can the collab part be done locally on a jupyter notebook?
@nodematic2 күн бұрын
Yes, as long as you have enough local GPU memory, and setup matching CUDA libraries and Python package versions.
@rinchennorbu32482 күн бұрын
@@nodematicawesome, thanks for the reply Also, at the end of the training step once our model is fine-tuned for our data, how and where can I save to share it? The model that is
@katarzynakuryo1973 күн бұрын
Hi, do you know if it is possible to use this model for multivariate time series?
@nodematic3 күн бұрын
Yes, the model is multivariate. Just be sure to include everything in the forecast call "inputs".
@katarzynakuryo1972 күн бұрын
@@nodematic I want to train my model on both ts at once. Do you know why it is not working? data = pd.DataFrame({ 'ds': train_df.index, 'ts1': train_df["ts1"], 'ts2': train_df["ts2"], 'unique_id': 'sensor_1' }) forecast_df=tfm.forecast_on_df( inputs=data, freq="D", value_name=["ts1", "ts2"] )
@triynizzles4 күн бұрын
I have been having tremendous difficulty, can this be run locally in VScode?
@nodematic2 күн бұрын
We haven't tested this, but it should work. The biggest concern would be if you don't have enough GPU memory on your local machine or if you don't have a clean Python packages and CUDA setup.
@triynizzles16 сағат бұрын
@@nodematic I have read about it more and it looks like windows isnt acting too friendly and most people are running Linux. :(
@triynizzles4 күн бұрын
Hello, I don't understand how at 11:00 I can change the "yahma/alpaca-cleaned" to a local .json file on my pc?
@nodematic2 күн бұрын
The Hugging Face datasets library is used in either case, to compile a dataset of training strings. The load_dataset("yahma/alpaca-cleaned") approach (or similar) is only if you have your dataset in Hugging Face. The Dataset.from_dict used in the video should work if you read in the data from your local json and use it for the dictionary's "text" value. Depending upon how the text is structured in your JSON, you may need to do string interpolation - the end result "text" values for the dataset need to be pure strings.
@triynizzles16 сағат бұрын
@@nodematic Thank you! I may have more questions in the future. :)
@adamhnat51236 күн бұрын
Very helpful, thanks :)
@jelenacupac77 күн бұрын
Really good tutorial!
@ausdavid7 күн бұрын
Thanks for the little tutorial, much appreciated. At my work I have to convert supplier quotes from PDF to Excel. This is to extract the item codes, description, pricing etc. Adobe Acrobat Pro does have a PDF to Excel converter, but it's hit or miss in terms of how it extracts the text correctly and also aligning information correctly in within the respective columns/rows. Do you believe Google's offering is a good alternative, or is there other solutions I should look into. Thanks in advance :)
@nodematic7 күн бұрын
I think it's a good alternative, and you can also customize the parsing models by training them on your own data (we'll be releasing videos on this soon)
@ausdavid6 күн бұрын
Thanks for the quick response @@nodematic, looking forward to your future releases :) I'll give it a try in 1-2 months from now and let you know how it goes, just currently busy with other projects.
@techandgaming012 күн бұрын
Any suggestions on how to achieve zero downtime deploy when uploading to the bucket?
@nodematic11 күн бұрын
I've seen a couple things: 1) You could use gsutil rsync instead of gsutil rm and gsutilcp, and also use the -d option. This synchronizes, rather than strictly recreates, the objects in the bucket and should be at or near zero downtime. 2) You could use Cloud CDN in front of the bucket, with a "serve while stale" configuration. This configuration serves objects from a CDN cache if they don't exist in the bucket, for a preconfigured time, so the cache can "cover the gap" when the objects are updating.
@fire1710215 күн бұрын
Thanks for the video ❤ Are there any tools that run variations on all the hyperparameter, and hones in the the best setup ? Thanks and all the best :)
@nodematic11 күн бұрын
Google cloud has a "Hyperparameter Tuning" feature/service in Vertex AI that can help with these. We haven't used it yet, and I'm not sure if it'll work cleanly with Darts, but we'll add it to our list of future video topics. An open source approach like Optuna would probably also work well.
@fire1710211 күн бұрын
@@nodematic awesome! Thanks for including the opensource option ftw!
@ChituyiDalmasWakhusama16 күн бұрын
Hi, i keep getting this error "TypeError: argument of type 'NoneType' is not iterable" It is originating from "usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py" Could you please share the requirements.txt. Also it only happens when i try to push "merge_16bit". merge_4bit works just fine!
@user-ti7fg7gh7t18 күн бұрын
i hate how everyone does unsloth tutorials not able of using multigpu setup
@ralvin361421 күн бұрын
Really love the play list! Great vidieo.
@wilsonyt-we2sj22 күн бұрын
It was an informative video. I have a question, can we just extract the reference text from the main blocks against the list mentioned in the reference section for citations?
@SeanCallahan5222 күн бұрын
Have you had any success with large prompts of say 800k tokens or try to push 2000 images in a conversation? I always just get a 504. Would love to see something like that.
@nodematic22 күн бұрын
We haven't tried at that scale yet. You might want to experiment with the Gemini API directly (assuming you're currently using the Vertex AI Studio GUI), in case your browser is imposing some constraints. On the media front, you'll definitely want to use Cloud Storage, and then use those URIs in your prompts (as opposed to inline file encodings). We have some potential video ideas that involve a larger scale of Gemini usage - will highlight any challenges and tricks if/when we create those videos.
@SeanCallahan5222 күн бұрын
@@nodematic yeah, I’m actually talking about working with the api directly / the official python sdk. Google gives you 20 gb of space to upload images and reference (which obv you delete when the job is done) but I have yet to be able to actually provide a prompt within the range they claim and get a response that would a) fit that context window or b) output even 4k tokens let alone the 8k output they claim. Even a hello world type example would be great content on that front.
@JackyLiang-zu8pz23 күн бұрын
I am running into a HTTP 400 error after using your JSON payload (word for word). Any ideas why?
@nodematic15 күн бұрын
We just retested and didn't have any problems - not sure how to reproduce the 400 status code. A 503 can come up for Cloud Storage permission issues (if you are using media files), but I haven't seen a 400. Since 400 is a "Bad Request", is it possible that your JSON is malformed? Maybe a missing comma, a comma where it shouldn't be, or non-ascii quote characters? We're posting another video that involves ImageBind this week, which will feature calls to an associated Vertex AI Endpoints API. Maybe that will help.
@simonstrandgaard550324 күн бұрын
Great explanation. The background music was a little distracting.
@nodematic22 күн бұрын
Thanks for the feedback - we'll keep this in mind on future videos.
@JavMend24 күн бұрын
pretty cool video. I had two questions: 1) what is the difference between loading the model in 4-bit quantization but doing tuning in 16 bit? previously, I loaded a model in bfloat16 and didn't have to specify anything when doing tuning - maybe I am misunderstanding; 2) do you have any video or code recommendations for where I can see fine-tuning but not using LoRA? I feel semi committed to trying this first before going the LoRA route. ty for the great video (and music hehe)
@nodematic22 күн бұрын
1) You can do the Mistral "base model" weights via 4-bit quantization, while using 16-bits for the fine-tuning adapter layers. So, for your fine-tuned model, most of the layers will be quantized, but with a bit of non-quantized layers "on top". This is often a sweet spot between computational demands and model accuracy (generally a tradeoff). 2) We don't currently have non-adapter fine-tuning videos (e.g., full-parameter fine-tuning), but will try to create a video on this topic. Thanks for watching and good luck.
@ShdowGarden24 күн бұрын
hi, I am fine tuning llama 3 model but i am facing some issue. Your video was great. I was hoping to connect with you. Can we connect?
@nodematic22 күн бұрын
Thanks. You can reach out via email at [email protected]. We often do not have the staff to handle technical troubleshooting or architectural consulting, but we'll answer if we can.
@user-vm8yn4hb4w26 күн бұрын
i love your every video
@user-vm8yn4hb4wАй бұрын
i love your every tutorial.. and also i like your voice too 😅. i would like to contact with you on linkedin so i can get advice while im building project
thank you this is very helpful. i'm trying to make community application with flutter. and i wanna use vertex ai for content moderation for comment or images
@vahandagАй бұрын
you are legend <3
@tristan2711Ай бұрын
nice video, very helpful ! I fine-tuned llama3 with personal data. Few rows had nb_tokens > 512, so I chose max_seq_length=512. It worked well, but how does the model process data (instruction + input + target) with nb_tokens > 512? To me, RoPE scaling just provide generalized positional information, but I don't understand if every token information is preserved in that case? I probably misunderstood RoPE scaling...
@nodematicАй бұрын
These LLMs are trained for a specific context window (8k tokens, in the case of this Llama 3 model), and RoPE scaling allows the model to understand/process sequences larger than that window (i.e., 8k+ tokens). Setting a max_seq_length that's below your longest row will mean truncation and information loss - whether or not you use RoPE scaling. Unless you have some hardware constraint, you should avoid this. It sounds like you are well under the 8k Llama 3 context window, so you shouldn't need RoPE scaling (and that will be automatically handled by the libraries/notebook). Just set the max_seq_length at or above your longest-sequence row, and you'll have good fine-tuning results without truncation. If that max_seq_length is less than 8k, the backend will not need RoPE, and if max_seq_length is more than 8k, the backend will need RoPE - but you can typically let the library handle that for you automatically.
@tristan2711Ай бұрын
@@nodematicThank you, it was very clear!
@atsileitumelengАй бұрын
Can you please make a video about creating a cloud firestore function triggered by an image url in the collection. I understand the text part, but having issues with image url trigger. I was using Gemini 1.5 pro multimodal model.
@nodematicАй бұрын
Sure - we'll try to make a video in this area. Thanks for the suggestion.
@ThinhNguyen-ph1wdАй бұрын
Hi, thank a lot for your tutorial. Can you have tutorial to create "Custom Chatbot w/ Your PDF Data in Google Chat" ? Wish you have a nice day !!!
@nodematicАй бұрын
Thanks for the suggestion. We'll try to create a tutorial on this.
@redgenAIАй бұрын
Great tutorial, esp appreciate the data prep and custom data tweaking part
@nishitp28Ай бұрын
Nice Video, What should be the format for data extraction, if I want to extract data from a chunk? Can I include something like: """ {Instruction or System Prompt} ### {Context or Chunks} ### {Question} ### {Answer} """
@nodematicАй бұрын
The "###" lines signify headers, so I wouldn't put your content on those lines - rather, they are used to categorize the line(s) of text below each header. If you're using a chunk of content (e.g., via some sort of RAG approach), yes, you could have that as a separate categorization. Something like: """ {instruction} ### Background {chunk} ### Question {question} ### Answer {answer} """ For the best results, use the header terms in your instruction. For the example above, this could be something like "Based on the provided background, which comes from documentation, FAQs, and/or support tickets, answer the supplied question as clearly and factually as possible. If the background is insufficient to answer the question, answer "I don't know".".
@V-cy5dlАй бұрын
Any ideas how could I deploy this example without elasticsearch?
@nodematicАй бұрын
You can swap in a different vector database, and still use the same architecture. Here's a comparison of some popular Vector DBs benchmark.vectorview.ai/vectordbs.html. Pinecone is probably the best beginner option, if you aren't already familiar with these.
@ItsgosmАй бұрын
Amazing video!, been curious if had to train a set of codes, which would have indentations (take example python code), will it still require data to be in ]standard format of having 'instruction', 'output' and 'input'? 150+ codes with quite high complexity will it be possible to train it? are there any other ways to set up the dataset? and is Llama3 capable of getting trained on un-structured data?
@nodematicАй бұрын
Yes, you could use a different, non-Alpaca-style format. For the "text" field creation via string interpolation, replace that with a text block of your code lines (including line breaks). Llama-3 does well on HumanEval, so I suspect it would work well for your described use case. Just be careful with how you create your samples - getting the model to stop after generating the right line/block of code may not be easy (although you could trim things down with post-processing).
@minidraco2601Ай бұрын
whats the name of the some at 3:47? sounds pretty cool
@nodematicАй бұрын
That's a Udio-generated custom song, and isn't published.
@adnenbenabdelaali6016Ай бұрын
Great video and nice code, can you do this context length extension for Deepseek Coder model ?
@nodematicАй бұрын
I believe it's possible, but I haven't tried yet and there isn't an existing Unsloth model for this. We'll look into it though and try to create a video. Thanks for the suggestion.
@PuneetMehraАй бұрын
Turnkey chatbot - what is turnkey?
@nodematicАй бұрын
We mean "turnkey" in the sense of no model work, no code, and no installation/setup.
@user-fv8pi9uq6nАй бұрын
do we need to create repo first before push to hub command ?
@nodematicАй бұрын
No, just replace "hf/model" with your username (or organization name) and desired model name. Also, if you want a private repo, add a private=True argument to push_to_hub_merged.
@samfisher92scАй бұрын
Great explanation. This could be a stupid question. How do we fine-tune for trigger function calling?
@nodematicАй бұрын
Thanks for your question-it's definitely not a stupid one! In your dataset, have fields like "instruction", "prompt", and "function", and then do the string interpolation to create your text field (you could do it similar to the video, but replace "### Story" with "### Prompt" and "### Summary" with "### Function"). Make sure your training set has a consistent format for the function to trigger, and a consistent fallback value for non-triggering cases. Overall, the process should be quite similar to the video. Your model itself won't be able to actually trigger the function - only identify the right function to trigger (and possibly the arguments to supply to the function). You'll need to execute the function as a "next step" in some broader pipeline, application, service, or script. Hope I'm understanding the question correctly and that helps.
@cobainkuy342Ай бұрын
Ini sangat membantu sekali Terimakasih
@nimesh.akalankaАй бұрын
How can I fine-tune the LLAMA 3 8B model for free on my local hardware, specifically a ThinkStation P620 Tower Workstation with an AMD Ryzen Threadripper PRO 5945WX processor, 128 GB DDR4 RAM, and two NVIDIA RTX A4000 16GB GPUs in SLI? I am new to this and have prepared a dataset for training. Is this feasible?
@nodematicАй бұрын
The approach highlighted in the video may work if your dataset doesn't have a very high token count. Just download the notebook and run it on your local machine. I haven't tried A4000s, but it's CUDA+Ampere technology, so should work similarly. The fine-tuning would need to stay within 16 GB GPU RAM since the open source, free unsloth doesn't include multi-GPU support.
@nimesh.akalankaАй бұрын
@@nodematic Thank you for the clarification
@drnicwilliamsАй бұрын
LOL “We don’t need this code, so let’s put it in a text cell”
@alokrajsidhaarth7130Ай бұрын
Great Video! I had a doubt about RoPE Scaling. How efficient is it and to what extent does it help solve the LLM context window size issue? Thanks!
@nodematicАй бұрын
RoPE is the standard way to solve the context window size issue with these open models. It can come at a quality cost, but it's basically the best method we have if you need to go beyond the model's default context window. Use it only if you truly need the additional tokens. In the video's example, the RoPE scaling is needed, because you simply can't summarize a 16k token story by only looking at the second-half 8k of tokens.
@npip99Ай бұрын
@@nodematic @nodematic Is there an easy API for RoPE? I don't even need fine-tuning, I just need a chat completion API for 32k context Llama 3
@nodematicАй бұрын
Yes, you can use RoPE without fine-tuning (e.g., off-the-shelf Llama 3 with a 32k context). I would recommend using Hugging Face libraries, which can be configured for RoPE scaling (for example TGI RoPE scaling is detailed here huggingface.co/docs/text-generation-inference/en/basic_tutorials/preparing_model).
@excido7107Ай бұрын
I was wondering, is there a certain format I need to have my fine-tune dataset. For example I want to train a LLama3 model to prefill financial documents, I have completed version of what I want and want to make a synthetic dataset. I was wondering if I need to do instruction based structure or can just do alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {} ### Input: {} ### Response: {}""" EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): instructions = examples["instruction"] inputs = examples["input"] outputs = examples["output"] texts = [] for instruction, input, output in zip(instructions, inputs, outputs): # Must add EOS_TOKEN, otherwise your generation will go on forever! text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN texts.append(text) return { "text" : texts, } pass alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {} ### Input: {} ### Response: {}""" EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): instructions = examples["instruction"] inputs = examples["input"] outputs = examples["output"] texts = [] for instruction, input, output in zip(instructions, inputs, outputs): # Must add EOS_TOKEN, otherwise your generation will go on forever! text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN texts.append(text) return { "text" : texts, } pass or should i do it like # Convert dataset to OAI messages system_message = """You are Llama, an AI assistant created by Vignesh to be helpful and honest. Your knowledge spans a wide range of topics, allowing you to engage in substantive conversations and provide analysis on complex subjects.""" def create_conversation(sample): if sample["messages"][0]["role"] == "system": return sample else: sample["messages"] = [{"role": "system", "content": system_message}] + sample["messages"] return sample# Convert dataset to OAI messages system_message = """You are Llama, an AI assistant created by Vignesh to be helpful and honest. Your knowledge spans a wide range of topics, allowing you to engage in substantive conversations and provide analysis on complex subjects.""" def create_conversation(sample): if sample["messages"][0]["role"] == "system": return sample else: sample["messages"] = [{"role": "system", "content": system_message}] + sample["messages"] return sample Just wondering before I start making a dataset. Thanks :)
@nodematicАй бұрын
You can deviate from the Alpaca prompt format, to whatever format you like. However, make sure you have a single text field for each example, by the time that you're actually running the training. Direct use of these OpenAI-style data structures will not be possible, but indirect use (via string interpolation) is possible. If you want to have a distinction between system prompts, AI messages, and user messages, be sure to reflect that with demarcation strings (similar to the stories and summaries in the video).
@excido7107Ай бұрын
@@nodematic I was just going to go with the intruction, input, output format, like { "instruction": "Explain the importance of disclosing any potential conflicts of interest or commissions related to the recommended insurance products.", "input": "The adviser has recommended that Dion and Valbone purchase life, TPD, trauma, and IP insurance policies from a specific insurance provider.", "output": "When recommending specific insurance products, it is crucial for the adviser to disclose any potential conflicts of interest or commissions they may receive from the insurance provider. This transparency helps establish trust and ensures that Dion and Valbone are aware of any potential biases or incentives that could influence the adviser's recommendations, allowing them to make informed decisions." }, { "instruction": "Provide the full names of the clients from the provided information.", "input": "Client Information: Names - Robert Jefferson and Ruth Jefferson", "output": "The clients' full names are Robert Jefferson and Ruth Jefferson." etc What do you mean a single string? do I have to process the instruction format into something else before passing it for finetuning?
@nodematicАй бұрын
The instructions, inputs, and outputs should be fine, but the trainer will look for a "text" field, so make sure you somehow combine the instruction, input, and output into a string for that text value. You could set the text field to a string interpolation using the other fields (a process used in both the DIY data video process and base notebook Hugging Face off-the-shelf data process). The string interpolation could look something like this: """ {instruction} ### {input} ### {output} """ You can change the column used via the "dataset_text_field" trainer argument, but you can't set multiple fields for this - they need to be combined.
@danielhanchenАй бұрын
Oh fantastic video as always - absolutely packed with detailed information so great work!
@slimpbisquitte3942Ай бұрын
Really comprehensive and well-explained! Great work! I wonder if it is also possible to fine-tune not a text generator but an image generator. Does someone have any ideas? I am super new to this field and pretty much in the dark so far. Could not find something for image generation yet :/ Thanks for any suggestions!
@nodematicАй бұрын
We'll try to make a video on this. Thanks for the suggestion.
@andrewmaynard6693Ай бұрын
Good tone and voice! I appreciate it.
@LerndemyАй бұрын
undderrated gem channel
@fernandes7949Ай бұрын
Great explanation of the data form product, do you know if it is possible to integrate dataform with bitbucket? How can I force the dataform to run through composer airflow?
@nodematicАй бұрын
Yes, BitBucket integration is possible. The Dataform Airflow operator should work for using Dataform via Composer - typically, this will be via the DataformCreateWorkflowInvocationOperator. I would recommend our Analytics Engineering with Dataform playlist if you want to dive into the details of Dataform. kzbin.info/www/bejne/Y3Lac2CHmK18d7c
@nicolassuarez29332 ай бұрын
Outstanding! Will this approach works for data extraction? Lets say I want all the Titles of a book? Thanks!
@nodematic2 ай бұрын
Thanks! Yes, your fine tuned model could definitely be focused on data extraction, like book titles.
@branislavgaming87172 ай бұрын
im start to host Compilance on Google Cloud all Rolles is assing....already pass one month but no one start use Compilance on my hosting accaunt.....can you make video about that ? meaby im miss something.....thanks.....im watch all your videos
@nodematic2 ай бұрын
Yes, we'll try to make a video on this. Thanks for the suggestion.
@SaiyD2 ай бұрын
Thank you so much for this, you re awesome <3
@muhannadobeidat2 ай бұрын
Love the video and style. Great content
@rajendraprajapat5382 ай бұрын
was looking for similar video, Thanks. It is really great video
@zacharyh50272 ай бұрын
Is it possible to get this CDN setup with HTTPS and SIgned URLs against a private bucket?
@nodematic2 ай бұрын
Yes, that should be possible. Cloud CDN does not prevent HTTPS for the Cloud Load Balancers or Signed URLs with a private bucket.
@zacharyh50272 ай бұрын
@@nodematic but how? Because I've tried and always ended up getting a 403 even with assigning bucket permissions to the CDN service account.