Пікірлер
@rinchennorbu3248
@rinchennorbu3248 2 күн бұрын
can the collab part be done locally on a jupyter notebook?
@nodematic
@nodematic 2 күн бұрын
Yes, as long as you have enough local GPU memory, and setup matching CUDA libraries and Python package versions.
@rinchennorbu3248
@rinchennorbu3248 2 күн бұрын
@@nodematicawesome, thanks for the reply Also, at the end of the training step once our model is fine-tuned for our data, how and where can I save to share it? The model that is
@katarzynakuryo197
@katarzynakuryo197 3 күн бұрын
Hi, do you know if it is possible to use this model for multivariate time series?
@nodematic
@nodematic 3 күн бұрын
Yes, the model is multivariate. Just be sure to include everything in the forecast call "inputs".
@katarzynakuryo197
@katarzynakuryo197 2 күн бұрын
​@@nodematic I want to train my model on both ts at once. Do you know why it is not working? data = pd.DataFrame({ 'ds': train_df.index, 'ts1': train_df["ts1"], 'ts2': train_df["ts2"], 'unique_id': 'sensor_1' }) forecast_df=tfm.forecast_on_df( inputs=data, freq="D", value_name=["ts1", "ts2"] )
@triynizzles
@triynizzles 4 күн бұрын
I have been having tremendous difficulty, can this be run locally in VScode?
@nodematic
@nodematic 2 күн бұрын
We haven't tested this, but it should work. The biggest concern would be if you don't have enough GPU memory on your local machine or if you don't have a clean Python packages and CUDA setup.
@triynizzles
@triynizzles 16 сағат бұрын
@@nodematic I have read about it more and it looks like windows isnt acting too friendly and most people are running Linux. :(
@triynizzles
@triynizzles 4 күн бұрын
Hello, I don't understand how at 11:00 I can change the "yahma/alpaca-cleaned" to a local .json file on my pc?
@nodematic
@nodematic 2 күн бұрын
The Hugging Face datasets library is used in either case, to compile a dataset of training strings. The load_dataset("yahma/alpaca-cleaned") approach (or similar) is only if you have your dataset in Hugging Face. The Dataset.from_dict used in the video should work if you read in the data from your local json and use it for the dictionary's "text" value. Depending upon how the text is structured in your JSON, you may need to do string interpolation - the end result "text" values for the dataset need to be pure strings.
@triynizzles
@triynizzles 16 сағат бұрын
@@nodematic Thank you! I may have more questions in the future. :)
@adamhnat5123
@adamhnat5123 6 күн бұрын
Very helpful, thanks :)
@jelenacupac7
@jelenacupac7 7 күн бұрын
Really good tutorial!
@ausdavid
@ausdavid 7 күн бұрын
Thanks for the little tutorial, much appreciated. At my work I have to convert supplier quotes from PDF to Excel. This is to extract the item codes, description, pricing etc. Adobe Acrobat Pro does have a PDF to Excel converter, but it's hit or miss in terms of how it extracts the text correctly and also aligning information correctly in within the respective columns/rows. Do you believe Google's offering is a good alternative, or is there other solutions I should look into. Thanks in advance :)
@nodematic
@nodematic 7 күн бұрын
I think it's a good alternative, and you can also customize the parsing models by training them on your own data (we'll be releasing videos on this soon)
@ausdavid
@ausdavid 6 күн бұрын
​Thanks for the quick response @@nodematic, looking forward to your future releases :) I'll give it a try in 1-2 months from now and let you know how it goes, just currently busy with other projects.
@techandgaming0
@techandgaming0 12 күн бұрын
Any suggestions on how to achieve zero downtime deploy when uploading to the bucket?
@nodematic
@nodematic 11 күн бұрын
I've seen a couple things: 1) You could use gsutil rsync instead of gsutil rm and gsutilcp, and also use the -d option. This synchronizes, rather than strictly recreates, the objects in the bucket and should be at or near zero downtime. 2) You could use Cloud CDN in front of the bucket, with a "serve while stale" configuration. This configuration serves objects from a CDN cache if they don't exist in the bucket, for a preconfigured time, so the cache can "cover the gap" when the objects are updating.
@fire17102
@fire17102 15 күн бұрын
Thanks for the video ❤ Are there any tools that run variations on all the hyperparameter, and hones in the the best setup ? Thanks and all the best :)
@nodematic
@nodematic 11 күн бұрын
Google cloud has a "Hyperparameter Tuning" feature/service in Vertex AI that can help with these. We haven't used it yet, and I'm not sure if it'll work cleanly with Darts, but we'll add it to our list of future video topics. An open source approach like Optuna would probably also work well.
@fire17102
@fire17102 11 күн бұрын
@@nodematic awesome! Thanks for including the opensource option ftw!
@ChituyiDalmasWakhusama
@ChituyiDalmasWakhusama 16 күн бұрын
Hi, i keep getting this error "TypeError: argument of type 'NoneType' is not iterable" It is originating from "usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py" Could you please share the requirements.txt. Also it only happens when i try to push "merge_16bit". merge_4bit works just fine!
@user-ti7fg7gh7t
@user-ti7fg7gh7t 18 күн бұрын
i hate how everyone does unsloth tutorials not able of using multigpu setup
@ralvin3614
@ralvin3614 21 күн бұрын
Really love the play list! Great vidieo.
@wilsonyt-we2sj
@wilsonyt-we2sj 22 күн бұрын
It was an informative video. I have a question, can we just extract the reference text from the main blocks against the list mentioned in the reference section for citations?
@SeanCallahan52
@SeanCallahan52 22 күн бұрын
Have you had any success with large prompts of say 800k tokens or try to push 2000 images in a conversation? I always just get a 504. Would love to see something like that.
@nodematic
@nodematic 22 күн бұрын
We haven't tried at that scale yet. You might want to experiment with the Gemini API directly (assuming you're currently using the Vertex AI Studio GUI), in case your browser is imposing some constraints. On the media front, you'll definitely want to use Cloud Storage, and then use those URIs in your prompts (as opposed to inline file encodings). We have some potential video ideas that involve a larger scale of Gemini usage - will highlight any challenges and tricks if/when we create those videos.
@SeanCallahan52
@SeanCallahan52 22 күн бұрын
@@nodematic yeah, I’m actually talking about working with the api directly / the official python sdk. Google gives you 20 gb of space to upload images and reference (which obv you delete when the job is done) but I have yet to be able to actually provide a prompt within the range they claim and get a response that would a) fit that context window or b) output even 4k tokens let alone the 8k output they claim. Even a hello world type example would be great content on that front.
@JackyLiang-zu8pz
@JackyLiang-zu8pz 23 күн бұрын
I am running into a HTTP 400 error after using your JSON payload (word for word). Any ideas why?
@nodematic
@nodematic 15 күн бұрын
We just retested and didn't have any problems - not sure how to reproduce the 400 status code. A 503 can come up for Cloud Storage permission issues (if you are using media files), but I haven't seen a 400. Since 400 is a "Bad Request", is it possible that your JSON is malformed? Maybe a missing comma, a comma where it shouldn't be, or non-ascii quote characters? We're posting another video that involves ImageBind this week, which will feature calls to an associated Vertex AI Endpoints API. Maybe that will help.
@simonstrandgaard5503
@simonstrandgaard5503 24 күн бұрын
Great explanation. The background music was a little distracting.
@nodematic
@nodematic 22 күн бұрын
Thanks for the feedback - we'll keep this in mind on future videos.
@JavMend
@JavMend 24 күн бұрын
pretty cool video. I had two questions: 1) what is the difference between loading the model in 4-bit quantization but doing tuning in 16 bit? previously, I loaded a model in bfloat16 and didn't have to specify anything when doing tuning - maybe I am misunderstanding; 2) do you have any video or code recommendations for where I can see fine-tuning but not using LoRA? I feel semi committed to trying this first before going the LoRA route. ty for the great video (and music hehe)
@nodematic
@nodematic 22 күн бұрын
1) You can do the Mistral "base model" weights via 4-bit quantization, while using 16-bits for the fine-tuning adapter layers. So, for your fine-tuned model, most of the layers will be quantized, but with a bit of non-quantized layers "on top". This is often a sweet spot between computational demands and model accuracy (generally a tradeoff). 2) We don't currently have non-adapter fine-tuning videos (e.g., full-parameter fine-tuning), but will try to create a video on this topic. Thanks for watching and good luck.
@ShdowGarden
@ShdowGarden 24 күн бұрын
hi, I am fine tuning llama 3 model but i am facing some issue. Your video was great. I was hoping to connect with you. Can we connect?
@nodematic
@nodematic 22 күн бұрын
Thanks. You can reach out via email at [email protected]. We often do not have the staff to handle technical troubleshooting or architectural consulting, but we'll answer if we can.
@user-vm8yn4hb4w
@user-vm8yn4hb4w 26 күн бұрын
i love your every video
@user-vm8yn4hb4w
@user-vm8yn4hb4w Ай бұрын
i love your every tutorial.. and also i like your voice too 😅. i would like to contact with you on linkedin so i can get advice while im building project
@nodematic
@nodematic Ай бұрын
Thanks! Please reach out at [email protected].
@user-vm8yn4hb4w
@user-vm8yn4hb4w Ай бұрын
thank you this is very helpful. i'm trying to make community application with flutter. and i wanna use vertex ai for content moderation for comment or images
@vahandag
@vahandag Ай бұрын
you are legend <3
@tristan2711
@tristan2711 Ай бұрын
nice video, very helpful ! I fine-tuned llama3 with personal data. Few rows had nb_tokens > 512, so I chose max_seq_length=512. It worked well, but how does the model process data (instruction + input + target) with nb_tokens > 512? To me, RoPE scaling just provide generalized positional information, but I don't understand if every token information is preserved in that case? I probably misunderstood RoPE scaling...
@nodematic
@nodematic Ай бұрын
These LLMs are trained for a specific context window (8k tokens, in the case of this Llama 3 model), and RoPE scaling allows the model to understand/process sequences larger than that window (i.e., 8k+ tokens). Setting a max_seq_length that's below your longest row will mean truncation and information loss - whether or not you use RoPE scaling. Unless you have some hardware constraint, you should avoid this. It sounds like you are well under the 8k Llama 3 context window, so you shouldn't need RoPE scaling (and that will be automatically handled by the libraries/notebook). Just set the max_seq_length at or above your longest-sequence row, and you'll have good fine-tuning results without truncation. If that max_seq_length is less than 8k, the backend will not need RoPE, and if max_seq_length is more than 8k, the backend will need RoPE - but you can typically let the library handle that for you automatically.
@tristan2711
@tristan2711 Ай бұрын
​@@nodematicThank you, it was very clear!
@atsileitumeleng
@atsileitumeleng Ай бұрын
Can you please make a video about creating a cloud firestore function triggered by an image url in the collection. I understand the text part, but having issues with image url trigger. I was using Gemini 1.5 pro multimodal model.
@nodematic
@nodematic Ай бұрын
Sure - we'll try to make a video in this area. Thanks for the suggestion.
@ThinhNguyen-ph1wd
@ThinhNguyen-ph1wd Ай бұрын
Hi, thank a lot for your tutorial. Can you have tutorial to create "Custom Chatbot w/ Your PDF Data in Google Chat" ? Wish you have a nice day !!!
@nodematic
@nodematic Ай бұрын
Thanks for the suggestion. We'll try to create a tutorial on this.
@redgenAI
@redgenAI Ай бұрын
Great tutorial, esp appreciate the data prep and custom data tweaking part
@nishitp28
@nishitp28 Ай бұрын
Nice Video, What should be the format for data extraction, if I want to extract data from a chunk? Can I include something like: """ {Instruction or System Prompt} ### {Context or Chunks} ### {Question} ### {Answer} """
@nodematic
@nodematic Ай бұрын
The "###" lines signify headers, so I wouldn't put your content on those lines - rather, they are used to categorize the line(s) of text below each header. If you're using a chunk of content (e.g., via some sort of RAG approach), yes, you could have that as a separate categorization. Something like: """ {instruction} ### Background {chunk} ### Question {question} ### Answer {answer} """ For the best results, use the header terms in your instruction. For the example above, this could be something like "Based on the provided background, which comes from documentation, FAQs, and/or support tickets, answer the supplied question as clearly and factually as possible. If the background is insufficient to answer the question, answer "I don't know".".
@V-cy5dl
@V-cy5dl Ай бұрын
Any ideas how could I deploy this example without elasticsearch?
@nodematic
@nodematic Ай бұрын
You can swap in a different vector database, and still use the same architecture. Here's a comparison of some popular Vector DBs benchmark.vectorview.ai/vectordbs.html. Pinecone is probably the best beginner option, if you aren't already familiar with these.
@Itsgosm
@Itsgosm Ай бұрын
Amazing video!, been curious if had to train a set of codes, which would have indentations (take example python code), will it still require data to be in ]standard format of having 'instruction', 'output' and 'input'? 150+ codes with quite high complexity will it be possible to train it? are there any other ways to set up the dataset? and is Llama3 capable of getting trained on un-structured data?
@nodematic
@nodematic Ай бұрын
Yes, you could use a different, non-Alpaca-style format. For the "text" field creation via string interpolation, replace that with a text block of your code lines (including line breaks). Llama-3 does well on HumanEval, so I suspect it would work well for your described use case. Just be careful with how you create your samples - getting the model to stop after generating the right line/block of code may not be easy (although you could trim things down with post-processing).
@minidraco2601
@minidraco2601 Ай бұрын
whats the name of the some at 3:47? sounds pretty cool
@nodematic
@nodematic Ай бұрын
That's a Udio-generated custom song, and isn't published.
@adnenbenabdelaali6016
@adnenbenabdelaali6016 Ай бұрын
Great video and nice code, can you do this context length extension for Deepseek Coder model ?
@nodematic
@nodematic Ай бұрын
I believe it's possible, but I haven't tried yet and there isn't an existing Unsloth model for this. We'll look into it though and try to create a video. Thanks for the suggestion.
@PuneetMehra
@PuneetMehra Ай бұрын
Turnkey chatbot - what is turnkey?
@nodematic
@nodematic Ай бұрын
We mean "turnkey" in the sense of no model work, no code, and no installation/setup.
@user-fv8pi9uq6n
@user-fv8pi9uq6n Ай бұрын
do we need to create repo first before push to hub command ?
@nodematic
@nodematic Ай бұрын
No, just replace "hf/model" with your username (or organization name) and desired model name. Also, if you want a private repo, add a private=True argument to push_to_hub_merged.
@samfisher92sc
@samfisher92sc Ай бұрын
Great explanation. This could be a stupid question. How do we fine-tune for trigger function calling?
@nodematic
@nodematic Ай бұрын
Thanks for your question-it's definitely not a stupid one! In your dataset, have fields like "instruction", "prompt", and "function", and then do the string interpolation to create your text field (you could do it similar to the video, but replace "### Story" with "### Prompt" and "### Summary" with "### Function"). Make sure your training set has a consistent format for the function to trigger, and a consistent fallback value for non-triggering cases. Overall, the process should be quite similar to the video. Your model itself won't be able to actually trigger the function - only identify the right function to trigger (and possibly the arguments to supply to the function). You'll need to execute the function as a "next step" in some broader pipeline, application, service, or script. Hope I'm understanding the question correctly and that helps.
@cobainkuy342
@cobainkuy342 Ай бұрын
Ini sangat membantu sekali Terimakasih
@nimesh.akalanka
@nimesh.akalanka Ай бұрын
How can I fine-tune the LLAMA 3 8B model for free on my local hardware, specifically a ThinkStation P620 Tower Workstation with an AMD Ryzen Threadripper PRO 5945WX processor, 128 GB DDR4 RAM, and two NVIDIA RTX A4000 16GB GPUs in SLI? I am new to this and have prepared a dataset for training. Is this feasible?
@nodematic
@nodematic Ай бұрын
The approach highlighted in the video may work if your dataset doesn't have a very high token count. Just download the notebook and run it on your local machine. I haven't tried A4000s, but it's CUDA+Ampere technology, so should work similarly. The fine-tuning would need to stay within 16 GB GPU RAM since the open source, free unsloth doesn't include multi-GPU support.
@nimesh.akalanka
@nimesh.akalanka Ай бұрын
@@nodematic Thank you for the clarification
@drnicwilliams
@drnicwilliams Ай бұрын
LOL “We don’t need this code, so let’s put it in a text cell”
@alokrajsidhaarth7130
@alokrajsidhaarth7130 Ай бұрын
Great Video! I had a doubt about RoPE Scaling. How efficient is it and to what extent does it help solve the LLM context window size issue? Thanks!
@nodematic
@nodematic Ай бұрын
RoPE is the standard way to solve the context window size issue with these open models. It can come at a quality cost, but it's basically the best method we have if you need to go beyond the model's default context window. Use it only if you truly need the additional tokens. In the video's example, the RoPE scaling is needed, because you simply can't summarize a 16k token story by only looking at the second-half 8k of tokens.
@npip99
@npip99 Ай бұрын
@@nodematic ​ @nodematic Is there an easy API for RoPE? I don't even need fine-tuning, I just need a chat completion API for 32k context Llama 3
@nodematic
@nodematic Ай бұрын
Yes, you can use RoPE without fine-tuning (e.g., off-the-shelf Llama 3 with a 32k context). I would recommend using Hugging Face libraries, which can be configured for RoPE scaling (for example TGI RoPE scaling is detailed here huggingface.co/docs/text-generation-inference/en/basic_tutorials/preparing_model).
@excido7107
@excido7107 Ай бұрын
I was wondering, is there a certain format I need to have my fine-tune dataset. For example I want to train a LLama3 model to prefill financial documents, I have completed version of what I want and want to make a synthetic dataset. I was wondering if I need to do instruction based structure or can just do alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {} ### Input: {} ### Response: {}""" EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): instructions = examples["instruction"] inputs = examples["input"] outputs = examples["output"] texts = [] for instruction, input, output in zip(instructions, inputs, outputs): # Must add EOS_TOKEN, otherwise your generation will go on forever! text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN texts.append(text) return { "text" : texts, } pass alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {} ### Input: {} ### Response: {}""" EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN def formatting_prompts_func(examples): instructions = examples["instruction"] inputs = examples["input"] outputs = examples["output"] texts = [] for instruction, input, output in zip(instructions, inputs, outputs): # Must add EOS_TOKEN, otherwise your generation will go on forever! text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN texts.append(text) return { "text" : texts, } pass or should i do it like # Convert dataset to OAI messages system_message = """You are Llama, an AI assistant created by Vignesh to be helpful and honest. Your knowledge spans a wide range of topics, allowing you to engage in substantive conversations and provide analysis on complex subjects.""" def create_conversation(sample): if sample["messages"][0]["role"] == "system": return sample else: sample["messages"] = [{"role": "system", "content": system_message}] + sample["messages"] return sample# Convert dataset to OAI messages system_message = """You are Llama, an AI assistant created by Vignesh to be helpful and honest. Your knowledge spans a wide range of topics, allowing you to engage in substantive conversations and provide analysis on complex subjects.""" def create_conversation(sample): if sample["messages"][0]["role"] == "system": return sample else: sample["messages"] = [{"role": "system", "content": system_message}] + sample["messages"] return sample Just wondering before I start making a dataset. Thanks :)
@nodematic
@nodematic Ай бұрын
You can deviate from the Alpaca prompt format, to whatever format you like. However, make sure you have a single text field for each example, by the time that you're actually running the training. Direct use of these OpenAI-style data structures will not be possible, but indirect use (via string interpolation) is possible. If you want to have a distinction between system prompts, AI messages, and user messages, be sure to reflect that with demarcation strings (similar to the stories and summaries in the video).
@excido7107
@excido7107 Ай бұрын
@@nodematic I was just going to go with the intruction, input, output format, like { "instruction": "Explain the importance of disclosing any potential conflicts of interest or commissions related to the recommended insurance products.", "input": "The adviser has recommended that Dion and Valbone purchase life, TPD, trauma, and IP insurance policies from a specific insurance provider.", "output": "When recommending specific insurance products, it is crucial for the adviser to disclose any potential conflicts of interest or commissions they may receive from the insurance provider. This transparency helps establish trust and ensures that Dion and Valbone are aware of any potential biases or incentives that could influence the adviser's recommendations, allowing them to make informed decisions." }, { "instruction": "Provide the full names of the clients from the provided information.", "input": "Client Information: Names - Robert Jefferson and Ruth Jefferson", "output": "The clients' full names are Robert Jefferson and Ruth Jefferson." etc What do you mean a single string? do I have to process the instruction format into something else before passing it for finetuning?
@nodematic
@nodematic Ай бұрын
The instructions, inputs, and outputs should be fine, but the trainer will look for a "text" field, so make sure you somehow combine the instruction, input, and output into a string for that text value. You could set the text field to a string interpolation using the other fields (a process used in both the DIY data video process and base notebook Hugging Face off-the-shelf data process). The string interpolation could look something like this: """ {instruction} ### {input} ### {output} """ You can change the column used via the "dataset_text_field" trainer argument, but you can't set multiple fields for this - they need to be combined.
@danielhanchen
@danielhanchen Ай бұрын
Oh fantastic video as always - absolutely packed with detailed information so great work!
@slimpbisquitte3942
@slimpbisquitte3942 Ай бұрын
Really comprehensive and well-explained! Great work! I wonder if it is also possible to fine-tune not a text generator but an image generator. Does someone have any ideas? I am super new to this field and pretty much in the dark so far. Could not find something for image generation yet :/ Thanks for any suggestions!
@nodematic
@nodematic Ай бұрын
We'll try to make a video on this. Thanks for the suggestion.
@andrewmaynard6693
@andrewmaynard6693 Ай бұрын
Good tone and voice! I appreciate it.
@Lerndemy
@Lerndemy Ай бұрын
undderrated gem channel
@fernandes7949
@fernandes7949 Ай бұрын
Great explanation of the data form product, do you know if it is possible to integrate dataform with bitbucket? How can I force the dataform to run through composer airflow?
@nodematic
@nodematic Ай бұрын
Yes, BitBucket integration is possible. The Dataform Airflow operator should work for using Dataform via Composer - typically, this will be via the DataformCreateWorkflowInvocationOperator. I would recommend our Analytics Engineering with Dataform playlist if you want to dive into the details of Dataform. kzbin.info/www/bejne/Y3Lac2CHmK18d7c
@nicolassuarez2933
@nicolassuarez2933 2 ай бұрын
Outstanding! Will this approach works for data extraction? Lets say I want all the Titles of a book? Thanks!
@nodematic
@nodematic 2 ай бұрын
Thanks! Yes, your fine tuned model could definitely be focused on data extraction, like book titles.
@branislavgaming8717
@branislavgaming8717 2 ай бұрын
im start to host Compilance on Google Cloud all Rolles is assing....already pass one month but no one start use Compilance on my hosting accaunt.....can you make video about that ? meaby im miss something.....thanks.....im watch all your videos
@nodematic
@nodematic 2 ай бұрын
Yes, we'll try to make a video on this. Thanks for the suggestion.
@SaiyD
@SaiyD 2 ай бұрын
Thank you so much for this, you re awesome <3
@muhannadobeidat
@muhannadobeidat 2 ай бұрын
Love the video and style. Great content
@rajendraprajapat538
@rajendraprajapat538 2 ай бұрын
was looking for similar video, Thanks. It is really great video
@zacharyh5027
@zacharyh5027 2 ай бұрын
Is it possible to get this CDN setup with HTTPS and SIgned URLs against a private bucket?
@nodematic
@nodematic 2 ай бұрын
Yes, that should be possible. Cloud CDN does not prevent HTTPS for the Cloud Load Balancers or Signed URLs with a private bucket.
@zacharyh5027
@zacharyh5027 2 ай бұрын
@@nodematic but how? Because I've tried and always ended up getting a 403 even with assigning bucket permissions to the CDN service account.