If you are interested in learning more about how to build robust RAG applications, check out this course: prompt-s-site.thinkific.com/courses/rag
@mike_nyc_2126 ай бұрын
In the provided training example the first user "message" looks like it matches the "prompt". Btw, totally agree that my main frustration with fine tuning toolkits has been how to format the training data (especially for multi turn conversion)!
@unclecode6 ай бұрын
Very interesting. I noticed that the library dependency isn't solely Torch and doesn't involve HuggingFace. This means they built all these transformer classes themselves. I wonder about the motivation behind this. Although we can use HuggingFace for fine-tuning, it's intriguing they optimized it for their own model. I'm also curious about setting the weight to 0. If it's skewed to a row, what happens if you set the weight to another number, like 0.5 or 1? Does it consider it with a multiplier? Another idea: let's create a simple web interface for fine-tuning. What do you think? If you agree, I can create a repo and we can work on it together. We could build a simple interface where people can upload CSV files or other formats, use validation tools, prepare data, and then queue it for fine-tuning. I think it would be very useful for many people. May be you can add it to localGPT, its nice to see localGPT has local fine-tuner too.
@engineerprompt6 ай бұрын
I agree, seems like they have a lot custom code and optimizations. I think they might be releasing a lot more and have decided to create dedicated implementation which might be able to support their new models with waiting for HF transformers to be upgrade each time their is some new innovation. I suspect the weight is going to be just binary (I plan to look into their code a bit deeper when I get back from the break). Seems like this is part of their data filtering for samples that are not properly formatted. I like your idea. I always wanted to integrate something like this in localgpt but never got around to it. One aspect which could actually be really useful is if someone could just upload raw text files and there is a model which create instruct dataset out it. I have had a few requests around it and I think this might be very useful for people who want to fine-tune their own models but don't even know where to start with their datasets. Let's discuss this when I get back towards the end of the week.
@unclecode6 ай бұрын
@@engineerprompt True, they have just updated their library for Codestral, try it, and please make a content for that. Regarding LocalGPT, looking forward, and will be happy if I can be a help.
@Stephan808Күн бұрын
sorry, can I finetune mistral large 2 too or pixtral or is this only for mistral 7B?
@Stephan808Ай бұрын
how do I create the dataset, by using an llm? Is there a dialog creator tool? Does it make sense to make it in a csv and then convert it to jsonl? I cannot imagine that anyone does all the text manually...
@iainattwater17476 ай бұрын
Nice video - thanks. Have you tried inferencing Mistral FT with TGI? If the tokenizer contains the chat template then TGI/HG Chat UI should honor it. I'm going to try it.
@engineerprompt6 ай бұрын
Not yet, I am looking into TGI and plan to create a tutorial on it.
@pawan31336 ай бұрын
Thanks for another great video!! Can you please make a video or at least share the material on fine-tuning a quantized mistral v0.3 model
@engineerprompt6 ай бұрын
In general, you want to load the model in 4-bit. Look at my finetuning videos using unsloth.
@spectre1235 ай бұрын
Thanks for this video. Can you make a video for the pretrain data corresponds to plain text data stored in the "text" key. E.g: {"text": "Text contained in document n°1"} ? and how many text we need for a good fine tuning results? thanks
@godataprof6 ай бұрын
Can you do a video on function calling fine tuning?
@vivekjainmaiet4 ай бұрын
Most of people do not have dataset but have unstructured data. Could you make a video to train base model and then convert it to chat Model.
@farazfitness6 ай бұрын
I did everything but I'm unable to understand the last step how do I push the model and run it locally like a chat gpt interface??? Can you do a video of how to integrate a model from google colab to gpt4all
@tcwang536818 күн бұрын
I have the same question!
@neretharanthanael91302 күн бұрын
@@tcwang5368 There are already tutorials for that. Search for a tutorial like "how to use own model with Ollama" and you'll find nice videos. I believe I watched the one from Matt Williams.
@BoHorror5 ай бұрын
If I just wanted the Model to speak in a certain way, and I have a PDF full of examples what would I need to do.
@engineerprompt5 ай бұрын
If it's just the tone, you could potentially use few shot prompting to get it working
@BoHorror5 ай бұрын
@@engineerprompt So just a simple example. Input would be Speak Like Jolly Roger and output would be Jolly Roger speaking
@宋柏勳-u6p5 ай бұрын
How to convert model to GGUF after fine-tuned?
@azkarathore43555 ай бұрын
Can we finetune mistral for machine translation task for a low resource language
@engineerprompt5 ай бұрын
Yes, I think that can be done
@azkarathore43555 ай бұрын
@@engineerprompt I have some quries about it can you guide me
@kenchang34565 ай бұрын
Great detail, thank you very much. I'm interested in Mistral fine-tuning using a JSONL dataset for NER. Do you have any videos for that topic or is this video sufficient and really all I would need to do is determine what the JSONL data format should be?
@engineerprompt5 ай бұрын
I don't have specific video on that topic but I think this will be a good start. My recommendation will be to do few shots first before you start thinking about finetuning. There is a LOT you can achieve with few shots. Finetuning should be the last resort!
@kenchang34565 ай бұрын
@@engineerprompt Thank you for responding, I appreciate it.