Master Fine-Tuning Mistral AI Models with Official Mistral-FineTune Package

Рет қаралды 13,818

Күн бұрын

Пікірлер: 30

@engineerprompt 6 ай бұрын

If you are interested in learning more about how to build robust RAG applications, check out this course: prompt-s-site.thinkific.com/courses/rag

@mike_nyc_212 6 ай бұрын

In the provided training example the first user "message" looks like it matches the "prompt". Btw, totally agree that my main frustration with fine tuning toolkits has been how to format the training data (especially for multi turn conversion)!

@unclecode 6 ай бұрын

Very interesting. I noticed that the library dependency isn't solely Torch and doesn't involve HuggingFace. This means they built all these transformer classes themselves. I wonder about the motivation behind this. Although we can use HuggingFace for fine-tuning, it's intriguing they optimized it for their own model. I'm also curious about setting the weight to 0. If it's skewed to a row, what happens if you set the weight to another number, like 0.5 or 1? Does it consider it with a multiplier? Another idea: let's create a simple web interface for fine-tuning. What do you think? If you agree, I can create a repo and we can work on it together. We could build a simple interface where people can upload CSV files or other formats, use validation tools, prepare data, and then queue it for fine-tuning. I think it would be very useful for many people. May be you can add it to localGPT, its nice to see localGPT has local fine-tuner too.

@engineerprompt 6 ай бұрын

I agree, seems like they have a lot custom code and optimizations. I think they might be releasing a lot more and have decided to create dedicated implementation which might be able to support their new models with waiting for HF transformers to be upgrade each time their is some new innovation. I suspect the weight is going to be just binary (I plan to look into their code a bit deeper when I get back from the break). Seems like this is part of their data filtering for samples that are not properly formatted. I like your idea. I always wanted to integrate something like this in localgpt but never got around to it. One aspect which could actually be really useful is if someone could just upload raw text files and there is a model which create instruct dataset out it. I have had a few requests around it and I think this might be very useful for people who want to fine-tune their own models but don't even know where to start with their datasets. Let's discuss this when I get back towards the end of the week.

@unclecode 6 ай бұрын

@@engineerprompt True, they have just updated their library for Codestral, try it, and please make a content for that. Regarding LocalGPT, looking forward, and will be happy if I can be a help.

@Stephan808 Күн бұрын

sorry, can I finetune mistral large 2 too or pixtral or is this only for mistral 7B?

@Stephan808 Ай бұрын

how do I create the dataset, by using an llm? Is there a dialog creator tool? Does it make sense to make it in a csv and then convert it to jsonl? I cannot imagine that anyone does all the text manually...

@iainattwater1747 6 ай бұрын

Nice video - thanks. Have you tried inferencing Mistral FT with TGI? If the tokenizer contains the chat template then TGI/HG Chat UI should honor it. I'm going to try it.

@engineerprompt 6 ай бұрын

Not yet, I am looking into TGI and plan to create a tutorial on it.

@pawan3133 6 ай бұрын

Thanks for another great video!! Can you please make a video or at least share the material on fine-tuning a quantized mistral v0.3 model

@engineerprompt 6 ай бұрын

In general, you want to load the model in 4-bit. Look at my finetuning videos using unsloth.

@spectre123 5 ай бұрын

Thanks for this video. Can you make a video for the pretrain data corresponds to plain text data stored in the "text" key. E.g: {"text": "Text contained in document n°1"} ? and how many text we need for a good fine tuning results? thanks

@godataprof 6 ай бұрын

Can you do a video on function calling fine tuning?

@vivekjainmaiet 4 ай бұрын

Most of people do not have dataset but have unstructured data. Could you make a video to train base model and then convert it to chat Model.

@farazfitness 6 ай бұрын

I did everything but I'm unable to understand the last step how do I push the model and run it locally like a chat gpt interface??? Can you do a video of how to integrate a model from google colab to gpt4all

@tcwang5368 18 күн бұрын

I have the same question!

@neretharanthanael9130 2 күн бұрын

@@tcwang5368 There are already tutorials for that. Search for a tutorial like "how to use own model with Ollama" and you'll find nice videos. I believe I watched the one from Matt Williams.

@BoHorror 5 ай бұрын

If I just wanted the Model to speak in a certain way, and I have a PDF full of examples what would I need to do.

@engineerprompt 5 ай бұрын

If it's just the tone, you could potentially use few shot prompting to get it working

@BoHorror 5 ай бұрын

@@engineerprompt So just a simple example. Input would be Speak Like Jolly Roger and output would be Jolly Roger speaking

@宋柏勳-u6p 5 ай бұрын

How to convert model to GGUF after fine-tuned?

@azkarathore4355 5 ай бұрын

Can we finetune mistral for machine translation task for a low resource language

@engineerprompt 5 ай бұрын

Yes, I think that can be done

@azkarathore4355 5 ай бұрын

@@engineerprompt I have some quries about it can you guide me

@kenchang3456 5 ай бұрын

Great detail, thank you very much. I'm interested in Mistral fine-tuning using a JSONL dataset for NER. Do you have any videos for that topic or is this video sufficient and really all I would need to do is determine what the JSONL data format should be?

@engineerprompt 5 ай бұрын

I don't have specific video on that topic but I think this will be a good start. My recommendation will be to do few shots first before you start thinking about finetuning. There is a LOT you can achieve with few shots. Finetuning should be the last resort!

@kenchang3456 5 ай бұрын

@@engineerprompt Thank you for responding, I appreciate it.