Fine tuning LLama 3 LLM for Text Classification of Stock Sentiment using QLoRA

Рет қаралды 1,695

Күн бұрын

Fine tuning LLama 3 for Text Classification for Sentiment Analysis of Financial Data using Hugging Face Transformers using QLoRA and Peft and Lora
This video follows • LLaMA 3 LLM & Hugging ...
Code:
github.com/adidror005/youtube...

Пікірлер: 39

@MLAlgoTrader 16 күн бұрын

Code Here including short explanation on how to get dataset. github.com/adidror005/youtube-videos/blob/main/LLAMA_3_Fine_Tuning_for_Sequence_Classification_Actual_Video.ipynb

@andrewbritt3117 15 күн бұрын

Hello, thanks for the really informative walkthough. I was looking to go back through your notebook for further review, however the notebook no longer available from the link.

@MLAlgoTrader 15 күн бұрын

@@andrewbritt3117 github.com/adidror005/youtube-videos/blob/main/LLAMA_3_Fine_Tuning_for_Sequence_Classification_Actual_Video.ipynb

@andrewbritt3117 15 күн бұрын

@@MLAlgoTrader thanks!

@ranxu9473 11 күн бұрын

thanks dude that's very useful for me.

@MLAlgoTrader 10 күн бұрын

@ranxu9473 thank you

@user-be8om3ov2k 19 күн бұрын

great video mate! loved it

@MLAlgoTrader 19 күн бұрын

Glad you enjoyed

@am7-p 19 күн бұрын

Once again, thank you for the informative channel and sharing this video.

@MLAlgoTrader 19 күн бұрын

Thanks I thought you guys on average didn't like LLM videos lol. My click through rate is low so it makes me happy you say that.

@am7-p 19 күн бұрын

@@MLAlgoTrader What is click through rate ?

@am7-p 19 күн бұрын

@@MLAlgoTrader Also, please consider that knowing what you are working on helps me to plan for the next steps of my development Currently, I use and pay for OpenAI API, but I do plan for implementing a LLama in my home-lab. Once I start to learn and practice LLama, I will go through your videos again.

@MLAlgoTrader 19 күн бұрын

This was is small like

@MLAlgoTrader 19 күн бұрын

Honestly, it is completely random. My next videos are on sequential bootstrap, implementing a gap trading strategy both with stocks and with options, the dangers of backtesting, and then I also plan to do ib_insnyc for begginers. ...I think it llama 3 8b params works free version of colab for a bit until you get kicked of gpu. There is also this api I used I think you get quite a bit free at first. docs.llama-api.com/quickstart .

@amitocamitoc2294 17 күн бұрын

Interesting!

@MLAlgoTrader 17 күн бұрын

Glad you think so!

@salmakhaled-hn6gw 18 күн бұрын

Thank you so much, it is very informative. Could I ask you when will you provide the notebook you worked on?

@MLAlgoTrader 17 күн бұрын

Yes the delay is cuz I need a notebook t explain how to get the data

@MLAlgoTrader 17 күн бұрын

So I literally was about to share video but I had a bug so needed to restart. Must wait 24 hours due to api limit. So I'll send it 25 hours from now lol!

@MLAlgoTrader 16 күн бұрын

Code: github.com/adidror005/youtube-videos/blob/main/LLAMA_3_Fine_Tuning_for_Sequence_Classification_Actual_Video.ipynb

@salmakhaled-hn6gw 16 күн бұрын

@@MLAlgoTrader Thank you so much🙏

@MLAlgoTrader 16 күн бұрын

@@salmakhaled-hn6gw No problem. There are a few more things I left out hopefully we can cover them in another video like loading the model and merging with QlORA weights. Does the part about getting the data make sense? You need that to run the notebook!

@khachapuri_ 15 күн бұрын

Is there a way to remove attention-mask from Llama-3 to turn it into a giant BERT (encoder-only transformer)?

@MLAlgoTrader 15 күн бұрын

Being on 0 sleep I'll quote chatgpt and get back to answering you later lol.... Turning Llama-3 into an encoder-only transformer like BERT, by removing the attention mask, is theoretically possible but involves more than just altering the attention mechanism. Here are the steps and considerations for this transformation:Modify Attention Mechanism: In Llama-3, which is presumably an autoregressive transformer like GPT-3, each token can only attend to previous tokens. To make it behave like BERT, you need to allow each token to attend to all other tokens in the sequence. This involves changing the attention mask settings in the transformer's layers.Change Training Objective: BERT uses a masked language model (MLM) training objective where some percentage of the input tokens are masked, and the model predicts these masked tokens. You would need to implement this training objective for the modified Llama-3.Adjust Tokenizer and Inputs: BERT is trained with pairs of sentences as inputs (for tasks like next sentence prediction), and uses special tokens (like [CLS] and [SEP]) to distinguish between sentences. You would need to adapt the tokenizer and data preprocessing steps to accommodate these requirements.Retraining the Model: Even after these modifications, the model would need to be retrained from scratch or fine-tuned extensively on a suitable dataset because the pre-existing weights were optimized for a different architecture and objective.Software and Implementation: You need to ensure that the transformer library you're using supports these customizations. Libraries like Hugging Face Transformers are quite flexible and might be useful for this purpose.This transformation essentially creates a new model, leveraging the architecture of Llama-3 but fundamentally changing its operation and purpose. Such a project would be substantial and complex but interesting from a research and development perspective.

@khachapuri_ 15 күн бұрын

@@MLAlgoTrader Thank you so much, appreciate the response! Since its a classification task it makes sense to remove the mask (make it encoder-only) and retrain the model to another objective function. I was just wondering technically how would you remove the mask from llama-3? and maybe also add a feedforward layer? Is it possible to edit the architecture like that?

@dariyanagashi8958 13 күн бұрын

Hello! Thank you so much for your tutorial, it is very helpful and easy to follow. I started applying it in on my custom binary dataset, but stumbled on the training step. I get the error with this line of code: labels = inputs.pop("labels").long() KeyError: 'labels' My inputs look like this: ['input_ids', 'attention_mask'] and I don't understand which "labels" are you referring to in that line. If it is not difficult for you, could you explain what it means? I would be most grateful! UPD: I renamed the columns of my dataset to "text" and "labels", and it solved the issue! 😀

@MLAlgoTrader 13 күн бұрын

I will get back to you

@MLAlgoTrader 9 күн бұрын

Hey sorry haven't gotten to this. Haven't forgot I will look this week sometime just overwhelmed.

@dariyanagashi8958 9 күн бұрын

@@MLAlgoTrader hi! I actually updated my comment that I found the workaround for that issue, although I still vaguely understand how it helped. Need to read more documentation, I guess. Anyways, thank you for your tutorial, it helped me with my thesis 😊

@MLAlgoTrader 9 күн бұрын

Wow very happy to hear!!!

@MLAlgoTrader 9 күн бұрын

Your comment made my day. I'll do more videos related to nlp/llm/rag/etc.. soon I hope