Пікірлер
@nuralifahsalsabila9057
@nuralifahsalsabila9057 Күн бұрын
hi can u make a video that explain about efficientnet lite?
@superfreiheit1
@superfreiheit1 2 күн бұрын
I like how he lead us to papers.
@superfreiheit1
@superfreiheit1 3 күн бұрын
Can I create a dataset only with questions and answers? Without context?
@pabloescobar2738
@pabloescobar2738 4 күн бұрын
Thank
@Luffynami143
@Luffynami143 11 күн бұрын
U helped me complete whole unit in one vedio,keep posting wonderful vedios like thissss :))
@TheSopk
@TheSopk 13 күн бұрын
what the diff with API? why they create MCP when you can use API
@saber8387
@saber8387 13 күн бұрын
From what I understand, mcp can have the context and the memory of the sessions so its more aware whereas apis are individually requested.
@heesongkoh
@heesongkoh Күн бұрын
I guess it's just simpler to use in your llm app.
@ISLAInstruments
@ISLAInstruments 14 күн бұрын
great explanation, thank you!
@amit4rou
@amit4rou 18 күн бұрын
How much of vram is required to run qwen coder 2.5 7b version?
@sharon8811
@sharon8811 18 күн бұрын
It gave you feedback on the sudoku game of 4o it’s said invalid move
@lupusreginabeta3318
@lupusreginabeta3318 18 күн бұрын
4o is free to use and not 200 PM😂
@AIBites
@AIBites 18 күн бұрын
Ok I would say 20 pm if you want to use it extensively 😊
@archijogi7021
@archijogi7021 22 күн бұрын
Error occurred: Error running graph: Error building Component File: It looks like we're missing some important information: session_id (unique conversation identifier). Please ensure that your message includes all the required fields. constantly facing this error
@pratikvyas-g2r
@pratikvyas-g2r 25 күн бұрын
how to fit Gemma 27b model to finetune on my free colab GPU ( T4 15GB memory ). Is there any way ? please explain
@pratikvyas-g2r
@pratikvyas-g2r 25 күн бұрын
why padding = right , shouldn't it be left as it is next token generation where left side in sequence require padding
@scnak
@scnak 27 күн бұрын
Thank you for your well-intentioned and sincere explanation. It's great to hear advice from someone who has a good grasp of the subject.
@AIBites
@AIBites 27 күн бұрын
Very encouraging to keep going 👍
@РичардЖиулиевич
@РичардЖиулиевич 27 күн бұрын
Thanks very much. But github link is broken
@karthikreddy9504
@karthikreddy9504 Ай бұрын
Can we use lightrag to pass the context to a fine tuned LLM?
@davide0965
@davide0965 Ай бұрын
Terrible explanation, the background music makes all worse
@azmyin
@azmyin Ай бұрын
Excellent explanation
@SetuAI
@SetuAI Ай бұрын
which screen recorder do you use ?
@AIBites
@AIBites Ай бұрын
OBS
@Advokat7V
@Advokat7V Ай бұрын
thank you friend
@AIBites
@AIBites Ай бұрын
glad it was useful! 🙂
@AK-ox3mv
@AK-ox3mv Ай бұрын
Its llama 3 8B. What that "100B" at the end of model name means? Llama is either 8B or 100B! What it mean?
@AIBites
@AIBites Ай бұрын
so 100B is for billions of parameters. The more the params, the model is supposed to be better. 8b, 4b or 2b stands for bits in quantization. We use quantization to reduce the model size to make it run locally on our laptops or CPU desktops.
@rafaeel731
@rafaeel731 Ай бұрын
If you have M1+ chip then llama.cpp will use metal and GPU. Right?
@AIBites
@AIBites Ай бұрын
Yes, from their github page github.com/ggerganov/llama.cpp I gather that they support Metal. So, it should in turn be leveraging the GPU on the Mac.
@AIWhale3
@AIWhale3 Ай бұрын
What databases are used in light rag? Do you use both a vector and graph db?
@AIBites
@AIBites Ай бұрын
though it uses graph structures for indexing, I believe in terms of storing the embeddings, its just any other vector DB
@prathameshdinkar2966
@prathameshdinkar2966 Ай бұрын
Nicely explained! Keep the good work going!! 🤗
@AIBites
@AIBites Ай бұрын
Thank you 🙂
@AIBites
@AIBites Ай бұрын
Are you interested in more of theory or hands on implementation style videos? Your input will be very valuable 👍
@prathameshdinkar2966
@prathameshdinkar2966 Ай бұрын
@@AIBites I'm interested in more videos on concept understanding as the implementations are easily available
@robiulislamusa
@robiulislamusa Ай бұрын
@@AIBites Yes. We all want
@antonijo01
@antonijo01 Ай бұрын
Is this method good for complex large codebases?
@AIBites
@AIBites Ай бұрын
good question and a nice candiate to test :-) As these techs are evolving fast, I believe only testing will showcase its limitations and potentials
@antonijo01
@antonijo01 Ай бұрын
Is this method good for complex large codebases?
@AIBites
@AIBites Ай бұрын
It needs testing to see how well it copes with huge DBs. At least I feel it's going to do better than Naive RAG.
@pranavghoghari600
@pranavghoghari600 Ай бұрын
Nicely explained. Keep up the good work.
@AIBites
@AIBites Ай бұрын
Thank you. May I know if you would be interested in more of theory or hands on implementation style videos?
@pranavghoghari600
@pranavghoghari600 Ай бұрын
@AIBites A balance between both. Core theories explained and maybe a quick implementation video. Since the implementation/use case will be different for every viewer but the theory paired with a simple quick example works perfectly.
@AIBites
@AIBites Ай бұрын
Thanks. That's quite a valuable and sensible feedback
@AaronBlox-h2t
@AaronBlox-h2t Ай бұрын
Cool stuff....
@AIBites
@AIBites Ай бұрын
Thank you 🙂
@waffleninja1000
@waffleninja1000 Ай бұрын
Does anyone know a good, reliable jailbreak for the llama3 100B model?
@amssss4152
@amssss4152 Ай бұрын
Count R's in a word is not a good way to evaluate model since it Depends on tokenizer. I read this on reddit. Anyway just wanted to add.
@AIBites
@AIBites Ай бұрын
Yes that's very true.. It's just becoming overrated in the hype and everyones using it 😊
@philtoa334
@philtoa334 Ай бұрын
How many r in the word : strawberry ? Answer: 3 r in the word strawberry What is the meaning of the word : strawberry ? Definition: Strawberry is a common name for several species of edible plants in the genus Fragaria, in the family Rosaceae.The strawberry is a member of llama_perf_sampler_print: sampling time = 4.80 ms / 62 runs ( 0.08 ms per token, 12916.67 tokens per second) llama_perf_context_print: load time = 557.00 ms llama_perf_context_print: prompt eval time = 658.78 ms / 12 tokens ( 54.90 ms per token, 18.22 tokens per second) llama_perf_context_print: eval time = 2709.21 ms / 49 runs ( 55.29 ms per token, 18.09 tokens per second) llama_perf_context_print: total time = 3380.71 ms / 61 tokens How many r in the word : strawberry ? Answer: 4. How many letters in the word : strawberry ? Answer: 8. How many vowels in the word : strawberry ? Answer: 2. How many consonants in the word : strawberry ? Answer: 6. How many words can llama_perf_sampler_print: sampling time = 2.80 ms / 62 runs ( 0.05 ms per token, 22119.16 tokens per second) llama_perf_context_print: load time = 539.47 ms llama_perf_context_print: prompt eval time = 664.60 ms / 12 tokens ( 55.38 ms per token, 18.06 tokens per second) llama_perf_context_print: eval time = 2698.53 ms / 49 runs ( 55.07 ms per token, 18.16 tokens per second) llama_perf_context_print: total time = 3372.86 ms / 61 tokens How many r in strawberry ? Answer: There are number of number of rimes. How many r in runt ? Answer: There are 3 rimes in 3 runst. How many rimes in 1 runt ? Answer: How many r in carrotte ? Answer: 2. How many r in carrotte ? How many r in carrotte ? Answer: 2. How many r in 2 carrotte ? How many r in 2 carrotte ? Answer: 2. How many r llama_perf_sampler_print: sampling time = 5.94 ms / 61 runs ( 0.10 ms per token, 10274.55 tokens per second) llama_perf_context_print: load time = 529.35 ms llama_perf_context_print: prompt eval time = 520.21 ms / 11 tokens ( 47.29 ms per token, 21.15 tokens per second) llama_perf_context_print: eval time = 2320.73 ms / 49 runs ( 47.36 ms per token, 21.11 tokens per second) llama_perf_context_print: total time = 2856.54 ms / 60 tokens Give rabbit recipe with carrotte ? Answer: 1 cup of carrot juice 1/2 cup of carrot juice Question: 1 cup of carrot juice recipe Answer: 1 cup of carrot juice recipe Question: 1 cup of carrot juice recipe Answer: 1 cup llama_perf_sampler_print: sampling time = 6.12 ms / 62 runs ( 0.10 ms per token, 10130.72 tokens per second) llama_perf_context_print: load time = 582.47 ms llama_perf_context_print: prompt eval time = 567.69 ms / 12 tokens ( 47.31 ms per token, 21.14 tokens per second) llama_perf_context_print: eval time = 2317.53 ms / 49 runs ( 47.30 ms per token, 21.14 tokens per second) llama_perf_context_print: total time = 2900.77 ms / 61 tokens Give rabbit recipe with carrotte ? Answer: 1 cup carrot juice recipe is made with 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice.The 1 cup of grated carrots and 1/2 cup of carrot juice recipe is a great way to get kids to eat their vegetables. It's also a great way to get kids to drink their vegetables. This recipe is a great way to get kids to eat their vegetables. It's also a great way to get kids to drink their vegetables. This llama_perf_sampler_print: sampling time = 21.10 ms / 212 runs ( 0.10 ms per token, 10047.87 tokens per second) llama_perf_context_print: load time = 517.67 ms llama_perf_context_print: prompt eval time = 586.22 ms / 12 tokens ( 48.85 ms per token, 20.47 tokens per second) llama_perf_context_print: eval time = 9738.30 ms / 199 runs ( 48.94 ms per token, 20.43 tokens per second) llama_perf_context_print: total time = 10381.15 ms / 211 tokens
@fredpourlesintimes
@fredpourlesintimes 2 ай бұрын
It's paying
@AIBites
@AIBites 2 ай бұрын
Could you please elaborate
@IvanLesnov
@IvanLesnov 2 ай бұрын
How fine tune on offline (local) ?
@AIBites
@AIBites Ай бұрын
if you have powerful hardware (sufficient GPU memory), then you should be able to FT locally. Otherwise, we are at the mercy of the cloud 🙂
@jaggyjut
@jaggyjut 2 ай бұрын
Thank you for this tutorial.
@AIBites
@AIBites 2 ай бұрын
Glad you like it 😊
@bharatbhusansau2996
@bharatbhusansau2996 2 ай бұрын
Bro, your statement from 05:22 is completely wrong and misguiding. LoRA is used for finetuning LLM models, when full-finetuning is not possible. It does so by freezing all model weights, and incorporating and training low-rank matrices(A*B) in Attention modules. LoRA speeds up training and reduces memory requirements but does not provide a speedup during inference. If LLM model is too large to be handled by LoRA due to GPU memory limitations, Quantized LoRA is used to finetune the model. Overall, QLoRA is a more advanced solution when LoRA alone cannot handle large models for finetuning.
@AIBites
@AIBites Ай бұрын
Thanks for your feedback. I think we are pretty much on the same page. Can you be more specific what I am wrong with? Unfortunately I won't be able to edit the video but can at leaset pin a message to viewers pointing the errata
@IsmailIfakir
@IsmailIfakir 2 ай бұрын
is there is a multimodal llm can fine-tuning for sentiment analysis from text, image, video and audio ?
@Ram-oj4gn
@Ram-oj4gn 2 ай бұрын
The quantisation of changing number format applies only to the result of activation function or also to the individual weights ? Where we apply this quantisation in the NN
@AIBites
@AIBites Ай бұрын
its applied to the weights. So the model size (weights) itself reduces and hence the computation also simplifies. Hope its clear now.
@AaronBlox-h2t
@AaronBlox-h2t 2 ай бұрын
Cool video....Some test would be appreciated. Also, maybe you can include the qwen vision models, especially.
@AIBites
@AIBites 2 ай бұрын
Sure will try to test models that come up in the future 👍
@IsmailIfakir
@IsmailIfakir 3 ай бұрын
you can fine-tuning this multimodal llm for sentiment analysis
@yoavtamir7707
@yoavtamir7707 3 ай бұрын
Thanks! this is an awesome explanation
@AIBites
@AIBites 3 ай бұрын
Glad you like it 😊
@yabezD
@yabezD 3 ай бұрын
where to draw these kinda charts ?. could you tell me. itll be helpful
@AIBites
@AIBites Ай бұрын
I think its keynote if I can recollect as its been a long time I did this video.
@yabezD
@yabezD Ай бұрын
@AIBites thanks for sharing, any others, android specific?
@Kurshu101
@Kurshu101 3 ай бұрын
Dude change the battery in your smoke detector
@AIBites
@AIBites 3 ай бұрын
Hah.. Already did 🤣
@saadowain3511
@saadowain3511 3 ай бұрын
thank you. I have a question ! do we use dspy for development or production ?
@xspydazx
@xspydazx 3 ай бұрын
nice i made this before ! A model which picks the correct model ! BUT : then i decided tat a 1b agent can be the router model ! Then i decided that models as TOOLS ! so once you create an anthrpic as a tool , it will select the anthropic insteads ! i think its all about understanding the power of Tools andeven graphs and nodes : If we create some graphs then their start point are the tool ! SO: the docstring methodolgy is the best version of the tool calling method ! , perhaps with a react type framwork ( epecally when using tools ) by creating details docstring and example in the docstring , ach tool added will be woven into the prompt ! so the aim is to create model ( or tune one ) to use the react framwork as well as selecting tools ! -- I think that higging face agents is the methodology which is correct because we ca host models on hugging face .. and hit those spaces ! ... Spaces as TOOLS !.. SO again we see tools takinng a front role as the main prompt is to select the correct tool for the intent: also train for slot filling and intent detection (hf dataset ) .... the routing method was very good learning execises ! ... but it also needs the pydantic to send back the coreect route to select , when it could be done via a tool which is already preprogrammed iin to the library ( stoping reason )...
@dfdiasbr
@dfdiasbr 3 ай бұрын
Thank you for that video. I've been studying this model and helped me a lot.
@AIBites
@AIBites 3 ай бұрын
Glad it helped 👍
@bitminers1379
@bitminers1379 3 ай бұрын
How did you push your own custom dataset on huggingface?
@AIBites
@AIBites 3 ай бұрын
Checkout the bunch of commands available in the HF command line tools. It's quite easy actually
@orangtimur6812
@orangtimur6812 3 ай бұрын
i always got this message ImportError: cannot import name 'load_flow_from_json' from 'langflow' (unknown location) already clone from github using windows
@first-thoughtgiver-of-will2456
@first-thoughtgiver-of-will2456 4 ай бұрын
fp16 also has a bigger mantissa than bfloat which benefits normalized or bounded activation functions (e.g. sigmoid)
@newbie8051
@newbie8051 4 ай бұрын
Well the graphs at 2:18 are incorrect, sigmoid and tanh have different ranges, so the output gate should have range - 1 to 1 (tanh)
@AIBites
@AIBites 4 ай бұрын
thats a great spot. Copy pasting oversight I guess 🙂 will pay more attention while making the videos on attention. Thank you 😀