hi can u make a video that explain about efficientnet lite?
@superfreiheit12 күн бұрын
I like how he lead us to papers.
@superfreiheit13 күн бұрын
Can I create a dataset only with questions and answers? Without context?
@pabloescobar27384 күн бұрын
Thank
@Luffynami14311 күн бұрын
U helped me complete whole unit in one vedio,keep posting wonderful vedios like thissss :))
@TheSopk13 күн бұрын
what the diff with API? why they create MCP when you can use API
@saber838713 күн бұрын
From what I understand, mcp can have the context and the memory of the sessions so its more aware whereas apis are individually requested.
@heesongkohКүн бұрын
I guess it's just simpler to use in your llm app.
@ISLAInstruments14 күн бұрын
great explanation, thank you!
@amit4rou18 күн бұрын
How much of vram is required to run qwen coder 2.5 7b version?
@sharon881118 күн бұрын
It gave you feedback on the sudoku game of 4o it’s said invalid move
@lupusreginabeta331818 күн бұрын
4o is free to use and not 200 PM😂
@AIBites18 күн бұрын
Ok I would say 20 pm if you want to use it extensively 😊
@archijogi702122 күн бұрын
Error occurred: Error running graph: Error building Component File: It looks like we're missing some important information: session_id (unique conversation identifier). Please ensure that your message includes all the required fields. constantly facing this error
@pratikvyas-g2r25 күн бұрын
how to fit Gemma 27b model to finetune on my free colab GPU ( T4 15GB memory ). Is there any way ? please explain
@pratikvyas-g2r25 күн бұрын
why padding = right , shouldn't it be left as it is next token generation where left side in sequence require padding
@scnak27 күн бұрын
Thank you for your well-intentioned and sincere explanation. It's great to hear advice from someone who has a good grasp of the subject.
@AIBites27 күн бұрын
Very encouraging to keep going 👍
@РичардЖиулиевич27 күн бұрын
Thanks very much. But github link is broken
@karthikreddy9504Ай бұрын
Can we use lightrag to pass the context to a fine tuned LLM?
@davide0965Ай бұрын
Terrible explanation, the background music makes all worse
@azmyinАй бұрын
Excellent explanation
@SetuAIАй бұрын
which screen recorder do you use ?
@AIBitesАй бұрын
OBS
@Advokat7VАй бұрын
thank you friend
@AIBitesАй бұрын
glad it was useful! 🙂
@AK-ox3mvАй бұрын
Its llama 3 8B. What that "100B" at the end of model name means? Llama is either 8B or 100B! What it mean?
@AIBitesАй бұрын
so 100B is for billions of parameters. The more the params, the model is supposed to be better. 8b, 4b or 2b stands for bits in quantization. We use quantization to reduce the model size to make it run locally on our laptops or CPU desktops.
@rafaeel731Ай бұрын
If you have M1+ chip then llama.cpp will use metal and GPU. Right?
@AIBitesАй бұрын
Yes, from their github page github.com/ggerganov/llama.cpp I gather that they support Metal. So, it should in turn be leveraging the GPU on the Mac.
@AIWhale3Ай бұрын
What databases are used in light rag? Do you use both a vector and graph db?
@AIBitesАй бұрын
though it uses graph structures for indexing, I believe in terms of storing the embeddings, its just any other vector DB
@prathameshdinkar2966Ай бұрын
Nicely explained! Keep the good work going!! 🤗
@AIBitesАй бұрын
Thank you 🙂
@AIBitesАй бұрын
Are you interested in more of theory or hands on implementation style videos? Your input will be very valuable 👍
@prathameshdinkar2966Ай бұрын
@@AIBites I'm interested in more videos on concept understanding as the implementations are easily available
@robiulislamusaАй бұрын
@@AIBites Yes. We all want
@antonijo01Ай бұрын
Is this method good for complex large codebases?
@AIBitesАй бұрын
good question and a nice candiate to test :-) As these techs are evolving fast, I believe only testing will showcase its limitations and potentials
@antonijo01Ай бұрын
Is this method good for complex large codebases?
@AIBitesАй бұрын
It needs testing to see how well it copes with huge DBs. At least I feel it's going to do better than Naive RAG.
@pranavghoghari600Ай бұрын
Nicely explained. Keep up the good work.
@AIBitesАй бұрын
Thank you. May I know if you would be interested in more of theory or hands on implementation style videos?
@pranavghoghari600Ай бұрын
@AIBites A balance between both. Core theories explained and maybe a quick implementation video. Since the implementation/use case will be different for every viewer but the theory paired with a simple quick example works perfectly.
@AIBitesАй бұрын
Thanks. That's quite a valuable and sensible feedback
@AaronBlox-h2tАй бұрын
Cool stuff....
@AIBitesАй бұрын
Thank you 🙂
@waffleninja1000Ай бұрын
Does anyone know a good, reliable jailbreak for the llama3 100B model?
@amssss4152Ай бұрын
Count R's in a word is not a good way to evaluate model since it Depends on tokenizer. I read this on reddit. Anyway just wanted to add.
@AIBitesАй бұрын
Yes that's very true.. It's just becoming overrated in the hype and everyones using it 😊
@philtoa334Ай бұрын
How many r in the word : strawberry ? Answer: 3 r in the word strawberry What is the meaning of the word : strawberry ? Definition: Strawberry is a common name for several species of edible plants in the genus Fragaria, in the family Rosaceae.The strawberry is a member of llama_perf_sampler_print: sampling time = 4.80 ms / 62 runs ( 0.08 ms per token, 12916.67 tokens per second) llama_perf_context_print: load time = 557.00 ms llama_perf_context_print: prompt eval time = 658.78 ms / 12 tokens ( 54.90 ms per token, 18.22 tokens per second) llama_perf_context_print: eval time = 2709.21 ms / 49 runs ( 55.29 ms per token, 18.09 tokens per second) llama_perf_context_print: total time = 3380.71 ms / 61 tokens How many r in the word : strawberry ? Answer: 4. How many letters in the word : strawberry ? Answer: 8. How many vowels in the word : strawberry ? Answer: 2. How many consonants in the word : strawberry ? Answer: 6. How many words can llama_perf_sampler_print: sampling time = 2.80 ms / 62 runs ( 0.05 ms per token, 22119.16 tokens per second) llama_perf_context_print: load time = 539.47 ms llama_perf_context_print: prompt eval time = 664.60 ms / 12 tokens ( 55.38 ms per token, 18.06 tokens per second) llama_perf_context_print: eval time = 2698.53 ms / 49 runs ( 55.07 ms per token, 18.16 tokens per second) llama_perf_context_print: total time = 3372.86 ms / 61 tokens How many r in strawberry ? Answer: There are number of number of rimes. How many r in runt ? Answer: There are 3 rimes in 3 runst. How many rimes in 1 runt ? Answer: How many r in carrotte ? Answer: 2. How many r in carrotte ? How many r in carrotte ? Answer: 2. How many r in 2 carrotte ? How many r in 2 carrotte ? Answer: 2. How many r llama_perf_sampler_print: sampling time = 5.94 ms / 61 runs ( 0.10 ms per token, 10274.55 tokens per second) llama_perf_context_print: load time = 529.35 ms llama_perf_context_print: prompt eval time = 520.21 ms / 11 tokens ( 47.29 ms per token, 21.15 tokens per second) llama_perf_context_print: eval time = 2320.73 ms / 49 runs ( 47.36 ms per token, 21.11 tokens per second) llama_perf_context_print: total time = 2856.54 ms / 60 tokens Give rabbit recipe with carrotte ? Answer: 1 cup of carrot juice 1/2 cup of carrot juice Question: 1 cup of carrot juice recipe Answer: 1 cup of carrot juice recipe Question: 1 cup of carrot juice recipe Answer: 1 cup llama_perf_sampler_print: sampling time = 6.12 ms / 62 runs ( 0.10 ms per token, 10130.72 tokens per second) llama_perf_context_print: load time = 582.47 ms llama_perf_context_print: prompt eval time = 567.69 ms / 12 tokens ( 47.31 ms per token, 21.14 tokens per second) llama_perf_context_print: eval time = 2317.53 ms / 49 runs ( 47.30 ms per token, 21.14 tokens per second) llama_perf_context_print: total time = 2900.77 ms / 61 tokens Give rabbit recipe with carrotte ? Answer: 1 cup carrot juice recipe is made with 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice.The 1 cup of grated carrots and 1/2 cup of carrot juice recipe is a great way to get kids to eat their vegetables. It's also a great way to get kids to drink their vegetables. This recipe is a great way to get kids to eat their vegetables. It's also a great way to get kids to drink their vegetables. This llama_perf_sampler_print: sampling time = 21.10 ms / 212 runs ( 0.10 ms per token, 10047.87 tokens per second) llama_perf_context_print: load time = 517.67 ms llama_perf_context_print: prompt eval time = 586.22 ms / 12 tokens ( 48.85 ms per token, 20.47 tokens per second) llama_perf_context_print: eval time = 9738.30 ms / 199 runs ( 48.94 ms per token, 20.43 tokens per second) llama_perf_context_print: total time = 10381.15 ms / 211 tokens
@fredpourlesintimes2 ай бұрын
It's paying
@AIBites2 ай бұрын
Could you please elaborate
@IvanLesnov2 ай бұрын
How fine tune on offline (local) ?
@AIBitesАй бұрын
if you have powerful hardware (sufficient GPU memory), then you should be able to FT locally. Otherwise, we are at the mercy of the cloud 🙂
@jaggyjut2 ай бұрын
Thank you for this tutorial.
@AIBites2 ай бұрын
Glad you like it 😊
@bharatbhusansau29962 ай бұрын
Bro, your statement from 05:22 is completely wrong and misguiding. LoRA is used for finetuning LLM models, when full-finetuning is not possible. It does so by freezing all model weights, and incorporating and training low-rank matrices(A*B) in Attention modules. LoRA speeds up training and reduces memory requirements but does not provide a speedup during inference. If LLM model is too large to be handled by LoRA due to GPU memory limitations, Quantized LoRA is used to finetune the model. Overall, QLoRA is a more advanced solution when LoRA alone cannot handle large models for finetuning.
@AIBitesАй бұрын
Thanks for your feedback. I think we are pretty much on the same page. Can you be more specific what I am wrong with? Unfortunately I won't be able to edit the video but can at leaset pin a message to viewers pointing the errata
@IsmailIfakir2 ай бұрын
is there is a multimodal llm can fine-tuning for sentiment analysis from text, image, video and audio ?
@Ram-oj4gn2 ай бұрын
The quantisation of changing number format applies only to the result of activation function or also to the individual weights ? Where we apply this quantisation in the NN
@AIBitesАй бұрын
its applied to the weights. So the model size (weights) itself reduces and hence the computation also simplifies. Hope its clear now.
@AaronBlox-h2t2 ай бұрын
Cool video....Some test would be appreciated. Also, maybe you can include the qwen vision models, especially.
@AIBites2 ай бұрын
Sure will try to test models that come up in the future 👍
@IsmailIfakir3 ай бұрын
you can fine-tuning this multimodal llm for sentiment analysis
@yoavtamir77073 ай бұрын
Thanks! this is an awesome explanation
@AIBites3 ай бұрын
Glad you like it 😊
@yabezD3 ай бұрын
where to draw these kinda charts ?. could you tell me. itll be helpful
@AIBitesАй бұрын
I think its keynote if I can recollect as its been a long time I did this video.
@yabezDАй бұрын
@AIBites thanks for sharing, any others, android specific?
@Kurshu1013 ай бұрын
Dude change the battery in your smoke detector
@AIBites3 ай бұрын
Hah.. Already did 🤣
@saadowain35113 ай бұрын
thank you. I have a question ! do we use dspy for development or production ?
@xspydazx3 ай бұрын
nice i made this before ! A model which picks the correct model ! BUT : then i decided tat a 1b agent can be the router model ! Then i decided that models as TOOLS ! so once you create an anthrpic as a tool , it will select the anthropic insteads ! i think its all about understanding the power of Tools andeven graphs and nodes : If we create some graphs then their start point are the tool ! SO: the docstring methodolgy is the best version of the tool calling method ! , perhaps with a react type framwork ( epecally when using tools ) by creating details docstring and example in the docstring , ach tool added will be woven into the prompt ! so the aim is to create model ( or tune one ) to use the react framwork as well as selecting tools ! -- I think that higging face agents is the methodology which is correct because we ca host models on hugging face .. and hit those spaces ! ... Spaces as TOOLS !.. SO again we see tools takinng a front role as the main prompt is to select the correct tool for the intent: also train for slot filling and intent detection (hf dataset ) .... the routing method was very good learning execises ! ... but it also needs the pydantic to send back the coreect route to select , when it could be done via a tool which is already preprogrammed iin to the library ( stoping reason )...
@dfdiasbr3 ай бұрын
Thank you for that video. I've been studying this model and helped me a lot.
@AIBites3 ай бұрын
Glad it helped 👍
@bitminers13793 ай бұрын
How did you push your own custom dataset on huggingface?
@AIBites3 ай бұрын
Checkout the bunch of commands available in the HF command line tools. It's quite easy actually
@orangtimur68123 ай бұрын
i always got this message ImportError: cannot import name 'load_flow_from_json' from 'langflow' (unknown location) already clone from github using windows
@first-thoughtgiver-of-will24564 ай бұрын
fp16 also has a bigger mantissa than bfloat which benefits normalized or bounded activation functions (e.g. sigmoid)
@newbie80514 ай бұрын
Well the graphs at 2:18 are incorrect, sigmoid and tanh have different ranges, so the output gate should have range - 1 to 1 (tanh)
@AIBites4 ай бұрын
thats a great spot. Copy pasting oversight I guess 🙂 will pay more attention while making the videos on attention. Thank you 😀