AI Bites

12:30

Gemini 2.0 from Google - Is it jack of all trades? | Full Review (Part 1)

Күн бұрын

12:12

How good is Claude + MCP at replacing full-stack developers? Lets test!

14 күн бұрын

6:34

Structured output from Ollama | Local LLM + VLM | Quick Hands-on

14 күн бұрын

15:59

The new Llama 3.3 vs GPT-4o - full review | Is free Llama 3.3 sufficient?

14 күн бұрын

26:26

RAG - Vector DBs for RAG | Indexing and Similarity in Vector DBs

21 күн бұрын

15:04

RAG - Embeddings for RAG | BERT and SBERT | Sentence Transformers

Ай бұрын

16:01

Mixture of Transformers for Multi-modal foundation models (paper explained)

Ай бұрын

11:33

AI App to talk to your laptops locally (Local Alexa) - hands-on

Ай бұрын

8:52

LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)

Ай бұрын

7:26

How I generate unlimited AI images for free!

Ай бұрын

6:00

bitnet.cpp from Microsoft: Run LLMs locally on CPU! (hands-on)

Ай бұрын

6:43

The new claude 3.5 sonnet - computer use, benchmark and more

2 ай бұрын

9:22

Introduction to PDF Parsing, challenges and methods (RAG Series)

2 ай бұрын

12:33

Swarm from Open AI - routines, handoffs and agents explained with code

2 ай бұрын

23:28

Meta Movie Gen Research Paper explained

2 ай бұрын

18:00

Contextual Information Retrieval for improving your RAG pipeline (from Anthropic)

2 ай бұрын

15:43

Qwen2.5 coder - Combines code generation with reasoning to build coding agents!

2 ай бұрын

8:50

Qwen2.5 Math - world's leading open-source Math model?

3 ай бұрын

9:01

Qwen 2.5 - The Small Language Model? (a quick look)

3 ай бұрын

16:44

o1 preview from OpenAI is all about reasoning - A comprehensive look

3 ай бұрын

15:22

Model Router - choose the right AI model using AI

3 ай бұрын

14:21

Segment Anything 2 (SAM2) from Meta: The next generation of Meta Segment Anything Model for videos

4 ай бұрын

17:29

Chunking in RAG (with hands-on in LangChain and LlamaIndex) - RAG video series

5 ай бұрын

18:48

Claude3.5 Sonnet vs GPT4o (ChatGPT) - an honest review

5 ай бұрын

22:30

Kolmogorov-Arnold Networks (KAN) - paper explained (maths, B-splines, experiments and KAN vs MLP)

6 ай бұрын

14:14

XLSTM - Extended LSTMs with sLSTM and mLSTM (paper explained)

7 ай бұрын

13:52

Make your LLMs fully utilize the context (paper explained)

7 ай бұрын

55:41

Podcast #3 - Becoming a Kaggle GM + learning AI by OpenSource contribution...

7 ай бұрын

21:16

Build a RAG app using LangFlow + @streamlitofficial with minimal coding | LangFlow crash course

7 ай бұрын

Пікірлер

@nuralifahsalsabila9057 Күн бұрын

hi can u make a video that explain about efficientnet lite?

@superfreiheit1 2 күн бұрын

I like how he lead us to papers.

@superfreiheit1 3 күн бұрын

Can I create a dataset only with questions and answers? Without context?

@pabloescobar2738 4 күн бұрын

Thank

@Luffynami143 11 күн бұрын

U helped me complete whole unit in one vedio,keep posting wonderful vedios like thissss :))

@TheSopk 13 күн бұрын

what the diff with API? why they create MCP when you can use API

@saber8387 13 күн бұрын

From what I understand, mcp can have the context and the memory of the sessions so its more aware whereas apis are individually requested.

@heesongkoh Күн бұрын

I guess it's just simpler to use in your llm app.

@ISLAInstruments 14 күн бұрын

great explanation, thank you!

@amit4rou 18 күн бұрын

How much of vram is required to run qwen coder 2.5 7b version?

@sharon8811 18 күн бұрын

It gave you feedback on the sudoku game of 4o it’s said invalid move

@lupusreginabeta3318 18 күн бұрын

4o is free to use and not 200 PM😂

@AIBites 18 күн бұрын

Ok I would say 20 pm if you want to use it extensively 😊

@archijogi7021 22 күн бұрын

Error occurred: Error running graph: Error building Component File: It looks like we're missing some important information: session_id (unique conversation identifier). Please ensure that your message includes all the required fields. constantly facing this error

@pratikvyas-g2r 25 күн бұрын

how to fit Gemma 27b model to finetune on my free colab GPU ( T4 15GB memory ). Is there any way ? please explain

@pratikvyas-g2r 25 күн бұрын

why padding = right , shouldn't it be left as it is next token generation where left side in sequence require padding

@scnak 27 күн бұрын

Thank you for your well-intentioned and sincere explanation. It's great to hear advice from someone who has a good grasp of the subject.

@AIBites 27 күн бұрын

Very encouraging to keep going 👍

@РичардЖиулиевич 27 күн бұрын

Thanks very much. But github link is broken

@karthikreddy9504 Ай бұрын

Can we use lightrag to pass the context to a fine tuned LLM?

@davide0965 Ай бұрын

Terrible explanation, the background music makes all worse

@azmyin Ай бұрын

Excellent explanation

@SetuAI Ай бұрын

which screen recorder do you use ?

@AIBites Ай бұрын

OBS

@Advokat7V Ай бұрын

thank you friend

@AIBites Ай бұрын

glad it was useful! 🙂

@AK-ox3mv Ай бұрын

Its llama 3 8B. What that "100B" at the end of model name means? Llama is either 8B or 100B! What it mean?

@AIBites Ай бұрын

so 100B is for billions of parameters. The more the params, the model is supposed to be better. 8b, 4b or 2b stands for bits in quantization. We use quantization to reduce the model size to make it run locally on our laptops or CPU desktops.

@rafaeel731 Ай бұрын

If you have M1+ chip then llama.cpp will use metal and GPU. Right?

@AIBites Ай бұрын

Yes, from their github page github.com/ggerganov/llama.cpp I gather that they support Metal. So, it should in turn be leveraging the GPU on the Mac.

@AIWhale3 Ай бұрын

What databases are used in light rag? Do you use both a vector and graph db?

@AIBites Ай бұрын

though it uses graph structures for indexing, I believe in terms of storing the embeddings, its just any other vector DB

@prathameshdinkar2966 Ай бұрын

Nicely explained! Keep the good work going!! 🤗

@AIBites Ай бұрын

Thank you 🙂

@AIBites Ай бұрын

Are you interested in more of theory or hands on implementation style videos? Your input will be very valuable 👍

@prathameshdinkar2966 Ай бұрын

@@AIBites I'm interested in more videos on concept understanding as the implementations are easily available

@robiulislamusa Ай бұрын

@@AIBites Yes. We all want

@antonijo01 Ай бұрын

Is this method good for complex large codebases?

@AIBites Ай бұрын

good question and a nice candiate to test :-) As these techs are evolving fast, I believe only testing will showcase its limitations and potentials

@antonijo01 Ай бұрын

Is this method good for complex large codebases?

@AIBites Ай бұрын

It needs testing to see how well it copes with huge DBs. At least I feel it's going to do better than Naive RAG.

@pranavghoghari600 Ай бұрын

Nicely explained. Keep up the good work.

@AIBites Ай бұрын

Thank you. May I know if you would be interested in more of theory or hands on implementation style videos?

@pranavghoghari600 Ай бұрын

@AIBites A balance between both. Core theories explained and maybe a quick implementation video. Since the implementation/use case will be different for every viewer but the theory paired with a simple quick example works perfectly.

@AIBites Ай бұрын

Thanks. That's quite a valuable and sensible feedback

@AaronBlox-h2t Ай бұрын

Cool stuff....

@AIBites Ай бұрын

Thank you 🙂

@waffleninja1000 Ай бұрын

Does anyone know a good, reliable jailbreak for the llama3 100B model?

@amssss4152 Ай бұрын

Count R's in a word is not a good way to evaluate model since it Depends on tokenizer. I read this on reddit. Anyway just wanted to add.

@AIBites Ай бұрын

Yes that's very true.. It's just becoming overrated in the hype and everyones using it 😊

@philtoa334 Ай бұрын

How many r in the word : strawberry ? Answer: 3 r in the word strawberry What is the meaning of the word : strawberry ? Definition: Strawberry is a common name for several species of edible plants in the genus Fragaria, in the family Rosaceae.The strawberry is a member of llama_perf_sampler_print: sampling time = 4.80 ms / 62 runs ( 0.08 ms per token, 12916.67 tokens per second) llama_perf_context_print: load time = 557.00 ms llama_perf_context_print: prompt eval time = 658.78 ms / 12 tokens ( 54.90 ms per token, 18.22 tokens per second) llama_perf_context_print: eval time = 2709.21 ms / 49 runs ( 55.29 ms per token, 18.09 tokens per second) llama_perf_context_print: total time = 3380.71 ms / 61 tokens How many r in the word : strawberry ? Answer: 4. How many letters in the word : strawberry ? Answer: 8. How many vowels in the word : strawberry ? Answer: 2. How many consonants in the word : strawberry ? Answer: 6. How many words can llama_perf_sampler_print: sampling time = 2.80 ms / 62 runs ( 0.05 ms per token, 22119.16 tokens per second) llama_perf_context_print: load time = 539.47 ms llama_perf_context_print: prompt eval time = 664.60 ms / 12 tokens ( 55.38 ms per token, 18.06 tokens per second) llama_perf_context_print: eval time = 2698.53 ms / 49 runs ( 55.07 ms per token, 18.16 tokens per second) llama_perf_context_print: total time = 3372.86 ms / 61 tokens How many r in strawberry ? Answer: There are number of number of rimes. How many r in runt ? Answer: There are 3 rimes in 3 runst. How many rimes in 1 runt ? Answer: How many r in carrotte ? Answer: 2. How many r in carrotte ? How many r in carrotte ? Answer: 2. How many r in 2 carrotte ? How many r in 2 carrotte ? Answer: 2. How many r llama_perf_sampler_print: sampling time = 5.94 ms / 61 runs ( 0.10 ms per token, 10274.55 tokens per second) llama_perf_context_print: load time = 529.35 ms llama_perf_context_print: prompt eval time = 520.21 ms / 11 tokens ( 47.29 ms per token, 21.15 tokens per second) llama_perf_context_print: eval time = 2320.73 ms / 49 runs ( 47.36 ms per token, 21.11 tokens per second) llama_perf_context_print: total time = 2856.54 ms / 60 tokens Give rabbit recipe with carrotte ? Answer: 1 cup of carrot juice 1/2 cup of carrot juice Question: 1 cup of carrot juice recipe Answer: 1 cup of carrot juice recipe Question: 1 cup of carrot juice recipe Answer: 1 cup llama_perf_sampler_print: sampling time = 6.12 ms / 62 runs ( 0.10 ms per token, 10130.72 tokens per second) llama_perf_context_print: load time = 582.47 ms llama_perf_context_print: prompt eval time = 567.69 ms / 12 tokens ( 47.31 ms per token, 21.14 tokens per second) llama_perf_context_print: eval time = 2317.53 ms / 49 runs ( 47.30 ms per token, 21.14 tokens per second) llama_perf_context_print: total time = 2900.77 ms / 61 tokens Give rabbit recipe with carrotte ? Answer: 1 cup carrot juice recipe is made with 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice. - 1 cup of grated carrots and 1/2 cup of carrot juice.The 1 cup of grated carrots and 1/2 cup of carrot juice recipe is a great way to get kids to eat their vegetables. It's also a great way to get kids to drink their vegetables. This recipe is a great way to get kids to eat their vegetables. It's also a great way to get kids to drink their vegetables. This llama_perf_sampler_print: sampling time = 21.10 ms / 212 runs ( 0.10 ms per token, 10047.87 tokens per second) llama_perf_context_print: load time = 517.67 ms llama_perf_context_print: prompt eval time = 586.22 ms / 12 tokens ( 48.85 ms per token, 20.47 tokens per second) llama_perf_context_print: eval time = 9738.30 ms / 199 runs ( 48.94 ms per token, 20.43 tokens per second) llama_perf_context_print: total time = 10381.15 ms / 211 tokens

@fredpourlesintimes 2 ай бұрын

It's paying

@AIBites 2 ай бұрын

Could you please elaborate

@IvanLesnov 2 ай бұрын

How fine tune on offline (local) ?

@AIBites Ай бұрын

if you have powerful hardware (sufficient GPU memory), then you should be able to FT locally. Otherwise, we are at the mercy of the cloud 🙂

@jaggyjut 2 ай бұрын

Thank you for this tutorial.

@AIBites 2 ай бұрын

Glad you like it 😊

@bharatbhusansau2996 2 ай бұрын

Bro, your statement from 05:22 is completely wrong and misguiding. LoRA is used for finetuning LLM models, when full-finetuning is not possible. It does so by freezing all model weights, and incorporating and training low-rank matrices(A*B) in Attention modules. LoRA speeds up training and reduces memory requirements but does not provide a speedup during inference. If LLM model is too large to be handled by LoRA due to GPU memory limitations, Quantized LoRA is used to finetune the model. Overall, QLoRA is a more advanced solution when LoRA alone cannot handle large models for finetuning.

@AIBites Ай бұрын

Thanks for your feedback. I think we are pretty much on the same page. Can you be more specific what I am wrong with? Unfortunately I won't be able to edit the video but can at leaset pin a message to viewers pointing the errata

@IsmailIfakir 2 ай бұрын

is there is a multimodal llm can fine-tuning for sentiment analysis from text, image, video and audio ?

@Ram-oj4gn 2 ай бұрын

The quantisation of changing number format applies only to the result of activation function or also to the individual weights ? Where we apply this quantisation in the NN

@AIBites Ай бұрын

its applied to the weights. So the model size (weights) itself reduces and hence the computation also simplifies. Hope its clear now.

@AaronBlox-h2t 2 ай бұрын

Cool video....Some test would be appreciated. Also, maybe you can include the qwen vision models, especially.

@AIBites 2 ай бұрын

Sure will try to test models that come up in the future 👍

@IsmailIfakir 3 ай бұрын

you can fine-tuning this multimodal llm for sentiment analysis

@yoavtamir7707 3 ай бұрын

Thanks! this is an awesome explanation

@AIBites 3 ай бұрын

Glad you like it 😊

@yabezD 3 ай бұрын

where to draw these kinda charts ?. could you tell me. itll be helpful

@AIBites Ай бұрын

I think its keynote if I can recollect as its been a long time I did this video.

@yabezD Ай бұрын

@AIBites thanks for sharing, any others, android specific?

@Kurshu101 3 ай бұрын

Dude change the battery in your smoke detector

@AIBites 3 ай бұрын

Hah.. Already did 🤣

@saadowain3511 3 ай бұрын

thank you. I have a question ! do we use dspy for development or production ?

@xspydazx 3 ай бұрын

nice i made this before ! A model which picks the correct model ! BUT : then i decided tat a 1b agent can be the router model ! Then i decided that models as TOOLS ! so once you create an anthrpic as a tool , it will select the anthropic insteads ! i think its all about understanding the power of Tools andeven graphs and nodes : If we create some graphs then their start point are the tool ! SO: the docstring methodolgy is the best version of the tool calling method ! , perhaps with a react type framwork ( epecally when using tools ) by creating details docstring and example in the docstring , ach tool added will be woven into the prompt ! so the aim is to create model ( or tune one ) to use the react framwork as well as selecting tools ! -- I think that higging face agents is the methodology which is correct because we ca host models on hugging face .. and hit those spaces ! ... Spaces as TOOLS !.. SO again we see tools takinng a front role as the main prompt is to select the correct tool for the intent: also train for slot filling and intent detection (hf dataset ) .... the routing method was very good learning execises ! ... but it also needs the pydantic to send back the coreect route to select , when it could be done via a tool which is already preprogrammed iin to the library ( stoping reason )...

@dfdiasbr 3 ай бұрын

Thank you for that video. I've been studying this model and helped me a lot.

@AIBites 3 ай бұрын

Glad it helped 👍

@bitminers1379 3 ай бұрын

How did you push your own custom dataset on huggingface?

@AIBites 3 ай бұрын

Checkout the bunch of commands available in the HF command line tools. It's quite easy actually

@orangtimur6812 3 ай бұрын

i always got this message ImportError: cannot import name 'load_flow_from_json' from 'langflow' (unknown location) already clone from github using windows

@first-thoughtgiver-of-will2456 4 ай бұрын

fp16 also has a bigger mantissa than bfloat which benefits normalized or bounded activation functions (e.g. sigmoid)

@newbie8051 4 ай бұрын

Well the graphs at 2:18 are incorrect, sigmoid and tanh have different ranges, so the output gate should have range - 1 to 1 (tanh)

@AIBites 4 ай бұрын

thats a great spot. Copy pasting oversight I guess 🙂 will pay more attention while making the videos on attention. Thank you 😀

Ең жақсы KZbin

Пікірлер