"How to give GPT my business knowledge?" - Knowledge embedding 101

  Рет қаралды 189,344

AI Jason

AI Jason

Күн бұрын

Пікірлер: 332
@AIJasonZ
@AIJasonZ Жыл бұрын
A few people asked “why only vectorise one column instead of the whole csv?” Adding a few more explanation here: So vectorise is mainly for search, and the column to vectorise can be considered as “index” or “id” of the dataset; while the data it return will still be in question/answer pair; The reason I want to vectorise only one column is because: 1. It save cost - vectorise using embedding model which means every token we vectorise generate cost 2. It increase accuracy, in this case I want to only search for past customer email instead of sales response; search both column might return wrong answer “e.g. search for “interested in learning more”, it can return pair: “client: stop sending me emails; sales: understood, let us know if you are interested in learning more in future!” Hope this help!
@ozfish17
@ozfish17 Жыл бұрын
It seems Embedding enriches your search query. how about answers? In your example, do you 'train' llm with Q&A pair?
@AIJasonZ
@AIJasonZ Жыл бұрын
@@ozfish17 yep, it return both Q&A pair!
@Taskade
@Taskade Жыл бұрын
Jason, brilliant step-by-step guide on knowledge embedding! Your breakdown of the process was super insightful. I'm curious about how AI Agents in Langchain perform, especially in long-running scenarios. Hope you'll consider diving into that topic in the future. Keep up the stellar content!
@sandeepbansal1195
@sandeepbansal1195 Жыл бұрын
So if you want the output response email to be generated by the LLM based on a specific tone, why wouldn't the 2nd column be a part of vectorizing the dataset?
@csss142
@csss142 Жыл бұрын
Hey Jason! What would be the best way to do this with financial PDFs? I want to ask questions and get accurate insights from the large documents. Would using embeddings be best or the fine tuning from your other video? Thanks! @AIJasonZ
@psychxx7146
@psychxx7146 Жыл бұрын
Small channels like this are the ones that hold the most values.
@Helpsmallbusinesses
@Helpsmallbusinesses Жыл бұрын
In 2 minutes and 54 seconds you explained what is vectoring better than any other video online. You made it easy. Thank you!
@funkyboodah
@funkyboodah 9 ай бұрын
man you have a really rare ability to explain super complicated things in a very simple way and organize the information so it's even more clear. Bravo and thank you
@photon2724
@photon2724 Жыл бұрын
Anyone looking to make a great startup in AI,you have to jump on this!
@i_forget
@i_forget Жыл бұрын
Working on it!
@dragoon347
@dragoon347 Жыл бұрын
Working on it now
@verasalem5071
@verasalem5071 Жыл бұрын
Love your content, very easy to digest and understand. The only recommendation I would give is to use other embeddings and LLM models besides OpenAI. Mid/Large sized companies cannot use OpenAI in their environment because of legal issues around OpenAIs data retention policy. Alot of companies want to develop their own implementations so including other models like Llama 2, Vicuna, etc would allow you to reach a bigger audience.
@AIJasonZ
@AIJasonZ Жыл бұрын
yea great points, thanks for the recommendation! totally get that company dont want to send any data to OpenAI LOL
@Ascended23
@Ascended23 Жыл бұрын
+1 for using more open models. I love your content and the approach you take to your videos. But even though I'm not a big company I just value using systems that are open instead of closed.
@averagegamer9513
@averagegamer9513 Жыл бұрын
Great video as always, Jason. Thank you for making one of the few channels with genuine AI tools video that actually demonstrate implementation and applications rather than hyping up the content through sweet talk then simply dropping an affiliate link.
@devklepacki
@devklepacki Жыл бұрын
This! I feel so grateful that the KZbin algorithm blessed me with Jason's channel. Beautiful explanations and clear steps.
@koen.mortier_fitchen
@koen.mortier_fitchen Жыл бұрын
Yeah, he's one if the real ones. I've asked him if he could add a github for the code. It's the only thing this channel lacks imo.
@frankchangshow
@frankchangshow Жыл бұрын
@@devklepackisame feelings here
@sidavidsin
@sidavidsin Жыл бұрын
Thank for sharing your knowledge with us, your channel is literally a gold mine of information. Keep doing what you doing, Jason!
@humadi2001
@humadi2001 8 ай бұрын
I've watched many video on this topic and I can say that your simple examples has covered most of what I need to know. Thanks Jason.
@nguyenvanduc2000
@nguyenvanduc2000 6 ай бұрын
I have the same idea in mind. I have tons of product documents that I wish I could just ask an agent something about it instead of scrolling hundreds of word pages. I really appreciate your video man.
@_arman_
@_arman_ Жыл бұрын
Man... you have a serious gift for teaching! This is super helpful. Thanks.
@muhammadanasazambhatti2772
@muhammadanasazambhatti2772 Жыл бұрын
Thank you very much! Nobody explained Embedding and Vectorization like this! Thank you again!
@fuxxs5994
@fuxxs5994 Жыл бұрын
I really love your style, first explaining the theory and then demonstrating it by an example
@normanluismadrid422
@normanluismadrid422 Жыл бұрын
this is virtual gold, mad props to jason for clearly describing complex topics and even showing practical application, saved me hours of research lol, it'd be great if you can touch up on the various services out there that offer AI services that embed, and how they compare in performance, pros / cons etc.
@farid3101
@farid3101 8 ай бұрын
I am really surprised that these tools can help so many businesses doing the low-cost and autonomous response specifically for customer service! Great video!
@shivamroy1775
@shivamroy1775 Жыл бұрын
Absolutely great video, I loved that you took the time to explain everything in theory and then went on to give a detailed walkthrough of the code. Please keep posting such videos !
@VaibhavShewale
@VaibhavShewale Жыл бұрын
this is just awesome, now people who didnt had idea now dont only have idea but also reference
@coldestlin
@coldestlin 10 ай бұрын
当中间向量查询的结果出来, 一下子就了解了整个流程, 非常赞. 原来是拿向量查询的结果, 再去扔给llm, 当作promt instruction, 然后让llm给出答案.
@devinoutfleet1998
@devinoutfleet1998 Жыл бұрын
Bro... you are incredibly smart and are a great teacher. This is going to provide 10x value to my users
@_yasser
@_yasser 5 ай бұрын
This is my new favorite channel. The topics are pretty dense and dry - but you make them super easy and fun to learn. Thank you!
@davidkwon1233
@davidkwon1233 Жыл бұрын
one of the best channels out there, really appreciate your content!
@growthub8541
@growthub8541 Жыл бұрын
So helpful! I started using relevance ai because of your videos & just as a no-code developer been able to build some sick ass LLM chains with Zapier Custom HTTP Requests. I have my development team even using it & it’s definitely speeding up our velocity to iterate🙌🔥
@AIJasonZ
@AIJasonZ Жыл бұрын
thats great to hear! 🤘
@camach28
@camach28 Жыл бұрын
It would be amazing if you could make a video creating a knowledge base using long pdfs as source,, and use gpt as well to make an expert assistant in a topic.
@frankchangshow
@frankchangshow Жыл бұрын
Yes like if the data source is like a book and we want to search the contents in it giving relative data like “I remember this part of the book saying something like this… where was it?” … or “the book had this story … where was it and the main ideas”
@rahuliyer6007
@rahuliyer6007 Жыл бұрын
Came here after the fine tune model video - looking for exactly this. Thanks!
@christhornham
@christhornham 7 ай бұрын
Outstanding. Your ability to explain complicated topics is incredible. Thank you.
@stepkurniawan
@stepkurniawan Жыл бұрын
yo bro.. i really like when you explain all the step-by-step and all relevant tools out there! thank you!
@Optable
@Optable Жыл бұрын
You have helped the community so much with this valuable content. Keep it up my friend, i'll be watching!
@shethromesh
@shethromesh Жыл бұрын
Loved to see similar demo of knowledge search with open source models not with openai models
@Gingeey23
@Gingeey23 Жыл бұрын
Great video Jason, however the biggest challenge for companies will be ensuring that commercially sensitive information isn't fed into hosted LLM models due to security concerns. Would be really interested to see how you would approach this challenge, and potentially try to deploy this tool locally? keep up the good work!
@AIJasonZ
@AIJasonZ Жыл бұрын
Thanks mate! Yea I agree, I heard business talk about sensitive information a lot, especially ones with clients data; There are 2 ways I see it can be solved now: 1. Self host LLM, using Azure self host version or even using open source models; so you don’t send info to openai 2. Anonymoulyse your input/output data, so openai don’t have a clear idea that data A is from company A;
@devklepacki
@devklepacki Жыл бұрын
If using hosted LLM like OpenAI's this would probably 1. require just a lot of manual work with clearing all the data or 2. first pushing the data through lighter local LLM with a task to clear any sensitive information (like they used one LLM to create training prompts for another LLM). Just a thought, tho
@SaminYasar_
@SaminYasar_ Жыл бұрын
Keep it up man probably one of the only channels with incredible value
@jasonfinance
@jasonfinance Жыл бұрын
the best video about embedding ive seen; thank you!
@PlectrumShorts
@PlectrumShorts Жыл бұрын
Great tutorial! You covered a LOT of ground quickly, but thoroughly. Haha. Nice work.
@JJ-vq8mu
@JJ-vq8mu Жыл бұрын
Great job and appreciate a lot on sharing your knowledge. Looking forward for Open LLM content.
@kurtcampher4716
@kurtcampher4716 Жыл бұрын
thank you for this As a dev with no AI experience, you really make it easy to understand
@ludwigvanbeethoven61
@ludwigvanbeethoven61 Жыл бұрын
I wonder why those AI channels, like yours, are not exploding. This is so important for the future what you all are doing. Only a few people get this!
@stevi32800
@stevi32800 Жыл бұрын
I really like your video. You knows how to reach the people attention. Please make more videos like this 😊
@koen.mortier_fitchen
@koen.mortier_fitchen Жыл бұрын
Thanks for your work Jason. You're one of the best, and I follow tons.
@Artificial_Noob
@Artificial_Noob Жыл бұрын
Great video man! I hope you can cover more "No Code Methods" for beginners like me that are not very technical! The last part of this video was GOLD for me. cheers!
@pietdebeer7972
@pietdebeer7972 Жыл бұрын
Same here
@Ozla102
@Ozla102 Жыл бұрын
The video is very inspiring and straightforward, a valuable lesson
@michalf16
@michalf16 Жыл бұрын
Love your content good sir, tuned for all next videos you are the leader
@wojpaw5362
@wojpaw5362 Жыл бұрын
Absolutely outstanding. I liked, subscribed and shared. Best explanation of knowledge embedding I have come across!!!!
@TheDestint
@TheDestint Жыл бұрын
This is super duper helpful man ! Great work and thanks !
@maciejbalasinski2419
@maciejbalasinski2419 Жыл бұрын
Thanks for No coding alteratives
@pietdebeer7972
@pietdebeer7972 Жыл бұрын
I'm blown away. Thank you!!
@ristopaasivirta9770
@ristopaasivirta9770 Жыл бұрын
My friend. You have an uncanny ability to teach AI science and concepts to us pepegs. This and your other videos are really good at explaining on how the neural networks work, not just how to do the thing.
@half_way_expert
@half_way_expert Жыл бұрын
Another great video! Thanks Jason, keep up the excellent work
@manojnaidu619
@manojnaidu619 6 ай бұрын
Cannot be more valuable than this. Loved it 🎉
@gautamdawar5067
@gautamdawar5067 Жыл бұрын
This is pure gold. Thank you so much!
@naimneman
@naimneman Жыл бұрын
Amazing video Jason! Pretty useful information. I would love to see a video about GPT4All as a personal assistance for everyday life.
@Grumptr0nix
@Grumptr0nix Жыл бұрын
This is exactly what I was looking for... I have a tremendous amount of assets (Requirements docs, project plans, etc) that we've created over and over for all our engagements, and I'm trying to find a way for us to stop reinventing the wheel. All of which are in our Google Drive, but I'm having trouble conceptualizing how I'd be able to turn that into vectored data (you talk about text splitter, but I'm still a bit confused about its application). Anyways, I'll do more research but this is amazing content. Thank you.
@Grumptr0nix
@Grumptr0nix Жыл бұрын
And for sure, the legal issues with our business data and OpenAI that is discussed in other comments have been a blocker for us as well, but at least there's options.
@aliyousefi9735
@aliyousefi9735 Жыл бұрын
you're the man Jason, great content!
@ridg2806
@ridg2806 11 ай бұрын
Really high quality content, thank you Jason!
@kiraakamaru
@kiraakamaru Жыл бұрын
This is exactly what I was looking for, I have a question Jason: How can we secure our company personal data?
@aliq6709
@aliq6709 7 ай бұрын
This was super helpful. Thank you, Jason!
@robertcormia7970
@robertcormia7970 10 ай бұрын
Very well done! Straightforward to follow!
@kylelau1329
@kylelau1329 Жыл бұрын
have been waiting for this video, Thank you!
@AssassinUK
@AssassinUK Жыл бұрын
This was 🔥🔥🔥. If I hadn't already subscribed, I would have. Excellent use case! Looking to impliment this using Flowise.
@AlessaOxygen-ot4rl
@AlessaOxygen-ot4rl 10 ай бұрын
This is hilariously good. Thanks for this wonderful ressource!
@MrDe0
@MrDe0 Жыл бұрын
This is GOLD !! Thank You !
@shrvn110
@shrvn110 Жыл бұрын
this dude is on FIRE 🔥
@manideepatalukdar9201
@manideepatalukdar9201 Жыл бұрын
Great video! Very simple to understand.
@nealshah5874
@nealshah5874 5 ай бұрын
This is the greatest video ever created
@rimilien
@rimilien 4 ай бұрын
Thank you my friend! Awsome work!
@IanTrolinger
@IanTrolinger Жыл бұрын
this is the best video on your channel.
@karankanchetty8320
@karankanchetty8320 8 ай бұрын
Great job. You deserve more subscribers.
@AI_Ron
@AI_Ron Жыл бұрын
These are gems
@AndrejsKarpovs
@AndrejsKarpovs Жыл бұрын
I have a couple of questions: 1) I have 0 knowledge about Vector databases, but don't you need to define some kind of access related information, connection string, username/password, etc. to use it? Did you define it in .env file? 2) How does this method compare to PEFT/LoRa? Does it basically achieve the same thing? It looks like embeddings can be a faster solution
@AIJasonZ
@AIJasonZ Жыл бұрын
Hey mate! 1/ if you want to have the vector database stored on managed cloud solution like pinecone, then yes you can create account and use them; in this example I used Faiss, which is not a managed database solution; so it just store on your local machine 2/ so Lora or other fine tune solution as I mentioned at the beginning is more use case of getting LLM behave in certain way (e.g. digitise someone), while embedding is useful for knowledge retrieval (e.g. Q&A on your own data)
@Fiop22
@Fiop22 Жыл бұрын
I haven’t watched the video, but to answer your first question if you’re using a cloud service like pinecone then yes. Alternatively, you can store the embeddings locally as a .csv for example and perform the lookup via cosine similarity with numpy for example.
@BillVoisine
@BillVoisine 4 ай бұрын
Thank you Jason, this is awesome an very helpful!
@chrisvienneau3366
@chrisvienneau3366 Жыл бұрын
Great content and love the intros
@KarlJuhl
@KarlJuhl Жыл бұрын
Great resources Jason, I will add to the flood of comments - you are a great communicator and you move at a good speed. Thanks for sharing! It is interesting how many langchain UI apps are being built. Relevance AI looks to be the most integrated from end to end, with such an easy deploy process. I am curious to know your thoughts on using a UI tool like flowise or relevance AI versus custom programming.
@frankchangshow
@frankchangshow Жыл бұрын
I have the same question
@daviddoch4872
@daviddoch4872 Жыл бұрын
Great video. We all want a AI that does not censor social datasets based on race and gender. GPT will not ansewer any questions related to sociology!!!
@markieuanroberts
@markieuanroberts Жыл бұрын
Awesome explanation, thanks.
@aibeginnertutorials
@aibeginnertutorials Жыл бұрын
Hey Jason thanks for the always excellent presentations and information. The Streamlit and RelevanceAI information were interesting and useful. Relevance reminds me of another great product, Flowise.
@frankchangshow
@frankchangshow Жыл бұрын
I don’t know if I should use stack ai, relevance ai, or flow wise. Going into decision fatigue now
@patriciodiaz2377
@patriciodiaz2377 Жыл бұрын
Thanks a lot for the info!! Greetings from Mexico 🤙
@JoeyIngles
@JoeyIngles Жыл бұрын
AI Jason is fast becoming Uncle Jason! Thank you, so much.
@himanshumishra6253
@himanshumishra6253 Жыл бұрын
hey, can you share the salse_response.csv file also, it's not in the git repo
@YangYang-rh8uy
@YangYang-rh8uy 8 ай бұрын
Exactly want I want , thanks Jason.
@CyberSQUID9000
@CyberSQUID9000 Жыл бұрын
More excellent content, thanks mate
@xulipaTV
@xulipaTV Жыл бұрын
You are the man Jason!
@kylearnold9647
@kylearnold9647 Жыл бұрын
Thank you! This was incredibly helpful
@aquahoodjd
@aquahoodjd 9 ай бұрын
I have a lot of high-quality and reliable data some for a better general model and some for special use cases.
@RaulCasado-l9t
@RaulCasado-l9t 10 ай бұрын
Dude. You. Are. Awesome!
@ivant_true
@ivant_true 7 ай бұрын
you make really useful videos man
@adi2hot
@adi2hot Жыл бұрын
Fantastic content, thank you.
@andrzejpec4886
@andrzejpec4886 Жыл бұрын
Big thank you ❤
@takeshikriang
@takeshikriang Жыл бұрын
Great video, subscribed.
@davide.2349
@davide.2349 Жыл бұрын
Jason you are awesome!
@arunkabilan
@arunkabilan Жыл бұрын
Great explanation
@NickWatching
@NickWatching 9 ай бұрын
Amazing explanations, thank you!
@DeLeizard
@DeLeizard Жыл бұрын
Thank you for the super video. I'm learning LLM and am quite confused between knowledge base embedding, that was mentioned, vs prompt tuning. Could you tell me the difference?
@davidwylie8491
@davidwylie8491 Жыл бұрын
Amazing! Thanks for sharing
@fenderbender2096
@fenderbender2096 Жыл бұрын
Very nice video.
@tahunal
@tahunal Жыл бұрын
Bro you are awesome.
@tauraik
@tauraik Жыл бұрын
Amazing content my guy Amazing
@groccy
@groccy Жыл бұрын
Thank you for making these great contents, Jason! You literally created a gold mine for LLM practitioners. Really appreciated it! Any chance we can find your codes taught in this video online?
@AIJasonZ
@AIJasonZ Жыл бұрын
Hah I had a hard time to define my audience, and LLM practitioner is kinda perfect! Sure thing, I will open up the GitHub link soon
@groccy
@groccy Жыл бұрын
@@AIJasonZ Thanks. Can’t wait!
@rverm1000
@rverm1000 Жыл бұрын
great video! is that enough info to go out and start building a customer response ai for other people or businesses?
@facundozupel4166
@facundozupel4166 20 күн бұрын
Jason, first of all, this video is very very clear and well explained, so thank you and congratulations on such great content! already subscribed. I have a question about how to evaluate if the context it´s being used correctly. I know langchain have some retrieve functionality, is it worth checking it out?
@SoumyaDasgupta
@SoumyaDasgupta 2 ай бұрын
Superb stuff.
@desiderata2745
@desiderata2745 Жыл бұрын
Thanks!
@oscarcharliezulu
@oscarcharliezulu Жыл бұрын
Excellent vid thank you !
This is how I scrape 99% websites via LLM
22:44
AI Jason
Рет қаралды 111 М.
Из какого города смотришь? 😃
00:34
МЯТНАЯ ФАНТА
Рет қаралды 2,5 МЛН
Hoodie gets wicked makeover! 😲
00:47
Justin Flom
Рет қаралды 136 МЛН
Qwen Just Casually Started the Local AI Revolution
16:05
Cole Medin
Рет қаралды 87 М.
OpenAI Embeddings and Vector Databases Crash Course
18:41
Adrian Twarog
Рет қаралды 500 М.
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
15:21
GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem
19:15
Better than Cursor? Future Agentic Coding available today
27:32
Run your own AI (but private)
22:13
NetworkChuck
Рет қаралды 1,7 МЛН
"okay, but I want Llama 3 for my specific use case" - Here's how
24:20
Make your agents 10x more reliable? Flow engineer 101
17:11
AI Jason
Рет қаралды 16 М.
Из какого города смотришь? 😃
00:34
МЯТНАЯ ФАНТА
Рет қаралды 2,5 МЛН