RAG with Mistral AI!
12:25
5 ай бұрын
NEW Pinecone Assistant
13:47
6 ай бұрын
Semantic Chunking for RAG
29:56
8 ай бұрын
AI Agent Evaluation with RAGAS
19:42
Multi-Modal NSFW Detection with AI
9:23
New LangChain XML Agents
20:22
11 ай бұрын
OpenAI's NEW Embedding Models
16:31
FIRST Look at Pinecone Serverless!
11:06
Пікірлер
@alx8439
@alx8439 5 сағат бұрын
I always considered asking LLM to spill out the python code directly with using whatever functions you want it to use works much more native, robust and reliable, than taking this detour with JSON-here-is-how-you-call-the-tool thing. The only small downside is to think in advance on how to run the LLM generated code in a safe sandbox, so it won't do any crazy shit. But there's a plethora of options for that
@bradleywoolf3351
@bradleywoolf3351 Күн бұрын
great description
@deniz-gunay
@deniz-gunay 2 күн бұрын
When I run the following script: from haystack.document_stores import PineconeDocumentStore document_store = PineconeDocumentStore( api_key='**************************************, index='myindex', similarity="cosine", embedding_dim=768 ) It gave me an error like this: ApiException: (400) Request failed. You've reach the max pod-based indexes allowed How to fix that? EDIT: I have only one index, and even if I delete it, I still get the same error.
@nnamdiodozi7713
@nnamdiodozi7713 3 күн бұрын
Thanks James. Can the assistant ingest scnned pdfs (require OCR) as well as other file extensions like .doc, .pptx etc?
@deekshitht786
@deekshitht786 3 күн бұрын
You are awesome ❤
@fgfanta
@fgfanta 4 күн бұрын
9:54 I don't get where the 9.8 million comparisons in every attention layer comes from. Is every patch compared to every patch? That would be 256 patches => 65536 comparisons.
@natecodesai
@natecodesai 6 күн бұрын
Congrats on the son!
@ofrylivney367
@ofrylivney367 8 күн бұрын
nice video! I've been implementing a version of this myself.. this could be quite useful
@Swooshii-u4e
@Swooshii-u4e 8 күн бұрын
Is this the samething as gemini 1.5 deep research ?
@cloudcoders536
@cloudcoders536 10 күн бұрын
Retrieval Augmented Generation (RAG) has dominated the discussion around making GenAI applications useful since ChatGPT’s advent exploded the AI hype. In recent evaluations, GraphRAG demonstrated its ability to answer “global questions” that address the entire dataset, a task where naive RAG approaches often fail. "We need an alternative retrieval method to demonstrate its ability to answer 'global questions' that address the entire dataset, a task where naive RAG approaches often fail in modern AI Applications"
 Welcome to Graph RAG…! GraphRAG, Outperforms traditional RAG ( Retrieval-Augmented Generation ) for Query Focused Summarization This book is for programmers, researchers and developers who are interested in LLMs techniques and advancement for Generative AI specifically the recent GraphRAG: Open-source research of Knowledge Graph to support human sense-making, improving the accuracy of data discovery, solving RAG pain points, and to enhance LLMs ( Large Language Models ) ***Including Case-study of PDF AI Chatbot using Python 
Available On Amazon www.amazon.com/dp/B0DJB2N5T3
@IzzyLazerson-q3z
@IzzyLazerson-q3z 16 күн бұрын
great video - thanks!
@micbab-vg2mu
@micbab-vg2mu 19 күн бұрын
Thanks! Google Search and citations are great features! :)
@jamesbriggs
@jamesbriggs 19 күн бұрын
yeah they made adding citations easy
@jamesbriggs
@jamesbriggs 19 күн бұрын
📌 Code: github.com/aurelio-labs/cookbook/blob/main/gen-ai/google-ai/gemini-2/web-search.ipynb
@jimy.wuhani9500
@jimy.wuhani9500 19 күн бұрын
it was very clear thank you man you are the best
@wolpumba4099
@wolpumba4099 20 күн бұрын
*Overview of Google's Gemini 2 Model and its Multimodal Capabilities* * *0:00** Introduction:* The video discusses Google's new Gemini 2 model, highlighting its potential to be a strong competitor to OpenAI's models. * *0:41** Focus on Agents:* Gemini 2 is designed with a focus on agent use cases, emphasizing its ability to produce structured output effectively. * *1:53** Running the Code:* The presenter demonstrates running the code locally and provides instructions for running it in Google Colab. * *3:08** Describing Images:* * The model accurately describes underwater scenes, identifying various fish and coral species. * It successfully identifies a camouflaged fish that was difficult to spot. * The model sometimes misidentifies objects, particularly corals. * *9:29** Image Bounding Boxes:* * The model generates bounding boxes around objects in images, providing labels and coordinates. * A frequency penalty is used to prevent the model from repeating itself excessively. * The model is able to label various fish species, even those partially obscured. * The presenter noted improvement in the model's accuracy over a short period, indicating it might be under development. * The model struggles with identifying corals accurately. * It was able to identify a hidden fish in a shipwreck's turret. * *21:06** Example 2:* A large fish in the center of the image is correctly identified as a "painted sweetlips" whereas it was not identified in a previous run. * *23:29** Example 3:* A complex scene with many fish. * It's able to identify fish in the distance that are not clearly visible. * Identifies a black fish with a distinctive tail as a "Nazo Lituratus" - which may be related. * *26:52** Example 4:* * Able to identify the scene as a shipwreck. * Correctly identifies a cylindrical object as a likely gun barrel. * Unable to determine a specific location for the shipwreck. * *29:09** Image-to-Text:* The model can describe images and identify objects without explicit instructions to draw bounding boxes. * *30:50** Google Gemini vs OpenAI GPTs:* The presenter expresses optimism about Gemini 2's potential as an agent and its ability to compete with OpenAI's models in real-world use cases. I used gemini-1.5-pro-exp-0827 on rocketrecap dot com to summarize the transcript. Cost (if I didn't use the free tier): $0.03 Input tokens: 20215 Output tokens: 547
@johnny017
@johnny017 20 күн бұрын
I don't find this model really impressive for object detecion. Florence2 can already do a similar job and it is under 1B param model. For real world case, I would not trust prompt engineering to get my results. Rather I would prefer to fine tune the model. It is also a nightmare when google does some tweaking on the model as you were experiencing. I'm also experimenting grounding with Qwen 2VL as they have <|object_ref_start|>, <|object_ref_end|>, <|box_start|>,<|box_end|> tokens specifically for object detection. Thanks James for the update 🙏
@jamesbriggs
@jamesbriggs 20 күн бұрын
thanks for the info - I'll try florence2 and qwen 2vl
@micbab-vg2mu
@micbab-vg2mu 21 күн бұрын
I need to rethink my AI workflows-this model offers many new opportunities.
@jamesbriggs
@jamesbriggs 21 күн бұрын
yes, looking forward to testing gemini more
@ultrasound1459
@ultrasound1459 21 күн бұрын
keep rethinking every month then with new models coming out 💀
@madkimchi5444
@madkimchi5444 21 күн бұрын
Thanks for the Demo. An interesting model for sure, but anything non open source is not suitable for enterprise use. Not now, not ever, especially since even models that are even tagged as "Appropriate for Enterprise" go through a lot of changes and have their instructions changed while being live. It's an absolute nightmare to work with.
@absta1995
@absta1995 21 күн бұрын
A lot if not most apps used by enterprise are not open source
@jamesbriggs
@jamesbriggs 20 күн бұрын
imo in the future we'll be using more open source LLMs for the reasons @madkimchi5444 said, but rn open source LLMs can't do what we can do with OpenAI and other providers, so although it's annoying with model changes I think the only option for a lot of use-cases (not all) is to go with proprietary models locked behind APIs
@practical-ai-engineering
@practical-ai-engineering 21 күн бұрын
I've been saying this for months. Building LLM-centric agents is extremely limiting and dramatically reduces reliability. You don't need o1 to build good agents. GPT-4 level does a good job when you break The problem down into smaller pieces and only use LLMs in places where deterministic code can't do the job.
@danielaguileragarcia4813
@danielaguileragarcia4813 22 күн бұрын
Amazing!! Thanks James! Always explaining it really clear!
@SidonRoids-cc9gd
@SidonRoids-cc9gd 24 күн бұрын
Such an important topic in whole of computer science still only 30k views
@MohitAwasthi-p3o
@MohitAwasthi-p3o 25 күн бұрын
Hi James, thanks for beautifully explaining the idea of ViT. May I request a similar video on SLIP as well? I would really appreciate it. Thanks in advance.
@DemetrioFilocamo
@DemetrioFilocamo 28 күн бұрын
It would be good to have some examples to test or see the real life results, like when you did the 75% discount example, I could see how we can semantically block ANY discount to be proposed by the llm, but will it work if we want to allow max 10% discount instead?
@andreyklepikov7084
@andreyklepikov7084 29 күн бұрын
Thank you for such a good explanation
@iyanello
@iyanello Ай бұрын
James, how does the semantic router behave for long user queries? For example, the user's request to `check the weather` is in the middle of the long query, so it can miss the function calling because the score is pretty high. But a lower score will hurt short queries.
@anshulbhide
@anshulbhide Ай бұрын
Loved this video! Would be great to have more theoretical videos like this one.
@anshulbhide
@anshulbhide Ай бұрын
Especially around the history of AI leading upto today.
@energyexecs
@energyexecs Ай бұрын
Great video. I agree with James Briggs. I recently submitted a RAG Idea to my AI Idea Evaluation Team. The idea to do RAG on our massive library of PDF technical documents available in the public domain. We did the Retrieval and the Augmentation of our LLM but when I started testing and prompting (asking questions) to our LLM (Chat GPT) the answers were lots to be desired and I was somewhat disappointed. I learned a lot that and realized that perhaps we were not really chunking the PDF documents into a Vector Database and my Prompts had to somewhat juice the LLM. After several prompts the answers got better and better. I was almost loosing interest in the whole effort. Then I remember James Briggs and still pursing RAG.
@anshulbhide
@anshulbhide Ай бұрын
Great video! However, Doesn't the latest version (0.0.72) use RouteLayer instead of SemanticRouter?
@jamesbriggs
@jamesbriggs Ай бұрын
0.0.72 is the latest release, but there have been a few “prereleases”, which are all numbered from 0.1.0.dev0 to 0.1.0.dev3, because they’re prereleases, pip won’t install them unless you specify that exact version - once everything is ready I’ll mark the 0.1.0 full release
@2098ramkumar
@2098ramkumar Ай бұрын
Best video I have seen on embeddings
@CEO_of_code
@CEO_of_code Ай бұрын
Not safe for work and safe for work if you have been wondering
@duongkstn
@duongkstn Ай бұрын
thanks <3
@jamesbriggs
@jamesbriggs Ай бұрын
you're welcome!
@AristideFumagalli
@AristideFumagalli Ай бұрын
such a cool idea - and it does exactly what it sounds like it needs to do. will be very useful for tool use and llm routing
@jamesbriggs
@jamesbriggs Ай бұрын
yeah we use it all the time for a lot of different things, I'm planning to share more hands on examples soon
@wild0nE12
@wild0nE12 Ай бұрын
Would the semantic router work if the questions are written in different languages?
@DoktorUde
@DoktorUde Ай бұрын
If you use a multilingual embedding model like those from OpenAI, you should be fine. It will likely work best when the utterances and input language are the same, but in theory, it should still function even if they are different.
@jamesbriggs
@jamesbriggs Ай бұрын
I can put together an example for multilingual use - but yeah it's pretty easy, I'd probably recommend using Cohere's models if you want a hosted multilingual model, this link should help: - Cohere multilingual model ie "embed-multilingual-v3.0" huggingface.co/Cohere/Cohere-embed-multilingual-v3.0 In semantic router you would install the cohere dependency, ie `pip install semantic-router[cohere]` then use their encoder like: ``` import os from semantic_router.encoders import CohereEncoder os.environ["COHERE_API_KEY"] = "your_api_key" encoder = CohereEncoder(name="embed-multilingual-v3.0") ``` With that you're good to go :)
@wild0nE12
@wild0nE12 Ай бұрын
Thanks! 💯
@robfielding8566
@robfielding8566 Ай бұрын
I'm doing something very similar in Golang, manually handling the creation of the system prompt. The hardest thing is to get the models to simply obey the output format! For instance, a bot that fills out forms; acting as the secretary at the front desk of an office. It's instructed to use SAY, SET {fieldName}, SAVE; and its job is to interrogate the user about a list of form fields. So, the system prompt has injected into it: []{fieldName, fieldFormatExample, fieldCurrentValue} ... and it interrogates you in whatever language you are speaking. The chat history accumulates existing values. In the end, I have a Go program that loads up a big xml file that contains all the assets such as prompts, html templates, config. The main thing is getting back responses like this: SAY We are setting your full name to Robert Jason Fielding SET FirstName Robert SET MiddleName Jason SET LastName Fielding SAVE SAY What is your phone number? ie... filling out form fields in bulk; one round of conversation; easily switching languages. I don't use LangChain when I am using Python; but it seems like those libraries are handling a bunch of miscellaneous stuff to read and write random data sources. I wish there was a way to make the output format completely mandatory; literally impossible for it to insert conversational junk that breaks the parser.
@tonycardinal413
@tonycardinal413 Ай бұрын
awesome!
@DavitKhachaturyan-c1y
@DavitKhachaturyan-c1y Ай бұрын
I am struggling to understand how the search would work if we needed to handle some kind of structured data. Anyone having thoughts? I would appreciate your assistance
@felixkuria1250
@felixkuria1250 Ай бұрын
In building agents, which is preferred; langgraph or crewai ?
@jamesbriggs
@jamesbriggs Ай бұрын
I haven't used crewai enough to judge how they compare in functionality, but just in usability I find langgraph far better for actual applications - this is based purely on the number of dependencies between the two libraries. Crewai is bloated and bloat means slow deployments, higher chance of future issues, dependency conflicts, etc - and it's because of the bloat that I haven't given the library much of a chance I do plan to try them out again soon though, it was a while ago since I last tried
@АнтонБ-х9у
@АнтонБ-х9у Ай бұрын
RAGAS must go. I dont really understand why such weak method promoted so much.
@janspoerer
@janspoerer Ай бұрын
Thank you for the video! The notebook seems to have been removed from the repository. Is it still available?
@SwePalm
@SwePalm Ай бұрын
If you want to go deep into the rabbit hole of Neuro Symbolic AI there is some interesting recordings here www.youtube.com/@asimmunawar/videos
@shantanukaushikmathoholic
@shantanukaushikmathoholic Ай бұрын
Very Helpful, subscribed !!
@yotubecreators47
@yotubecreators47 Ай бұрын
The font at the beginning is hard to read
@davide0965
@davide0965 Ай бұрын
Too much talk and very few illustrations
@simonoliverhansen7307
@simonoliverhansen7307 Ай бұрын
Love your videos and wlakthroughs - really easy to understand!! thank you james
@olimoz
@olimoz Ай бұрын
By the way, do check out Max Tegmark's recent paper "The Geometry of Concepts: Sparse Autoencoder Feature Structure" - points in the same direction
@jamesbriggs
@jamesbriggs Ай бұрын
very cool - probably I will try and reproduce something similar, thanks for sharing!
@olimoz
@olimoz Ай бұрын
I've loved embeddings for a long time, so very powerful for semantic AND analogical search. But their power relies on the quality of the embedding model. We've been looking at SAE's (mechanistic interpretability of LLMs) as embedders. All concepts in the embeddings are unique (sparse_, and decomposed, ie Interpretable embeddings from an LLM. Goodfire intend to make this into a service, do take a look at their Goodfire SDK, specifically the Latent Explorer. Their SDK is new this week for the ApartLabs Hackathon tomorrow.
@ramakantasamal7482
@ramakantasamal7482 Ай бұрын
Just curious if there is a way to pass required param option to the function schema definition and fail the function call if not provided.
@codewithbrogs3809
@codewithbrogs3809 Ай бұрын
Not a great description of what's going on
@web3namesai
@web3namesai Ай бұрын
Great RAG walkthrough! With Web3NS, you can combine this tech with YourName.Web3 to create decentralized AI agents that integrate seamlessly with Web3 ecosystems. 🚀
@UmarFaruk-f8t
@UmarFaruk-f8t Ай бұрын
Are you suggesting the image encoder never have seen any cassette player during it's training and still it has formed an understanding of it in the latency space? I would disagree to this.