Information Extraction with LangChain & Kor

  Рет қаралды 20,142

Sam Witteveen

Sam Witteveen

Күн бұрын

Colab : colab.research.google.com/dri...
In this video I go through examples of how you can extract information out of text using a LLM with LangChain and Kor
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/samwit/langchain-t...
github.com/samwit/llm-tutorials
00:00 Previous Video
00:38 Intro to Kor
02:38 Basic examples for text extraction
08:45 Pydantic Classes for extraction

Пікірлер: 52
@VikramSoni2
@VikramSoni2 Жыл бұрын
Great work you are doing here mate. Love the structure of your videos and colab examples. I stopped looking in this area due to explosion of tools which are repetitive. And use only your channel to find out whats new nowadays. Keep up the awesome videos.!
@kevon217
@kevon217 Жыл бұрын
You’re a godsend. You’ve really helped me understand and utilize the power of these approaches and the packages. Appreciate it!
@Canna_Science_and_Technology
@Canna_Science_and_Technology Жыл бұрын
Awesome! Thank you for taking the time to pass on some of your wisdom and knowledge.
@hichamzmarrou3762
@hichamzmarrou3762 Жыл бұрын
Great video as usual, thanks sam for the great work!
@SloanMosley
@SloanMosley Жыл бұрын
Keep ‘em coming 🎉
@RedCloudServices
@RedCloudServices Жыл бұрын
Sam I subscribe to a ton of AI LLM channels you are a top notch resource thank you. I just need to try this out on the weekends 😆
@gigabytechanz9646
@gigabytechanz9646 Жыл бұрын
Great Work! Thank You!
@tubingphd
@tubingphd Жыл бұрын
Thank you Sam
@lacknerish
@lacknerish Жыл бұрын
Great video as usual. Thanks for your hard work. You should do one about Microsoft Guidance! I find the template-driven format pretty natural and ideal.
@samwitteveenai
@samwitteveenai Жыл бұрын
yeah I want to do one about that and guardrails etc.
@andy111007
@andy111007 9 ай бұрын
Hey Sam, I changed the example: examples=[ ( "My first fav meal was at a restaurant called Burnt Ends in Singapore where there top dish italian cuisine was lasagne.", {"name": "Burnt Ends", "location": "Singapore","style":"cuisine","top_dish":"lasagne"}, ) ], and results were much better Cheers, Andy
@ChenXibo
@ChenXibo Жыл бұрын
it's just great! one idea, let's assume the doc to be eactracted is a human made one, so one validator could be to create an agent, asking for help to the doc maker (human) based on the dataframe extracted.
@mukkeshmckenzie7386
@mukkeshmckenzie7386 Жыл бұрын
Thank you!
@MadhavanSureshRobos
@MadhavanSureshRobos Жыл бұрын
Amazing. Why can't we use open source models with small size to test and improve the responses for these tasks? That would be real value instead plugging everything to GPT3 or 4
@alx8439
@alx8439 Жыл бұрын
Have a look on Guanaco which was trained following new QLoRA approach. Might be interesting for you and your audience
@adityaroy4261
@adityaroy4261 11 ай бұрын
Is there any other open source models that can extract information apart from openai?
@sammathew535
@sammathew535 Жыл бұрын
Thanks Sam. I actually used it for extracting items that would even require some math reasoning (e.g., total cost would require some math operations to be performed, based on the numbers in the text) which is then left to the LLM's accuracy. It got the text objects right most of the time but didn't do all that well on numbers. Any suggestions how this could be implemented?
@vivekmathur3068
@vivekmathur3068 11 ай бұрын
I am trying to do the same. I can't seem to understand what I need to search for.
@efneogearbox
@efneogearbox 10 ай бұрын
Hi Sam, thanks for this video! Do you know maybe how to use vectorestores with kor? Kor generates long prompts and when I add a text, it usually exceeds the token limit on OpenAI. When using pure LangChain, I can easly use text splitter and vectorstore to grab the relevant chunks of text, but I find it difficult to replicate it with kor. Any idea how to go around it? Thank you! Franek
@user-gp6ix8iz9r
@user-gp6ix8iz9r Жыл бұрын
Hi Sam good video 👍 can you make a video on how to run Private GTP on your local machine and colab 🙂👍
@Truizify
@Truizify Жыл бұрын
Thanks for the awesome videos! 👏 What's interesting about this one is that it seems to work well in my limited testing, but the author himself claims the implementation is "half-baked" and prone to error. They recommend people try alternative libraries like Promptify and MiniChain to achieve the same output - could you do a video on either/both of those?
@samwitteveenai
@samwitteveenai Жыл бұрын
Yeah I should make a benchmark comparing to the alternatives. I think the author of Kor is very honest and I think many of the issues are to do with the qualities of the LLM rather than that package.
@Truizify
@Truizify Жыл бұрын
@@samwitteveenai That would be awesome!
@user-ss9bh6jp9s
@user-ss9bh6jp9s Жыл бұрын
hi thank you so much for the wonderful video. Is if fine to use confidential information of a company for information extraction using Langchain?? I mean does langchain itself doesn't have privacy concerns for that usage?
@samwitteveenai
@samwitteveenai Жыл бұрын
No it is just software that runs on your setup, its where the tools and LLMs are hosted that cause the privacy issues etc
@alexdantart
@alexdantart Жыл бұрын
and what about not using openai and use any nice pretrained model in your language ?
@shivanidwivedi1625
@shivanidwivedi1625 2 ай бұрын
does output we get using kor depend on the operating system we are using?
@nosiphondlovu6751
@nosiphondlovu6751 8 ай бұрын
Thanks a lot How is your real-life example not nested, how possible is it.
@user-wr4yl7tx3w
@user-wr4yl7tx3w Жыл бұрын
Wow, that’s really cool. Is Kor the only game in town for doing this currently?
@samwitteveenai
@samwitteveenai Жыл бұрын
No there are a couple of other ways as well, So I might make a vid about them at some point as well
@krishradha5709
@krishradha5709 10 ай бұрын
Can we use the extracted text as an input to the llm?
@fernandosanchezvillanueva4762
@fernandosanchezvillanueva4762 Жыл бұрын
Great work, How long can be the sentence? The same number of token than ChatGpt admite?
@samwitteveenai
@samwitteveenai Жыл бұрын
It's not limited to a sentence really it can be anything you can 'stuff' into one pass of the LLM, I generally do a few paragraphs at a time.
@user-dg7ud7wv5n
@user-dg7ud7wv5n 6 ай бұрын
I cannot get past the StringIO error in Kor library. can anyone help me with this?
@pypypy4228
@pypypy4228 11 ай бұрын
14:23 probably pd.json_normalize(json_data) would work out of the box here
@andy111007
@andy111007 9 ай бұрын
Hey Sam, chain.predict_and_parse has deprecated, please change to : output2 = chain.run(text=("Alice Doe moved from New York to Boston, MA while Bob Smith did the opposite"))["data"] printOutput(output2) def printOutput(output): print(json.dumps(output,sort_keys=True, indent=3)) Regards, Andy
@eduardomoscatelli
@eduardomoscatelli Жыл бұрын
Incredible. Question of 1 million dollars 😊: How to "teach" chatgpt just 1 time what the schema is and be able to validate infinite texts without having to spend a token inputting the schema at the prompt and without having to train the model via fine-tune?
@samwitteveenai
@samwitteveenai Жыл бұрын
It is all put in via ICL (In context Learning)
@eduardomoscatelli
@eduardomoscatelli Жыл бұрын
@@samwitteveenai Thanks for the tip. Do you have any indication of material so I can do this in a nocode way?
@programwithpradhan
@programwithpradhan Жыл бұрын
Can you please tell me instead of giving text data is there any other way so that i can give embedding vectors as input to the llm with this approach that you discussed in this video
@samwitteveenai
@samwitteveenai Жыл бұрын
I am not sure why you would want to do that, can you explain.
@programwithpradhan
@programwithpradhan Жыл бұрын
@@samwitteveenai I can't give all of the raw text to the open ai because the text is so long like more than 200k letters are there so I need to convert that text into chunks and do the embedding
@TheKingfysher
@TheKingfysher Жыл бұрын
Do you know if its possible to feed in multiple text chunks into the pipeline like you can do with the langchain QA Chain?
@samwitteveenai
@samwitteveenai Жыл бұрын
Yeah that should doable. It will really operate on any input.
@onirdutta666
@onirdutta666 Жыл бұрын
guys, is anybody facing await issue..how to solve this one
@programwithpradhan
@programwithpradhan Жыл бұрын
Hii have you solved that issue because I am also facing same problem? Please reply
@muhamadabdallah7960
@muhamadabdallah7960 Жыл бұрын
can you please tell me or make a new video about making a tool or a transoformer agent that can take an audio and dubb it to another language with whisper or Nllb-200 and make a talking avatar to say it with sadtalker for free . thank you very much .
@Quitcool
@Quitcool 11 ай бұрын
i think after release OpenAI Function Agent that Kor is useless
@Teathebest0
@Teathebest0 Жыл бұрын
Hi, is it working with LinkedIn?
@samwitteveenai
@samwitteveenai Жыл бұрын
What do you want to do with Linkedin?
@maninzn
@maninzn Жыл бұрын
I have a pdf with a table data. What is the best way to extract that and store it as vectors for proper retrieval? The standard textsplitter is not accurate since it is storing it as one continuous text. Cheers!
@user-dg7ud7wv5n
@user-dg7ud7wv5n 6 ай бұрын
I cannot get past the StringIO error in Kor library. can anyone help me with this?
Converting a LangChain App from OpenAI to OpenSource
20:00
Sam Witteveen
Рет қаралды 15 М.
Understanding ReACT with LangChain
21:10
Sam Witteveen
Рет қаралды 43 М.
3 wheeler new bike fitting
00:19
Ruhul Shorts
Рет қаралды 51 МЛН
Always be more smart #shorts
00:32
Jin and Hattie
Рет қаралды 36 МЛН
World’s Deadliest Obstacle Course!
28:25
MrBeast
Рет қаралды 140 МЛН
Tagging and Extraction - Classification using OpenAI Functions
16:14
Using LangChain Output Parsers to get what you want out of LLMs
23:04
Build Blazing-Fast LLM Apps with Groq, Langflow, & Langchain
1:01:18
Is Tree-based RAG Struggling? Not with Knowledge Graphs!
9:06
Use LLMs To Extract Data From Text (Expert Mode)
15:28
Greg Kamradt (Data Indy)
Рет қаралды 54 М.
Florence 2 - The Best Small VLM Out There?
14:02
Sam Witteveen
Рет қаралды 10 М.
Tool Calling with LangChain
7:28
LangChain
Рет қаралды 11 М.
Introducing KeyLLM - Keyword Extraction with Mistral 7B and KeyBERT
18:52
Maarten Grootendorst
Рет қаралды 15 М.
Tutorial 2: Extracting Information from Documents
58:20
NLP and CSS 201: Beyond the Basics
Рет қаралды 41 М.
Что не так с LG? #lg
0:54
Не шарю!
Рет қаралды 65 М.
Ждёшь обновление IOS 18? #ios #ios18 #айоэс #apple #iphone #айфон
0:57
ПОКУПКА ТЕЛЕФОНА С АВИТО?🤭
1:00
Корнеич
Рет қаралды 3,2 МЛН
Samsung S24 Ultra professional shooting kit #shorts
0:12
Photographer Army
Рет қаралды 33 МЛН