LangChain101: Connect Google Drive Files To OpenAI

  Рет қаралды 33,491

Greg Kamradt (Data Indy)

Greg Kamradt (Data Indy)

Күн бұрын

Пікірлер: 96
@temozarela
@temozarela Жыл бұрын
I'm so obsessed goin through all of this videos one by one. No better way to spend my Saturday, especially when things work! Thanks for your amazing contribution!
@adamsardo
@adamsardo Жыл бұрын
Appreciate what you've been doing and the time you've spent helping the community :)
@moreshk
@moreshk Жыл бұрын
might be a bit silly to ask, but it would be useful if you can provide some guidance on how to setup the credentials json. Have been fumbling on it.
@merkemong1496
@merkemong1496 Жыл бұрын
same
@merkemong1496
@merkemong1496 Жыл бұрын
haven't you found a way to setup the credentials, I put my credentials.json at correct path but it still says not found
@davidwu3247
@davidwu3247 Жыл бұрын
awesome vid. can't wait till GPT4 is out and we can use google drive photos/text as multimodal input
@DataIndependent
@DataIndependent Жыл бұрын
Big time! That is going to be super cool.
@fliu5282
@fliu5282 Жыл бұрын
Python + LangChain + Html basic coding = Big Future = Prompt Engineering
@DataIndependent
@DataIndependent Жыл бұрын
Nice
@rossgalvanofficial
@rossgalvanofficial Жыл бұрын
Thank you for sharing this, very interested.
@badrinarayanans355
@badrinarayanans355 3 ай бұрын
Great Insights
@VictorCardonan
@VictorCardonan Жыл бұрын
Hello, thank you for the videos. They are really interesting. I have two questions: 1) Why are you not using embeddings in this case? 2) Would it make sense and it is possible to save the state of the summarizer so you don't have to do all the process from scratch if you have +1000 documents? Thank you
@MK-jn9uu
@MK-jn9uu Жыл бұрын
I was thinking the same thing..
@EstherL-wd9yx
@EstherL-wd9yx Жыл бұрын
@DataIndependent - My main question is #2: How can we build a database of documents so that the knowledge db grow and not do all of the processing from scratch?
@blocksystems202
@blocksystems202 Жыл бұрын
You're amazing - thanks for sharing.
@DataIndependent
@DataIndependent Жыл бұрын
My pleasure!
@bladeplays6425
@bladeplays6425 Жыл бұрын
One use case that I would love to see is how this performs on Excel/Google Sheets Data. Given event/log data from a website or a mobile app and documentation on what activity each event type in the log represents, does the model know how to answer questions about frequent (or user-specific) app activity?
@RussellDeming
@RussellDeming Жыл бұрын
Definitely interested in implementing in my business
@DataIndependent
@DataIndependent Жыл бұрын
Nice! What domain are you in? How are you thinking about using it?
@ahsanahmad3193
@ahsanahmad3193 Жыл бұрын
Should have shown the structure of credentials file. Maybe add in comment.
@carlosterrazas5091
@carlosterrazas5091 Жыл бұрын
Great content, just a question about security of the information. Do you know if this way ChatGPT will see the information like if you enter it on their platform?. My concern is if you use for private documents then the info will be in ChatGPT database for everyone to see, thanks
@weipingwu7852
@weipingwu7852 Жыл бұрын
thanks very much! I have a question, I want to control the usage of document, only for my company internal use. If I use langchain, is the other party include openai can see my document? thansk
@DataIndependent
@DataIndependent Жыл бұрын
Yes, if you use OpenAI as your LLM then they can see your data. Check out their data retention policies for more information. You could do a self hosted LLM for privacy reasons but that is more set up
@Iammikelovin
@Iammikelovin Жыл бұрын
Hi, can you recommend info on self hosted LLM? Can I use OpenAI and basically not have them retain my data? Or do I have to use another LLM?
@frankrobert9199
@frankrobert9199 Жыл бұрын
great
@rahuliitm
@rahuliitm Жыл бұрын
Great tutorial. Absolutely loving it. I'm trying to read a gitbook and summarise it but apparently there's a prompt context length limit. "This model's maximum context length is 4097 tokens, however you requested 7592 tokens" Not sure where I can set the token limit
@jmanhype1
@jmanhype1 Жыл бұрын
yea thats why hes selling his service to fill in the gaps
@DataIndependent
@DataIndependent Жыл бұрын
Nice! Yes there is a context limit for prompts. Check out either my video on asking a question to a 300 page book or else my "work arounds for prompt limit" video
@DataIndependent
@DataIndependent Жыл бұрын
Nothing to sell here - happy to help with any questions you have though
@briandao975
@briandao975 Жыл бұрын
Awesome video thank you. Do you have a video on how to utilize embeddings in the sample scenario. Would like to create something similar but have a lot of docs. Also is there a way to refresh the embeddings automatically or on a schedule? For example, if the doc gets updated, how does that get handled
@eracton
@eracton Жыл бұрын
Did you figure that out?
@manyavarshney4399
@manyavarshney4399 Жыл бұрын
Hello, can you resolve my error? I gave credentials path and it got executed. But when I loaded document, it displayed "Access blocked to the Google Drive API"
@DataIndependent
@DataIndependent Жыл бұрын
Have you googled it? that sounds like a google credential issue
@ujjwalgupta1318
@ujjwalgupta1318 Жыл бұрын
Is this and directory loader not doing a similar sort of thing?
@DheerSinghDel
@DheerSinghDel Жыл бұрын
Can u exactly explain the path of credentials folder assuming that I am working with GoogleColab and drive folder path where ipynb file is residing my drive at /ColabNotebooks/LangChain/drivetest.ipynb
@DataIndependent
@DataIndependent Жыл бұрын
I would put this question into chatgpt and have it work with you on the details. It requires knowledge about your setup which I don't have
@leticiaromanbernal4151
@leticiaromanbernal4151 Жыл бұрын
Hi, I would like to know if there's any possibility to connect Google Sheets from my Google Drive account as it does with Google Doc. Please help me. Thanks a lot :)
@DataIndependent
@DataIndependent Жыл бұрын
big time - you can use langchains drive loader python.langchain.com/docs/modules/data_connection/document_loaders/integrations/google_drive
@AizzatAffero
@AizzatAffero Жыл бұрын
Once langchain read all of it, does it store the data when we reopen it again?
@federicogiacomarra
@federicogiacomarra Жыл бұрын
Not sure if this is explained elsewhere, can you retrieve the source document somehow together with the answer?
@nsitkarana
@nsitkarana Жыл бұрын
Nice video. I have one follow up - when i do any kind of interaction with openai (for instance the doc from google drive) or in the other video where i chunk/embed local documents, how safe are the personal documents. in other words, how safe is it to use openai for personal documents ? does anyone have any idea on that.
@Iammikelovin
@Iammikelovin Жыл бұрын
Hello, I have just started watching a few of your vids, they’re super interesting and really well explained, thanks! Q: The source files, in my case several PDF docs, are confidential and my idea is to create a internal Q&A. What is the privacy? Does LongChain or OpenAI potentially have access to it? Does it add it to its “brain”? Or is it completely private? Thanks again
@bagamanocnon
@bagamanocnon Жыл бұрын
Data used through the Open AI APIs like the questions fed to the LLM and the answers outputted by the LLM (what Open AI calls prompts and completions, respectively) will be stored on their servers for 30 days before being purged. Per their policy, only a limited number of employees within OpenAI itself - only those employees who are monitoring it for abuse - will have access to the data. For enterprise customers, they might even have the option to totally opt out of having their data stored at all. Look up Open AI API usage policies. I can't paste link here. Using their embeddings service also exposes your data to OpenAI. The demo in this video doesn't use embeddings but (it reads the text directly) but you almost always want to create a vector index with embeddings for your knowledge base (kb) specially if it consists of hundreds or thousands of documents. LLMs has an easier time 'reading' vector values rather than raw text. cheers.
@DataIndependent
@DataIndependent Жыл бұрын
Agree! and if you don't want OpenAI to have your data then you should be using a local model
@AnkurChauhan-n3z
@AnkurChauhan-n3z Жыл бұрын
Hi Greg, I am getting an error while trying to connect Google Drive files to OpenAI and the error is below: ValueError: Client secrets must be for a web or installed app. May you please me to resolve this error. I am using Azure credentials.
@DataIndependent
@DataIndependent Жыл бұрын
Because Azure and Google Drive are run by different companies the credentials won't work. Try getting google credentials
@AnkurChauhan-n3z
@AnkurChauhan-n3z Жыл бұрын
@@DataIndependent Thanks Greg 😇
@ahmadzaimhilmi
@ahmadzaimhilmi Жыл бұрын
Still studying this langchain module. I'm looking to chain a series of questions, i.e. use result from a question to generate the next question.
@DataIndependent
@DataIndependent Жыл бұрын
Nice, that would likely be an agent. What's the example you want to do?
@ahmadzaimhilmi
@ahmadzaimhilmi Жыл бұрын
@@DataIndependentA business plan aims to develop a research plan for a thesis. The research plan needs to find a research gap, which means an unexplored area in the existing literature. Otherwise, the research would be repetitive and unoriginal. This is a difficult part that involves a lot of writing and concentration. It might take around nine months to finish this part if one is very committed. To do this, one has to go through hundreds of papers, learn about the methods, materials, standards and challenges of similar research. There is a technique for doing this, but LLM simplifies it a lot. My approach is to use Bert or another tool to get relevant keywords from the papers and build on them for the research plan. This way, the researcher spend less time on the writing part and focus on doing the experiment.
@photon2724
@photon2724 Жыл бұрын
Another fantastic tutorial! although, what is the credentials.json file? and how can i get my own?
@DataIndependent
@DataIndependent Жыл бұрын
Thanks! That is on the google side of the house. developers.google.com/workspace/guides/create-credentials
@anishmanandhar1203
@anishmanandhar1203 Жыл бұрын
and how do we do with it , how do we get the .json file@@DataIndependent
@cgtinc4868
@cgtinc4868 Жыл бұрын
Sorry for noob question, where to place the "../../desktop_credetnaisl.json" as to admit that I am a non coder, just following your video along the way
@DataIndependent
@DataIndependent Жыл бұрын
Nice! You can place your credentials file where ever you want. By default your program will usually look in a root folder, but you can tell it to look whereever you need. If your credentials were in the same folder as your script you could do "credentials.json" without going up/down from any folder
@cgtinc4868
@cgtinc4868 Жыл бұрын
​@@DataIndependent Thanks! wrote to you in Twitter as well
@adamtemple8677
@adamtemple8677 Жыл бұрын
Is it still limited by the prompt token limits, or can you use an entire G-Drive and chat with all your documents?
@user-vm7xx3wi8cbz
@user-vm7xx3wi8cbz 9 ай бұрын
Did you figure this out?
@HumzaAslam-i8l
@HumzaAslam-i8l Жыл бұрын
How do I get my credentials path from google?
@DataIndependent
@DataIndependent Жыл бұрын
*You* give your credentials path to google. This guide may help googleapis.dev/python/google-auth/latest/user-guide.html
@neon_Nomad
@neon_Nomad Жыл бұрын
What about nextCloud or syncthing?
@DataIndependent
@DataIndependent Жыл бұрын
Could you link me to the examples you'd want to see?
@ivantan222
@ivantan222 Жыл бұрын
4:00 That's a pretty short summary of the long text, is there any parameter to make it longer?
@DataIndependent
@DataIndependent Жыл бұрын
You can see here the prompt that is being used to generate this summary github.com/hwchase17/langchain/blob/master/langchain/chains/summarize/stuff_prompt.py Under the hood it's just a prompt with your text in it. You could adjust the prompt manually (not by using the chain, but doing your own prompt) to get a longer one.
@ivantan222
@ivantan222 Жыл бұрын
@@DataIndependent ah okay, thanks a lot for your info.
@learnapplybuild
@learnapplybuild Жыл бұрын
Please make a video on onedrive
@joelmartinez7628
@joelmartinez7628 Жыл бұрын
Still skeptical in opening our internal information to gpt3. Information will definitely be used to train and internal information that will be public once fed to gpt3. am i wrong to ask if they have a plan they can use the data to train but not as public information?
@DataIndependent
@DataIndependent Жыл бұрын
I totally agree - It's a problem that will need to get solved. I actually tweeted about this same question here: twitter.com/GregKamradt/status/1627338667936337921 AFAIK this isn't on the roadmap for them yet but I hope I'm wrong
@VictorCardonan
@VictorCardonan Жыл бұрын
why don't you use Gpt4all which can be installed locally and is not sending any data outside? It won't be that good nor straighforward but it can give you a good result.
@ezequielmelillan1708
@ezequielmelillan1708 Жыл бұрын
Hi man, thanks for sharing, this is amazing. Can you make a video using alpaca/llama integration with LangChain? Is it possible to use embeddings with those open-source AI?
@DataIndependent
@DataIndependent Жыл бұрын
Yep it's very possible you just need to swap out your embeddings model
@haisai4159
@haisai4159 Жыл бұрын
amazing tutorial! beginner here: can you do this for a google sheets and instead of juypter notebook a google collab notebook? thank you!
@DataIndependent
@DataIndependent Жыл бұрын
What's the use case you'd want to run through
@AmineBELALIA
@AmineBELALIA Жыл бұрын
​@@DataIndependent have the same problem. I have a list of product specifications (2000 specs) and I want to build a chatbot that can answer customer questions about these products and explain the technical details of each spec by searching the internet ( google sheet doesn't have thislevel of detail )
@vinosamari
@vinosamari Жыл бұрын
Please do a map-reduce video
@DataIndependent
@DataIndependent Жыл бұрын
Here's a video explaining the different chain_types kzbin.info/www/bejne/nGrCc4qeeK5nkNE
@wardaraees4887
@wardaraees4887 Жыл бұрын
I want to ask question to my excel files or a dataset which is in csv format (not a text file) or may be want to get a file in a form of table from sql server which is a result of a sql query, is it possible to upload that file in googledrive the same way or this method is for just text files? Or is there any direct way to ask question yo my sql table with open ai?
@DataIndependent
@DataIndependent Жыл бұрын
Check out the langchain documentation for how to query sql files, it's very doable.
@zes7215
@zes7215 Жыл бұрын
wrg
@abdoualgerian5396
@abdoualgerian5396 Жыл бұрын
the only bad thing about your content is the disturbing background music not all people can concentrate on a mixiture of more than one voice
@we-hate-copy-pasting
@we-hate-copy-pasting Жыл бұрын
Would it make sense to store embeddings in a database like Pinecone to avoid re-generating them with each call?
@DataIndependent
@DataIndependent Жыл бұрын
If you want them remote, then yep that would work. I should have put that example in the video
@ryanonvr2267
@ryanonvr2267 Жыл бұрын
---> 76 with open(self.token_path, "w") as token: 77 token.write(creds.to_json()) 79 return creds FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\info\\.credentials\\token.json' (even though the cred file is correct somewhere else.) :( newb
@DataIndependent
@DataIndependent Жыл бұрын
You can do two things 1) Make sure your cred file is in the location your script is looking for (I'm guessing it's the directory you mentioned above) 2) Tell your script to look elsewhere. This would be the location of your creds file wherever you would like it. I usually do it in my same folder or a parent folder above.
@TreiGamer
@TreiGamer Жыл бұрын
Hey Data Independent, I'm new to Python and coding in general but AI has been the push I need to really dig into this. I got Jupyter running locally, is there a recommended resource you'd point me towards for bringing your code into it?
@TreiGamer
@TreiGamer Жыл бұрын
Haha never mind, I figured it out. I just asked GPT 🤣 Love your content.
@DataIndependent
@DataIndependent Жыл бұрын
Nice! That's great. What I was going to say is: Easiest - Copy and paste the code from the github link in the description into your jupyter notebook More Robust - Git clone the repo so you can stay up to date with future changes as well
@TreiGamer
@TreiGamer Жыл бұрын
I did the git clone method. Thank you.
@coachfrank2808
@coachfrank2808 Жыл бұрын
Nice!
@DataIndependent
@DataIndependent Жыл бұрын
Thanks!
@cgtinc4868
@cgtinc4868 Жыл бұрын
Great video and as founder of startup need this tool! Is there a way not to access Google drive but like Synology Nas (which we use), that will be really really helpful
@DataIndependent
@DataIndependent Жыл бұрын
Thank you! I've never heard of Synology. For it to integrate it would either take a custom data loader from LangChain/Unstructured or you'd need to export the files you'd want to another spot.
@cgtinc4868
@cgtinc4868 Жыл бұрын
@@DataIndependent Thanks! its just a brand for external NAS setup. Maybe you can have a video on local HD drive which with that we can just change the path for wherever the source of the documents are :)
LangChain 101: YouTube Transcripts + OpenAI
11:08
Greg Kamradt (Data Indy)
Рет қаралды 24 М.
LangChain101: Question A 300 Page Book (w/ OpenAI + Pinecone)
11:32
Greg Kamradt (Data Indy)
Рет қаралды 206 М.
отомстил?
00:56
История одного вокалиста
Рет қаралды 7 МЛН
WORLD BEST MAGIC SECRETS
00:50
MasomkaMagic
Рет қаралды 52 МЛН
My daughter is creative when it comes to eating food #funny #comedy #cute #baby#smart girl
00:17
Which One Is The Best - From Small To Giant #katebrush #shorts
00:17
Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)
20:19
Cole Medin
Рет қаралды 89 М.
How to Connect ChatGPT GPT-4o to Google Drive (5 Detailed Examples)
12:17
The LangChain Cookbook - Beginner Guide To 7 Essential Concepts
38:11
Greg Kamradt (Data Indy)
Рет қаралды 342 М.
Workaround OpenAI's Token Limit With Chain Types
15:53
Greg Kamradt (Data Indy)
Рет қаралды 62 М.
OpenAI + LangChain Wrote Me 100 Custom Sales Emails
17:28
Greg Kamradt (Data Indy)
Рет қаралды 14 М.
LangChain 101: Ask Questions On Your Custom (or Private) Files + Chat GPT
10:11
Greg Kamradt (Data Indy)
Рет қаралды 119 М.
4 - Build a PDF Q&A Chatbot with Google Gemini (No LangChain!)
44:58
Generative Geek
Рет қаралды 1 М.
AI Data Agent with Gemini API | Build with Google AI
11:22
Google for Developers
Рет қаралды 34 М.
отомстил?
00:56
История одного вокалиста
Рет қаралды 7 МЛН