LangChain - Using Hugging Face Models locally (code walkthrough)

Рет қаралды 116,308

Күн бұрын

Пікірлер: 84

@insightbuilder Жыл бұрын

Keep up the great work. And thanks for curating the important HF models that we can use as alternate for paid LLMs. When learning new tech, using the free LLMs can provide the learner a lot of benefits.

@bandui4021 Жыл бұрын

Thank you! I am a newbie in this area and your vid´s are helping me a lot to get a better picture of the current landscape.

@قناة_لحظة_إدراك Жыл бұрын

How can the ready-made projects on the platform be linked to Blogger blogs? I have long days searching to no avail

@sakshikumar7679 6 ай бұрын

saved me from hours of debugging and research! thanks a ton

@morespinach9832 10 ай бұрын

This is helpful because in some industries like banking or telcos, it's impossible to use open source things. So we need to host.

@tushaar9027 Жыл бұрын

Great video sam , i don't know how i missed this

@luis96xd Жыл бұрын

I have a problem, when I use low_cpu_mem_usage or load_in_8_bit, I get an error about I need to install xformers , When I install xformers , I get an error I need to install accelerate, When I install accelerate, I get an error I need to install bitsandbytes, And so on: einops accelerate sentence_transformers bitsandbytes But finally, I got an error *NameError: name 'init_empty_weights' is not defined* I don't know how I can solve this error and why it happens, could you help me please?

@prestigious5s23 Жыл бұрын

Great tutorial. I need to train a model on some private company documents that aren't publicly released yet and this looks like it could be a big help to me. Subbed!!

@intelligenceservices 5 ай бұрын

is there a way to compile a huggingface repo to a single safetensors file? (compiled from a repo that has the separate directories: scheduler, text_encoder, text_encoder_2, tokenizer, etc...)

@atharvaparanjape9585 7 ай бұрын

How can I load the model for some time later, once I download it on the local drive

@steev3d Жыл бұрын

Nice video. Im trying to connect an LLM and use Unity 3d as my interface for STT and TTS with 3d characters. I just found a tool that enables connex to a LLM on huggingface which is how I discovered that you need a paid endpoint with GPU support to even run most of them, I kinda wish I found this video when you posted it. Very useful info.

@anubhavsarkar1238 6 ай бұрын

Hello. Can you please make a video on how to use the SeamlessM4T HuggingFace model with langchain ? Particularly for text to text translation. I am trying to do some prompt engineering with the model using Langchain's LLMChain module. But it does not seem to work ...

@Chris-se3nc Жыл бұрын

Thanks for the video. Is there any way to get an example using the lang chain JavaScript library? I am new to this area, and I think many developers would have a node versus a python background

@MohamedNihal-rq6cz Жыл бұрын

Hi sam , how do you feed your personal documents and query them and return response in a generative question answering format and not as extractive question answering , I am bit new to this library , I don't want to use Openai api keys please provide some guidance on using with open source llm models, thanks in advance!

@samwitteveenai Жыл бұрын

that would require fine tuning the model, if you want to put the facts in there. That is probably not the best way to go though.

@binitapriya4976 Жыл бұрын

Hi Sam, Is there any way to generate question answer from a given text in a .txt file and save those questions answers in another .txt file with the help of free huggingface model?

@botondvasvari5758 7 ай бұрын

and how can I use big models from huggingface ? I can't load them into memory because many of them are bigger than 15gb, some of them are 130gb+ . Any thoughts?

@samwitteveenai 7 ай бұрын

you need a machine with multi GPUs

@megz_2387 Жыл бұрын

how to fine tune this model so that it can follow instructions on data provided

@marko-o2-h20 Жыл бұрын

If we cannot afford to get A100, what's the cheaper option you would recommend to run these? I understand the models differ in size also. Thanks Sam.

@jzam5426 Жыл бұрын

Thanks for the content!! Is there a way to run a HuggingfacePipeline loaded model using M1/M2 processors on Mac? How would one set that up?

@DanielWeikert Жыл бұрын

I tried to store the KZbinDownloader loads in FAISS using HuggingFace Embeddings but the LLM was not able to do the similarity search. Colab finally ran into timeout. Can you share how to do this instead of using OpenAI? With OpenAI I had no issues but like to do it with HF Models instead e.g. Flan br

@computadorhumano949 Жыл бұрын

Hey, why it take time to response out? This needed of my CPU to be fast?

@samwitteveenai Жыл бұрын

yeah for the local stuff you really need a GPU rather than a CPU

@induu954 Жыл бұрын

Hi.. i would like to know that, Can we chain 2 models like a classification model and a pretrained model using langchain?

@samwitteveenai Жыл бұрын

You could do it through a tool. Not sure there is anything in built in LangChain for the classification models if you mean something like a BERT etc.

@luis96xd Жыл бұрын

Amazing video, everything was well explained, I needed it, thank you so much!

@magnanimist Жыл бұрын

Just curious, do you need to redownload the model everytime you run scripts like these? Is there a way to save the model and use it after it's been downloaded?

@samwitteveenai Жыл бұрын

If you are doing this on a local machine the model will be there and HuggingFace should save it to a model. You can also do model.save_pretrained('model_name')

@AdrienSales Жыл бұрын

Excllent tutorial, ad so weel explained. Thanks a lot.

@younginnovatorscenterofint8986 Жыл бұрын

Hello Sam,how do you solve Token indices sequence length is longer than the specified maximum sequence length for this model (2842 > 512). Running this sequence through the model will result in indexing errors. thank you inadavance

@samwitteveenai Жыл бұрын

this is a limitation in the model not LangChain. There are some models on HF that are 2048.

@yves1893 Жыл бұрын

i am using huggingface model chavinlo/alpaca-native however, when i use those embeddings with this model pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_length=248, temperature=0.4, top_p=0.95, repetition_penalty=1.2 ) local_llm = HuggingFacePipeline(pipeline=pipe) my output is always only 1 word long. can anyone explain this?

@SomuNayakVlogs Жыл бұрын

can you create for csv as input

@samwitteveenai Жыл бұрын

I made another video for using CSVs with langchain check that ou

@SomuNayakVlogs Жыл бұрын

@@samwitteveenai Thanks Sam,i already watch that video that is with opeiai but i wanted lang chain with csv and huggingface

@SomuNayakVlogs Жыл бұрын

can you please help me on that

@Marvaniamehul Жыл бұрын

I am also curious if we can use hugginface pipeline (local run) and langchain to load csv file.

@brianrowe1152 Жыл бұрын

Stupid question, so I'll take a link to another video/docs/anything. Which Python version, cuda version, pytorch is the best to use for this work? I see many using python 3.9 or 3.10.6 specifically. The pytorch site recommends 3.6/3.7/3.8 on the install page. Then the cuda version 11.7 or 11.8 - it looks 11.8 is experimental? Then when I look at my nvcc output its says 11.5, but my nvidia-smi says cuda Version 12.0 .. head explodes... I'm on Ubuntu 22.04. I will google some more, but if someone know the ideal setup.. or at least the it works setup.. I appreciate it!!! Thank you

@stonez56 8 ай бұрын

Please make a video on how to convert Safetensors to. GUFF format or format that can be used for Ollama? Thanks for these great AI videos!

@human_agi Жыл бұрын

what kind of cloab you need, becuase i am using $10 version with high ram and GPU on, and still cannotr run ValueError: A device map needs to be passed to run convert models into mixed-int8 format. Please run`.from_pretrained` with `device_map='auto'`

@samwitteveenai Жыл бұрын

If you don't have access to the bigger GPU then go with a smaller T5 model etc.

@rudy.d Жыл бұрын

I think you just need to add the argument device_map='auto' in the same list of arguments of your model's "*LM.from_pretrained(xxxx)" where you have "load_in_8bit=True"

@hiramcoriarodriguez1252 Жыл бұрын

I'm a transformers user and I don't still get the point to learn this new library. Is just for very specific use cases?

@samwitteveenai Жыл бұрын

Think of it as an abstraction layer for prompting and and how to manage the user interactions with your LLM. It is not an LLM in itself.

@hiramcoriarodriguez1252 Жыл бұрын

@@samwitteveenai I know, it's not a LLM, the biggest problem that I see is learning a new library that wraps Open AI and HuggingFace libraries just to save 3 or 5 lines of code. I will follow your work, maybe that will change my mind.

@insightbuilder Жыл бұрын

Consider the Transformers as the first layer of abstraction over the neural nets which create the LLMs. In order to interface with LLMs, we can use many of libraries including HF. HF Hub/ Langchain will be the 2nd layer. The USP of langchain is the ecosystem that is built around it, especially using the Agents, Utility Chains. This ecosystem lets the LLMs to be connected with the outside world... The devs at LC have done a great job. Do learn it, and share this absolutely brilliant vids with your friends/ team members etc.

@samwitteveenai Жыл бұрын

great way of describing it @Kamalraj M M

@neilzedd8777 Жыл бұрын

@@insightbuilder beyond impressed with how healthy their documentation is. Working on a flan-ul2 + lc app right now, very fun times.

@alexandremarhic5526 Жыл бұрын

Thank for he work. Just let you know Loire Valley is in the north of France ;)

@samwitteveenai Жыл бұрын

Good for wine ? :D

@alexandremarhic5526 Жыл бұрын

@@samwitteveenai depends of your taste. If you love sugar wine, south is better. Specialy for withe wine like "Jurançon".

@srimantht8302 Жыл бұрын

Awesome video! Was wondering how I could use Langchain with a custom model running on sagemaker? Is that possible?

@samwitteveenai Жыл бұрын

yeah that should be possible in a similar way.

@hnikenna Жыл бұрын

Thanks for this video. You just earned a subscriber

@halgurgenci4834 Жыл бұрын

These are great videos Sam. I am using a Mac M1. Therefore, it is impossible to run any model locally. I understand this is because PyTorch has not caught up with M1 yet.

@samwitteveenai Жыл бұрын

actually I think they will wrong. I use an M1 and M2 as well but I run models in the cloud. I might try to get them to run on my M2 and make a video if it works.

@daryladhityahenry Жыл бұрын

Hi! Do you find a way to load vicuna gptq version using this? I try your video with gpt neo 125M and it's working, but not vicuna gptq. Thank youu

@venkatesanr9455 Жыл бұрын

Thanks for the valuable series and highly informative. Can you provide some discussions on in-context learning(providing context/query), reasoning & chain of thoughts

@samwitteveenai Жыл бұрын

Hi glad it is helpful. I am thinking about doing some vids on Chain of Thought prompting, Self Consistency, and PAL going through the basics of the paper and then looking at how they work in practice with an LLM. I will in the basics of in-context learning as well. Let me know if there are any others you think I should cover.

@DarrenTarmey Жыл бұрын

It would be nice to have someone do review fir noobies as there are so much to learn and it's hard to know we're to start from.

@samwitteveenai Жыл бұрын

what exactly would you like me to cover? Any questions I am happy to make more vids etc.

@fintech1378 Жыл бұрын

how to do telegram chatbot with this

@surajnarayanakaimal Жыл бұрын

Than you for the awesome content, it would be very helpful if you make tutorial on how to use custom model with langchain embed it, so i want to train some documentations , so currently we can use open ai or other service APIs But it is very costly consuming their APIs, so can you teach how to do that locally please consider training a custom documentation of site, and it can answer from the documentation, more context aware and also history remember. Currently for that we depend on open ai APIs. So if it's achievable using local modal it would be very helpful

@XiOh Жыл бұрын

u are not doing it locally in this video.....

@samwitteveenai Жыл бұрын

The LLMs are running locally on the machine where the code is running. The first bit shows pinging the API as a comparison.

@SD-rg5mj Жыл бұрын

hello and thank you very much for this video, on the other hand the problem is that I am not sure to have understood everything, I speak English badly, I am French

@samwitteveenai Жыл бұрын

Did you try the french sub titles? I upload English subtitles so I hope youtube does a decent job translating them. Also feel free to ask any questions if you are not sure.

@pankajkumarmondal4490 3 ай бұрын

really good!

@samwitteveenai 3 ай бұрын

thanks this video is actually a bit out of date for now it is almost 18 months old

@KittenisKitten Жыл бұрын

Would be useful if you explained what program your using, or what page your looking at, seems like waste of time if you don't know anything about the programs or what your doing 1/5

@samwitteveenai Жыл бұрын

The Colab is linkedin in the description, its all there to use.

@mrsupremegascon Жыл бұрын

Ok, great tutorial, but as a French from Bordeaux, I am deeply disappointed by the answer of google about the best area to grow wine. Loire valley ? Seriously ???? Name one great wine coming from Loire, Google, I dare you. They are in the b league at best. The answer is obviously Bordeaux, I would maybe had accepted Agen (wrong) or even Bourg*gne (very very wrong). But Loire, it's outrageous and this answer made me certain that I will never use this cursed model.

@samwitteveenai Жыл бұрын

lol well at least you live in a very nice area of the world.

@nemtii_ Жыл бұрын

What happens always with this setup langchain + HuggingFaceHub is that it only increments on 80 characters for each call, anyone else having this problem, I tried max_length: 400 and still same issue

@nemtii_ Жыл бұрын

it's not local to langchain, I used the client directly and still getting the same issue

@samwitteveenai Жыл бұрын

I think this could be an issue with their API. Perhaps on the Pro/paid version they allow more? I am not sure, to be honest I don't use their API , I tend to load the models etc. could also the max_new_tokens setting rather than max_length, that could help.

@nemtii_ Жыл бұрын

@@samwitteveenai wow! thank youuu!! worked with max_new_tokens

@nemtii_ Жыл бұрын

@@samwitteveenai I wish someone one do a list, mapping of which model sizes runs on google colab free, versus the paid colab, and so to see if it's worth to pay, and what can u experiment with within that tier, I'm kinda lost in that sense, at a stage where I just want to evaluate models myself, and see for a production-env later

@samwitteveenai Жыл бұрын

This would be good I agree

@ELECOEST 11 ай бұрын

Hello, Thanks for your video. for now it's : llm_chain = LLMChain(prompt=prompt, llm=HuggingFaceHub(repo_id="google/flan-t5-xxl", model_kwargs={"temperature":0.9, "max_length":64})) temperature must bu >0 and model : flan-t5-xxl