Fine-tuning GPT from Scratch in 15 Minutes with JavaScript

Рет қаралды 49,171

Күн бұрын

In this video you'll learn how to build an application that fine-tunes GPT with custom data, from scratch, in less than 15 minutes.
🔗 Codebase - github.com/dabit3/gpt-fine-tu...
🔗 Blog Post - nader.substack.com/p/supercha...
🔗 GPT Repository Loader - github.com/mpoon/gpt-reposito...
We'll start by creating a new project and installing the dependencies.
We'll then walk through creating each interaction from modeling the data, to uploading it to OpenAI, and testing it out.
By the end of the video you should be able to build out your own custom models.
0:00 - Introduction
1:16 - Project setup
6:10 - Creating training data
7:22 - Creating local API interface
8:25 - Uploading training data
9:30 - Creating the fine tune with the training data
10:55 - Fetching and listing the fine tunes
11:40 - Testing it out

Пікірлер: 63

@dco_ Жыл бұрын

Very cool Nader, thank you for digging in and sharing this tutorial! Really great tools for training GPT based on things like documentation or an API meant for other developers to use.

@prabhacar Жыл бұрын

great tutorial...very clear and concise. I apprecaite your effort to zoom in and out and make it so easy to follow. Thanks

@_susanzea Жыл бұрын

Thanks for a great tutorial! This was my first time playing around with the openai api and this made it easy peasy🙏🏼

@yujiahu7651 Жыл бұрын

thanks for the clear explanation. very beginner friendly!

@kirso Жыл бұрын

You are awesome Nader!

@Ibrahim-oc5ql Жыл бұрын

Bro You come from a different planet or what? Mad knowledge for real WTH!!!!!!!

@VincentFulco Жыл бұрын

Great to see your focus on other cool tools aside from web3 stuff

@klaytonlam Жыл бұрын

Great tutorial! The only one (that I can find) that explains clearly how fine tuned model can be retrieved and retuned. Thanks! Any idea where I can find documentation on security around fine tuned models?

@blottolotto7648 Жыл бұрын

gret stuff. also, ur the man for using 'import' instead of 'require'

@mariusirgens5555 Жыл бұрын

Thanks for a quick and easy viewed tutorial! Do you know if there are major differences if using python to fine-tune?

@ToddDunning Жыл бұрын

Nader thank you for the great tutorial, can you give a good use case or two?

@unclecode Жыл бұрын

Great content, if u don’t mind, would u share the software you use for screen recording? I notice it follows the mouse pointer and smooth zoom in and out. Is it a software or ur own post production editing? Thx

@ConversationDesign Жыл бұрын

Tutorial is fantastic. What are the use cases of this model? Can it be used outside of terminal like deployed on the web site or discord channel?

@codeSTACKr Жыл бұрын

Wouldn't it be better to use Embeddings with a vector store? From there you could "fine tune" results if needed. But I don't think fine tuning is the proper way to add "new" data such as a company's docs.

@dancewithbrownie Жыл бұрын

Sounds reasonable. For the Embeddings you could use pinecone as a vector database (or store), right? In your opinion, what would be the proper way to add "new" data such as a companies knowledge db? I just started to explore this topic. :) Thank you!

@GuinessOriginal Жыл бұрын

Depends on how different the documents are and how different there’s you want are. The reality is for most businesses their documents aren’t going to be that different from the plethora of documents the model has already been trained on, and the way pale interact with and the results they expect from the documentation won’t be that different either, particularly if it’s just extraction of information and knowledge and analysis etc. in which case fine tuning is all you need.

@dancewithbrownie Жыл бұрын

@@GuinessOriginal Thanks for your response. So, If I understand this correctly: Let's assume I have a company with specific knowledge in a certain field. This knowledge has been built up over years in all possible formats (e.g. documents, tables, files, whatever...). As an example, I would like to set up a model that is used for internal purposes in the company and runs as a kind of "CompanyGPT". Would fine tuning be the right thing to do here? Even if new data is added? Or would this be a case for a company plugin (which was released a few days ago)?

@GuinessOriginal Жыл бұрын

@@dancewithbrownie that really doesn’t tell me very much, I’m afraid. I’d need to know how big your company is, how much data you’ve got, what kind of data it was, whether it was completely bespoke or similar to other data in the industry, whether it was data similar to data Gpt has been trained on or not, what your use cases are, what kind of prompts/questions and responses/answers you are going to make and expect, with examples, what systems you’ve got, what state your data is in, where it resides, how connected/connectable it is, how much cleaning and labelling it needs, what your token space might look like, how much tokenisation it might need, there’s a lot of unknowns. What you need is an AI Consultancy to come in and do a requirements, systems and data analysis, and write an AI strategy and road map for your company based on that analysis. That will give you a firm foundation to make an informed choice on the options available to you and decide on the appropriate implementation plan. If you tell me your location I can probably recommend one or two such consultancies. When you say a “company plug in” was released, what do you mean exactly? A plug in specifically for software your company produces, runs, or a bespoke plug in your company has developed?

@surajkhan5834 Жыл бұрын

@@dancewithbrownie as in this video the finetune is not respond what we give in completion why? and i am creatating a project like i have spreadheet that contain many contact so i want to train gpt like to get contact when chat ex: AI: What kind of information you are looking for? Human: I am looking for some contact which was stored from 2017. AI: Ok i have around 1,000,00 Contact in my data base from the year 2017 to till date. Human: I am looking for all the contacts in LA area. AI: Ok i have around 50,000 Contacts for LA area. is this possible?

@maskman4821 Жыл бұрын

Very cool 😀😘👍🤩

@James-hb8qu Жыл бұрын

Appreciate the video. What sort of cost is involved in this?

@nicholasbrock1178 Жыл бұрын

Does it continue to pull from the main data from openAI or just from what you trained it on? For instance, if I wanted to say what is a good color scheme for bright blue? Would it answer only based on what I have trained it on or also based on the general data as well?

@jerryyuan3958 Жыл бұрын

nice turorial

@andresmontoya7852 Жыл бұрын

Thank you sharing! One question: how can we update a model already created, if I want to add new data to a model created, how can I do? Thank you

@PradeepKumar-uo3rt Жыл бұрын

hey nader , cool video it was , but i didn't findout what are the use cases of this model and how we trained own data model to answer my own data related . Thanks

@gopinathkrm58 Жыл бұрын

Since it is a fine tuned model, is it possible to restrict the response to particular field ? . Example only prompt related to software alone should be coming back as response otherwise simply say not related topic or question.

@RationalDissonances 11 ай бұрын

whats the difference between fine-tuning like this and using langchain?

@kH-ul4hk Жыл бұрын

If you would add to the custom model later, would you only give the new data for the new training? Or also the data it has already fine tuned on?

@TheKnowledgeAge Жыл бұрын

It depends I think. Sometimes I'll back out and start new if I'm not happy with the results. But it's best to know why you want to continue adding and refining or start all over. Sry if it's not the answer your looking for😅

@GuinessOriginal Жыл бұрын

Depends if the data is in the same arena as the data you’ve already fed it, if it’s different in anyway you will probably want to train or at least fine tune it

@carbaj03 Жыл бұрын

They have recently released Plugins for Chatgpt, from my point of view it can be a replacement for fine tuning (or the same thing but done in a different way). What do you think?

@GuinessOriginal Жыл бұрын

How are plug-ins the same as fine tuning? Don’t understand your logic here at all, please explain it to me.

@mendsalbert Жыл бұрын

Nice presentation. what software do you use for your video editing. Those smooth zooms slaps!

@vasupalivela 11 ай бұрын

Thanks for the vedio with explanation. listFineTunes status not getting succeed it showing always pending. any help

@CarlosFernandez-me1uj Жыл бұрын

Awesome and really helpfull, Can you do the same video for embeddings? It would be really cool

@naderdabit Жыл бұрын

did you see this one? kzbin.info/www/bejne/bJDQl4yGhciNbJI

@YoleYu Жыл бұрын

For the dataset, is there any way to generate based on a real bussiness/company/project? It's not easy to prepare 100+ question for real people.

@jopadjr Жыл бұрын

184th... Thanks Nader

@aajn88 Жыл бұрын

Love your VSCode theme! Which one is it?

@Ibrahim-oc5ql Жыл бұрын

Cyberpunk

@aajn88 Жыл бұрын

@@Ibrahim-oc5ql Thanks!

@masteringreactnative Жыл бұрын

Is there a way to feed it the whole documentation?

@HotelSeptember Жыл бұрын

Good for a quick overview but too high level and impractical. Doesn’t go into enough detail for my taste and for any practical use-case. Thanks for being clear and concise!

@cliffordmwale5782 7 ай бұрын

I have a database with more than 5 million rows of structured data. How can I train my own model on that?

@TJMcCarty Жыл бұрын

I have found fine-tuned models to be worthless. Maybe I'm doing things wrong or not understanding. It seems like the new data I give the fine tuned model has no effect on the results.

@shadmanmartinpiyal4057 10 ай бұрын

How long did the fine tuning took ?

@yishakabrham5002 Жыл бұрын

it says openai is not recognized in my cmd

@bossgd100 Жыл бұрын

what vscode theme are u using ?

@shadmanmartinpiyal4057 10 ай бұрын

Cyperpunk

@andresshamis4348 11 ай бұрын

The part I am confused is that when fine tuning it completes my prompts perfectly, but when I try to have a conversation it doesnt remember the context of previous messages. So i got the idea to feed it back in as an input the entire conversation up to that point, however it gets all crazy after 3-4 messages because the training dataset does not contain any prompts that have conversation history in it, so how is this done in order to have conversations with it like chatgpt?? Please someone help me

@muneebwaqas400 2 ай бұрын

Have you found out a way to achieve it because I also wanted to do it?

@andresshamis4348 2 ай бұрын

@@muneebwaqas400 i manage to do this and A LOT more, if you want let me know and id be happy to help you

@PocketRiches 11 ай бұрын

Why cant I type in "openai"?! I'm getting so frustrated. It keeps telling me " The term 'openai' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again."

@shylesh7636 10 ай бұрын

If the issue still persists. you might wanna check this thread. community.openai.com/t/unable-to-run-the-command-line-interface-cli-for-openai/6197/18

@Ibrahim-oc5ql Жыл бұрын

ينصردينك

@chevvvv Жыл бұрын

can you finetune using free trial credits?

@gainmuscle Жыл бұрын

yes

@alchemication Жыл бұрын

Nice video, but all these videos really don’t focus on how the data get’s created, and what good quality data means with examples, I guess best to play around and keep the secret sauce out of yotube 😅

@Ibrahim-oc5ql Жыл бұрын

I meant to watch this