Пікірлер
@EtyHatton
@EtyHatton 2 күн бұрын
thank you. I didn't really understand the Tags issue, and why it is so complex to add a tag to a dataset (and what is the differences between tags and aspects)
@richardduncan3403
@richardduncan3403 5 күн бұрын
Nice simple starting point example to follow along with
@Steelxfaze
@Steelxfaze 7 күн бұрын
wow this is so intimidating. Where did you learn how to do all this?
@nodematic
@nodematic 7 күн бұрын
I personally learn best through side projects and work/consulting projects - a just-in-time learning sort of approach, with hands-on and watching videos for topics as needed. After working through fine-tuning for a few projects, the concepts and terminology become way easier. The bigger picture, for real projects, will often looks something like this kzbin.info/www/bejne/j6WypZRmeZ1gaMU.
@Steelxfaze
@Steelxfaze 6 күн бұрын
@@nodematic That makes a lot of sense. I'm currently just learning how to use models in Python in my own projects (though I'm still struggling quite a bit with them), but I hope this will get easier with time and I can start fine-tuning.
@ivankomakech3912
@ivankomakech3912 8 күн бұрын
Hello thanks for the wonderfull tutorial. However i get an error on the step that publishes the image to the artifact registry. The error says denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts" on resource "projects/***/locations/us-central1/repositories/charity-wave-repo" (or it may not exist) So based on this I have no idea how to proceed. Please help. Thanks
@nodematic
@nodematic 8 күн бұрын
It sounds like your authentication isn't setup. Try `gcloud init` to set that up.
@Chris-kq7ir
@Chris-kq7ir 8 күн бұрын
What if I have just one docx file that I want to fine tune the model with. How can I achieve that?
@nodematic
@nodematic 8 күн бұрын
If it's a huge docx file, you could break it up and fine-tune on that. If you just need some sort of RAG/grounding, a no-code RAG solution like this would be best kzbin.info/www/bejne/fWTZqn5rj8yGoLM (works with docx files).
@Chris-kq7ir
@Chris-kq7ir 6 күн бұрын
@@nodematic thanks, would data preparation be the same no matter the file type? just extract the text then convert it into key pairs?
@nodematic
@nodematic 5 күн бұрын
For fine tuning, yes, you'll just need the extracted text to train on. Something like python-docx could do the extraction. Split up the extracted text into small enough samples that you don't run out of memory during the fine-tuning process (get the best GPU you can for this, like a Colab A100).
@superfreiheit1
@superfreiheit1 14 күн бұрын
I would like to download 1000 Arxiv papers and chat with a LLM about the content. Do you know how to do it?
@nodematic
@nodematic 13 күн бұрын
I would suggest Google Cloud Agent Builder. The service has developed since this video, but this will give you the idea kzbin.info/www/bejne/n5zGo5V9nZaDj6s. Basically, you add the PDFs to a Google Cloud Storage bucket, then create a no-code Chatbot Agent on top of that bucket.
@nodematic
@nodematic 8 күн бұрын
Here's a video kzbin.info/www/bejne/fWTZqn5rj8yGoLM
@aswathymg9081
@aswathymg9081 15 күн бұрын
I am trying to download 2.2GB file...but only partial download occurs... what is the reason?
@aswathymg9081
@aswathymg9081 15 күн бұрын
How can download the uploaded file from tus server
@ndvhareesh
@ndvhareesh 16 күн бұрын
I’m so glad I found your channel-this is next-level content.
@DarkShadow-321
@DarkShadow-321 18 күн бұрын
Thanks
@sounishnath513
@sounishnath513 19 күн бұрын
If I choose to bypass Cloud Logging and directly insert application service logs into BigQuery, what are the implications, considering that real-time log analysis is not a requirement? Please advise.
@nodematic
@nodematic 18 күн бұрын
That would work fine - no major concerns. Just keep in mind that the more structured the log data, the more useful and intuitive BigQuery-based analysis will be. Unstructured or semi-structured logs are going to be easier to analyze in Log Explorer.
@sounishnath513
@sounishnath513 21 күн бұрын
Thanks nodematic team.
@mandeepmails
@mandeepmails 22 күн бұрын
really good content
@mallninja9805
@mallninja9805 23 күн бұрын
This tutorial represents the third unique config I've tried that has failed to get superset up and running. This one fails in "[wait-for-postgres, superset-init-db] ... * timed out waiting for the condition" Superset is just not ready for prime time.
@Jean-AlainMignon
@Jean-AlainMignon 23 күн бұрын
Very well structured and detailed video to compare self-hosted and google managed prometheus, thanks!
@redboy-1899
@redboy-1899 25 күн бұрын
But in f16 merge it was saving as.bin , how to save as the safetensors natively ?
@jacobjonm0511
@jacobjonm0511 26 күн бұрын
Nice work
@alewhois
@alewhois 28 күн бұрын
Congratulations! Very nice content.
@刘春峰-w2e
@刘春峰-w2e 29 күн бұрын
very informative. just what i need. Thanks.
@DarkShadow-321
@DarkShadow-321 Ай бұрын
Thank you
@sounishnath513
@sounishnath513 Ай бұрын
thanks for sharing
@ROKKor-hs8tg
@ROKKor-hs8tg Ай бұрын
It is free for all or for pro with 9$
@nodematic
@nodematic Ай бұрын
Pro is required
@Mrsamssful
@Mrsamssful Ай бұрын
I had this error when I launched the job: "denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts" on resource" . When I saw the logs, I found this message: "Warning: The gcloud CLI is not authenticated (or it is not installed). Authenticate by adding the "google-github-actions/auth" step prior this one." So we must configure the Authorization before the job and add the "google-github-actions/auth" action. For example: - id: auth uses: google-github-actions/auth@v2 with: credentials_json: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }} - name: install gcloud cli uses: google-github-actions/setup-gcloud@v2 with: project_id: ${{ secrets.GOOGLE_PROJECT }}
@siddhesh-shinde-trellix
@siddhesh-shinde-trellix Ай бұрын
Nice demo. Really helpful
@MitosNãoKagghamNaSuaCabeça
@MitosNãoKagghamNaSuaCabeça Ай бұрын
I love the high quality of NotebookLM, what an amazing tool, but the only problem, a crucial one, is the only available two voices. People are using them like crazy, so they start to sound oversaturated, that's why I'm searching for a free and unlimited method to change the voices, keeping the quality. Found an AI that does the job with the same quality, but it's a pity it's not cheap nor unlimited. The search goes on...
@ChuckBaggett
@ChuckBaggett Ай бұрын
They're playing music while they could be explaining what they are do. Practically no one understands what they're doing. That of explanation lead to me to thumbs down it. and stop watching. Hopefully I'll find something more useful .
@ahmadrana-c1y
@ahmadrana-c1y Ай бұрын
is it worth using unsloth with amazon sagemaker ?
@lvngleyptyltdnorthsideva1148
@lvngleyptyltdnorthsideva1148 Ай бұрын
Hi, my name is Bongani from South Africa. Firstly, thank you for such an informative video. Short and straight to the point. I'm a non-technical co-founder in our startup, and I would like to ask you something that is somewhat technical. I'm in a region where the model has limited data on local languages. It's mainly good for detecting profanity. I would like to fix that by creating my own audio data sets, transcribing them and then feed those into the model to improve it. Is that something that is possible to do? I'm from a sound engineering background, now working in the telelhealth space
@martinterreni1616
@martinterreni1616 Ай бұрын
Very good overview video
@deltaexistsss
@deltaexistsss Ай бұрын
Great tutorial! Just wish we could use checkpoints using the online version...
@LukaszBrodziak
@LukaszBrodziak Ай бұрын
When I follow the tutorial I get an error: denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.uploadArtifacts"
@MegaKrishnas
@MegaKrishnas Ай бұрын
I am getting the same error as well :(
@sounishnath513
@sounishnath513 Ай бұрын
I'm grateful that i found this channel without the recommendation .
@mrmuranga
@mrmuranga Ай бұрын
this is super awesome😎😎...thorough and easy to follow...thanks alot 👍🏿👍🏿
@rafaelfox10
@rafaelfox10 Ай бұрын
Thank you!
@saireddy-o6p
@saireddy-o6p 2 ай бұрын
Hi, I have got error like : Not found: Dataset peak-catbird-440802-b3:dataform was not found in location US i have given all permissions as you mentioned and the dataset loction is us in bigquery.
@jitendrakumarnayak8857
@jitendrakumarnayak8857 2 ай бұрын
such an intuitive video....very disheartening to see so less views, likes and subscribers😑...hope you continue making such videos
@igorcastilhos
@igorcastilhos 2 ай бұрын
I have a folder with many PDF's and I would like to fine tune a model to summarize these PDF's and respond to questions in my website. Is there a way to do that using the example of this video?
@abdulsami5843
@abdulsami5843 Ай бұрын
you would first have to figure out the parsing logic to correctly extract the text and then put it in a summarizer, if all you want is a summary then there are many good models available on hugging face that you can use directly OR just get a gemini api key and use gemini for it, it should do a decent job at it.
@igorcastilhos
@igorcastilhos Ай бұрын
@ thank you
@abdulsami5843
@abdulsami5843 Ай бұрын
@@igorcastilhos also if you want to use the PDFs as context to answer questions then you probably need to parse and then put them into a vector store so that they can be retrieved when needed, this is called RAG
@igorcastilhos
@igorcastilhos Ай бұрын
@@abdulsami5843 I'm using Ollama with the Web UI tool. Inside it, I'm sending to the Knowledge collection some PDFs of resolutions of attorneys, so that they can ask about them whenever they want. The main reason to use Ollama (llama3.2) instead of OpenAI API is that it is free. But I'm having problems accessing the Web UI localhost:3000 from our server in my machine, it doesn't show the models installed in the server machine.
@igorcastilhos
@igorcastilhos Ай бұрын
@@abdulsami5843 Also, the RAG feature doesn't have a very nice documentation. In my case, we have a distributed folder in microsoft windows (Like C:/) and inside that folder, the attorneys and advocates will send new PDFs through the website, and I wanted to use RAG for it, but it is very hard.
@TronggMjnh
@TronggMjnh 2 ай бұрын
How can I download it as excel to my computer?
@dante7222
@dante7222 2 ай бұрын
How do I submit a project in python. The main file (driver) main.py and other files that main.py imports? and other project files. requirements.txt, configs.json etc...
@liakat26
@liakat26 2 ай бұрын
Fantastic tutorial. Can it recognize handwriting and extract them too! Google has amazing people
@umairrkhn
@umairrkhn 2 ай бұрын
Great video!
@lesptitsoiseaux
@lesptitsoiseaux 2 ай бұрын
Could you link the actual notebook you are using somewhere please? Feedback wise, I'm 47, been learning forever, you have an awesome pace, quality wise, I'd pay for the content.
@nodematic
@nodematic 2 ай бұрын
Thanks! Here's the notebook colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing
@NhatNguyen-bq6jj
@NhatNguyen-bq6jj 2 ай бұрын
thanks!
@orafaelgf
@orafaelgf 2 ай бұрын
greate, tks. how to up interface from UC? Do you have any video explain that?
@nodematic
@nodematic 2 ай бұрын
At the time of video creation, the interface wasn't available, but we'll try to cover this in a future video.
@espetosfera8966
@espetosfera8966 2 ай бұрын
Very good.
@hamedparsa8880
@hamedparsa8880 2 ай бұрын
Thanks for the good tutorial ma'am. If you only save the QLoRA, How we can use it tools like LM Studio?
@nodematic
@nodematic 2 ай бұрын
You will always need both the base model and the adapter layers to actually run/use the model - it's just a question of if you want to merge and store everything "together". I haven't tried LM Studio, but upon a quick look, I would suggest saving/publishing your model as merged weights (should be simpler to pull into LM Studio). The "GGUF Conversion" portion of the fine-tuning notebook might actually be best for the export, based on LM Studio's website "LM Studio supports any GGUF Llama, Mistral, Phi, Gemma, StarCoder, etc model on Hugging Face". Hope that helps!
@hamedparsa8880
@hamedparsa8880 2 ай бұрын
@@nodematic yes I've exported as gguf and used the merged option and it worked. although i faced with the model over fitting to a small database and that made it's behavior to go weird in some cases. but I'm working to expand my training data and maybe add different system prompts into the data as well... the question is, how to divide my data into training/test datasets and check the loss function for test dataset? does that colab notebook support it? or I'mma need to figure it out myself? ++ thanks for the answer. 💙
@nodematic
@nodematic 2 ай бұрын
The loss is reported after each step in the training (you'll see this in the training cell in the notebook). A good approach is to see where the loss starts to "level off" (decrease significantly slower), and use the model at that point. You could also consider reducing the LoRA alpha value to put less emphasis on the adapter layers (and increase the emphasis on the base model). The expanded training set is a good idea, especially if you have less than ~200 examples. There isn't a traditional training/test split in fine-tuning like you would in other AI/ML problems - partly because responses are difficult to score quantitatively and with precision. Instead, people will do a post-training quality step of RLHF to integrate feedback on which fine-tuned model answers were good and which were bad, on tests. There are also some advanced methods to limit overfitting, but it's well beyond the scope of most small model fine tuning.
@Grynjolf
@Grynjolf 2 ай бұрын
Does this require a load balancer? Or can you region block my instances? I'm trying to stay within the free tier and block everything but US traffic.
@nodematic
@nodematic 2 ай бұрын
The demonstrated setup requires a load balancer - you'd have to DIY something if you're routing traffic straight to your VM public IP.
@prabhupreetamdas3971
@prabhupreetamdas3971 2 ай бұрын
Thanks for the informational video. I really liked the way you demonstrated the workflows and detailed steps of installing Milvus.
@hamedparsa8880
@hamedparsa8880 2 ай бұрын
Nice tutorial. Got me subscribed! ❤ Buuuutt,, i want a new more detailed fine-tuning tutorial on Gemma 2-9b.. especially for coding purposes.
@nodematic
@nodematic 2 ай бұрын
Thanks. We'll add that to our video ideas.
@hamedparsa8880
@hamedparsa8880 2 ай бұрын
@@nodematic can't wait for that to come out! 💯 keep it up! 💪🏼
@jonatasnascimento6584
@jonatasnascimento6584 2 ай бұрын
Hello, I'm from Brazil. I'm new to AI. I would like to build an artificial intelligence to automate university work, as I have a lot of work and I can't keep up with it. I want an AI that can write papers like me using my texts. What adjustments or training should I do? Do I need to change a parameter?
@jonatasnascimento6584
@jonatasnascimento6584 2 ай бұрын
In my mind I'm trying to use about 10 review texts of my articles. And 1 expanded summary. I want the AI to write like me without AI plagiarism detection.