Data Curation for Open Source LLM Fine Tuning - Data Science Festival

  Рет қаралды 118

Data Science Festival

Data Science Festival

Күн бұрын

A talk by Clemens Schroeer from Lemon AI.
This session covers Data Curation for Open Source LLM Fine-Tuning.
Everyone wants to fine-tune open source LLMs, but a lack of high quality data makes this hard. Even the data that companies do have is difficult to understand, making it challenging to iterate towards a high quality dataset that will provide good results from fine-tuning. Clemens will share his experience curating datasets to fine-tune models such as Mistral 7B and discuss some of the challenges that should be taken into consideration.
Technical Level: Technical practitioner
This session was part of the Data Science Festival MayDay event 2024. Find out more at datasciencefes...
The Data Science Festival is the place for data-driven people to come together, share cutting-edge ideas, and solve real-world problems. We run monthly events, meet-ups, and the biggest free-to-attend data festivals in the UK. Join the community at datasciencefes...

Пікірлер: 1
@kevon217
@kevon217 2 ай бұрын
BERTopic for the win.
Haleon's LLM Translation Tool Development Story
23:37
Data Science Festival
Рет қаралды 59
Life hack 😂 Watermelon magic box! #shorts by Leisi Crazy
00:17
Leisi Crazy
Рет қаралды 78 МЛН
The selfish The Joker was taught a lesson by Officer Rabbit. #funny #supersiblings
00:12
Funny superhero siblings
Рет қаралды 11 МЛН
Don't look down on anyone#devil  #lilith  #funny  #shorts
00:12
Devil Lilith
Рет қаралды 44 МЛН
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,1 МЛН
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 339 М.
The Problem With Microservices
17:47
Continuous Delivery
Рет қаралды 435 М.
Data Analytics Competitions: A Guide for Success
15:56
Data Science Festival
Рет қаралды 91
What are AI Agents?
12:29
IBM Technology
Рет қаралды 521 М.
AI can't cross this line and we don't know why.
24:07
Welch Labs
Рет қаралды 1 МЛН
Andrew Ng On AI Agentic Workflows And Their Potential For Driving AI Progress
30:54
How AI 'Understands' Images (CLIP) - Computerphile
18:05
Computerphile
Рет қаралды 205 М.
Life hack 😂 Watermelon magic box! #shorts by Leisi Crazy
00:17
Leisi Crazy
Рет қаралды 78 МЛН