May Habib, Writer CEO, joins 'Closing Bell Overtime' to talk the companies new AI model and how they are innovating AI training.
Пікірлер: 15
@TaskSwitcherify2 күн бұрын
How do you avoid Garbage In - Garbage Out? When training on synthetic data, some of which is already hallucinated, flawed, and misinformed, don't you get even more hallucinated and "machined" outputs and a form of data poisoning?
@MaioubiКүн бұрын
Like humans, it's easier for AI to know something is good than produce the good thing. AI is very good at "labeling" stuff (classification) and the tech for that is much more mature from the early days of deep learning. Synthetic data allows the AI to convert its classifying ability into greater intelligence by generating tons of examples and discarding the bad output. The next model will then be slightly smarter and better at generating and labeling. It's not perfect but it can go very far if you have vast computing power, and we're not close to the ceiling.
@KK-pm7ud4 күн бұрын
Sounds too good to be true
@joe_hoeller_chicago3 күн бұрын
Synthetic data doesn’t work as good as you think for real world tasks, esp within domains that require you understand a context within a context.
@alshiferaw9254 күн бұрын
The entire talk the lady said was a bunch of air.
@sim-racer2 күн бұрын
Not really. She is right, smaller models perform much better when trained with high quality synthetic data generated from LLMs.
@Cellardoor187Күн бұрын
No she did not, she is on point and this is a very smart venture. that "bunch of air" she produced got her a 2B dollar valuation. So perhaps get off your high horse.
@mymusicpublisherКүн бұрын
Not really. Her voice throws me off though.
@bluesque9687Күн бұрын
I like your blonde hairstyle and the big ring earrings! A blast from the past!
@MrDonald911Күн бұрын
Research already showed it doesnt work unfortunately.
@DanielKwan-b7g2 күн бұрын
Training on synthetic data gets the illusion that a model works but bc it’s trained on fake data it’s less accurate lol. Lady, there is a reason why ppl dont want to go this route 😅
@MaioubiКүн бұрын
Any synthetic data is reviewed by AI as well, and AI is better at knowing good from bad than making good, kinda like us humans. Bad output is discarded. This cycle isn't perfect but it definitely grows more accurate over time, not less so. Look at o1 by OpenAI, mostly trained by synthetic data.
@briandouglas7375Күн бұрын
Synthetic data will be flawed.
@MaioubiКүн бұрын
It's easier for AI to know something is good than produce the good thing, just like humans. AI is very good at labeling stuff (classification) and the tech for that is much more mature from before deep learning. Synthetic data allows the AI to generalize its classification skills into greater intelligence by generating tons of examples and discarding the bad output. The next model will then be slightly smarter and better at generating and labeling. If you have enough computing power, this cycle seems to have no upper bound.