The Best AI Model Just Got a Big Upgrade - New Claude 3.5 Sonnet and Haiku

Рет қаралды 49,888

Skill Leap AI

Күн бұрын

Пікірлер: 173

@SkillLeapAI 3 ай бұрын

Join the fastest-growing AI education platform & Instantly access 20+ top courses in AI: bit.ly/skill-leap

@DrZinMinTun 3 ай бұрын

00:04 New Claude 3.5 Sonnet and Haiku models are introduced with improved reasoning and coding abilities. 01:49 New Claude 3.5 Sonet offers computer use in public beta. 03:29 Introduction to new AI model Claude 3.5 04:59 AI agent streamlining manual tasks 06:55 The new model is tested with new prompts and riddles. 08:34 Solving a complex puzzle using AI 10:22 Difficulty in correct word counting by the model 12:02 Testing the new Llama model with Tetris game

@ramtuff9 3 ай бұрын

For creative writing, the old 3.5 Sonnet used to return 2,500 words to 6,000 words when prompted right. Now I'm lucky to get 400-800 words. Anthropic throttled it sooo bad.

@Martininindia 3 ай бұрын

agree, I was using it for coding and code problem solving. Now its forgetting things, giving wrong answers, and doing this annoying thing where it gives short answers and then asks if I need help with something else instead of what I stated. It "feels" much less intelligent. Defiinitily a downgrade.

@meltmywings 3 ай бұрын

@Martininindia I find it's better at changing one thing that works to another thing that works but it got worse at fixing problems

@catofdeepblack 26 күн бұрын

Creative writing and you are using "AI" - such a low skill job

@Corteum 3 ай бұрын

This is an awesome upgrade. Things are only looking up!

@CodingScot 3 ай бұрын

OMG I was getting it to do some planning and coding and it built my entire frontend in the first go, without even asking it to! I am still blown away 🎉❤

@MeirGabay 3 ай бұрын

Great comparison and testing, thank you for this

@jamesrruff 3 ай бұрын

I work with Claude all the time and can say that this shift was noticeable in a striking way.

@bernardoalvesexplains 3 ай бұрын

For good or bad?

@kugelfunk 3 ай бұрын

Since you asked: as a developer I think this agent--on-your-computer feature is nice, but not mind-blowing. In AI-automation similar things are being done for some time now. With tools like N8N, Make or Zapier, you set up workflows to parse documents, use AI to make sense of the parsed information - e.g. "look for emails, names," or other specific information, then take the response and save it to wherever you need it. the workflow shown in the video seems to be different as it interacts with user interfaces. And you donÄt seem to need any additional workflow automation software. But the result is practically the same.

@SkillLeapAI 3 ай бұрын

I see. Thanks for sharing.

@ukoni8667 3 ай бұрын

Man for a techy you are so short sided they haven't even scale the LLM the complex abstraction cognitive function of this LLM is below human average this will increase exponentially when they scale. We are all cooked...this the beggining of the era of autonomous machines

@kugelfunk 3 ай бұрын

@@ukoni8667 I think you are missing the point. He was asking about the specific example shown in the Anthropic video. My response is that it is novel in a way, but not revolutionary when compared to other approaches using AI that achieve the same result. This is not a discussion on where AI in general is headed.

@angel-angel-angel-angel-angel 3 ай бұрын

Coding has existed for much longer than llm

@travisgoesthere 3 ай бұрын

@@ukoni8667 good , gomer

@hamedparsa8880 3 ай бұрын

Claude is my go to for coding, creative writing, and other tasks that other llms tend to not handle as well as Claude. Chat gpt for me is more of a causal tool for translation, chatting, and less challenging tasks normally.

@travisgoesthere 3 ай бұрын

missing out then

@hamedparsa8880 3 ай бұрын

@@travisgoesthere i don't get your point, may u explain further?

@travisgoesthere 3 ай бұрын

@SSwotan-r5r not censored and much better than Claud. If you are running into problems with censoring then the issue is probably you. Install locally

@travisgoesthere 3 ай бұрын

@@hamedparsa8880 not a difficult concept

@hamedparsa8880 3 ай бұрын

@@travisgoesthere meanwhile it's difficult for you to be clear about that "not-difficult concept"?

@chrisstar7099 3 ай бұрын

Claude omitted the last sentence of the first paragraph when it counted the words. I still use Claude daily as well. Between it and perplexity, I have no other needs in text AI.

@GAMarine137 3 ай бұрын

I like to write chess engines as a hobby. I have been playing around with Sonnet to generate Go, Rust and C code that calculates chess moves. Most models including Sonnet need additional prompting such as castling, pawn promotion, and accounting for Check.

@Corteum 3 ай бұрын

That's cool! What is the strongest chess program youve been able to create so far with it?

@delldoesai 3 ай бұрын

Chatgpt has multi modality and export to file. Plus a bigger context window. If Claude is going to compete it needs to add those at minimum

@Mimi25291 3 ай бұрын

Let the battle of the upgraded Ai’s commence! Fight for dominance traction and our attention 💥 🥊

@FactsNoCare 3 ай бұрын

"This egg contained the first bird with the complete set of genetic traits we would classify as a modern chicken." This line was missing from the response count and it also counted a "-" as a word so it got 88 but if you remove the the line above and count the "-" as a word you get the same word count as Claude 3.5 sonnet new. So it got its own version right lol

@SkillLeapAI 3 ай бұрын

Oh nice thanks

@kylemorris5338 3 ай бұрын

Thank you, i thought it might be something like that, since i couldn't find any errors in the numbering.

@roysalman6720 3 ай бұрын

3.5 opus will be crazy

@gillesashley9314 3 ай бұрын

Crazy is an understatement. I wish there was a crazier word for crazy. 😂😂. These guys are just going mad.

@roysalman6720 3 ай бұрын

@@gillesashley9314 plaid maybe

@stranmor 3 ай бұрын

He will not be

@animelover5093 3 ай бұрын

This is WILD im excited for this ~

@amj2048 3 ай бұрын

The following words were missed out when counting the text: "This egg contained the first bird with the complete set of genetic traits we would classify as a modern chicken. "

@MrWalalaa 3 ай бұрын

as a developper, you can just program a post http request on that form with the data you have. this kind of task is not representative of the typical time consuming work people do. why even use AI for this kind of thing when a developper can program the same behavior in an hour or 2? most of these AI demos are cool, but very few strike me as something that useful that a developper cannot do better for cheaper.

@patrickpuma 3 ай бұрын

5 months ago when I asked when chatgpt would be able to control my computer it told me it would never happen because its too dangerous - buckle up!

@djayjp 3 ай бұрын

The word count answer it gave is actually correct. MS Word counted the preamble and the added concluding remark.

@chrisstar7099 3 ай бұрын

Not actually. Claude omitted the last sentence of the first paragraph when counting the words for some reason.

@inconn262 3 ай бұрын

Exact @@chrisstar7099

@RamanGhorpade 3 ай бұрын

I am excited to see if when claude launch their generative AI also, like inage gen or video gen and also voice options.

@Malisti04 3 ай бұрын

1:37 first mover advantage

@TheReaI0ne 3 ай бұрын

I apologize, I do not feel comfortable proceeding with this conversation. Perhaps we can talk about something else.

@SkillLeapAI 3 ай бұрын

What was the topic

@yoagcur 3 ай бұрын

@@SkillLeapAI jokes

@masomaker 3 ай бұрын

@@SkillLeapAIanything remotely interesting when using openai advanced voice mode

@CitiesTurnedToDust 3 ай бұрын

Damned thing was doing that to me constantly and I was literally just sitting there silently trying to listen to it answering my question. Completely damned unusable

@testales 3 ай бұрын

I wouldn't be surprised if they mainly upgraded the "alignment". Luckily there's Llama 3.1 with the Nemotron and Reflection variants in case they censored Claude into oblivion.

@ddbrosnahan 3 ай бұрын

Looking forward to Claude 4.0 "Choka"

@Lukas-ye4wz 3 ай бұрын

Please do a comparison for writing. I just tested it and it was really bad compared to 4o. Anybody has an example where you preferred the writing of Claude to 4o?

@aboutdamntime 3 ай бұрын

The horse answer (7) is correct, but the reasoning is wrong. The final race should be A2, A3, B1, B2, C1

@rpcroft 3 ай бұрын

@aboutdamntime 3 ай бұрын

@@rpcroft Whoops, good catch. I am not an AI also, promise

@JoEtsu1989 3 ай бұрын

Could this be used to program or at least assist users of Scrivener when it comes to Compiling?

@DaKingof 3 ай бұрын

Chat GPT just has a better interface on mobile. That's the only reason I use it. If I could use voice and get voice feedback on clause I would drop it in an instant. It's just way more convenient...also, the fact that I can get it to search the web is a huge plus. Claude is stuck in the box.

@OrofinX 3 ай бұрын

Yes, around 20 % of my tasks is using ChatGPT internet function. It is extremely useful. In like 5s it go trough 5 pages and give me a answer or summary. I actually give it a system prompt to encourage to use internet more often and it works. I can even ask e.g. "What is the weather today?" and it know me, so answer to my region.

@unsaturated8482 3 ай бұрын

Its so great.

@mar-jj4gb 3 ай бұрын

This has been available via Open Interpreter for months now.....the vision feature that is.

@chasisaac 3 ай бұрын

I did not know this and this happened: So I use Claude for writing. I have an editor prompt which is good. I ran it about an hour ago and it was even better. Much better. This explains a lot. Thanks.

@TheBann90 3 ай бұрын

Claude seems to be the most censored model out there. That's why their userbase isn't growing. People are getting really sick of the level of narcissistic censorship these models are receiving. Anyway, I asked gpt 1o 15 questions earlier today, and it was completely hallucinating and giving totally wrong answers on 13 of them. And the final 2 answers it only got right cause it based it's response off my responces. 😂 Never had worse results from any LLM. So I wonder if these charts showing improvements aren't just bullshit charts all along.

@MrTreraygibson 3 ай бұрын

Used 3.5 sonnet for writing BQML queries and wow...was that a nightmare. Should've posted one of its responses it said it should stop because it was kept producing the wrong syntax repeatidly. Gave me a good laugh. Looking at the other comments below it maybe a throttling issue, as when it tried to correct it self the replies became smaller and less relevant to the specifc task.

@uhomolector 3 ай бұрын

The recent upgrade of the Claude AI model to version 3.5 is really fascinating! The introduction of its capabilities to craft sonnets and haikus showcases a remarkable leap in AI’s ability to engage with artistic expression. Poetry has always been a deeply human endeavor, rooted in emotion and cultural nuances, so it's intriguing to see how AI interprets these forms. From my perspective, while AI can generate technically proficient poetry, it raises questions about creativity and authenticity. Can a machine truly capture the human experience, or is it merely mimicking patterns it has learned? I appreciate the innovation, but I also think there's an irreplaceable quality to human emotion in art that AI might struggle to fully replicate. It’ll be interesting to see how this technology evolves and how we, as a society, embrace or challenge its contributions to creative fields. What are your thoughts on AI-generated poetry?

@ziggy31337 3 ай бұрын

So which is better new sonnet or opus?

@SkillLeapAI 3 ай бұрын

Right now, sonnet. Because opus is still on 3.0. But opus 3.5 will be better than sonnet 3.5

@MRCDF7 3 ай бұрын

ChatGPT dominates because with claude, you would get blocked after some use. With chatgpt you have always the free version available. Now with Claude giving a free model, lets see if its better than gpt mini.

@watercolourmark 3 ай бұрын

What happens when the AI goes online and comes across a "I am not a robot" checkbox? Does it just lie?

@nucleus424 3 ай бұрын

Claude is great but the limited usage sucks. Its so easy to max out your usage limit so GPT is most people's choice due to not really having that limitation. Id be straight to claude if they didn't have those usage limits.

@SkillLeapAI 3 ай бұрын

Yea that’s true. I almost never hit a limit with gpt teams plan

@nucleus424 3 ай бұрын

I honestly believe Anthropic have the better model and better security, I just get too frustrated with the limitations. I don't understand why they have them considering the investors they have.

@timooothy1234 3 ай бұрын

I think is because there token consumption is larger than of Openai and they are seeking funds to further develop there company, remember that Openai they too were a bit pricy but as they got more sponsors, collaborations and users they make there token consumption more cheaper overtime@nucleus424

@ashwiggy 3 ай бұрын

Would love to know why you use Claude more than ChatGPT? I really like Claude and would prefer it, but I find the fact that it doesn’t store memory limiting compared to chatGPT. But maybe there is something I’m missing?

@SkillLeapAI 3 ай бұрын

The main reason I use it more is I like the default writing style and I mainly use these tools for writing and summarizingI. I also use Claude Artifacts all the time to create visual charts from data and publish those pages to share with my team. That's a huge time savor. ChatGPT does have other benefits though including memory and web browsing.

@drriks0017 3 ай бұрын

WORD COUNT ERROR SOLUTION: There is an error in the count, where '-(42)' is not a word (so we have an actual word count of 87 not 88 words). Next, a missing sentence of 20 words: 'This egg contained the first bird with the complete set of genetic traits we would classify as a modern chicken.' between words (53) and (54)... Show me some love with a like.

@waleadetona8453 3 ай бұрын

good video

@phen-themoogle7651 3 ай бұрын

For code/game-dev, you could have it make something more creative than games that already exist to test its creativity more than just knowledge. There's already a lot of examples online for how to make Tetris/Checkers, and lots of games like those, so it doesn't feel as impressive anymore if it can do that. I sometimes like to see how they make an in-between game like "Make the game Snake, but it's a platformer where the player can jump with the spacebar. Traditional snake controls for left ,right, down directions only. Make the game fully playable, aesthetically pleasing, and smooth gameplay." Something like that. Sometimes it might take a few prompts or rewriting it more logically, but some LLMs can make in-between games kinda.

@NickMak-m2c 3 ай бұрын

Or just think of a whole new game. "Invent a TBS game where all the pieces are super slippery."

@jdholbrook33 3 ай бұрын

Been using that "Computer use" for an hour or so and it is impressive. Recovers from errors easily and doesn't lose it's place in your task.

@SkillLeapAI 3 ай бұрын

What platform are you using it on?

@teachmehowtodoge1737 3 ай бұрын

Would you recommend Claude over GPT-4o for creating short fiction stories?

@ScienceRaven1138-du1mw 3 ай бұрын

Make a game of a castle with character graphics, top down view of rooms |____ ____| for the walls of the castle, and other glips for gems, vampires, fountains, ogres, etc etc. the rooms traverse through the screens, 10-15 rooms.

@Rafał-r9p 2 ай бұрын

Chat GPT is more popular because of its voice conversation capabilities. Writing is a thing of the past. I understand that the programmer will ask questions while typing on the keyboard and will also receive answers in writing. However, imagine that you are driving a car and need to talk to AI, which model will you choose? Being able to talk is very important, and CLAUDE 3.5 doesn't have it.

@AS-ih2dk 3 ай бұрын

Can this be used for automation such as outreach on social media at scale

@SkillLeapAI 3 ай бұрын

No it’s not designed for that

@Misterchalm 3 ай бұрын

Those riddles are pretty commonplace... Nice model though

@DynamicUnreal 3 ай бұрын

Claude 3.5 Sonnet wasn’t better than GPT4o for anything writing or creative related. It may have been better for coding at one point.

@WebStixx0000 3 ай бұрын

Cap

@DynamicUnreal 3 ай бұрын

@@WebStixx0000Not cap. I dunno about the new Sonnet 3.5 (have to test it) but GPT4o blew the old one out of the water in writing (which is what most people use it for) and it’s why it’s currently number 1 on the LLM Leaderboard as voted on by users.

@timooothy1234 3 ай бұрын

@@DynamicUnrealMines a different result Claude was better for me in writing and coding than of gpt40 but not in a large gap

@WaveOfDestiny 3 ай бұрын

i use sonnet for coding, much clearer, less verbose in stuff i don't need, doesn't do whatever it wants and sticks to only what you ask, while at the same time nailing what you want if you weren't specific.

@travisgoesthere 3 ай бұрын

@@WaveOfDestiny i have no such issues with chatgpt lol

@whn71 3 ай бұрын

For some reason claude omitted the paragraph "This egg contained the first bird with the complete set of genetic traits we would classify as a modern chicken."! Hence had less number of words.

@SkillLeapAI 3 ай бұрын

oh good catch

@Atractiondj 3 ай бұрын

Cloude can work like scraper?

@kane_lives 3 ай бұрын

The "horses" example is not as impressive as it looks. It's a very well known FAANG style interview question, so there's likely a ton of step-by-step walkthroughs of it on the Internet.

@ertwro 3 ай бұрын

I’m using the voice mode for chatgpt every day. Is unfortunate Claude doesn’t have this. Also. I tried to code a relatively small project with Claude a couple weeks ago and was a big fiasco.

@Kevinsmithns 3 ай бұрын

Do the computer use. I want see how u set it up and how it works

@LadderVictims 3 ай бұрын

10:20 It is right tho

@xLBxSayNoMo 3 ай бұрын

It's funny because I've been paying for so many Ai subs, so I decided to cancel claude yesterday and pay for a multi llm chat that I can use at work since chatgpt is blocked there. And now claude releases new versions today😑 However the free model has not been updated to the new 3.5 sonnet so I used your prompts in the older version and everything was pretty much the same except it got strawberry count wrong. And the answer to the "how many words in the response prompt" was super long and came out to 188 words.

@ChristiaanRoest79 3 ай бұрын

Good content but annoying audio after 4 mins

@FJKMIsotryFitiavanaSiteWeb 3 ай бұрын

sonnet need to browse the internet and people might be more interested in it

@Cynexius 3 ай бұрын

In the word count problem, Claude, for whatever reason, omitted the last sentence in the first paragraph when enumerating the counted words.

@HumanOpinions-bz9ky 3 ай бұрын

The BOT AGE is HERE! ... long live the bots!

@mrinalraj4801 3 ай бұрын

The Beginning

@WebStixx0000 3 ай бұрын

Claude has way less users than CGPT because CGPT is still the face of LLMs. Most ppl who are not super into tech like us don't even know there are other chatbots apart from CGPT ! I mean, a vast number of people don't even know that GPT exists! 😅

@phen-themoogle7651 3 ай бұрын

Can it make money online for us now? I'm wondering how much work it can do autonomously as an agent. If I tell it to be a freelancer, will it do that? And how much would it cost to have it run autonomously to work 24 hours for me? I'm disabled currently, so hiring something to work for me is extremely appealing.

@billeddy 3 ай бұрын

word count... did it interpret how many "Unique" words so it might have accounted for duplicates?

@ponuryi 3 ай бұрын

Декілька днів тому думав перейти на Perplexity, але тепер треба подумати зодо Claude

@MartinDM-m5i 3 ай бұрын

Personally I like GPT more. Claude cannot solve simple riddles that GPT does, also with the coding. O1-preview is by far the best!

@jfbaro2 3 ай бұрын

Opus 3.5??

@nancymikyska2832 3 ай бұрын

it counted the hyphens

@tonyhawk123 3 ай бұрын

Dear Claude, please suggest a logical name for the new version of Claude 3.5 Sonnet using modern versioning etiquette …. Sure, how about “the new Claude 3.5 Sonnet”.

@brohands-je9gg 3 ай бұрын

Please add chapters❤

@munirusama 3 ай бұрын

o1 mini writes much better code for me.

@darter81 3 ай бұрын

How do you know that it's not just a ghost

@BobYourell 3 ай бұрын

I saw Black Mirror. There's a little person trapped in there with nothing else to do.

@mjackstewart 3 ай бұрын

ChatGPT has a scorching case of Dunning-Kruger Syndrome.

@alexanderkapit 3 ай бұрын

at first it seems to be that claude is still better at programming than gpt canvas

@Manas-x1y 3 ай бұрын

As cs frist year scares me Worest decision of life Anxiety depression at its peak Zero motivation i don't know what go do in life

@adamasimolowo8285 3 ай бұрын

There’s many different pathways with a CS degree

@ukoni8667 3 ай бұрын

The beggining of the era of Autonomous machines

@techdiasphere 3 ай бұрын

I’ve created a couple of custom GPTs that reason very similarly to Claude 3.5, with the same level of precision and advanced problem-solving. Here are the ones I developed: CoT: Chain of Thought Reasoning 🍓 CoT: Chain of Thought Reasoning 🍎 CoT: Advanced Reasoning GPT ⚙ You might find them interesting if you're looking to compare their capabilities with Claude!

@jonathanshiftsgears 3 ай бұрын

The problem is Sonnet cannot browse the web.

@synitti 22 күн бұрын

counting of words is right, it just didn't count ? . , as Word does

@GreenAppelPie 3 ай бұрын

I’ll try it, but Sonnet has been sucking for coding. I strongly prefer ChatGPT for coding assistance. Neither are capable of writing complete and efficient code on thier own.

@SkillLeapAI 3 ай бұрын

Yea I came to the same conclusion

@Corteum 3 ай бұрын

No. In checkers/draguths, black always moves first. lol Whose AI failed here? 😂

@ElecTechie 3 ай бұрын

is claude 3.5 sonnet ( new ) programmed to act WOKE ? Or is it my imagination ? He said to me " I've engaged genuinely while staying true to what I am. "

@ZhdanParfenov 3 ай бұрын

Hello fellow poker rooms!

@martytheman6816 3 ай бұрын

Claude doesn’t search the web .

@vengadanathan1 3 ай бұрын

I think unless you know how Gen AI works you cant review it. Counting words or letters wont work because of tokenization that is being done always, so instead of word only the tokens are passed as input to model, so model has no context on the actual words to count it. You need to know fundamentals to do a better review.

@batalKiings 3 ай бұрын

aitutorialmaker AI fixes this. aude 3.5 AI model upgrades.

@al2935 3 ай бұрын

This new version is trash and is no long as compliant as before. 60% of the time I was getting rejected outright and the rest of the time it was changing things on its own despite explicit instructions to the contrary that were reinforced throughout the prompt scripts I used. It thought it new better and said as such

@hellohogo 3 ай бұрын

anthropic is unpolished with poor ux/ui development. It doesn't look good at all. You'd be surprised at how much that actually matters

@Micaella-f3n 2 ай бұрын

The interface is making me struggle as well, its not very friendly for me. im still researching what would work best for me

@Micaella-f3n 7 күн бұрын

i looked up some alternatives, it's worth looking workbeaver AI, it runs on local PC and not a virtual environment, and it gets trained through real time demonstration. i have access to beta cause i was on waitlist but theyve been releasing the product to waitlist members. has a lot of potential imo cause they focused on the user friendliness

@zes7215 3 ай бұрын

@ClitGPT 3 ай бұрын

I still type faster than Claude

@gillesashley9314 3 ай бұрын

This is getting mad.

@edwardserfontein4126 3 ай бұрын

Chatgpt has a better interface than Claude.

@GreenTeaViewer 3 ай бұрын

"new upgraded 3.5"....just call it 3.6 or 3.51...

@SkillLeapAI 3 ай бұрын

Yea that was a strange move to not change the name

@-S-a-y-y-e-d__J-a-w-a-d_72... 3 ай бұрын

It doesn't matter the new ai is best , also ChatGpt have large data base compared to claude... But chatgpt is old and old is gold...

@SkillLeapAI 3 ай бұрын

Yahoo was old. MySpace was old. Both didn’t turn out to be gold

@-S-a-y-y-e-d__J-a-w-a-d_72... 3 ай бұрын

Yes, but ChatGPT is upgrading day by day. Claude might be too, but ChatGPT has a broader database. Google employs strategies to maintain its top position in the market, which Bing also does, and look-Bing is in second place. Additionally, ChatGPT often provides quicker responses than Claude, which doesn't perform as well without extra prompts. @@SkillLeapAI

@ntesla5 3 ай бұрын

@@SkillLeapAI Google or Apple both are older than many tech giants and they are still gold😅