Amazon A.I. Nova Pro TESTED
15:27
28 күн бұрын
Official OpenAI library for  .NET
2:43
Пікірлер
@AbuBakr1
@AbuBakr1 14 сағат бұрын
Where there's difference in both: Reasoning
@ccyberhub
@ccyberhub Күн бұрын
Best way to use deepseek v3 in my opinion and that has work perfect with my flow, I use cursor with sonnet and when I encounter an error THAT IS NOT BACKEND related I send it to deepseek v3 via cline, I use it basically for frontend browser console errors
@MarvijoSoftware
@MarvijoSoftware 12 сағат бұрын
Interesting! I'll make a video on this workflow!
@orthodox_gentleman
@orthodox_gentleman Күн бұрын
Also, in my experience DeepSeek is junk to say the least…typical for Chinese companies. It switches human languages at random and is just messy. I also think it might be stealing API’s from frontier providers.
@MarvijoSoftware
@MarvijoSoftware 12 сағат бұрын
Yes, many say it used frontier models for synthetic data training
@orthodox_gentleman
@orthodox_gentleman Күн бұрын
I’ve discovered that using the Architect slash command with a reasoning model (I employ Gemini 2.0 Flash Thinking) and then employing DeepSeek as the primary model and a smaller model as the weak model (Haiku, 4o mini), all through the OpenRouter API, within an .env file located in the root directory for Aider, yields significantly improved results when planning and executing a project. Aider is unable to provide a reasoning framework for standard LLMs that don’t incorporate Contextual Understanding, among other factors. Additionally, the Cline Code Extension is highly effective for project planning and execution since it systematically breaks down tasks into step-by-step instructions. Therefore, it might be beneficial to consider using Cline since it operates solely on one model at a time. LLMs such as Sonnet and DeepSeek struggle with complex one-shot prompts for the development of an entire application. It is crucial and imperative to utilize these “AI” tools appropriately and for the intended purpose! Peace, brother!
@BeYob
@BeYob 22 сағат бұрын
Whoa, I love such detailed responses. I spent the last seven minutes looking through your profile doing reverse image search looking at your banner and wondering if it’s Texas or Tennessee to refine the search. How do I slowly transition into something less creepy?😅 I don’t have a specific purpose, but seven minutes of your time to just listen to you talk about a workflow or a product you’re building or what you had for dinner would ….sigh . I’m procrastinating. I should really start. But still would love to chat. I’ll dance, dance.ill scrape, shoot I’ll mow your lawn ….in the metaverse
@MarvijoSoftware
@MarvijoSoftware 12 сағат бұрын
Agreed with the step-by-step process being more effective. Cline however used to use a lot more tokens before they introduced the diff edIt method, like Aider did. So I leaned towards Aider. Will give Cline another try since Deepseek is so cheap and fast, I was using Qwen 2.5 and Deepseek 2.5 before
@neodark45895
@neodark45895 7 сағат бұрын
Cline is the best 🎉
@Swooshii-u4e
@Swooshii-u4e Күн бұрын
Do more like this
@MarvijoSoftware
@MarvijoSoftware 12 сағат бұрын
Will definitely do!
@Swooshii-u4e
@Swooshii-u4e Күн бұрын
You are the underdog
@MarvijoSoftware
@MarvijoSoftware 12 сағат бұрын
🙏🏾
@AB-cd5gd
@AB-cd5gd Күн бұрын
So cool pls test phind, nestmind ai, qwen
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
@@AB-cd5gd will do
@AB-cd5gd
@AB-cd5gd Күн бұрын
@MarvijoSoftware can't wait 🔥
@Seyfettin.a
@Seyfettin.a Күн бұрын
Deepseek is the King of AI for Coding.
@jonathansegura1469
@jonathansegura1469 Күн бұрын
greattt! please use cline :)
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
@@jonathansegura1469 Cline uses a lot of tokens :) but now that Deepseek 3 is here and it's CHEAP, I'll make a video with Cline. Also, Aider's prompts were better in my comparisons. Another "also", Aider has the powerful "always-yes" feature, which lets it continue when an LLM asks questions (I'll upload the video showcasing this once I'm done editing it)
@formixcode
@formixcode Күн бұрын
with deepseek v4 plus vision no doubt that day hahaha either way is good for the devs
@DmitriZaitsev
@DmitriZaitsev Күн бұрын
Please use time codes
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
I tried to use Chapters in the video. Please let me know if you need something else
@DmitriZaitsev
@DmitriZaitsev Күн бұрын
​@@MarvijoSoftwareCan't see any chapters where are they located?
@DmitriZaitsev
@DmitriZaitsev Күн бұрын
Have you tried with a longer more detailed more precise prompt outlining all the details and requirements?
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
Will do in a follow up video. I wanted to show their capabilities first
@farrael004
@farrael004 Күн бұрын
Honestly, the work that you + aicoderking do is essential for programmers who use AI. It boggles my mind how others are not doing more content like this.
@yellowboat8773
@yellowboat8773 Күн бұрын
Because it's mostly hype AI bro
@barberousse5112
@barberousse5112 Күн бұрын
The most interesting test is the one where you make a changement in a repository. Because most of the time, we use aider on an existing code-base. It would be better if there was more tests like that
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
I'll definitely do more of those, thank you
@darkreader01
@darkreader01 Күн бұрын
I have tested deepseek with multiple projects and it could not meet my expectations while claude did it good. It's not worth the hype. My first choice is still Claude 3.5 sonnet.
@TheBuzzati
@TheBuzzati Күн бұрын
Agreed
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
I hear you. It's probably context length related. Deepseek is restricted. With that said, we always say, at this cost you can literally afford 2 extra prompts
@aryanvikash1967
@aryanvikash1967 Күн бұрын
Yeah i agree.
@knubbe
@knubbe Күн бұрын
Agreed! I had a problem with my code. I ran deepseek quite a few times, and nothing. Claude did it but it cost me $3 :((((((
@garfield584
@garfield584 Күн бұрын
Same.
@MarcusNeufeldt
@MarcusNeufeldt Күн бұрын
Cool video mate! Keep it up
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
🙏🏾 I'll definitely do, thank you for your kind words
@tecnopadre
@tecnopadre Күн бұрын
I've made a complex app and when deepseek couldn't make it Claude did it. Both together are awesome.
@jeffwads
@jeffwads Күн бұрын
Curious. I entered the Tetris prompt into the Deepseek chat site (free) and DeepThink enabled and it produced a game similar in function to the one that Claude made. The screen and objects were fine, but they didn't move on their own unless I used the down or left/right arrow keys. Perhaps it would handle that other failed task better as well.
@MarcusNeufeldt
@MarcusNeufeldt Күн бұрын
deekseek deepthink is often better than v3 .. but obviously not so useful for production
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
@MarcusNeufeldt nailed it!
@goqsane
@goqsane Күн бұрын
Ever since DeepSeek came out I binned Sonnet. Not worth the money.
@blakesully7602
@blakesully7602 Күн бұрын
ur voice changed since the last video =\ (kzbin.info/www/bejne/qoSsaaSld7GKnpI) are u using some AI software to modify it? may I ask which one? 🤙
@Stuff_I_Found_Interesting
@Stuff_I_Found_Interesting Күн бұрын
sounds the same to me? ¯\_(ツ)_/¯
@annonymbruger
@annonymbruger 2 күн бұрын
That is a dead stupid comparison using aider. Aider is optimized for sonnet.
@remsee1608
@remsee1608 Күн бұрын
It rules with deepseek v3 it’s much cheaper
@MarvijoSoftware
@MarvijoSoftware Күн бұрын
I looked at the Aider codebase and prompts, it was actually optimised for GPT4 when it came out and later modified to cater for all models
@annonymbruger
@annonymbruger Күн бұрын
@ yes and sonnet, it has a few variations depend on model used. But nothing specifik for deepseek, smaller context window and diff format optimisations or safeguards. Also caching works different with deepseek, and if you do not use deepseek correctly it will sometimes stream empty chunks. It’s not enough to declare expected output in system prompt.
@annonymbruger
@annonymbruger Күн бұрын
@@remsee1608 agree. And with flow optimized for depseek it outperforms sonnet. Also training data for deepseek is newer at it makes a huge difference:)
@cbgaming08
@cbgaming08 5 күн бұрын
Deepseek V3
@MarvijoSoftware
@MarvijoSoftware 5 күн бұрын
@@cbgaming08 editing the video
@gemini_537
@gemini_537 6 күн бұрын
Do you know why Gemini 2.0 Exp didn't perform as well as Gemini 2.0 Flash Exp? I thought the former is larger and better.
@MarvijoSoftware
@MarvijoSoftware 5 күн бұрын
@@gemini_537 this has been on my mind for a long time. I want to actually make that dedicated video. Just finishing up the Deepseek 3 vs Sonnet video
@gemini_537
@gemini_537 6 күн бұрын
Gemini 2.0 Flash is super impressive ❤
@saladien9987
@saladien9987 6 күн бұрын
So as expected Claude is advanced while Gemini is a joke :D
@pabloescobar2738
@pabloescobar2738 8 күн бұрын
What this compare haiku small model anthropic vs gemini 2.0 big model, its crazy you need compare sonnet 3.5 vs gemini 2.0
@gemini_537
@gemini_537 6 күн бұрын
Gemini 2.0 Flash is a small model, so the comparison was correct.
@pabloescobar2738
@pabloescobar2738 6 күн бұрын
@gemini_537 no its medium if you says small model gemma😉 i think if you person or ia because your nick name remnber strawberry open ai
@gemini_537
@gemini_537 6 күн бұрын
@@pabloescobar2738 That's not true. Gemini Pro is medium. Flash is small. 2.0 Flash Exp is free for now, but 1.5 Flash is 10x cheaper than 3.5 Haiku, that's another evidence it is small
@gemini_537
@gemini_537 6 күн бұрын
@@pabloescobar2738 No, Gemini Pro is the medium model while Gemini Flash is the small model. Gemma is not Gemini.
@henrymwingirwa884
@henrymwingirwa884 9 күн бұрын
i was trying to choose a coding llm for a really complex codebase, thanks this is what i was looking for
@gemini_537
@gemini_537 6 күн бұрын
Context window matters for complex codebase.
@James-bc2bf
@James-bc2bf 10 күн бұрын
I like your channel. No BS, just straight to the comparison. Entertaining commentary too.
@MarvijoSoftware
@MarvijoSoftware 10 күн бұрын
@@James-bc2bf thank you man! That goes a long way
@petrosjr361
@petrosjr361 12 күн бұрын
Inclua áudio dublado.
@MarvijoSoftware
@MarvijoSoftware 12 күн бұрын
Farei em breve. Por favor, inscreva-se enquanto isso.
@sizwemsomi239
@sizwemsomi239 13 күн бұрын
It's my first time coming across a south African LLM content creator..I've been coding with openAi and claude for 2 years..😂😂I started with chatgpt 3.5..before people where even asking it to do snake games😂😂..bro we started before the LLMs had a attach file option or copy code option..we had to code all these features on our own...back when jail breaking would work easily..😂😂😂
@MarvijoSoftware
@MarvijoSoftware 13 күн бұрын
@@sizwemsomi239 ola bafo! You also come a long way with LLMs I see. Let's push. If you haven't already, a sub would go a long way. My Reddit if you wanna keep in touch: u/marvijo-software
@augmentos
@augmentos 13 күн бұрын
good vid
@MarvijoSoftware
@MarvijoSoftware 13 күн бұрын
@@augmentos Thank you, I truly appreciate it. Please sub, it goes a long way
@Baleion
@Baleion 13 күн бұрын
Interesting setup, thanks for sharing!
@MarvijoSoftware
@MarvijoSoftware 13 күн бұрын
Thanks! Glad you liked it. Please sub if you haven't already
@jrgzz
@jrgzz 14 күн бұрын
Weird Gemini is so slow for you. In my experience it’s insanely fast, like look away for a split second and it’s completed hundreds of lines of code fast. Very VERY much faster for me than Claude. I love Claude and pay for it but Gemini is so fast and free that I find myself using it for lots of stuff now
@MarvijoSoftware
@MarvijoSoftware 13 күн бұрын
You're probably using Gemini 2 Flash, which is very fast. I cover it in one of my other videos versus Claude 3.5 Haiku: kzbin.info/www/bejne/paGWmpSGh6eBkMksi=PgfH1EztFt_7Ofzy
@sentinel-q6j
@sentinel-q6j 9 күн бұрын
Gemini just stops in the middle loses complete context for me
@selub1058
@selub1058 14 күн бұрын
Gemini not so fast but more qualified I think
@selub1058
@selub1058 14 күн бұрын
Thank you! Very interesting.❤
@MarvijoSoftware
@MarvijoSoftware 13 күн бұрын
Thank you @selub1058 for watching. Please subscribe if you haven't already
@jsalsman
@jsalsman 14 күн бұрын
[summary from o1, excerpted:] Overall Conclusion: Claude 3.5 Sonnet outperforms Gemini 2 (Exp) across all tasks, demonstrating faster response, better handling of instructions, and producing more functional, reliable code. Gemini 2 (Exp) is slower, struggles with the diff-edit format, frequently fails to follow instructions properly, and delivers less polished results. Final Verdict: Claude 3.5 Sonnet is the clear winner, consistently producing more usable, stable, and feature-complete solutions than Gemini 2 (Exp).
@MarvijoSoftware
@MarvijoSoftware 14 күн бұрын
Thanks. o1 Pro or just o1? Should I add these summaries somewhere like the description or pinned comment for future videos? Along with the chapters available? Won't be a problem for non-spoiler viewers?
@jsalsman
@jsalsman 13 күн бұрын
@@MarvijoSoftware just o1; don't pin, people who want spoilers can scrub to the end
@HoD999x
@HoD999x 14 күн бұрын
zero shot minigames and "uses database for prototype on its own" aren't real world use cases - give both 100kb spaghetti code and ask for a fix
@MarvijoSoftware
@MarvijoSoftware 14 күн бұрын
I hear and appreciate your comment. I have longer videos with an existing codebase on the channel. But I find that if LLMs can't complete the elementary code editing tasks like these (I gave them a code base with SQLite + Express + React (Vite) + Node with ShadCN and authentication already baked in), there's no need for us to test them in bigger code bases. I used that repo in multiple of these tests. Repo: github.com/marvijo-code/sqlite-express-react-nodejs-template In the Windsurf vs Cursor video I used another medium sized repo as a starting point because Claude 3.5 Sonnet already proved itself through these elementary tests: kzbin.info/www/bejne/mqavg4F6lpJ2iJI
@judgegroovyman
@judgegroovyman 14 күн бұрын
@@MarvijoSoftwaregood points. I love these showdowns :)
@daviddeleeuw1
@daviddeleeuw1 14 күн бұрын
Is there an LLM good at .Net ?
@jonathanwmaddison
@jonathanwmaddison 15 күн бұрын
Nice job - like the side by side comparison and relatively simple benchmarks.
@MarvijoSoftware
@MarvijoSoftware 15 күн бұрын
Hi Jonathan. Glad you enjoyed the comparison. I'll keep them coming. Any that you'd want to see? Qwen vs Deepseek perhaps?
@Swooshii-u4e
@Swooshii-u4e 15 күн бұрын
This is def going to get you subscribers overtime. No one is doing this type of comparison
@MarvijoSoftware
@MarvijoSoftware 15 күн бұрын
Thank you, I'll definitely keep them coming
@MarvijoSoftware
@MarvijoSoftware 15 күн бұрын
NEW Claude 3.5 Haiku Code Test vs 3.5 Sonnet: kzbin.info/www/bejne/Z5jUhalnbbCqgMk
@MarvijoSoftware
@MarvijoSoftware 15 күн бұрын
Aider + Qwen 2.5 Coder 32B vs Claude 3.5 Sonnet (NEW): kzbin.info/www/bejne/qKnRd6GHfadneqc
@MichaelMerritt
@MichaelMerritt 20 күн бұрын
Big fan of windsurf. It built a fully functional professional looking swift iOS app in a day, then I spent a 2nd just adding bonus features to it for fun. Fixed every bug, optimized performance, wrote a script to generate all sizes of iOS images, walked me through releasing to the Apple Store. It’s pretty incredible, I haven’t been sleeping I was so pumped on how well windsurfer works. I tried cursor for a couple days but it wasn’t as good at reasoning with very large codebases (talking like 500k+ lines) while windsurf does a decent job, sometimes giving up too early. It goes deep but not super super deep, but still often impressing me.
@MarvijoSoftware
@MarvijoSoftware 22 күн бұрын
Aider + NEW Claude 3.5 Sonnet FULL Stack Skill Sharing App | ReactJS ExpressJS + MongoDB: kzbin.info/www/bejne/rmWwm5KgpMx7b5Y
@MarvijoSoftware
@MarvijoSoftware 22 күн бұрын
Aider + Qwen 2.5 Coder 32B vs Claude 3.5 Sonnet (NEW)!: kzbin.info/www/bejne/qKnRd6GHfadneqc
@MarvijoSoftware
@MarvijoSoftware 22 күн бұрын
Qwen 2.5 Coder 32b + Aider | Desktop App + Database: kzbin.info/www/bejne/hoPLiYKPodeHsKM
@teocci
@teocci 26 күн бұрын
Marvijo very nice testing review. Ah could you please share that prompt test sheet? It will be nice to test other AI models. Thank you mate.
@MarvijoSoftware
@MarvijoSoftware 26 күн бұрын
@@teocci thanks. @sheet okay, let me upload it somewhere so you can reuse it. I'll share the link
@MarvijoSoftware
@MarvijoSoftware 22 күн бұрын
docs.google.com/spreadsheets/d/1ybTpJvu0vJCYbGHJAG0DniyafNECTRzjgOjgzPSbOMo/edit?usp=sharing
@batibrahim
@batibrahim 28 күн бұрын
text to image?
@MarvijoSoftware
@MarvijoSoftware 28 күн бұрын
@@batibrahim I'll try it in the longer video, along with a full stack app, very good idea
@MarvijoSoftware
@MarvijoSoftware 28 күн бұрын
Qwen 2.5 Coder 32b + Aider | Desktop App + Database: kzbin.info/www/bejne/hoPLiYKPodeHsKM
@MarvijoSoftware
@MarvijoSoftware 28 күн бұрын
Aider AI Agent Full Stack AI News App ASP.NET + Vite: kzbin.info/www/bejne/eaTOoIiNbdOLrMU