Reminds me of Google recently saying 25% of all code is now written by AI, when in reality, it was just a LLM powered auto-completion feature helping complete things such as function signatures.
@comradepeter87Ай бұрын
That's it, I'm repackaging every major LSP and claim it already writes 90% of code ( you tab-completed it so it counts)
@UnexpectedTokensАй бұрын
it's what we all joked about when we first learned intellisense as a junior, "WHAT IF... we connect this to stackoverflow and google and autocomplete using those?!"
@PuntiSАй бұрын
It just feels like the name of the game these days is scamming people, investors, whatever you can rake quick money from. It's all so tiresome.
@WiseWeeaboАй бұрын
Yes, that's what Copilot is.. ?
@username7763Ай бұрын
Has google forgotten what it means to program? Or do they just not care?
@KwazzaaapАй бұрын
I still can't believe there's people out there who think a programmer's product is "lines of code" instead of problem solving. Managers no less. Completely clueless.
@XDarkGreyXАй бұрын
The more legs a carpenter puts on a table, the better. Also, quality increases with number of screws and nails used.
@realhuman2545Ай бұрын
You have misintepreted point number 2. What it is saying is given an arbitrary amount of lines of code, CoPilot creates code that has readability errors less frequently. Example: If in normal code, a readability error occurs at every 100th line, then with CoPilot it occurs every 113th line. Note that I dont pass judgement on whether CoPilot is good or not with this comment, I am simply pointing out your misinterpretation.
@chanuxАй бұрын
I keep joking with my colleagues that my dream is to lose my job to AIs that are capable of doing EXACTLY what managers ask them to. Nothing more. Nothing less. I'll go help the farmers for a while.
@LiveTypeАй бұрын
I had this metric before. You better believe the amount of lines of code I committed increased by 5000%. Top of the chart baby! That metric was quickly discarded when everybody figured out what I was doing and everybody started committing something like 5k lines per week. Also I only use ai code writing in my editor for single line auto complete. Anything more than a single line and accuracy drops dramatically to the point where it's not worth it. I've had to fix infuriatingly subtle bugs because of that more than once. But AI absolutely speeds up development if you know what you're doing. Anybody who says it doesn't do this is crazy. Overall code quality wise? I'm not entirely sure but at best it's the same, but typically worse. That's not an issue for personal projects but snowballs hard when you have 25 people all with differing opinions on how things should be done. The solution was a dramatic increase in the amount of extremely thorough tests. I'm talking like 15x the tests.
@PaxixifyАй бұрын
In the 1700s, if you had money and too many sons, you put the useless son in the army and bought him a captaincy, so he could at least have a uniform for parties. The MBA serves the same purpose.
@asmrddictАй бұрын
Commit message: Made code 2.37% more readable. Merge Blocked: WTF are you talking about.
@exginto8053Ай бұрын
2,137%
@masu33Ай бұрын
push -f
@hungrymusicwolfАй бұрын
"We defined code errors as any code that reduces the ability for the code to be easily understood." - so they basically just gaslit us when talking about less errors as a statistic.
@MarcelRieglerАй бұрын
I bet they just added / removed things from what they consider "errors" until the numbers looked good. You know, draw the bullseye around the dart, like any good liar.
@Topcatyo.Ай бұрын
15:30 - Hitchhiker’s Guide to the Galaxy (specifically, its sequel book) has a cloaking shield for a spaceship called the “Not My Problem” factor or something. It’s a cloaking shield that uses somebody’s unwillingness to have to deal with something like a giant spaceship being there to prevent people from seeing it. This feels very similar
@loopingdopeАй бұрын
douglams adams is such a goat
@asgerkahljensenАй бұрын
Someone Elses Problem field
@robertbeisert33153 күн бұрын
And it runs on a AA battery, because it takes very little to convince people tht it's Someone Else's Problem
@roccociccone597Ай бұрын
The AI train as arrived at my company too... They recently released this internal chatbot which is useless and it is often faster to simply read the confluence article... As soon as you try to use these AI tools for something specific, it just falls apart. I am so tired of people without a clue hyping this shit up like it's the second coming of Christ. And the best thing is that you still need the expertise to review and correct mistakes the AI makes. So at the end you're probably spending more time trying to understand the code the AI created and making changes....
@OkAtheistExplainVimАй бұрын
My company is trying to position itself as "ai experts" because they hooked a chatbot into one demo app 😂
@sanjaycse9608Ай бұрын
@@OkAtheistExplainVimbroo😂
@fredrikfjeld1575Ай бұрын
I have seen live demos where AI drew out components and made it insanely easy to check for faults before loading it into a project. 4 people and two days of work done in an hour. A company delivered a project estimated to over 3000 hours and used only a bit over 100, with zero faults found during building. They have used AI to quality check something another company had manually quality checked, and found a fault that would cost tens of thousand each year. Now, people do try to get AI to do all sorts of things that doing manually would be better, but there are companies out there who have proven that it for certain tasks can improve productivity by more than 50%. Having seen a live demo of one, I am hundred percent certain my company will either have to use AI or downsize to take on really niche jobs very soon.
@ПошелНахуй-ш3иАй бұрын
Its so funny that something like "dont use made up non-existant functions and parameters" is a mandatory line in every ai prompt ans ai still manages to use made up things
@OphargАй бұрын
We have a chatbot too, and it's actually ai, I couldn't believe it. The reason it's so absurdly funny is, that THE ONLY thing it can do is ask for your worker ID and birthday to reset your password. It's absolutely comical, like, just give me a form.
@guidoferri8683Ай бұрын
GitHub defined "error" with such an open mind that their brain fell out
@abdulkadiraminu262Ай бұрын
I swr out of everything i was expecting from this article i did not expect them to redefine an error
@dera_ngАй бұрын
🤣🤣🤣😂😂😂😅😅
@zeikjtАй бұрын
I bet they also redefined "statistically significant"
@PflanzenChirurgАй бұрын
@@zeikjt i bet they redefined every word of the article xD
@BboyKenyАй бұрын
For 100% readability, you need to align all text to the right stopping at column 80. This makes it more convenient to read on the punch cards.
@InvisibleSquidsАй бұрын
Also helps if you emulate working on a 800x600 monochromatic tube
@JckdrllngАй бұрын
Love when people say "statistically significant" without a P value or any mention of what statistical tests they used Stats is just hard, it's easy to use the wrong test or model for your data.
@0oShwavyo0Ай бұрын
Just to play devil’s advocate, once you start mentioning p-values your potential audience is going to get a lot smaller very quickly. I don’t think it’s uncommon to graduate a Bachelor’s program with only a single Stats class under your belt. Those folks should at least understand p-values, but to interpret how appropriate various methodologies are for a given data set requires more study than you can get in a single semester. I’d even say it requires you to stay up-to-date in the field. I took econometrics in school and can confidently say I’ve forgotten most of the information I’d need to choose an appropriate modeling strategy. With all that said, the problem emerges that eventually we need to rely on other statisticians to critique and police the statisticians presenting data to us, which does not resolve the issue of finding statisticians to be untrustworthy.
@harleyspeedthrust4013Ай бұрын
@@0oShwavyo0I learned about P values in high school, in a class about basic statistics. It's not too much to ask for a person to learn just a little bit about stats, and for those who report stats to include some more information. The problem is that "statistically significant" has lost its meaning. Nowadays something is "statistically significant" if someone feels like it is.
@wesleyd.485928 күн бұрын
@@harleyspeedthrust4013that’s not really the problem. Significance has a pretty strict definition. The problem is that the null hypotheses under most of these tests (0 difference, no error) is totally unrealistic. For a while statisticians have been screaming for people to abandon the notion of significance for other metrics. The American statistician publishes something that is more or less titled “Abandon Statistical Significance” every year. Non-stats experts are slow to change because it’s so easy and so routine
@danmac_auАй бұрын
Wife who doesn't code had the perfect comment: You don't smoke your own green and you don't get your own AI to report on how well your AI performs
@youtubeliskАй бұрын
What backend does your wife use?
@user-fd7jd4jq1eАй бұрын
back 2 da kitchen @@youtubelisk
@pluto8404Ай бұрын
exactly, you dont beet your own meet. You get in a circle with your closest buddies and beat eachother.
@WiseWeeaboАй бұрын
Why would you not smoke your own green? You should 100% smoke your own green, why would you not? She's talking about being a drug dealer and how it will impact your profits if you consume the drug yourself and simply empty out your own stock. But if you're not a drug dealer then you should absolutely grow it for your own consumption, it's very economical and you have more control over the use of pesticides and stuff like that.
@WiseWeeaboАй бұрын
@@youtubelisk Probably Dyson and HexClad.
@bolinhodechuvaazedoАй бұрын
Crazy how LLMs should 10x everyone and now even MS giving bs stats could only conclude a 3% improvement ...? 3% is so small that a developer-to-developer change trumps that, even in 6 months one can easily get 3% better at coding
@ProfShibeАй бұрын
And LLMs will definitely not be 3% better at coding in 6 months 😂 they’re so bad, especially with databases, it’s comical
@ПошелНахуй-ш3иАй бұрын
Money-laundering capabilities for ai-hype are crazy
@ColinFoxАй бұрын
LLMs could 10x our productivity if they didn't hallucinate and always produced top quality work. Unfortunately, they're brilliant idiots, and do stupid things as often as they do good things. You still need a skilled and knowledgable developer there to actually make the decisions. I've been deveoping software professionally for 40 years. I use ChatGPT in a lot of my work, but more of an assitant/rubber duck than for producing code that I just drop into place. It's most useful when I have an idea for something and I want to spitball it out. It gets less useful when trying to solve more specific issues.
@defeqel6537Ай бұрын
a smart new keybinding is likely to give more of an improvement
@InvisibleSquidsАй бұрын
@@defeqel6537setting up Prettier and EsLint to run on save has made a definitive difference for me, lol. Especially after defining solid configs for both
@samygafsi6429Ай бұрын
The part that weirds me out the most is that by replacing junior devs with AI, you cut the source for new senior and expert devs. Their numbers would only decrease then and never grow. Within few years, this might simply kill the dev industry without a replacement. What happens in this case ?
@XDarkGreyXАй бұрын
Age-old problem but as if any industry or even society learns
@borylesnyАй бұрын
Same problem as with bridge designers, and areospace engeeners? Will probably lead to simmilar solutions
@samygafsi6429Ай бұрын
@@borylesny Can you elaborate ? I don't quite understand what you mean
@borylesnyАй бұрын
@@samygafsi6429 There are industries where junior members are liability. And they do not dissapear from lack of staff. Few years with lack of senior programers will cause wages inflation to the point where training seniors becomes industr y by itself.
@samygafsi6429Ай бұрын
@@borylesny That sound like a reasonable take. I could see programming going more into the medicine route, where programmers have to do so many years of studies that include a lot of practice. And then you add more years for a specialty. but there will be a difficulty to to mimic the residency and internship phase in programming, but still not impossible.
@dancimpianu7217Ай бұрын
Thanks for reading my article, I'm glad you liked it! 💚
@RMDragon3Ай бұрын
Nice article! Something I wish you had also mentioned is how they never say how they evaluated terms like "concise". Obviously the intuitive meaning should be clear, but given that is not the case for other metrics in the article it is at least something I'm curious about. On top of that, there are plenty of (reasonable) ways you could evaluate how concise the code is. The definition of concise is giving a lot of information clearly and in few words. To me, for code, the "gives a lot information" part would be code that achives many tasks, but how do they evaluate that? Is it only based on the cases that passed all tests? Or are codes that didn't succesfully do everything also counted? Is the few words part evaluated based on number of lines, characters, files, functions, non-blank lines...? Also, what if controlers wrote code that would be necessary for a real application, but that wasn't tested for? For any of those options, you could change how you write your code to score well on those. To be clear, I'm not implying they told Copiloters what to do to score well. However, what I AM saying is that you can probably define a reasonably sounding metric after the fact that gives you a 4% advantage towards the conclusion you already had.
@UltraGaivalasАй бұрын
yeah but who are you
@sutirkАй бұрын
LLM code is more "readable" because you've probably already read that specific function implementation on stack overflow before. Literally pre-read.
@agentofenhanced2428Ай бұрын
That's just a stupid over simplification thats not true, ai is trained on coding documents so when u ask it to write something it uses logic, yes if u ask it to create a few classes for a Minecraft clone it wont recreate an entire 3d engine library from scratch why would it ? it can just use lwjgl for example. And its funny how this guy is criticizing studies made by the company that makes the product while he is a programmer reviewing ai which would take his job if it were to work, and something he completely missed is that ai isnt bad because its stupid its bad because the hardware hasnt caught up, claude and chatgpt could create any fullstack app you can imagine if you gave it 100 rtx 4080 supers, ai sucks FOR you because you don't have enough hardware to contain such large context and performance
@RawrxDevАй бұрын
@@agentofenhanced2428 That's wrong across the board, first off, your usage of logic shows you don't understand how these models works, explain to me the part in the pipeline where logic is applied? Was it when the data is scraped? Was it when the data is transformed into vectors? Maybe when it's fit for the curve! Second, no, hardware is not the limiting factor, you're going to be very disappointed when scaling compute doesn't magically create AGI, diminishing returns is a buzzkill for sci-fi, but its part of our reality. We've already seen the start of diminishing returns, and I'd bet that the concept we've observed 1000000 times before won't magically be wrong this time just because people want to live in space.
@agentofenhanced2428Ай бұрын
@@RawrxDev dude I am not gonna deal with "well akchtually" comment, you know damn well what I said, your "we've leached the limit" is something people have thought about literally every single invention.
@RawrxDevАй бұрын
@@agentofenhanced2428 And most of the time guess what, their right. LLM"s are a dead end.. could a different paradigm come about that leads to AGI, sure, but LLM"s are almost guaranteed to not be that path, and since that's where all the money is, I doubt we'll see big leaps for a while again. I think your conflating scientific goals and "inventions", "inventions" have always had their limits, which we've reached, its the goal that we're able to still achieve by changing thought and perspective. The LLM invention is limited, the goal for intelligent AI is still possible.
@GoalOrientedLiftingАй бұрын
@agentofenhanced2428 this is the hardest copium I've read in ages. Imagine assuming there's no flaws in AI, and only assuming it's a hardware limit😂😂😂
@VarelinskiАй бұрын
AI slander is what makes my morning brighter.
@grendel6oАй бұрын
It's not slander if it's true.
@UnexpectedTokensАй бұрын
"13 more lines til error" "oh wow that's great, compiler errors right?" "no, linter errors" o_o
@Klaus-bm5ekАй бұрын
i suspect these readability improvements is entirely because of AIs adding too many comments so you end up with // this function sorts an array function sortArray(array) Im more curious about how you even measure "maintainability" as a metric (other than the obvious answer that you don't and its all made up)
@pluto8404Ай бұрын
if AI code is so good you shouldnt even have to read it.
@sk-sm9shАй бұрын
it would be funny if they asked AI to be the judge to decide which code is more readable
@agentofenhanced2428Ай бұрын
@@sk-sm9sh it actually does a good job at that, i gave claude two same programs in python, one was written completely explicitly and one implicitly, and it said that while its the same for it, it absolutely prefered the explicit version because it's easier to maintain, and a lot easier to read if multiple people are working on it
@JakubZabłocki-e5yАй бұрын
WAIT WAIT WAIT, why is this graph in 18:00 adds up, but vertically? If you add up all fails together you get 100%, same with passes. This math, and graph also was made by AI, i bet you
@autohmaeАй бұрын
amazing isn't it ? Like someone made a mistake with Excel to create a graph
@JakubZabłocki-e5yАй бұрын
@@autohmae But what mistake? And what is logical path to create that graph? The most logical way I see is "we're taking x examples with y condition and we'll see how many times it fails and how many times it works". Here we have something so weird... "In 100% cases of fails X% is this case and Y% is this case". The logical path to this moment is so weird for me. That's why i think, that AI give a little helping hand to this article
@autohmaeАй бұрын
@@JakubZabłocki-e5y it's almost like it should have been horizontal instead of vertical. But Very much possible it was AI too.
@meherzadlakadia9740Ай бұрын
I assume the actual metric was, of those who did not pass all tests, 60% were control and 40% copiloters. It was probably done this way since it's the best illusion of gain. Bifurcation of a sliding scale into a binary scale (pass or not) was the first weirdness. If they'd not done that you'd likely plot a graph of score for each group (unit tests passed out of 10). Very shady.
@alpha.gamma.dingdongАй бұрын
wait wait wait so humans can make mistakes too? are you just proving that AI can help improve graphs too?
@nosotrosloslobosestamosreg4115Ай бұрын
Something owned by Micro$ift lied? What a shock!
@ToddMagnussonWasHereАй бұрын
I mean, Bill Gates has a pretty popular photo of himself pre-2019 next to the “How to Lie with Statistics” book by Darrel Huff. Read it out of curiosity, same shit an Advertising or Graphic/Marketing/Web Design college course will teach someone. But very telling of the man himself.
@rumplstiltztinkersteinАй бұрын
@@ToddMagnussonWasHere He must have written it during his vacation on Jeffstein Island
@lever1209Ай бұрын
microcock winblows 👍
@kowaihanaАй бұрын
It's also laughable how my friend who uses Linux, uses GitHub copilot 😂
@nosotrosloslobosestamosreg4115Ай бұрын
@@kowaihana And browse with Edge as well XD
@rookbrАй бұрын
The const foo vs big ass regex line analysis 12:40 is just brilliant! :-)
@hungrymusicwolfАй бұрын
"There's lies, damned lies, and statistics" - Mark Twain (I belief). Whenever someone comes with statistics assume those are taken out of context to the point of being wrong and you'll be statistically more likely to be right than if you believed those statistics. (Irony intended.)
@0oShwavyo0Ай бұрын
Ironically you’ve just presented a statistic to us (that the average presentation of statistical data is manipulative and inaccurate). Clearly there’s something pretty natural about using data to inform our worldview.
@hungrymusicwolfАй бұрын
@@0oShwavyo0 I'm well aware, hence the "irony intended" at the end.
@0oShwavyo0Ай бұрын
@ it would appear that my brain wasn’t braining this morning 🙃
@hungrymusicwolfАй бұрын
@@0oShwavyo0 We all got those days my friend. I hope the rest of your day is a better brained one (maybe even good dare I say).
@hinzsterАй бұрын
Only checking code for "errors" like whitespace, repeated code or stupid stuff like that instead of checking for actual bugs is wild. But it's 100% on target for an LLM, which is a) a people pleaser and b) just a model that guesses the next word. Remember that. No, it is NOT intelligence. Forget that buzzword, it is a LANGUAGE model guessing the most likely next word to achieve total bliss in the requester's mind.
@alpha.gamma.dingdongАй бұрын
no offense but the human brain is a model that guesses the next word. it's just slightly better than an LLM some of the time. the information that you consider "intelligence" is just memory && context.
@0oShwavyo0Ай бұрын
@@alpha.gamma.dingdong The human brain is not an LLM. It has all kinds of functions that extend beyond prediction, like helping regulate breathing, heart rate, and other parts of the autonomic nervous system. Even in regard to its predictive capacity, the brain is capable of way more than any model that humans have conceived of currently, across many more domains at once, and much more efficiently. We grow new neurons over time, and form new synapses frequently. None of this is true for AI models. The only similarities come from the fact that the AI researchers are mildly interested in the brain for inspiration and like to name their systems after brain anatomy. Actual neurons do not exchange discrete digitized data with one another.
@KwazzaaapАй бұрын
@@alpha.gamma.dingdong The idea that a brain is an AI model either assumes that we understand the brain or that AI models are beyond comprehension. Neither is true. I don't get why people feel so confident about being wrong about neuroscience. Maybe there's not enough neuroscientists to call you out on it.
@zzzdendaАй бұрын
They evaluated correctness and readability separately. They didn't say that bugs don't matter.
@GwarzonicusАй бұрын
@@alpha.gamma.dingdongbruh no one understand the brain.
@Oler-yx7xjАй бұрын
I was just thinking how much will I have to lie in statistics (and how much of an artpiece do I want to make that lie) to make my uni homework project look actually decent, and here you are telling me exactly what to do! Thanks Prime, for your advice
@jorchardАй бұрын
0:34 so stoked that he accidentally highlighted the sentence correctly!!!
@arkinul2993Ай бұрын
why does he do it the way he does, anyway?, it annoys me an unreasonable amount
@sayandeepdey5471Ай бұрын
LLM hating is my hobby
@not_user11Ай бұрын
the wifeagen walking near the coding shed and then hearing: DORA! DOORAAAA!
@sk-sm9shАй бұрын
It is possible to very objectively test readability but hardly anyone tries to ever set it up. Experiment can look somewhat like so: you give two different peaces of code that implement same functionality to two groups (large enough so that results would have statistical significance) and you ask them to answer same set of questions about what the code does. If one group answers in better correctness and/or faster that proves objectively that one implementation has better readability than the other.
@defeqel6537Ай бұрын
and to test maintainability, the participants would need to fix some bugs (how to make those equivalent in different types of code?) and implement new, large, features on top of the existing code base
@adambickford8720Ай бұрын
So simple, nobody does it.
@kzmatrАй бұрын
@@adambickford8720 he didn't say it is simple, but that it is just *possible* (although expensive) and you have to make a study just for that part of a larger study multiply that by each coding example you want to test
@0oShwavyo0Ай бұрын
Describing code in natural language and then trying to draw objective conclusions from that data is flawed. How do you accurately deduce how well someone understands a topic from their description, without introducing their writing ability as a variable? Better to ask what outputs or side effects a particular function has for a given set of inputs/initial state, some kind of question that has a measurable answer.
@Saru-DonoАй бұрын
The "code quality" errors are even more nonsense if you take into account, that people who have Copilot installed are more likely to also have LSPs or formatters installed, which would warn you about some of this nonsense
@567saturnАй бұрын
3% percent encrease sound like a research error. How 3% encrease can even be considered an encrease. Big chance if you do same study again you get 2% decrease
@ValentineOficialАй бұрын
Increase* but yes
@mattilindstromАй бұрын
The term for seeing without perceiving is visual neglect. It's a debilitating condition mainly associated with traumatic brain injury. The condition can be life threatening, walking into traffic or dying of hunger with a full plate of food on the table.
@UNgineeringАй бұрын
That study is the political equivalent of "we did a survey of a representative sample and discovered that 93% agree". What was the sample? What did they agree with? What does "representative" mean? and WHERE'S THE RAW DATA??
@nuvotion-liveАй бұрын
That research was also 3% effective in convincing me. Statistically significant!
@rmidifferent8906Ай бұрын
18:47 The graph not adding up is so funny. If you swap lower numbers with each other you get 100%. Either they or the LLM mixed up the values. I wonder what is worse At least it's a statistically significant graph or something
@autohmaeАй бұрын
It's not funny, it seems do be trying to create a misleading visual.
@CritLorenАй бұрын
@@autohmae eh it's still a 60-40 split either way, probably why it was so easy to make the mistake, but true, when 3% is "significant" that stands out
@AmirparsaDDАй бұрын
Lies or truths, data and statistics is cool no matter wat (only if its formatted in cool graphs and charts)
@RohinthasАй бұрын
Huh, this was a really good article, it didnt just confirm my biases, it actually strengthened them!
@krisnahariАй бұрын
Great timing. I was just about to do my presentation. With some statistics!
@k98killerАй бұрын
One of my favorite features of Cursor is the ability to refactor something then hit tab as it applies the refactored pattern to other relevant bits of code. Letting it generate new GUI frames based on existing ones is also a good time saver -- starting from scratch with code gen is a painful experience of trial and error, but once you get it right, the AI is pretty good at replicating the correct patterns.
@bluedragon4008Ай бұрын
Just wow!! That "Grey Men" allegory is so precise. And correct.
@sammartin7515Ай бұрын
18:02 the chart percentages do add up to 100. its a really bad chart but its saying that of the people whose unit tests didnt pass, 37.8% were using copilot and 62.2% weren't. the same is true for the right side of the chart.
@haniffaris8917Ай бұрын
Finally, this format returns!
@victormasmontserrat5374Ай бұрын
a reference to the wheel of time with code ? phew .... you have earned my unconditional love
@CodecrafterArtemisАй бұрын
I've just reached 3:20 and I feel like if you have to write out percentages with two digits after the comma, you're trying to inflate the numbers in some way.
@blazernitrox6329Ай бұрын
Re: "People who present you with statistics are about to lie to you." I mean, I definitely get the sentiment. I would caveat that "People who present you with statistics _but no context_ are about to lie to you," but that happens so bleeding often that it seems to be the majority case (especially in our social-media-driven 280-characters-or-less social environment). If someone presents you with statistics, you should _immediately_ ask them how they got those statistics, and keep moving down the chain until you've gotten to the ins-and-outs of how whatever study was done (assuming there even _was_ a study). Re: Using AI to do the tedious stuff that you've already done before. As many people in the comments have pointed out, using AI doesn't make you a master at [insert field here]. You still need the expertise to verify the output. People are treating AI like it's some magic tool, but the truth is it's subject to all of the exact same caveats every other piece of software (or, really, tech in general) is. It's a tool. You still need to actually know what it is your making and how it should be made, it just gives you an easier and faster way to do it - assuming you know how to use the tool as well. To really drive this point home, there's an interview (I believe with one of the big wigs at IBM, circa the early 1990s?) where the interviewee mentions that the thing with computers is that they should take over the _tedium_ so that people have more time to spend focusing on more important things. Likewise, it's mentioned (unsure if it's the same interview or a different one) that _computers can't be held responsible, and therefore shouldn't be allowed to make decisions._
@Imperial_SquidАй бұрын
3:37 speaking as a data scientist, I _really_ dislike this "all stats are lies" stance. While I agree it's possible to warp and bend statistics into saying what you want, they're still useful the vast majority of the time. If you don't have numbers to go off, what are you going to trust instead? Pure vibes? Analogies from yourself and your mates? Statistics, like a lot of things, are just tools. How they get used depends on the person using them. Saying "everyone who tells you stats is lying to you" is no more accurate than "everyone with a hammer is going to cave your skull in". If the person telling you the stats is trustworthy, the stats probably are too. So, as always, the real enemy here is companies trying to shill you products, that's all.
@ali32bit42Ай бұрын
i think he is saying the random statistics in marketing or internet posts are probably made up and misleading. actual science and research usually has real statistics because they have an incentive to use correct data instead of lying.
@dvhhАй бұрын
@@Imperial_Squid unfortunately most of the forewarning about statistics are usually not shared or ignored in order to "reduce the noise". And probably what I mean by that is that the most noisy statisticians are giving statistics a bad reputation.
@ONoesBirdАй бұрын
Hey Prime. Love your content. Thank you for pushing with educational content and not just playing games. It is harder and that sets you apart and hey, you can do both! Play some games now and then and then drop the knowledge once their hooked ;)
@tylerthomas6932Ай бұрын
I built a service that you curl with some params to send localization JSON files with a target language which then runs through a lightweight LLM and stores it per page in a mongodb collection. It was just a demo, but the idea is, when you add text to your front end, you run this script and it translates to your target languages and when a user lands on your page, we can query the db, get the JSON and just smash it on the page and it should just work. Now to me that's a pretty cool use of AI.
@rumplstiltztinkersteinАй бұрын
This video is 13% more giga-chadable than the average
@pmiddlet72Ай бұрын
how does one measure things such as 'inconsistent', 'unclear', 'excessive; 'insufficient', or 'variable' without any formal definition of these priors? And of course given the subjective 'measures' sans definition, it appears there were were some sort of null hypothesis significance tests that amount to likely t-tests of means/proportions. The thing is that 'significance' often gets weaponized. As if a p-value is AT ALL a measure of true/false, yes/no, right/wrong. There are important statements that have been made on p-values (and p-hacking, along with a number of related unsavory practices) by the ASA that explain by committee how these should / should not be used. There have been addenda to those statements as well to further clarify what has eternally been an issue in (often the misuse of) statistical analysis. What's also important is a concept called 'effect size' - basically a rough estimate of how many observations would be required to detect a difference between groups in the first place given some chosen model to fit the data (whether it's t-distribution - yeah, that's a model - or any other). On numbers not adding to 100%. This is curious, and for ease-of-understanding should've have been well vetted. However, due to things like sampling error, rounding error, missing data, and what I think would be in this case a possible BIG miss, limitations in how the data were categorized in this experiment (some data may have overlapping information between 2 categories of things - this is what happens when one doesn't understand how to deal with hierarchical information). Not knowing the exact data and experimental methods they used (these should've been published), it's otherwise hard to know how those results were derived. But it 'does' give one the feeling of at least a hint of cherry-picking given what measures they reported on doesn't it? These 'measures' feel more like proxies to me. Maybe they didn't meet support some desired marketing endpoints and decided to go with the easily subjective. It's one thing to get code correct (i.e. the code accurately reflects the input of the user, performs the function that is was intended to do, and ideally has no errors), which is far acry from 'did the code meet PEP-8 or other standard for a language. Would that mean the primary endpoint for co-pilot was how well it does *linting*? Gah.... I won't dismiss the need for good code linting as formatting does produce a certain class of errors syntactically. But if that were the goal of this experiment, they should've stated that up front. I guess the bottom line is simple - this wasn't created by any "data scientist" worth their weight in boogers vs a seasoned statistician (many folks of the data science ilk aren't, but then stats is only one part of that entire umbrella). Not that statisticians don't screw the pooch! Someone could've been paid to cherry pick. Someone could've just been way off base with what methods should've been used for this sort of analysis (which could be anyone). Maybe it was an LLM that produced the output (always trust the digital oracle right? There is no spoon....). Ok maybe there's not a *simple* bottom line. But, it's pretty clear to me there are some egregious errors, and many, if not most of them feel 'marketing contrived'.
@ThecatThecat-hq1opАй бұрын
30:40, anyone who is telling you that is lying to you. That's not how statistics work. The statistics derived from a sample that was taken in a specific location can only be applied in that location. Assuming the sample is actually random (and it's probably not in this case). This is statics 101.
@jaroslavhuss7813Ай бұрын
I want more Family Feud... F@k I was laughing as hell, pure intellectual comedy.
@adissentingopinion848Ай бұрын
3:14 They're called lies, damned lies, and statistics for a reason
@gnerkusАй бұрын
PI
@Karol-g9dАй бұрын
in my 4 exemple , if ai find the issue , ai would add 4 white space to respect character spacing like google but its not copilot the issue , its the way ms implement or vetted standard of keyboard cursor location . When i mentionned the issue to copilot ? It could not detect the keyboard cursor location for some character was wrong . I mean that is the first thing that should be checked
@ZT1STАй бұрын
@21:57; "Rendering dictionaries useless? I don't even know what that means!" *Two seconds later* "Okay, I wasn't expecting *that,* but I now know what they meant."
@SimeonRadivoevАй бұрын
The fun part is that no matter how much you mess around with the numbers, reality kicks in at some point. I guess in the process will make some people lots of money in the short term.
@LyraelRayneАй бұрын
Just as you went on your code review rant it ran an ad with the lyrics “I just got fixated on somebody” and I wasn’t sure whether this was actually part of the video haha.
@boppinggamer8571Ай бұрын
18:56 It adds up to 100 if you only add the same color lines together. Purple added up with purple (37.8+62.2) and green with green.
@jazzochannelАй бұрын
1:58 "improved readability" all code should be write-only. I thought this discussion was settled.
@carlosdamasio349Ай бұрын
Statistics can be misleading or misused for sure but they’re easier to analyse or discuss than feelings or beliefs…
@timokreuzer1820Ай бұрын
"if you can’t write good code without an AI, then you shouldn’t use one in the first place." They shouldn't be writing any code. But they still do it. The tons of shitty code out there are proof of it. So maybe, if they started using AI, their code would be a bit less shitty. That would be a win for everyone. And they might even learn to write better code.
@rsdyeahhАй бұрын
on 18:20 the data is stacked vertically, which marches 100% each
@ShrikeGFXАй бұрын
The thing is AI does allow people to code things way out of their league. So in these leagues, the code quality is definitely reduced, but you can fish in much deeper waters now.
@derendohoda3891Ай бұрын
It is easy to lie with statistics but it is even easier to lie without them.
@sa-hq8jkАй бұрын
i’ve never asked chat gpt to generate code for me but i have asked it to point me in the right direction for a few specific problems. every. single. time. i’ve done that it’ll also give me a code sample of how it would be implemented and it’s always the ugliest most unconventional code i’ve ever seen (maybe cause it’s rust btw idk tho)
@ali32bit42Ай бұрын
when you DO want code samples... it hullucinates functions that dont exist or belong to some library you dont have. hay how do i create a Jacobian matrix ? it gave me "func _calcualate_jacubian_matrix(variable) " and no actual instructions on how it works and why...
@sa-hq8jkАй бұрын
@ i remember when it came out i asked it how to do bit packing in rust and it said to use u2 and u4 and it was at that moment that i realized the ai hype train was bs
@ali32bit42Ай бұрын
@@sa-hq8jk the prompt you would be writing to get the result you want can be longer and more complicated then the code some times.
@3v1lp1ngv1nАй бұрын
At a previous job we have made a ctrie implementation that was 10-20% faster than what we've been using before and we have abandoned it because it wasn't a significant enough improvement. And they are beating their chests about a 3-5% improvement.
@vukkumspАй бұрын
This is like my previous company where everytime we use their so-called AI tool to query something we need to fill a record in an excel sheet explaining how many hours saved with this query and how much is the expected if we didn't use any query along with jira numbers the query is related to, our office mail id, employee id number, query itself and explaining the purpose of the query as well. The company gave awards and prizes to people who asked more queries (and updated the excel sheet) and saved more hours (numbers written in that excel intended saved).😅
@vukkumspАй бұрын
Even bigger problem is they are more concerned about employees using their AI tool atleast once per day writing that query's details in excel,... Else they call you and disturb your actual work. They providely showcase those inflated numbers like employees are saving time with their AI tool
@9darkspells20 күн бұрын
19:00 I think I realize what they have done here: The true data is: of people who passed all unit tests, 39.2% of them werent using copilot, and 60.8% were, meanwhile of the ones who did not pass all unit tests, 62.2% of them werent using copilot and 37.8% were. So they can add up to 100% each, just not in the way they are presented.
@scheimongАй бұрын
18:25 that was a very unexpected SNL reference. I literally muttered "woman" as you were reading that 😅
@LordHonkIncАй бұрын
"Don't trust a statistic you haven't faked yourself" was the first thing my prof said to us. Clearly facetious but with a kernel of truth, just like every great quote
@CheapoPremioАй бұрын
I do love copilot and jetbrains AI Assist, but only for boiletplate code completion. It's nice that it auto completes all fields on a table that I designed 5 minutes ago correctly. But other than that and the odd regex I create with it I don't rely on ai for my code in any way.
@TheFrankvHoofАй бұрын
Automatically analizing what Regex is doing in any open Pull Request actually sounds like a great way to use AI. If an AI tells me what it thinks that regex does, I can compare that to what I think it does (or is supposed to be doing). Some company like Github should get on that.
@maccsguitarАй бұрын
Copiloters write the description for copilot to generate the functionality. Oh look wow, the code doesn't have 'missing documentation'
@mathematik1865Ай бұрын
It is true that only 30 samples are needed if your data represents the entire population. In this case, the data does not represent the whole population (of developers as you said) but rather a smaller sample. This is a common misunderstanding. What was done here is that they selected (as they claim) 200 developers. The question is: from where? For instance, if they focused on open-source contributors on GitHub, it would be accurate to say that this sample represents the population of open-source contributors on GitHub. However, this excludes developers who do not contribute to open-source projects or use GitHub alternatives. The 30 samples rule does not apply if you already have a selection bias.
@lestinkstiefelАй бұрын
Hey! Love listening to you reacting to articles! Have u ever considered releasing these as an audio format? I don’t own yt premium so I can’t play this with my phone off
@temistoclesvaldes5312Ай бұрын
Where are the confidence intervals for these stats? What about standard error? Hold on - you think 202 coders is an adequate sample size and representative of the population of coders? I'm at 2:11 and stopped and and went and got another drink.
@joopie4661426 күн бұрын
every time i go on primeagen i see 41 minute long video, so I wait to see if the title pops up in the first minute otherwise I go
@monq02Ай бұрын
the graph part it at 18:59 adds up to 100% if you add them by color instead of by line
@kalebbruwerАй бұрын
I love how they have qualitative measurements down to two significant digits
@ZutraxiАй бұрын
Is it just me with too little experience that think most of what devs do is essentially api endpoints? 09:50. Very few people get to work on the next crypto algorithm or quantum computer. Building complex simulations/rendering. Most people will just shuffle data around? While few build the tooling for that shuffling.
@defeqel6537Ай бұрын
usually there is a bit more logic behind those endpoints than just CRUD
@lazyman2451Ай бұрын
only use it for quick short snippets of code, everything else do not use it.
@googelygoo1Ай бұрын
I've been using vim movements for 10 years now and doing programming interviews with a simple embedded text editor in a web browser is one of the most painful things I've had to do in a long time
@JdotCarverАй бұрын
3:43 If this were true, ANY AND ALL statistics would be lies. Making the entire act of research unfounded. This is wrong on so many levels. I understand why he posted it to Twitter. ;)
@stormoffistsАй бұрын
Correcting "pedanterism" was good stuff
@ET.AIMusicАй бұрын
You made me laugh with this video man, happy TG! Strongly agree about building a strong team that can work with AI while getting compensated properly. AI is not here to replace those who perform and should be appreciated, for now at least. This is of course my own personal opinion as both a developer and a company owner.
@perz1valАй бұрын
2:20 I mean, copilot is suggesting a more verbose version, which I may've not written myself, because I've already got an idea for a more compact solution, that is naturally a little harder to understand locally. However, shorter code (assuming no "magic") resulting in a 100 LOC file is easier to wrap your head around than 300 LOC file, just because I'm not constanly searching through it to find the meaningful line. Example: array map with ternary one-liner VS a for loop with if+else pushing to result array, then assigning it to the original array variable The later is more readable, but do we care?
@Tony-dp1rlАй бұрын
I would be really interested to see the comparison of AI vs Human in terms of achieving functional goals with the fewest statements.
@charlesd4572Ай бұрын
Interesting the guy who wrote VisiCalc retrained at the end of the 70s in business as he thought with the wide adoption of compilers and interpreters there would be less need for programmers (experts assembly and lower level compiled languages). That's when he got the idea for a spreadsheet. So this is not the first time we've been through this sort of thing. I think we'll end up with very high level languages at some stage that constrain the AI so it can be more precise and predictable.
@bastiana3611Ай бұрын
11:26 true, it's mathematically proven that every regular expression can be represented by a FSM.
@MankepankeАй бұрын
Isn't that as useful as saying "all programs can be expressed using an AST"?
@SpektralJoАй бұрын
That is true iff we mean the mathematical definition oft regular expressions. Modern regexes have gained features like backreferences that allow you to match some non regular languages as well which can not be represented by an FSM
@diadetediotedio691815 күн бұрын
@@Mankepanke No? Saying "all programs can be expressed using an AST" just says about the syntactical properties of programs, saying "all regexes can be represented by a FSM" says about their operational semantics. It shows their limitations, and by relation what you can and cannot expect from them.
@ranzrumJojoJDDom373Ай бұрын
@4:20, readbility issues in a blind review... sounds a bit suspect... or just bad readbility.
@marcr8181Ай бұрын
I'd be interested into checking those 10 Unit Tests and see if you can easily write code that satisfies the test but the functionality itself is actually broken.
@k98killerАй бұрын
Fun fact: if you want a significant result, you can just do the same study 10-20 times and get a statistically significant result by pure random chance.
@defeqel6537Ай бұрын
Indeed. Or do a bunch of studies with different parameters / methodology and only publish the ones that are favorable to your product. Pharma does this all the time. There are also ways to pre-cook the data, perform the study afterward, and not mention it in the study.
@IvanRandomDudeАй бұрын
"There are lies, damned lies and statistics."
@diogofirmino7353Ай бұрын
If I close my eyes, it's like Steve Carell is reacting to tech articles
@dragonridertechnologiesАй бұрын
Ignored actual errors in place of style "issues"? If that's not today's corporate world in a nutshell I don't know what is!
@noid3571Ай бұрын
> programmers with copilot write more lines of code > AI code is more concise They already lost me
@tsyf1Ай бұрын
That Norm Mcdonald joke left me both surprised and laughing 😂
@PJutchАй бұрын
37:00 is it bcs it became easier to change something or bcs more code sucks now?
@PJutchАй бұрын
Ok, if it recent changes, than yeah...
@ShadowZero27Ай бұрын
what i need is a copilot with different dials so i can increase the different metrics, like more function, less readabilty errors per blocks
@alexlowe2054Ай бұрын
13:00 Prime describing Bike Shedding, with almost the perfect example for exactly when Bike Shedding is common, and also bad.
@riccardo-964Ай бұрын
Reliability, in the strictest sense is the ability of function without ever stopping providing service. Now, how do you measured that in that code??
@woog1eАй бұрын
18:58 graphs are correct, but kinda obfuscated 37.8% of people using copilot did not pass all tests + 62.2% of people not using copilot not pass all tests = 100% and 60.8% people of using coppilot pass all tests + 39.2 people not using copilot pass all tests = 100% im not defending AI :) but when i see statement like that in article - that (*), it doesn't reflect well in terms of merit or credibility
@dvhhАй бұрын
In a pedantic fashion, the graph would be incorrect as it failed to be read properly at first glance