Great... Github Lies About Copilot Stats

Рет қаралды 195,207

Күн бұрын

Пікірлер: 532

@SethGrantham-k1x Ай бұрын

Reminds me of Google recently saying 25% of all code is now written by AI, when in reality, it was just a LLM powered auto-completion feature helping complete things such as function signatures.

@comradepeter87 Ай бұрын

That's it, I'm repackaging every major LSP and claim it already writes 90% of code ( you tab-completed it so it counts)

@UnexpectedTokens Ай бұрын

it's what we all joked about when we first learned intellisense as a junior, "WHAT IF... we connect this to stackoverflow and google and autocomplete using those?!"

@PuntiS Ай бұрын

It just feels like the name of the game these days is scamming people, investors, whatever you can rake quick money from. It's all so tiresome.

@WiseWeeabo Ай бұрын

Yes, that's what Copilot is.. ?

@username7763 Ай бұрын

Has google forgotten what it means to program? Or do they just not care?

@Kwazzaaap Ай бұрын

I still can't believe there's people out there who think a programmer's product is "lines of code" instead of problem solving. Managers no less. Completely clueless.

@XDarkGreyX Ай бұрын

The more legs a carpenter puts on a table, the better. Also, quality increases with number of screws and nails used.

@realhuman2545 Ай бұрын

You have misintepreted point number 2. What it is saying is given an arbitrary amount of lines of code, CoPilot creates code that has readability errors less frequently. Example: If in normal code, a readability error occurs at every 100th line, then with CoPilot it occurs every 113th line. Note that I dont pass judgement on whether CoPilot is good or not with this comment, I am simply pointing out your misinterpretation.

@chanux Ай бұрын

I keep joking with my colleagues that my dream is to lose my job to AIs that are capable of doing EXACTLY what managers ask them to. Nothing more. Nothing less. I'll go help the farmers for a while.

@LiveType Ай бұрын

I had this metric before. You better believe the amount of lines of code I committed increased by 5000%. Top of the chart baby! That metric was quickly discarded when everybody figured out what I was doing and everybody started committing something like 5k lines per week. Also I only use ai code writing in my editor for single line auto complete. Anything more than a single line and accuracy drops dramatically to the point where it's not worth it. I've had to fix infuriatingly subtle bugs because of that more than once. But AI absolutely speeds up development if you know what you're doing. Anybody who says it doesn't do this is crazy. Overall code quality wise? I'm not entirely sure but at best it's the same, but typically worse. That's not an issue for personal projects but snowballs hard when you have 25 people all with differing opinions on how things should be done. The solution was a dramatic increase in the amount of extremely thorough tests. I'm talking like 15x the tests.

@Paxixify Ай бұрын

In the 1700s, if you had money and too many sons, you put the useless son in the army and bought him a captaincy, so he could at least have a uniform for parties. The MBA serves the same purpose.

@asmrddict Ай бұрын

Commit message: Made code 2.37% more readable. Merge Blocked: WTF are you talking about.

@exginto8053 Ай бұрын

2,137%

@masu33 Ай бұрын

push -f

@hungrymusicwolf Ай бұрын

"We defined code errors as any code that reduces the ability for the code to be easily understood." - so they basically just gaslit us when talking about less errors as a statistic.

@MarcelRiegler Ай бұрын

I bet they just added / removed things from what they consider "errors" until the numbers looked good. You know, draw the bullseye around the dart, like any good liar.

@Topcatyo. Ай бұрын

15:30 - Hitchhiker’s Guide to the Galaxy (specifically, its sequel book) has a cloaking shield for a spaceship called the “Not My Problem” factor or something. It’s a cloaking shield that uses somebody’s unwillingness to have to deal with something like a giant spaceship being there to prevent people from seeing it. This feels very similar

@loopingdope Ай бұрын

douglams adams is such a goat

@asgerkahljensen Ай бұрын

Someone Elses Problem field

@guidoferri8683 Ай бұрын

GitHub defined "error" with such an open mind that their brain fell out

@abdulkadiraminu262 Ай бұрын

I swr out of everything i was expecting from this article i did not expect them to redefine an error

@dera_ng Ай бұрын

🤣🤣🤣😂😂😂😅😅

@zeikjt Ай бұрын

I bet they also redefined "statistically significant"

@PflanzenChirurg Ай бұрын

@@zeikjt i bet they redefined every word of the article xD

@danmac_au Ай бұрын

Wife who doesn't code had the perfect comment: You don't smoke your own green and you don't get your own AI to report on how well your AI performs

@youtubelisk Ай бұрын

What backend does your wife use?

@user-fd7jd4jq1e Ай бұрын

back 2 da kitchen @@youtubelisk

@pluto8404 Ай бұрын

exactly, you dont beet your own meet. You get in a circle with your closest buddies and beat eachother.

@WiseWeeabo Ай бұрын

Why would you not smoke your own green? You should 100% smoke your own green, why would you not? She's talking about being a drug dealer and how it will impact your profits if you consume the drug yourself and simply empty out your own stock. But if you're not a drug dealer then you should absolutely grow it for your own consumption, it's very economical and you have more control over the use of pesticides and stuff like that.

@WiseWeeabo Ай бұрын

@@youtubelisk Probably Dyson and HexClad.

@roccociccone597 Ай бұрын

The AI train as arrived at my company too... They recently released this internal chatbot which is useless and it is often faster to simply read the confluence article... As soon as you try to use these AI tools for something specific, it just falls apart. I am so tired of people without a clue hyping this shit up like it's the second coming of Christ. And the best thing is that you still need the expertise to review and correct mistakes the AI makes. So at the end you're probably spending more time trying to understand the code the AI created and making changes....

@OkAtheistExplainVim Ай бұрын

My company is trying to position itself as "ai experts" because they hooked a chatbot into one demo app 😂

@sanjaycse9608 Ай бұрын

@@OkAtheistExplainVimbroo😂

@fredrikfjeld1575 Ай бұрын

I have seen live demos where AI drew out components and made it insanely easy to check for faults before loading it into a project. 4 people and two days of work done in an hour. A company delivered a project estimated to over 3000 hours and used only a bit over 100, with zero faults found during building. They have used AI to quality check something another company had manually quality checked, and found a fault that would cost tens of thousand each year. Now, people do try to get AI to do all sorts of things that doing manually would be better, but there are companies out there who have proven that it for certain tasks can improve productivity by more than 50%. Having seen a live demo of one, I am hundred percent certain my company will either have to use AI or downsize to take on really niche jobs very soon.

@ПошелНахуй-ш3и Ай бұрын

Its so funny that something like "dont use made up non-existant functions and parameters" is a mandatory line in every ai prompt ans ai still manages to use made up things

@Opharg Ай бұрын

We have a chatbot too, and it's actually ai, I couldn't believe it. The reason it's so absurdly funny is, that THE ONLY thing it can do is ask for your worker ID and birthday to reset your password. It's absolutely comical, like, just give me a form.

@BboyKeny Ай бұрын

For 100% readability, you need to align all text to the right stopping at column 80. This makes it more convenient to read on the punch cards.

@InvisibleSquids Ай бұрын

Also helps if you emulate working on a 800x600 monochromatic tube

@Jckdrllng Ай бұрын

Love when people say "statistically significant" without a P value or any mention of what statistical tests they used Stats is just hard, it's easy to use the wrong test or model for your data.

@0oShwavyo0 Ай бұрын

Just to play devil’s advocate, once you start mentioning p-values your potential audience is going to get a lot smaller very quickly. I don’t think it’s uncommon to graduate a Bachelor’s program with only a single Stats class under your belt. Those folks should at least understand p-values, but to interpret how appropriate various methodologies are for a given data set requires more study than you can get in a single semester. I’d even say it requires you to stay up-to-date in the field. I took econometrics in school and can confidently say I’ve forgotten most of the information I’d need to choose an appropriate modeling strategy. With all that said, the problem emerges that eventually we need to rely on other statisticians to critique and police the statisticians presenting data to us, which does not resolve the issue of finding statisticians to be untrustworthy.

@harleyspeedthrust4013 Ай бұрын

@@0oShwavyo0I learned about P values in high school, in a class about basic statistics. It's not too much to ask for a person to learn just a little bit about stats, and for those who report stats to include some more information. The problem is that "statistically significant" has lost its meaning. Nowadays something is "statistically significant" if someone feels like it is.

@wesleyd.4859 22 күн бұрын

@@harleyspeedthrust4013that’s not really the problem. Significance has a pretty strict definition. The problem is that the null hypotheses under most of these tests (0 difference, no error) is totally unrealistic. For a while statisticians have been screaming for people to abandon the notion of significance for other metrics. The American statistician publishes something that is more or less titled “Abandon Statistical Significance” every year. Non-stats experts are slow to change because it’s so easy and so routine

@dancimpianu7217 Ай бұрын

Thanks for reading my article, I'm glad you liked it! 💚

@RMDragon3 Ай бұрын

Nice article! Something I wish you had also mentioned is how they never say how they evaluated terms like "concise". Obviously the intuitive meaning should be clear, but given that is not the case for other metrics in the article it is at least something I'm curious about. On top of that, there are plenty of (reasonable) ways you could evaluate how concise the code is. The definition of concise is giving a lot of information clearly and in few words. To me, for code, the "gives a lot information" part would be code that achives many tasks, but how do they evaluate that? Is it only based on the cases that passed all tests? Or are codes that didn't succesfully do everything also counted? Is the few words part evaluated based on number of lines, characters, files, functions, non-blank lines...? Also, what if controlers wrote code that would be necessary for a real application, but that wasn't tested for? For any of those options, you could change how you write your code to score well on those. To be clear, I'm not implying they told Copiloters what to do to score well. However, what I AM saying is that you can probably define a reasonably sounding metric after the fact that gives you a 4% advantage towards the conclusion you already had.

@UltraGaivalas Ай бұрын

yeah but who are you

@samygafsi6429 Ай бұрын

The part that weirds me out the most is that by replacing junior devs with AI, you cut the source for new senior and expert devs. Their numbers would only decrease then and never grow. Within few years, this might simply kill the dev industry without a replacement. What happens in this case ?

@XDarkGreyX Ай бұрын

Age-old problem but as if any industry or even society learns

@borylesny Ай бұрын

Same problem as with bridge designers, and areospace engeeners? Will probably lead to simmilar solutions

@samygafsi6429 Ай бұрын

@@borylesny Can you elaborate ? I don't quite understand what you mean

@borylesny Ай бұрын

@@samygafsi6429 There are industries where junior members are liability. And they do not dissapear from lack of staff. Few years with lack of senior programers will cause wages inflation to the point where training seniors becomes industr y by itself.

@samygafsi6429 Ай бұрын

@@borylesny That sound like a reasonable take. I could see programming going more into the medicine route, where programmers have to do so many years of studies that include a lot of practice. And then you add more years for a specialty. but there will be a difficulty to to mimic the residency and internship phase in programming, but still not impossible.

@bolinhodechuvaazedo Ай бұрын

Crazy how LLMs should 10x everyone and now even MS giving bs stats could only conclude a 3% improvement ...? 3% is so small that a developer-to-developer change trumps that, even in 6 months one can easily get 3% better at coding

@ProfShibe Ай бұрын

And LLMs will definitely not be 3% better at coding in 6 months 😂 they’re so bad, especially with databases, it’s comical

@ПошелНахуй-ш3и Ай бұрын

Money-laundering capabilities for ai-hype are crazy

@ColinFox Ай бұрын

LLMs could 10x our productivity if they didn't hallucinate and always produced top quality work. Unfortunately, they're brilliant idiots, and do stupid things as often as they do good things. You still need a skilled and knowledgable developer there to actually make the decisions. I've been deveoping software professionally for 40 years. I use ChatGPT in a lot of my work, but more of an assitant/rubber duck than for producing code that I just drop into place. It's most useful when I have an idea for something and I want to spitball it out. It gets less useful when trying to solve more specific issues.

@defeqel6537 Ай бұрын

a smart new keybinding is likely to give more of an improvement

@InvisibleSquids Ай бұрын

@@defeqel6537setting up Prettier and EsLint to run on save has made a definitive difference for me, lol. Especially after defining solid configs for both

@UnexpectedTokens Ай бұрын

"13 more lines til error" "oh wow that's great, compiler errors right?" "no, linter errors" o_o

@Varelinski Ай бұрын

AI slander is what makes my morning brighter.

@grendel6o Ай бұрын

It's not slander if it's true.

@sutirk Ай бұрын

LLM code is more "readable" because you've probably already read that specific function implementation on stack overflow before. Literally pre-read.

@agentofenhanced2428 Ай бұрын

That's just a stupid over simplification thats not true, ai is trained on coding documents so when u ask it to write something it uses logic, yes if u ask it to create a few classes for a Minecraft clone it wont recreate an entire 3d engine library from scratch why would it ? it can just use lwjgl for example. And its funny how this guy is criticizing studies made by the company that makes the product while he is a programmer reviewing ai which would take his job if it were to work, and something he completely missed is that ai isnt bad because its stupid its bad because the hardware hasnt caught up, claude and chatgpt could create any fullstack app you can imagine if you gave it 100 rtx 4080 supers, ai sucks FOR you because you don't have enough hardware to contain such large context and performance

@RawrxDev Ай бұрын

@@agentofenhanced2428 That's wrong across the board, first off, your usage of logic shows you don't understand how these models works, explain to me the part in the pipeline where logic is applied? Was it when the data is scraped? Was it when the data is transformed into vectors? Maybe when it's fit for the curve! Second, no, hardware is not the limiting factor, you're going to be very disappointed when scaling compute doesn't magically create AGI, diminishing returns is a buzzkill for sci-fi, but its part of our reality. We've already seen the start of diminishing returns, and I'd bet that the concept we've observed 1000000 times before won't magically be wrong this time just because people want to live in space.

@agentofenhanced2428 Ай бұрын

@@RawrxDev dude I am not gonna deal with "well akchtually" comment, you know damn well what I said, your "we've leached the limit" is something people have thought about literally every single invention.

@RawrxDev Ай бұрын

@@agentofenhanced2428 And most of the time guess what, their right. LLM"s are a dead end.. could a different paradigm come about that leads to AGI, sure, but LLM"s are almost guaranteed to not be that path, and since that's where all the money is, I doubt we'll see big leaps for a while again. I think your conflating scientific goals and "inventions", "inventions" have always had their limits, which we've reached, its the goal that we're able to still achieve by changing thought and perspective. The LLM invention is limited, the goal for intelligent AI is still possible.

@GoalOrientedLifting Ай бұрын

@agentofenhanced2428 this is the hardest copium I've read in ages. Imagine assuming there's no flaws in AI, and only assuming it's a hardware limit😂😂😂

@nosotrosloslobosestamosreg4115 Ай бұрын

Something owned by Micro$ift lied? What a shock!

@ToddMagnussonWasHere Ай бұрын

I mean, Bill Gates has a pretty popular photo of himself pre-2019 next to the “How to Lie with Statistics” book by Darrel Huff. Read it out of curiosity, same shit an Advertising or Graphic/Marketing/Web Design college course will teach someone. But very telling of the man himself.

@rumplstiltztinkerstein Ай бұрын

@@ToddMagnussonWasHere He must have written it during his vacation on Jeffstein Island

@lever1209 Ай бұрын

microcock winblows 👍

@kowaihana Ай бұрын

It's also laughable how my friend who uses Linux, uses GitHub copilot 😂

@nosotrosloslobosestamosreg4115 Ай бұрын

@@kowaihana And browse with Edge as well XD

@JakubZabłocki-e5y Ай бұрын

WAIT WAIT WAIT, why is this graph in 18:00 adds up, but vertically? If you add up all fails together you get 100%, same with passes. This math, and graph also was made by AI, i bet you

@autohmae Ай бұрын

amazing isn't it ? Like someone made a mistake with Excel to create a graph

@JakubZabłocki-e5y Ай бұрын

@@autohmae But what mistake? And what is logical path to create that graph? The most logical way I see is "we're taking x examples with y condition and we'll see how many times it fails and how many times it works". Here we have something so weird... "In 100% cases of fails X% is this case and Y% is this case". The logical path to this moment is so weird for me. That's why i think, that AI give a little helping hand to this article

@autohmae Ай бұрын

@@JakubZabłocki-e5y it's almost like it should have been horizontal instead of vertical. But Very much possible it was AI too.

@meherzadlakadia9740 Ай бұрын

I assume the actual metric was, of those who did not pass all tests, 60% were control and 40% copiloters. It was probably done this way since it's the best illusion of gain. Bifurcation of a sliding scale into a binary scale (pass or not) was the first weirdness. If they'd not done that you'd likely plot a graph of score for each group (unit tests passed out of 10). Very shady.

@alpha.gamma.dingdong Ай бұрын

wait wait wait so humans can make mistakes too? are you just proving that AI can help improve graphs too?

@Klaus-bm5ek Ай бұрын

i suspect these readability improvements is entirely because of AIs adding too many comments so you end up with // this function sorts an array function sortArray(array) Im more curious about how you even measure "maintainability" as a metric (other than the obvious answer that you don't and its all made up)

@pluto8404 Ай бұрын

if AI code is so good you shouldnt even have to read it.

@sk-sm9sh Ай бұрын

it would be funny if they asked AI to be the judge to decide which code is more readable

@agentofenhanced2428 Ай бұрын

@@sk-sm9sh it actually does a good job at that, i gave claude two same programs in python, one was written completely explicitly and one implicitly, and it said that while its the same for it, it absolutely prefered the explicit version because it's easier to maintain, and a lot easier to read if multiple people are working on it

@rookbr Ай бұрын

The const foo vs big ass regex line analysis 12:40 is just brilliant! :-)

@hungrymusicwolf Ай бұрын

"There's lies, damned lies, and statistics" - Mark Twain (I belief). Whenever someone comes with statistics assume those are taken out of context to the point of being wrong and you'll be statistically more likely to be right than if you believed those statistics. (Irony intended.)

@0oShwavyo0 Ай бұрын

Ironically you’ve just presented a statistic to us (that the average presentation of statistical data is manipulative and inaccurate). Clearly there’s something pretty natural about using data to inform our worldview.

@hungrymusicwolf Ай бұрын

@@0oShwavyo0 I'm well aware, hence the "irony intended" at the end.

@0oShwavyo0 Ай бұрын

@ it would appear that my brain wasn’t braining this morning 🙃

@hungrymusicwolf Ай бұрын

@@0oShwavyo0 We all got those days my friend. I hope the rest of your day is a better brained one (maybe even good dare I say).

@hinzster Ай бұрын

Only checking code for "errors" like whitespace, repeated code or stupid stuff like that instead of checking for actual bugs is wild. But it's 100% on target for an LLM, which is a) a people pleaser and b) just a model that guesses the next word. Remember that. No, it is NOT intelligence. Forget that buzzword, it is a LANGUAGE model guessing the most likely next word to achieve total bliss in the requester's mind.

@alpha.gamma.dingdong Ай бұрын

no offense but the human brain is a model that guesses the next word. it's just slightly better than an LLM some of the time. the information that you consider "intelligence" is just memory && context.

@0oShwavyo0 Ай бұрын

@@alpha.gamma.dingdong The human brain is not an LLM. It has all kinds of functions that extend beyond prediction, like helping regulate breathing, heart rate, and other parts of the autonomic nervous system. Even in regard to its predictive capacity, the brain is capable of way more than any model that humans have conceived of currently, across many more domains at once, and much more efficiently. We grow new neurons over time, and form new synapses frequently. None of this is true for AI models. The only similarities come from the fact that the AI researchers are mildly interested in the brain for inspiration and like to name their systems after brain anatomy. Actual neurons do not exchange discrete digitized data with one another.

@Kwazzaaap Ай бұрын

@@alpha.gamma.dingdong The idea that a brain is an AI model either assumes that we understand the brain or that AI models are beyond comprehension. Neither is true. I don't get why people feel so confident about being wrong about neuroscience. Maybe there's not enough neuroscientists to call you out on it.

@zzzdenda Ай бұрын

They evaluated correctness and readability separately. They didn't say that bugs don't matter.

@Gwarzonicus 28 күн бұрын

@@alpha.gamma.dingdongbruh no one understand the brain.

@Oler-yx7xj Ай бұрын

I was just thinking how much will I have to lie in statistics (and how much of an artpiece do I want to make that lie) to make my uni homework project look actually decent, and here you are telling me exactly what to do! Thanks Prime, for your advice

@not_user11 Ай бұрын

the wifeagen walking near the coding shed and then hearing: DORA! DOORAAAA!

@jorchard Ай бұрын

0:34 so stoked that he accidentally highlighted the sentence correctly!!!

@arkinul2993 Ай бұрын

why does he do it the way he does, anyway?, it annoys me an unreasonable amount

@mattilindstrom Ай бұрын

The term for seeing without perceiving is visual neglect. It's a debilitating condition mainly associated with traumatic brain injury. The condition can be life threatening, walking into traffic or dying of hunger with a full plate of food on the table.

@sk-sm9sh Ай бұрын

It is possible to very objectively test readability but hardly anyone tries to ever set it up. Experiment can look somewhat like so: you give two different peaces of code that implement same functionality to two groups (large enough so that results would have statistical significance) and you ask them to answer same set of questions about what the code does. If one group answers in better correctness and/or faster that proves objectively that one implementation has better readability than the other.

@defeqel6537 Ай бұрын

and to test maintainability, the participants would need to fix some bugs (how to make those equivalent in different types of code?) and implement new, large, features on top of the existing code base

@adambickford8720 Ай бұрын

So simple, nobody does it.

@kzmatr Ай бұрын

@@adambickford8720 he didn't say it is simple, but that it is just *possible* (although expensive) and you have to make a study just for that part of a larger study multiply that by each coding example you want to test

@0oShwavyo0 Ай бұрын

Describing code in natural language and then trying to draw objective conclusions from that data is flawed. How do you accurately deduce how well someone understands a topic from their description, without introducing their writing ability as a variable? Better to ask what outputs or side effects a particular function has for a given set of inputs/initial state, some kind of question that has a measurable answer.

@sayandeepdey5471 Ай бұрын

LLM hating is my hobby

@Saru-Dono Ай бұрын

The "code quality" errors are even more nonsense if you take into account, that people who have Copilot installed are more likely to also have LSPs or formatters installed, which would warn you about some of this nonsense

@nuvotion-live Ай бұрын

That research was also 3% effective in convincing me. Statistically significant!

@k98killer Ай бұрын

One of my favorite features of Cursor is the ability to refactor something then hit tab as it applies the refactored pattern to other relevant bits of code. Letting it generate new GUI frames based on existing ones is also a good time saver -- starting from scratch with code gen is a painful experience of trial and error, but once you get it right, the AI is pretty good at replicating the correct patterns.

@blazernitrox6329 Ай бұрын

Re: "People who present you with statistics are about to lie to you." I mean, I definitely get the sentiment. I would caveat that "People who present you with statistics _but no context_ are about to lie to you," but that happens so bleeding often that it seems to be the majority case (especially in our social-media-driven 280-characters-or-less social environment). If someone presents you with statistics, you should _immediately_ ask them how they got those statistics, and keep moving down the chain until you've gotten to the ins-and-outs of how whatever study was done (assuming there even _was_ a study). Re: Using AI to do the tedious stuff that you've already done before. As many people in the comments have pointed out, using AI doesn't make you a master at [insert field here]. You still need the expertise to verify the output. People are treating AI like it's some magic tool, but the truth is it's subject to all of the exact same caveats every other piece of software (or, really, tech in general) is. It's a tool. You still need to actually know what it is your making and how it should be made, it just gives you an easier and faster way to do it - assuming you know how to use the tool as well. To really drive this point home, there's an interview (I believe with one of the big wigs at IBM, circa the early 1990s?) where the interviewee mentions that the thing with computers is that they should take over the _tedium_ so that people have more time to spend focusing on more important things. Likewise, it's mentioned (unsure if it's the same interview or a different one) that _computers can't be held responsible, and therefore shouldn't be allowed to make decisions._

@rmidifferent8906 Ай бұрын

18:47 The graph not adding up is so funny. If you swap lower numbers with each other you get 100%. Either they or the LLM mixed up the values. I wonder what is worse At least it's a statistically significant graph or something

@autohmae Ай бұрын

It's not funny, it seems do be trying to create a misleading visual.

@CritLoren Ай бұрын

@@autohmae eh it's still a 60-40 split either way, probably why it was so easy to make the mistake, but true, when 3% is "significant" that stands out

@AmirparsaDD Ай бұрын

Lies or truths, data and statistics is cool no matter wat (only if its formatted in cool graphs and charts)

@UNgineering Ай бұрын

That study is the political equivalent of "we did a survey of a representative sample and discovered that 93% agree". What was the sample? What did they agree with? What does "representative" mean? and WHERE'S THE RAW DATA??

@567saturn Ай бұрын

3% percent encrease sound like a research error. How 3% encrease can even be considered an encrease. Big chance if you do same study again you get 2% decrease

@ValentineOficial Ай бұрын

Increase* but yes

@tylerthomas6932 Ай бұрын

I built a service that you curl with some params to send localization JSON files with a target language which then runs through a lightweight LLM and stores it per page in a mongodb collection. It was just a demo, but the idea is, when you add text to your front end, you run this script and it translates to your target languages and when a user lands on your page, we can query the db, get the JSON and just smash it on the page and it should just work. Now to me that's a pretty cool use of AI.

@Rohinthas Ай бұрын

Huh, this was a really good article, it didnt just confirm my biases, it actually strengthened them!

@rumplstiltztinkerstein Ай бұрын

This video is 13% more giga-chadable than the average

@sammartin7515 Ай бұрын

18:02 the chart percentages do add up to 100. its a really bad chart but its saying that of the people whose unit tests didnt pass, 37.8% were using copilot and 62.2% weren't. the same is true for the right side of the chart.

@krisnahari Ай бұрын

Great timing. I was just about to do my presentation. With some statistics!

@victormasmontserrat5374 Ай бұрын

a reference to the wheel of time with code ? phew .... you have earned my unconditional love

@derendohoda3891 Ай бұрын

It is easy to lie with statistics but it is even easier to lie without them.

@haniffaris8917 Ай бұрын

Finally, this format returns!

@bluedragon4008 Ай бұрын

Just wow!! That "Grey Men" allegory is so precise. And correct.

@Imperial_Squid Ай бұрын

3:37 speaking as a data scientist, I _really_ dislike this "all stats are lies" stance. While I agree it's possible to warp and bend statistics into saying what you want, they're still useful the vast majority of the time. If you don't have numbers to go off, what are you going to trust instead? Pure vibes? Analogies from yourself and your mates? Statistics, like a lot of things, are just tools. How they get used depends on the person using them. Saying "everyone who tells you stats is lying to you" is no more accurate than "everyone with a hammer is going to cave your skull in". If the person telling you the stats is trustworthy, the stats probably are too. So, as always, the real enemy here is companies trying to shill you products, that's all.

@ali32bit42 Ай бұрын

i think he is saying the random statistics in marketing or internet posts are probably made up and misleading. actual science and research usually has real statistics because they have an incentive to use correct data instead of lying.

@dvhh Ай бұрын

@@Imperial_Squid unfortunately most of the forewarning about statistics are usually not shared or ignored in order to "reduce the noise". And probably what I mean by that is that the most noisy statisticians are giving statistics a bad reputation.

@CodecrafterArtemis Ай бұрын

I've just reached 3:20 and I feel like if you have to write out percentages with two digits after the comma, you're trying to inflate the numbers in some way.

@jaroslavhuss7813 Ай бұрын

I want more Family Feud... F@k I was laughing as hell, pure intellectual comedy.

@carlosdamasio349 Ай бұрын

Statistics can be misleading or misused for sure but they’re easier to analyse or discuss than feelings or beliefs…

@pmiddlet72 Ай бұрын

how does one measure things such as 'inconsistent', 'unclear', 'excessive; 'insufficient', or 'variable' without any formal definition of these priors? And of course given the subjective 'measures' sans definition, it appears there were were some sort of null hypothesis significance tests that amount to likely t-tests of means/proportions. The thing is that 'significance' often gets weaponized. As if a p-value is AT ALL a measure of true/false, yes/no, right/wrong. There are important statements that have been made on p-values (and p-hacking, along with a number of related unsavory practices) by the ASA that explain by committee how these should / should not be used. There have been addenda to those statements as well to further clarify what has eternally been an issue in (often the misuse of) statistical analysis. What's also important is a concept called 'effect size' - basically a rough estimate of how many observations would be required to detect a difference between groups in the first place given some chosen model to fit the data (whether it's t-distribution - yeah, that's a model - or any other). On numbers not adding to 100%. This is curious, and for ease-of-understanding should've have been well vetted. However, due to things like sampling error, rounding error, missing data, and what I think would be in this case a possible BIG miss, limitations in how the data were categorized in this experiment (some data may have overlapping information between 2 categories of things - this is what happens when one doesn't understand how to deal with hierarchical information). Not knowing the exact data and experimental methods they used (these should've been published), it's otherwise hard to know how those results were derived. But it 'does' give one the feeling of at least a hint of cherry-picking given what measures they reported on doesn't it? These 'measures' feel more like proxies to me. Maybe they didn't meet support some desired marketing endpoints and decided to go with the easily subjective. It's one thing to get code correct (i.e. the code accurately reflects the input of the user, performs the function that is was intended to do, and ideally has no errors), which is far acry from 'did the code meet PEP-8 or other standard for a language. Would that mean the primary endpoint for co-pilot was how well it does *linting*? Gah.... I won't dismiss the need for good code linting as formatting does produce a certain class of errors syntactically. But if that were the goal of this experiment, they should've stated that up front. I guess the bottom line is simple - this wasn't created by any "data scientist" worth their weight in boogers vs a seasoned statistician (many folks of the data science ilk aren't, but then stats is only one part of that entire umbrella). Not that statisticians don't screw the pooch! Someone could've been paid to cherry pick. Someone could've just been way off base with what methods should've been used for this sort of analysis (which could be anyone). Maybe it was an LLM that produced the output (always trust the digital oracle right? There is no spoon....). Ok maybe there's not a *simple* bottom line. But, it's pretty clear to me there are some egregious errors, and many, if not most of them feel 'marketing contrived'.

@ONoesBird Ай бұрын

Hey Prime. Love your content. Thank you for pushing with educational content and not just playing games. It is harder and that sets you apart and hey, you can do both! Play some games now and then and then drop the knowledge once their hooked ;)

@adissentingopinion848 Ай бұрын

3:14 They're called lies, damned lies, and statistics for a reason

@gnerkus Ай бұрын

@SimeonRadivoev Ай бұрын

The fun part is that no matter how much you mess around with the numbers, reality kicks in at some point. I guess in the process will make some people lots of money in the short term.

@ZT1ST Ай бұрын

@21:57; "Rendering dictionaries useless? I don't even know what that means!" *Two seconds later* "Okay, I wasn't expecting *that,* but I now know what they meant."

@LyraelRayne Ай бұрын

Just as you went on your code review rant it ran an ad with the lyrics “I just got fixated on somebody” and I wasn’t sure whether this was actually part of the video haha.

@scheimong Ай бұрын

18:25 that was a very unexpected SNL reference. I literally muttered "woman" as you were reading that 😅

@jazzochannel Ай бұрын

1:58 "improved readability" all code should be write-only. I thought this discussion was settled.

@diogofirmino7353 Ай бұрын

If I close my eyes, it's like Steve Carell is reacting to tech articles

@ThecatThecat-hq1op Ай бұрын

30:40, anyone who is telling you that is lying to you. That's not how statistics work. The statistics derived from a sample that was taken in a specific location can only be applied in that location. Assuming the sample is actually random (and it's probably not in this case). This is statics 101.

@lestinkstiefel Ай бұрын

Hey! Love listening to you reacting to articles! Have u ever considered releasing these as an audio format? I don’t own yt premium so I can’t play this with my phone off

@kalebbruwer Ай бұрын

I love how they have qualitative measurements down to two significant digits

@LordHonkInc Ай бұрын

"Don't trust a statistic you haven't faked yourself" was the first thing my prof said to us. Clearly facetious but with a kernel of truth, just like every great quote

@Karol-g9d Ай бұрын

in my 4 exemple , if ai find the issue , ai would add 4 white space to respect character spacing like google but its not copilot the issue , its the way ms implement or vetted standard of keyboard cursor location . When i mentionned the issue to copilot ? It could not detect the keyboard cursor location for some character was wrong . I mean that is the first thing that should be checked

@joopie46614 21 күн бұрын

every time i go on primeagen i see 41 minute long video, so I wait to see if the title pops up in the first minute otherwise I go

@vukkumsp Ай бұрын

This is like my previous company where everytime we use their so-called AI tool to query something we need to fill a record in an excel sheet explaining how many hours saved with this query and how much is the expected if we didn't use any query along with jira numbers the query is related to, our office mail id, employee id number, query itself and explaining the purpose of the query as well. The company gave awards and prizes to people who asked more queries (and updated the excel sheet) and saved more hours (numbers written in that excel intended saved).😅

@vukkumsp Ай бұрын

Even bigger problem is they are more concerned about employees using their AI tool atleast once per day writing that query's details in excel,... Else they call you and disturb your actual work. They providely showcase those inflated numbers like employees are saving time with their AI tool

@ShrikeGFX Ай бұрын

The thing is AI does allow people to code things way out of their league. So in these leagues, the code quality is definitely reduced, but you can fish in much deeper waters now.

@maccsguitar Ай бұрын

Copiloters write the description for copilot to generate the functionality. Oh look wow, the code doesn't have 'missing documentation'

@JdotCarver Ай бұрын

3:43 If this were true, ANY AND ALL statistics would be lies. Making the entire act of research unfounded. This is wrong on so many levels. I understand why he posted it to Twitter. ;)

@timokreuzer1820 Ай бұрын

"if you can’t write good code without an AI, then you shouldn’t use one in the first place." They shouldn't be writing any code. But they still do it. The tons of shitty code out there are proof of it. So maybe, if they started using AI, their code would be a bit less shitty. That would be a win for everyone. And they might even learn to write better code.

@temistoclesvaldes5312 Ай бұрын

Where are the confidence intervals for these stats? What about standard error? Hold on - you think 202 coders is an adequate sample size and representative of the population of coders? I'm at 2:11 and stopped and and went and got another drink.

@IvanRandomDude Ай бұрын

"There are lies, damned lies and statistics."

@boppinggamer8571 Ай бұрын

18:56 It adds up to 100 if you only add the same color lines together. Purple added up with purple (37.8+62.2) and green with green.

@googelygoo1 Ай бұрын

I've been using vim movements for 10 years now and doing programming interviews with a simple embedded text editor in a web browser is one of the most painful things I've had to do in a long time

@rsdyeahh Ай бұрын

on 18:20 the data is stacked vertically, which marches 100% each

@tsyf1 Ай бұрын

That Norm Mcdonald joke left me both surprised and laughing 😂

@3v1lp1ngv1n Ай бұрын

At a previous job we have made a ctrie implementation that was 10-20% faster than what we've been using before and we have abandoned it because it wasn't a significant enough improvement. And they are beating their chests about a 3-5% improvement.

@TheFrankvHoof Ай бұрын

Automatically analizing what Regex is doing in any open Pull Request actually sounds like a great way to use AI. If an AI tells me what it thinks that regex does, I can compare that to what I think it does (or is supposed to be doing). Some company like Github should get on that.

@ettoreferruccio6124 Ай бұрын

Thanks Prime for teaching me the `:Sex!` command.

@HolyMacaroni-i8e Ай бұрын

Thanks for teaching it to me also. Was too lazy to actually watch the vid just looked at the comments

@AshCozey Ай бұрын

sitting here wondering why this sounds so familiar, didn't realize i saw this live until like 10 minutes in 🙃

@XDarkGreyX Ай бұрын

At its worst when the recording is months old and Flip dusts it off for uploading

@ranzrumJojoJDDom373 Ай бұрын

@4:20, readbility issues in a blind review... sounds a bit suspect... or just bad readbility.

@bastiana3611 Ай бұрын

11:26 true, it's mathematically proven that every regular expression can be represented by a FSM.

@Mankepanke Ай бұрын

Isn't that as useful as saying "all programs can be expressed using an AST"?

@SpektralJo Ай бұрын

That is true iff we mean the mathematical definition oft regular expressions. Modern regexes have gained features like backreferences that allow you to match some non regular languages as well which can not be represented by an FSM

@diadetediotedio6918 9 күн бұрын

@@Mankepanke No? Saying "all programs can be expressed using an AST" just says about the syntactical properties of programs, saying "all regexes can be represented by a FSM" says about their operational semantics. It shows their limitations, and by relation what you can and cannot expect from them.

@alexlowe2054 Ай бұрын

13:00 Prime describing Bike Shedding, with almost the perfect example for exactly when Bike Shedding is common, and also bad.

@sa-hq8jk Ай бұрын

i’ve never asked chat gpt to generate code for me but i have asked it to point me in the right direction for a few specific problems. every. single. time. i’ve done that it’ll also give me a code sample of how it would be implemented and it’s always the ugliest most unconventional code i’ve ever seen (maybe cause it’s rust btw idk tho)

@ali32bit42 Ай бұрын

when you DO want code samples... it hullucinates functions that dont exist or belong to some library you dont have. hay how do i create a Jacobian matrix ? it gave me "func _calcualate_jacubian_matrix(variable) " and no actual instructions on how it works and why...

@sa-hq8jk Ай бұрын

@ i remember when it came out i asked it how to do bit packing in rust and it said to use u2 and u4 and it was at that moment that i realized the ai hype train was bs

@ali32bit42 Ай бұрын

@@sa-hq8jk the prompt you would be writing to get the result you want can be longer and more complicated then the code some times.

@charlesd4572 Ай бұрын

Interesting the guy who wrote VisiCalc retrained at the end of the 70s in business as he thought with the wide adoption of compilers and interpreters there would be less need for programmers (experts assembly and lower level compiled languages). That's when he got the idea for a spreadsheet. So this is not the first time we've been through this sort of thing. I think we'll end up with very high level languages at some stage that constrain the AI so it can be more precise and predictable.

@dragonridertechnologies Ай бұрын

Ignored actual errors in place of style "issues"? If that's not today's corporate world in a nutshell I don't know what is!

@Tony-dp1rl Ай бұрын

I would be really interested to see the comparison of AI vs Human in terms of achieving functional goals with the fewest statements.

@guilhermeraposo6080 Ай бұрын

"People feel confused without everything being presented in front of them." I feel attacked

@stormoffists Ай бұрын

Correcting "pedanterism" was good stuff

@ET.AIMusic Ай бұрын

You made me laugh with this video man, happy TG! Strongly agree about building a strong team that can work with AI while getting compensated properly. AI is not here to replace those who perform and should be appreciated, for now at least. This is of course my own personal opinion as both a developer and a company owner.

@k98killer Ай бұрын

Fun fact: if you want a significant result, you can just do the same study 10-20 times and get a statistically significant result by pure random chance.

@defeqel6537 Ай бұрын

Indeed. Or do a bunch of studies with different parameters / methodology and only publish the ones that are favorable to your product. Pharma does this all the time. There are also ways to pre-cook the data, perform the study afterward, and not mention it in the study.

@jazzochannel Ай бұрын

3:39 the popular quote is: There are 3 types of lies. Lies, damned lies and statistics.

@gmcoelho Ай бұрын

We regret to inform you that your regex license has been revoked.

@guyincognito1406 Ай бұрын

I won’t accept that unless it comes packed in a jumble of symbols.

@marcr8181 Ай бұрын

I'd be interested into checking those 10 Unit Tests and see if you can easily write code that satisfies the test but the functionality itself is actually broken.

@9darkspells 14 күн бұрын

19:00 I think I realize what they have done here: The true data is: of people who passed all unit tests, 39.2% of them werent using copilot, and 60.8% were, meanwhile of the ones who did not pass all unit tests, 62.2% of them werent using copilot and 37.8% were. So they can add up to 100% each, just not in the way they are presented.

@monq02 Ай бұрын

the graph part it at 18:59 adds up to 100% if you add them by color instead of by line

@PJutch Ай бұрын

37:00 is it bcs it became easier to change something or bcs more code sucks now?

@PJutch Ай бұрын

Ok, if it recent changes, than yeah...

@CheapoPremio Ай бұрын

I do love copilot and jetbrains AI Assist, but only for boiletplate code completion. It's nice that it auto completes all fields on a table that I designed 5 minutes ago correctly. But other than that and the odd regex I create with it I don't rely on ai for my code in any way.

@chrsbll Ай бұрын

"Lies, damned lies, and statistics" - Mark Twain

@Oler-yx7xj Ай бұрын

I wonder whether the API spec and unit tests were generated by ChatGPT

@peterprokop 28 күн бұрын

These percentages in the low single digit percentage with two digits after the decimal point are a joke. They shout "we have no idea what we are doing".