Crowdstruck (Windows Outage) - Computerphile

  Рет қаралды 288,770

Computerphile

Computerphile

Күн бұрын

Пікірлер: 1 000
@luicecifer
@luicecifer 4 ай бұрын
"Well, well, well. Tell me, young gentlemen, why is it always you two when something bad happened??"
@throwaway6478
@throwaway6478 4 ай бұрын
Because we rule the world, and a one in a billion chance is next Tuesday for us.
@SubTroppo
@SubTroppo 4 ай бұрын
I am reminded of Cheech & Chong, - but high on technology. I mean man, what can you do?
@reallyWyrd
@reallyWyrd 4 ай бұрын
"It's a gift." -- the 4th Doctor
@Nicolas-L-F
@Nicolas-L-F 4 ай бұрын
⁠well put
@nahco3994
@nahco3994 4 ай бұрын
That's a bit unfair, isn't it? Crowdstrike managed to crash tons of Linux systems with the exact same software this April. Same software (Falcon), same problem (kernel panic). Only nobody made a big deal about it back then. Dr. Begley even mentions it briefly in the video.
@leighhaynes
@leighhaynes 4 ай бұрын
McAfee did something similar several years ago. A bad definition quarantined core system files. The McAfee CTO from that era is now CEO at Crowdstrike.
@somethinglikethat2176
@somethinglikethat2176 4 ай бұрын
To borrow a comment from elsewhere "real men test in production on a Friday"
@acrazydurian
@acrazydurian 4 ай бұрын
A fine example of "failing up"
@alvintollah
@alvintollah 4 ай бұрын
1 time is a mistake to be learned from. 2 times are a pattern of behaviour, signalling deeper flaws.
@james_chatman
@james_chatman 4 ай бұрын
I got dragged into this and I'm now at 48 hours of overtime. Thanks CrowdStrike.
@jklax
@jklax 4 ай бұрын
​@NigelfarijI was about to say
@FrietjeOorlog
@FrietjeOorlog 4 ай бұрын
@Nigelfarij Tell that to the taxman.
@sunefred
@sunefred 4 ай бұрын
Thats crazy. Whats your patch rate / hour? How many machines?
@Artifactorfiction
@Artifactorfiction 4 ай бұрын
Ujjjj😢😢😢😢😢😢😢😢😢😢😢
@rationalbushcraft
@rationalbushcraft 4 ай бұрын
Did you guys get the USB microsoft created to automatically fix it? What is cool is the winpe usb drive just boots into safe mode and runs repair.cmd file it creates. I am keeping this as it will be easy to change that batch file and have it do other things in the future if I want to.
@TheAnonymmynona
@TheAnonymmynona 4 ай бұрын
So there were 3 seperate failures from Crowdstrike. 1. The kernel Driver didn't have proper input validation 2. The Channel File was broken 3. The testing was so abysmal that they didn't notice before sending the update out to customers.
@torbjornlindh5108
@torbjornlindh5108 4 ай бұрын
It’s quite scary that they get their kernel driver signed, despite it not meeting the standard of validating all input! That’s a systemic problem with their entire solution! (Well, so is the third, but testing is not you build quality into the system, so I think the first is the fatal flaw.)
@jbird4478
@jbird4478 4 ай бұрын
4. They didn't even notice that every client that updated went down, or at least they didn't respond. How that is even possible is beyond me. Their entire product is based on monitoring systems, but it took them hours to respond, and that was after Google had called them out for the chaos everywhere.
@SkandiaAUS
@SkandiaAUS 4 ай бұрын
I think #3 is the worst and why their share price is tanking. Such an utter lack of responsibility to Yolo this into prod.
@ReverendTed
@ReverendTed 4 ай бұрын
It does call into question the WHQL testing that allowed the driver to be signed, which does push some degree of responsibility back to Microsoft.
@jimfoye1055
@jimfoye1055 4 ай бұрын
@@ReverendTed Bingo.
@wcmatthysen
@wcmatthysen 4 ай бұрын
The problem is rolling out an update (that might not have been tested so well) TO EVERYONE ON THE PLANET AT THE SAME TIME. I can't believe Crowdstrike is operating like this. If you did a phased roll-out to a couple of smaller customers initially, and then monitored whether the updates didn't have any glaring issues this whole situation could have been averted.
@ChrisM541
@ChrisM541 4 ай бұрын
That's the nuts & bolts of it. Zero QC/QA before release. In an unregulated industry, this is damningly the norm.
@lever2k
@lever2k 4 ай бұрын
I can't believe huge customers don't have a tiered approach to allowing patches to be deployed.
@Jai-xj7vy
@Jai-xj7vy 4 ай бұрын
​@@lever2k what company do you work at that tiers endpoint protection updates? Never heard of such a thing. Crowdstrike may not even offer that capability.
@rolfs2165
@rolfs2165 4 ай бұрын
@@lever2k That's assuming the software even allows tiered deployment and doesn't expect _everything_ (including the main server) to be working on the same version - and any machine that isn't updated yet can only connect to update.
@TjPhysicist
@TjPhysicist 4 ай бұрын
@@lever2k based on what i've bbeen hearing from others online: a lot of companies **do** have tiered approach for updates, including crowdstrike, but this update - deemed by crowdstrike to be very critical, ignored ALL such settings and was deployed unilaterally to everything.
@IstasPumaNevada
@IstasPumaNevada 4 ай бұрын
"As I said online, you should just go outside and enjoy the sunshine." Okay, but what are people in the U.K. supposed to do?
@QuantumHistorian
@QuantumHistorian 4 ай бұрын
Shots fired. But not seen in the UK, because of the dense cloud cover.
@blucat4
@blucat4 4 ай бұрын
😄
@oourdumb
@oourdumb 4 ай бұрын
The real worry is the lack of QA at Enterprise companies. A state actor infiltrating one of these orgs would be absolutely devastating.
@SuperWolfkin
@SuperWolfkin 4 ай бұрын
The real issue and worry is a monoculture. This sort of problem will always happen. Someone is always going to be affected and there's always going to be a cohort of people who are unfairly affected by things that are out of their control. The problem is the cohort here happens to be extremely big because of there's a monoculture of this type of software monopolies lead to monocultures and monocultures lead to unique weaknesses. This unique weakness was able to take out. You know millions of computers all around the world cuz everyone was using this software. We need more companies in this space. Even now the fact that after this happens, everyone basically have to look to crowdstrike because that's who everyone uses. It sounds there's no competitive alternative
@vincei4252
@vincei4252 4 ай бұрын
It has and still is devastating. Didn't need the boogieman to show this.
@BongoBaggins
@BongoBaggins 4 ай бұрын
If you can think of it, someone has already done it.
@NoahSpurrier
@NoahSpurrier 4 ай бұрын
There are probably already some bad actors out there. Just look at the catastrophic instances of espionage inside the CIA. See Robert Hanssen and Aldrich Ames.
@sandwich2473
@sandwich2473 4 ай бұрын
Agile!!!!!!!!! I love Agile development practices!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@solimm4sks510
@solimm4sks510 4 ай бұрын
Heh the BSOD at 0:40 is cool "For more information about this issue and possible fixes, do not ask us"
@DailyFrankPeter
@DailyFrankPeter 4 ай бұрын
But it's about as helpful as a genuine one!
@T_GingerDude5416
@T_GingerDude5416 4 ай бұрын
also LEET% complete
@paulmichaelfreedman8334
@paulmichaelfreedman8334 4 ай бұрын
@@T_GingerDude5416 All hail 1337!
@telebubba5527
@telebubba5527 4 ай бұрын
Haven't come across that for years. Had totally forgotten how it looks like.
@crazymonkeyVII
@crazymonkeyVII 4 ай бұрын
Could've been a genuine message from M$ then!
@era_s
@era_s 4 ай бұрын
"If you put everything on the cloud, and then the cloud's not there, you've got nothing."
@kevinmcfarlane2752
@kevinmcfarlane2752 4 ай бұрын
The clouds have multiple redundancies though, depending on how much the customer is willing to pay.
@tadeob_
@tadeob_ 3 ай бұрын
what if the could and its redundancies were affected?😮
@BruceAngus
@BruceAngus 4 ай бұрын
I was stuck in Atlantas airport because of this. It was absolute madness and everyone that talked about it, either from the airline or passengers, said it was a Microsoft issue. That's all most people are going to remember.
@0LoneTech
@0LoneTech 4 ай бұрын
That's not entirely wrong. Microsoft did bless this software as permitted the privileges to do whatever to the entire system. They're in turn blaming this on EU, but EU only mandated they provide access to security software at the same level their own has; it's Microsoft's choice to make that this risky. Then there's the trust placed in Crowdstrike; they're likely selected for being a known name, never mind they ran a previous company into the ground in this particular manner. It's like the hotel manager decided to install an entry counter in their front door and nobody asked why it's also a guillotine.
@adityavardhanjain
@adityavardhanjain 4 ай бұрын
I was waiting for this video with extreme excitement for the last 2 days. I jumped on KZbin as soon as I saw the notification.
@nosuchthing8
@nosuchthing8 4 ай бұрын
We all were
@LunarcomplexMain
@LunarcomplexMain 4 ай бұрын
I swear this is only the beginning for tech companies that are losing valued senior staff over the many, many decades...
@DoubleOhSilver
@DoubleOhSilver 4 ай бұрын
Honestly I see why. This career is mostly miserable and the pay seems to be going down.
@kaseyboles30
@kaseyboles30 4 ай бұрын
Senior staff that in case probably cautioned against allowing running code in kernel space before it's tested on a test system because that's a fast track to exactly what happened. Senior staff likely tired of their expertise being ignored by suits who cannot comprehend anything outside their niche might matter.
@vincei4252
@vincei4252 4 ай бұрын
Losing? They think they can do things cheaper elsewhere and AI can replace everyone. I wish them luck in the wars to come. Yes, this was a fun career and all I've see is degradation of quality of life on a massive scale. Where everything is micromanaged by 100% non-technical types. I don't miss it at all.
@vincei4252
@vincei4252 4 ай бұрын
@@DoubleOhSilver KZbin censored my comment. Wanted to say that I totally concur with the sentiment. Not only is it miserable, the hiring process that is adopted across the board seems to be nonsensical hazing rituals that do not map to real world problems or realistic development tasks and activities. The golden age is well and truly over.
@Abdega
@Abdega 4 ай бұрын
Especially the ones who are losing senior staff who know the ins and outs of the product, and replacing them with “Business guy who does business things and doesn’t need to know how the technology works”
@vincei4252
@vincei4252 4 ай бұрын
In the modern version of Battlestar Galactica, Admiral Adama absolutely refused to have Galactica networked to other systems and ships in fleet because of the risks to their it critical system. Yet here we are, allowing a root kit to operate unconstrained on millions of machines. Fun times ahead.
@MrJegerjeg
@MrJegerjeg 4 ай бұрын
Wow, I thought exactly the same! 😃
@evannibbe9375
@evannibbe9375 4 ай бұрын
A lot of the computers that businesses give out to employees (such as ATM screens and point-of-sale devices) where those computers are so cheap that they become completely useless without a network connection (like a Chromebook), and so the system is working “correctly enough” that it detected a problem in those (theoretically) cheap end computers, and it cut them off of the network. The failure was that the wrong thing was found to be a threat, and all those end computers were cut off.
@rolfs2165
@rolfs2165 4 ай бұрын
@@evannibbe9375 "Oops, it's all malware."
@thefrub
@thefrub 4 ай бұрын
@@evannibbe9375 I'm amazed, literally everything you just said in that comment is wrong. It's like I just watched Calvin's dad explain computers
@ivonakis
@ivonakis 4 ай бұрын
And kernel level anticheat is a thing ...
@bilalsadiq1450
@bilalsadiq1450 4 ай бұрын
If Dr Bagley and Dr Pound had a podcast, I'd definitely listen to them talk for hours lol.
@paulmichaelfreedman8334
@paulmichaelfreedman8334 4 ай бұрын
"The IT podcast with Bagley and Pound" Does that sound interesting to you?
@learningCodingWithMe
@learningCodingWithMe 4 ай бұрын
​@@paulmichaelfreedman8334 oh yeah it does
@Turbo3032
@Turbo3032 4 ай бұрын
A Computerphile podcast as a sister podcast to the Numberphile Podcast would be amazing!
@whathappenedman
@whathappenedman 4 ай бұрын
Fr. I like listening to them speak
@scottydawg1234567
@scottydawg1234567 4 ай бұрын
​@@paulmichaelfreedman8334 Yes, actually.
@piranniayt
@piranniayt 4 ай бұрын
Perfect storm: no fuzzy testing the driver code, no staged deployment, no os blue/green boot partition
@Ash_18037
@Ash_18037 4 ай бұрын
No not really, a perfect storm implies the issue was due to various timing / bad luck factors. ie It lessens the culpability of ClownStrike. Each of the issue you mention were just plain incompetence.
@baumkuchen6543
@baumkuchen6543 4 ай бұрын
I am afraid there was not testing at all in this mess. Everything points out to that...
@draoi99
@draoi99 4 ай бұрын
Third Party apps operating in kernelspace... FFS
@colinhobbs7265
@colinhobbs7265 4 ай бұрын
​@@draoi99All operating systems do this. If you are saying FFS about that you don't know how computers work. Yes, including MacOS.
@BigMcLargeChungus
@BigMcLargeChungus 4 ай бұрын
I think it's important to point out that Crowdstrike did the same thing back in April but it affected Linux machines (causing kernel panic).
@Techmagus76
@Techmagus76 4 ай бұрын
But not much talk about that, why probably because you have a rollback mechanism in booting previous working kernels in nearly all distros.
@heinzk023
@heinzk023 4 ай бұрын
Maybe CrowdStrike's management thinks and acts like Boeing's?
@nosuchthing8
@nosuchthing8 4 ай бұрын
Really??
@ChrisM541
@ChrisM541 4 ай бұрын
And they've caused a massive c*ck-up a few years ago. Seems they are 'too big' to fail.
@sinaghaderi9184
@sinaghaderi9184 4 ай бұрын
​​@@Techmagus76bcz no one install an anti-virus on linux.
@jeraldbottcher1588
@jeraldbottcher1588 4 ай бұрын
This boggles my mind as an IT professional. I was part of a team that deployed patches and software for years. This included OS deployment patch deployment, software deployment the whole thing on both Workstations and Servers. We tested our patches extensively before pushing them out to the entire population of the environment. This 1st included a sandbox environment, then a select user / system environment, then we would stage our patches out over several hours so if something happened we could back out before catastrophe struck. And honestly sometimes we would find problems with the patches, and we would be able to immediately stop, suspend and even back out. Yes we would use 3rd party vendor solutions to help with this, and any time we changed ANYTHING we would follow our testing procedures and matrix, normal business. We would never shirk our procedures to test 1st, then deploy. To me this is a total failure of IT Governance and failure to maintain standards. (IT Governance is setting and maintaining standards and policies for the IT Infrastructure)
@Arthur-1337
@Arthur-1337 4 ай бұрын
The frowny face is absolutely necessary
@user-yv6xw7ns3o
@user-yv6xw7ns3o 4 ай бұрын
Yes I agree. Absolutely necessary, even if not strictly so :(
@ICanDoThatToo2
@ICanDoThatToo2 4 ай бұрын
I dunno, I'm starting to like 😉👍
@phizc
@phizc 4 ай бұрын
​@@ICanDoThatToo2 any of these would work too: 🤪 🤯 🥳 🥶 😱 💀 💩 🍐 🌋 🆘️ 🏳 Or an animation: 🤣 😂 🔫 😅 🔫 🥺🔫 🤯💥🔫 🧠💀
@blucat4
@blucat4 4 ай бұрын
If Mike Pound says it, it must be true. Therefore you are wrong! 😁
@CheddarKungPao
@CheddarKungPao 4 ай бұрын
When talking about this incident it's worth remembering that hospitals were affected and she people may have died because of this. So it's all well and good to say when everything goes down, go outside and touch grass. But also, we do need to think seriously about whether we're doing enough to ensure software safety. We take it way less seriously than, for example, car safety. When a new model of car comes out it has to go through all kinds of testing to ensure its safety. But we are doing nothing to ensure software safety, we are just 100% trusting the vendors. I've been a software engineer professionally for 25 years and have long thought that the current approach is madness and incidents like this one only make more sure we need to have standards that all critical system software meets in its development, deployment and implementation.
@Nadia1989
@Nadia1989 4 ай бұрын
Someone left a message in an Spanish dev stream saying their aunt had a miscarriage and couldn't be operated on because the all the hospital computers had BSOD'ed. She had an emergency procedure hours later.
@SuperWolfkin
@SuperWolfkin 4 ай бұрын
100% true. It's definitely a big deal that this incident took down not just School computers or corporate businesses but hospitals that need them to keep people alive. people were missing their medications and for some people like me missing medication means you end up throwing up for a couple nights for other people the consequences can be much more dire. At the end of the day as technology begins to run more and more of our lives I do agree there's nothing you can do to prevent hospitals from being part of the affected class these things will happen and hospitals will be affected just like any other computerized business. The problem is we don't need to have so many hospitals affected in a single incident that is purely the result of a monoculture which is the result of monopolistic practices which is a result of the form of capitalism that we have in North America and its effects around the world. And that's just on a philosophical level without even approaching all the specific problems that could have been prevented in this case
@mohammednazir3249
@mohammednazir3249 4 ай бұрын
bro is secretly working for the government
@jismeraiverhoeven
@jismeraiverhoeven 4 ай бұрын
while i agree with your statement, digitalization also played a huge role in this. nowadays everything needs to be "smart", even things that dont make sense like refrigirators. if those hospitals had alternatives to the computers they used (like for example have paper copies of documents alongside digital versions) this would have hurt them far less significantly. we are too dependant on digital computers
@tyrand
@tyrand 4 ай бұрын
Anyone using this horseshit on hospital computers needs sacking
@mfaizsyahmi
@mfaizsyahmi 4 ай бұрын
Seeing two academicians discuss this issue is so refreshing. So many ideas thrown back and forth.
@minxythemerciless
@minxythemerciless 4 ай бұрын
The guilty in this instance are both CrowdStrike and their Customer Security Managers. CrowdStrike has a history of shipping stuff that breaks systems, most recently their Linux product. The Customers said: Yes CrowdStrike just put whatever you want on our systems without monitoring. And by the way, we have no adequate disaster recovery plan. As a corollary, letting CrowdStrike put stuff on your systems also allows bad people to compromise CrowdStrike and deliver unlimited hurt. If I was a baddie I'd spend my every effort to subvert CrowdStrike!
@ipadista
@ipadista 4 ай бұрын
There will most likely be a lot of QA positions opening on Crodstrike in the aftermath of this. Bad actors just need to get one of "their guys" in through that recruitment process.
@LimitedWard
@LimitedWard 4 ай бұрын
​@@ipadistaI'd sooner expect more attorney positions to open up before QA
@justgame5508
@justgame5508 4 ай бұрын
What an awful take
@haqvor
@haqvor 4 ай бұрын
@@justgame5508 welcome to the corporate mindset. Protection against liability is more important than delivering a working product. Who do you think the company is prepared to pay the most, the lawyers or the engineers? That reflects how they value their respective services.
@jbird4478
@jbird4478 4 ай бұрын
@@lintfordpickle Yeah, but when our security software screws up it will a) first crash the test machine which would block the rest from receiving the update, and b) if that somehow fails our system would allow us to reboot with a previous system snapshot. To see these massive and vital organizations not have _any_ backup plans while putting full trust in an external company is mind boggling.
@kaseyboles30
@kaseyboles30 4 ай бұрын
The fix is simple, do not push untested code onto live systems where it will run as part of a must run to boot kernel level driver. Run it on a test system first. And never trust a 'security company' who says you should do otherwise (except in rare cases, such as a very bad zero day being exploited where it's a gamble either way). If they allowed this for a run of the mill non-emergency update then they don't know cyber security and safety well enough to protect a home gaming system, let alone major systems. This goes past gross incompetence to the point where I wouldn't blame anyone from suspecting malice. Though I personally think it was "we don't screw up, we stop screw ups" level hubris.
@ChrisM541
@ChrisM541 4 ай бұрын
EXACTLY! Unfortunately, this braindead policy of offloading all QC/QA onto the end user is being practiced my an increasing majority of devs...all thanks/empowered by The Internet. Software development is the most uncontrolled, unregulated industry in existence. Governments MUST act...before it really is too late!
@haqvor
@haqvor 4 ай бұрын
I quote Grey's law: "Any sufficiently advanced incompetence is indistinguishable from malice." It doesn't really matter if Crowdstrike did it out of malice or just cut corners to cheap out on development costs. They sell a product that is obviously not robust enough to be used on mission critical systems and they have made the decision to risk their customers business to make more money for themselves. In turn Microsoft allows their OS to hard crash due to a faulty third party driver. That can not be tolerated on mission critical systems so a large part of the blame goes to them as well. The end users seems to be pretty naive as well, they have hopefully learnt the expensive lesson on how to not build infrastructure.
@BillAnt
@BillAnt 4 ай бұрын
There's also a small chance that the files got corrupted during the transfer to a CDN which served the corrupted update to millions of computers. We shall see....
@wily_rites
@wily_rites 4 ай бұрын
Software running in the kernel pretending to be a driver, when in reality it is a parser, what could go wrong?
@blenderpanzi
@blenderpanzi 4 ай бұрын
Windows can in fact boot with the failing driver automatically disabled the next time, except for drivers that are marked as absolutely necessary for booting itself, and this driver is marjed as such.
@irql2
@irql2 4 ай бұрын
nah it wasnt marked as boot critical, common talking point though. Doesnt change anything though, unless you get to a desktop windows considers it a failed boot, do that 3x and you end up in the recovery console.
@grokitall
@grokitall 4 ай бұрын
@@irql2 yes it was, but the decision as to if it can be downgraded should be Microsofts. just because they want it to prevent booting if it cannot start does not mean that windows cannot start without it.
@irql2
@irql2 4 ай бұрын
@@grokitall stop parroting talking points and go look at how the driver is configured in the registry. People super confident about things and wont even verify when its very easy to do.
@grokitall
@grokitall 4 ай бұрын
@@irql2 according to retired microsoft engineer dave plumer, they had it marked as boot critical according to his sources. i have no reason to doubt his statement. despite how unimpressed i am with various choices Microsoft has made, i have no reason to doubt the quality of their engineers. that is why i am sure they are capable of determining if it is actually boot critical when the driver is being signed. i am also sure that they are capable of writing code which will use that determination to down grade the driver and disable it if it is too broken to boot, and to check if it is stuck in a boot loop. for any os, as long as you can get to startup, and use the net, you can fix the driver with an update without having to manually login to all the locked down machines. the fact that they have not bothered to implement such a measure when this has happened before is disappointing.
@irql2
@irql2 4 ай бұрын
@@grokitall Thanks for confirming you wont even go look and you'll just parrot whatever anyone says. David is wrong too and he would admit it if he looked. We're human, it happens... He probably doesnt have a dump to go and check. and honestly doesnt matter. Whats more concerning is how confidently wrong people and they have no interest in learning anything that wasnt hand delivered to them by some source they consider trustworthy. This is a huge problem and our political climate is evidence enough of this. If you would have asked "How do I verify this?" since you obviously don't know or even care to, I would have shared that information with you so that you could be more informed on the topic... but nah, polly wants a cracker instead. For those that are interested in learning, csagent's Start value is set to 1. Meaning its just another driver, its not special in regards to booting. If it were, you'd get a 7b on boot. This entire interaction is disappointing. What happened to the days when people went "Oh yea? Show me".
@daanwilmer
@daanwilmer 4 ай бұрын
Thanks for being the first source I found that actually explains what crowdstrike is and what went wrong here, and nice to hear some nuance amd perspective as well.
@IceMetalPunk
@IceMetalPunk 4 ай бұрын
If you want a little more detail: apparently, the definition file they pushed out left some index entries uninitialized, so some memory addresses that were meant to hold pointers ended up with junk data that, when dereferenced, pointed to invalid memory locations.
@Tahgtahv
@Tahgtahv 4 ай бұрын
@@IceMetalPunk Thanks, this is the best explanation I've heard so far. IMNSHO, the software should have been written in such a way such that the definitions don't directly map to memory. Then when you create data structures in memory, they always point to something valid. But nobody asked me.
@alazarbisrat1978
@alazarbisrat1978 4 ай бұрын
@@Tahgtahv I think what you're talking about is Rust. but apparently there were numerous cracks in the program even before then that was caused by the same QA issues that caused this current crash, the crash was just everything finally fell apart
@stco2426
@stco2426 4 ай бұрын
Enjoyed this. Glad I watched the recent 'Dave's Garage' video where he explained the problem. Here I saw and got a good understanding of the wider consequence management. Well werth wathing both I think.
@WilliamLeeSims
@WilliamLeeSims 4 ай бұрын
The CrowdStrike bug was what Y2K wished it could be.
@ZiggyGrok
@ZiggyGrok 4 ай бұрын
Fortunately we fixed Y2K before it could cause this chaos. If we had done nothing, it would've been far far more devastating.
@davidmcgill1000
@davidmcgill1000 4 ай бұрын
@@ZiggyGrok Y2K only affected those that were too lazy to add 2 more characters to their dates. If your code was vulnerable, it was terrible code to begin with.
@nosuchthing8
@nosuchthing8 4 ай бұрын
The world was not as interconnected then too.
@AySz88
@AySz88 4 ай бұрын
​@@davidmcgill1000 You realize that non-programmers use two digits for years too? A lot of it was a (lack of) standards issue, not just code
@davidioanhedges
@davidioanhedges 4 ай бұрын
​@@davidmcgill1000too lazy... No, using software originally designed when memory was small and expensive, and saving two characters per entry won them pay rises There were huge and expensive efforts put in to check and update to get around the issues many years later, and so near nothing happened, but it doesn't mean there wasn't a problem
@3Ppaatt
@3Ppaatt 4 ай бұрын
Working for a Bank we had drills where we simulated losing our systems for a few hours and had to do everything (and I mean every conceivable thing we might be asked to do in a normal day) without any computers. Including driving physical records to central processing locations.
@sunefred
@sunefred 4 ай бұрын
Falcon is using definition files which are NOT part of the WHQL process which Falcon obviously is! I don't know how this works on Linux or MAC, but maybe it should not be allowed for Windows driver makers to deliver _anything_ to the kernel that does not go through the WHQL certification.
@roippi3985
@roippi3985 4 ай бұрын
This is the part that’s wild for me. WHQL is supposed to be this Highest Level Of Scrutiny thing, and somehow WHQL reviewed this workaround to inject arbitrary runtime behavior without requiring WHQL recertification and said F It Ship It.
@IceMetalPunk
@IceMetalPunk 4 ай бұрын
My only suspicion is that someone, somewhere thought requiring WHQL for definition files could delay definitions too long when new vulnerabilities are discovered and need to be monitored. Like, "if we do WHQL on every definition, by the time it gets released, so many people could be affected by this exploit!"
@sunefred
@sunefred 4 ай бұрын
@@IceMetalPunk I think that's the reason, and I can't say I have any insights in the WHQL process to tell you how long the process normally is. Would be interested to know though, do you know? I would imagine most of it is automated.
@playground2137
@playground2137 4 ай бұрын
Yeah that is an important part that they didn’t mention, I think.
@bierrollerful
@bierrollerful 4 ай бұрын
Maybe definition files do not contain any code and are thus exempt from WHQL process? It could be that the definition file was simply corrupted and unreadable and the kernel driver crashed when trying to read it.
@lenwe33
@lenwe33 4 ай бұрын
13.37% complete... ISWYDT 🙃
@blackholesun4942
@blackholesun4942 4 ай бұрын
What does that mean
@alazarbisrat1978
@alazarbisrat1978 4 ай бұрын
@@blackholesun4942 I see what you did there
@playground2137
@playground2137 4 ай бұрын
@@blackholesun4942I am not sure which part you didn’t get. The custom blue screen of death (BSOD) is something they fabricated. 1337 is often used in gamer culture to mean LEET (or elite rather). Usually indicating something like highly skilled (1337 player for instance). ISWYDT : I see what you did there. So it is used a bit ironically here, because it was of course not a skilled update. Hope that helps.
@jeremytrees7266
@jeremytrees7266 4 ай бұрын
​@@blackholesun4942 🏴‍☠️
@JonBrase
@JonBrase 4 ай бұрын
​@@playground2137TBF, 1337 is specifically turn-of-the-millennium gamer culture (late GenX, elder millennial). I'm not sure I've even seen younger millennials using it, let alone Gen Z.
@m4rt_
@m4rt_ 4 ай бұрын
The new update to CrowdStrike falcon included some corrupted channel files (they contained just zeroes instead of the intended data), and because the core driver that loaded the channel files didn't do enough input validation, it continued on using the messed up channel files, and this revealed a bug that likely had been there for a while. The bug caused the driver to attempt to dereference a null pointer, which caused the BSOD.
@David-bi6lf
@David-bi6lf 4 ай бұрын
Yeah and probably crowd strike have not fixed the bug because it would require a new release of the driver and that would have to go again through the Microsoft WHQL signing process which the use of these channel files seeks to avoid.
@MatthijsvanDuin
@MatthijsvanDuin 4 ай бұрын
Note that this corruption claim is afaik coming from one random twitter user and has been denied by Crowdstrike who says there was a logic error in the updated rules file that caused the problem. It seems extremely unlikely to me that crowdstrike does no validation on these files given that they're being updated frequently on a huge number of machines and are therefore liable to get corrupted (due to power failures and such) on a regular basis.
@MatthijsvanDuin
@MatthijsvanDuin 4 ай бұрын
I found a twitter post from someone that the problematic channel file was _not_ zero-filled on any of the systems he had to manually fix that day.
@Ny_babs
@Ny_babs 4 ай бұрын
My local pub went down.. no fish and chips for me..
@jklax
@jklax 4 ай бұрын
No cash in hand?
@Abdega
@Abdega 4 ай бұрын
“This was a phishing attack and a chip level attack?” “No, no… the cash register system is down thanks to broken Windows update” “They broke your windows and stole your cash?!” “No, the money is still here!” “Okay, I’ll just pay you in cash then” “I can’t do that! The register is locked unless the computer tells it to open! Besides, each purchase is required to update the inventory as well” “I don’t see what the Tories have to do with anything in this case” “… I don’t have time for your Monty Python shenanigans” “I’d think this stuff would be programmed in C and not Python” “GET OUT!”
@paulmichaelfreedman8334
@paulmichaelfreedman8334 4 ай бұрын
@@Abdega 😂
@KarimY-119
@KarimY-119 4 ай бұрын
in my local pub i can order by sending a SMS to their fax. cash-only place
@dhillaz
@dhillaz 4 ай бұрын
​@@Abdega When the best comment is buried in a thread
@DragoniteSpam
@DragoniteSpam 4 ай бұрын
A number of years ago Tom Scott did a fun talk called "Single Point of Failure." I think about that sometimes.
@Vospi
@Vospi 4 ай бұрын
Very enjoyable format of two people discussing. Sounds less monotonous, too. Great job.
@zhandanning8503
@zhandanning8503 4 ай бұрын
when the computer goes down, that is a sign to photosynthesize, nice
@Abdega
@Abdega 4 ай бұрын
It’s thunderstorming where I’m at so I’d have to wait
@lachlantula
@lachlantula 4 ай бұрын
that os/house/hotel analogy was really good!
@eructationlyrique
@eructationlyrique 4 ай бұрын
Linux has a feature that allows the sandboxing of channel updates using eBPF, although Crowdstrike doesn't use it yet. In theory, that could have prevented the BSODs had Windows had a similar feature. Also, I don't ncessarily agree that Windows is blameless here. While Crowstrike is definitely at fault, Windwos did certify their driver, and that validation somhow didn't include testing for corrupted or invalid channel files. There's no reason the driver should blindly trust those files without validation.
@reybontje2375
@reybontje2375 4 ай бұрын
Yeah, Microsoft also allows eBPF, but it's in an alpha, very early state. Also, the people opining that "this isn't a Windows' issue" are right to a degree, but when you realize that there are design deficiencies around how Microsoft handles drivers, it can only be said, "they're right to a degree," especially when you can specify kernel command line options to disable drivers that are acting bad, or have a fallback initramfs that doesn't load the CrowdStrike driver, which Windows doesn't really allow. I believe that CrowdStrike is also on the eBPF design foundation alongside some other industry giants like Apple, Google, Microsoft, etc. I think CrowdStrike also uses eBPF for Linux in their newer agent after the debacle back in March/April with Debian.
@JonBrase
@JonBrase 4 ай бұрын
My understanding is that CrowdStrike does use some type of interpreted code in their definition files, which would imply that there was some bug in the interpreter (or code downstream of it) that allowed a null-pointer dereference through (or made a null pointer dereference on its own).
@TheFPSPower
@TheFPSPower 4 ай бұрын
@@reybontje2375 Windows does have self-recovery functions for bad acting drivers, but they do not work on boot drivers and Crowdstrike's driver is a boot driver so the system is not allowed to boot if it crashes by design unless you use safe mode.
@JonBrase
@JonBrase 4 ай бұрын
@@forbidden-cyrillic-handle Lol. Your username.
@sinaghaderi9184
@sinaghaderi9184 4 ай бұрын
But who would install this on linux? I never seen a linux server with anti-virus or edr. it sounds dum.
@mythofechelon
@mythofechelon 4 ай бұрын
As someone who led the deployment of EDR and EPP to 18,000+ endpoints last year, agents are absolutely installed on Windows servers, yes. Updates like this that don’t go through change control are a calculated risk for more up-to-date protections. Problem is that the risk mitigation is that the vendor does testing and releases competently..
@PE4Doers
@PE4Doers 4 ай бұрын
I am a recently retired Cyber Security (though being heavily involved in Computer Security for over 30-years, and a software developer for 20 years prior to that, I prefer the traditional names of Computer or Systems Security) Compliance Officer. Although the systems I monitored were involved with critical infrastructure and not open to regular users of business systems, they were still peripheral dependent on many such systems. Since I was a stickler for avoiding the Cloud and third-party security products, my former employer has taken steps to ensure I never know if they were severely affected by the CrowdStruck (accepting the pun) event. The real issue is something you two gentlemen mentioned but did not go deeply into. What if there were malicious embeds (i.e. spies) working for that organization, or for Windows System development? We would not be face a bad day or so, but it could been lights-out until every critical system were completely rebuilt and data backups restored. I can understand why discussion of that scenario would be avoided, but should it be avoided. If I were a critically ill patient in the hospital I would want to know so I could prepare for the aftermath.
@Scum42
@Scum42 4 ай бұрын
Every time there's some outage, or bug, or virus big enough to get in the news, I get excited about the inevitable computerphile video explaining it.
@phasm42
@phasm42 4 ай бұрын
Crowdstrike sounds like a nickname for Mustangs 😅
@marvinracer88
@marvinracer88 4 ай бұрын
good one lol
@cappaculla
@cappaculla 4 ай бұрын
We can probably thank Dave Plummer for making sure guys like this actually know how to explain the issue.
@Moose_33
@Moose_33 4 ай бұрын
Yesssssss, twas waiting for this. You beautiful channel you. The dynamic duo returns
@paultasker7788
@paultasker7788 4 ай бұрын
Finally, a really good explanation of crowdstrike and what it does and what went wrong.
@tocsa120ls
@tocsa120ls 4 ай бұрын
Crowdstrike did more harm to its clients, and to the Western world, that it could ever have possibly prevented for the entire duration of its existence as a company. How they ONLY lost 20% of their share value is mind-boggling.
@AlBoulley
@AlBoulley 4 ай бұрын
Love the point you've made.
@nicostigliano6393
@nicostigliano6393 4 ай бұрын
You said the most obvious thing
@Valgween
@Valgween 4 ай бұрын
robot movie pfp
@tocsa120ls
@tocsa120ls 4 ай бұрын
@@nicostigliano6393 nobody's saying it out loud tho
@vincentfiestada
@vincentfiestada 4 ай бұрын
Finally, FINALLY, some informed and cogent commentary on this issue that isn't just "Tech influencer says Windows is a mess and this would never happen in Linux or macOS"
@lis6502
@lis6502 4 ай бұрын
Crowdstruck? We gave this overtime event a codename of 'clownstrike'
@PowerShellWizard
@PowerShellWizard 4 ай бұрын
As an Ex MS employee and one that worked at Windows, I appreciate what was said at 7:42 :)
@bbellefson
@bbellefson 4 ай бұрын
Typical "Management Bug?" A CrowdStrike engineer or two urges more testing before release. Some executive then pounds the conference table and shouts, "No more f**king EXCUSES! I want that update NOW gawdammit!"
@wcmatthysen
@wcmatthysen 4 ай бұрын
Yeah, and I want it rolled out to everyone, NOW!!! Phased roll-outs are for pussies!
@aixtom979
@aixtom979 4 ай бұрын
Especially seeing that the CEO of Crowstrike *now* was the CTO at McAffee back *then* , when McAffee brought down XP Machines by deleting Windows core files in 2010. The common factor ist the manager.
@Asidders
@Asidders 4 ай бұрын
I love listening to these engaged guys 😁
@tubehellcat
@tubehellcat 4 ай бұрын
😂 the example bluescreen at around 0:36 , 13.37% 😂 love it 😁
@taz9609
@taz9609 4 ай бұрын
I mean crowdstrike issues aside, WHAT A CHANNEL! Thank you
@rooboy69
@rooboy69 4 ай бұрын
Crowdstrike didnt do any validation control(or not enough) in their Driver to check the .sys file before running it to confirm it wasnt just full of Null values etc.
@TS6815
@TS6815 4 ай бұрын
These IT disasters always have the upside of flushing Dr Pound and Dr Bagley out of whatever else they’re up to, to give us these great explanations!
@sunefred
@sunefred 4 ай бұрын
Its going to be very interesting to see what Crowdstrike learns from this. One thing they didn't seem to use is a canary or blue/green deployment scheme. Hoping for some enlightening blog-posts on the topic eventually.
@vincei4252
@vincei4252 4 ай бұрын
nothing. The guy in charge oversaw something exactly similar when he was at McAfee
@spartanj2957
@spartanj2957 4 ай бұрын
Microsoft,CS,Black rock the WEF and more are tied together .was no accident
@fatonaoladimeji9697
@fatonaoladimeji9697 4 ай бұрын
I would have listened to these guys talk about it for an hour
@rubenreyes2000
@rubenreyes2000 4 ай бұрын
You didn't mention that in order to install kernel drivers, the code needs to be submitted to Microsoft's to be tested, approved and digitally signed. As you mentioned, the bug was not present in the main kernel, but in the "channel files" that are updates without following that same process. It is not clear to me if those "channel files" are code or just configuration, but maybe Microsoft is partially at fault here for allowing these channel files in the first place, or for not sufficiently checking the kernel driver had the necessary logic to gracefully crash without taking down the entire system.
@throwaway6478
@throwaway6478 4 ай бұрын
Clownstrike apparently uses a P-code interpreter to sneak unsigned code into their driver. You'd be a millionaire by Saturday if you invented a heuristic that can reliably detect a P-code interpreter and/or the P-code itself (which of course can be in any format the writer desires) running in kernel mode.
@nosuchthing8
@nosuchthing8 4 ай бұрын
As I understand it, if something fails in ring zero or kernel mode, the entire OS goes down.
@TheFPSPower
@TheFPSPower 4 ай бұрын
@@throwaway6478 In this case it's not that hard, it's a new file getting loaded from system32, the kernel knows every file you open so you could absolutely block unsigned files in system folders from loading, but as they said it would interfere with competing products so they can't do that, they signed an agreement to allow kernel drivers to work.
@ChrisM541
@ChrisM541 4 ай бұрын
There are exceptions to requiring to get your code MS Certified - code that needs to respond to Day 0 attacks don't need certified, for obvious speed reasons. Fortunately/unfortunately.
@irql2
@irql2 4 ай бұрын
the "bug" was in csagent.sys, thats the driver that was referencing an invalid memory address. Important to note that.
@HopliteSecurity
@HopliteSecurity 4 ай бұрын
Computer Phile is amazing! I love your content and calm but casual demeanor. Your explanations and ability to break things down is superb! Keep it up 🙏🙏🙏🙂❤️
@pnwlady
@pnwlady 4 ай бұрын
Are there no standards for deploying updates that run in the kernel?
@jjdawg9918
@jjdawg9918 4 ай бұрын
I cant find one KZbinr talking about proper sysadmin practices at the enterprise level that would have caught this before getting rolled out. I have never worked at a company where PCs weren't locked down from software installs and every update (even ones from MS) were tested by local QA before rolling them out to your enterprise PCs. Unbelievable that airlines are being run this way. Unless Cloudstrike installed some rootkit that bypasses all these processes I'm shocked at the state of sloppiness in IT.
@egria
@egria 4 ай бұрын
I am trying to voice out the same thing but not even tech guys understand. CS Falcon updates bypass everything but still i don't understand how admins allow live updates on supposedly closed system like airports, banks, POS etc. And the loophole seems like the same windows update server used fir both live and testing, or just plain network connection to outside world to allow CS Falcon updates so that it can prevent zero day security issues. It is just absurd!
@TechSY730
@TechSY730 4 ай бұрын
UPDATE: Thanks tma2001 letting me know the zero file was not the cause. And in fact there is validation in place. The error was somewhere else. So the below is inaccurate Seems it was a lack of input validation. Apparently the root cause of the crash was that one of the files in the definition update was just a file filled with zeros for whatever reason. Leading to a null pointer dereference (which always crashes, by design) But that makes me go like: Input validation anyone?! Does CrowdStrike Falcon fail to at least make sure the definition file makes sense as a definition file before blindy following its directions?
@necuz
@necuz 4 ай бұрын
Everyone who is even remotely competent knows to put headers on files, network packets and the like. A magic byte or two and some metadata goes a long way when validating.
@tma2001
@tma2001 4 ай бұрын
no that was a red herring - for some people it wasn't all zeros and CS confirmed in a technical blog post that null bytes in the channel file were not the cause. There are many possible reasons why it was a file of zeros for some folks - pre-allocated ahead of time before updated or wiped clean as a post processing step for security. Valid channel files have a magic signature at the beginning and they actually contain code in the form of byte code for a VM interpreter in the actual kernel driver. The logic error was in the byte code. Of course this means the actual driver can have gone through WHQL but is actually a dynamic entity.
@TechSY730
@TechSY730 4 ай бұрын
@@tma2001 Ooh, thanks for the correction. I hadn't heard any technical detail updates since the original 0'ed file finding
@tma2001
@tma2001 4 ай бұрын
@@TechSY730 you were not alone - I too was confused by what little folks had to go on initially. None of it made any sense! There is a full explanation by the Cloud Architect B Shyam Sundar on Medium website to breaks it down.
@HubrisInc
@HubrisInc 4 ай бұрын
Never fails, something big happens in the field of cybersec, we can guarantee that we'll get a Computerphile video starring Dr Bagley &/or Dr Pound :)
@miravlix
@miravlix 4 ай бұрын
Not seeing much understanding of administration. A system I was admining involves testing updates before they get installed on the live environment and with this many computers, you don't install it on all of them at the same second, you install it in segments and don't continue until you have successfully restarted the first batch of computers. This all about GREED admining, they didn't want to pay for doing to properly, my way of admining was developed in the 19xx, we have INTENTIONALLY dropped security to save money.
@egria
@egria 4 ай бұрын
Yep, admin practices is the key and not a particular bug. Live updates in closed system is big NO no matter what sweet voice of software vendor tells you. And the most common phrase nowadays is: "it is for you security" - be it the people or the machines.
@egria
@egria 4 ай бұрын
Some companies had staging environments but they use the same windows update server for both live and staging/testing so this update just bypassed software enforced policies and gone live. Those are mine speculations git from admins sharing their cases. Yet no in depth public case analysis. Hush practice fir reputation.
@daryx.langdale
@daryx.langdale 4 ай бұрын
Big cyber-cockup (a beat) "Crowdstruck (Windows Outage) - Computerphile" ------ Right on time, thank you
@SyphistPrime
@SyphistPrime 4 ай бұрын
It also doesn't help that Microsoft took away the key combo to tell the OS to boot into safe mode on startup. If that was a thing I'm sure this would've been at least a bit smoother.
@throwaway6478
@throwaway6478 4 ай бұрын
It amazes me how many of you don't know about bootmenupolicy legacy.
@SyphistPrime
@SyphistPrime 4 ай бұрын
@@throwaway6478 because I don't specialize in the black box that is Windows. Also why should I have to dig through layers of archaic settings to change this when it's a sensible default?
@throwaway6478
@throwaway6478 4 ай бұрын
@@SyphistPrimeYou use an operating system where you have to edit dotfiles to configure your mouse. 🤣
@irql2
@irql2 4 ай бұрын
@@SyphistPrime oh stop it, you're not reading the source code for linux to figure out how something works, no one does that... you "can" do it, but thats not a thing an average person does. You're reading documentation just like people do with windows. Stop it.
@SyphistPrime
@SyphistPrime 4 ай бұрын
@@irql2 The documentation on Linux is leagues better than Windows. There's so many undocumented and hidden features in Windows where as with Linux it's all out in the open. Also I have read bits of source code when AUR packages failed to compile. I've very much used that to help fix issues with PKGBUILDs and compiler errors. It's not usually necessary to read source code because all the documentation is out in the open, unlike Windows.
@s2snider
@s2snider 4 ай бұрын
Excellent discussion. I'm so glad I'm not in IT any more.
@akashaabeysundara8454
@akashaabeysundara8454 4 ай бұрын
1:13 if that hotel is like linux then the guests would carry their own air conditioners 😂
@SanderEvers
@SanderEvers 4 ай бұрын
and smart guests will build their own hotel next to the original, with only a small difference.
@davidioanhedges
@davidioanhedges 4 ай бұрын
Linux can run CrowdStrike, and had a worryingly similar issue a few weeks ago, since it was in the kernel there was nothing Linux could do either... But only on a couple of distros and only if you had installed Falcon CS ...
@dhillaz
@dhillaz 4 ай бұрын
Room key is not in the sudoers file. This incident will be reported.
@timsmith2525
@timsmith2525 4 ай бұрын
And to get your room cleaned, the instructions would be, "Run make, look for any errors, and correct them."
@kentslocum
@kentslocum 4 ай бұрын
This was a fantastic conversation! 😊
@paranic7
@paranic7 4 ай бұрын
There is a bottle of water under the desk !
4 ай бұрын
Oooh boy, you're guys are back. Finally!! ❤
@stefanreindel9888
@stefanreindel9888 4 ай бұрын
Wondering how it got past QA? Seems like installing the update on a docker instance or vm would have found this bug.
@ytechnology
@ytechnology 4 ай бұрын
Also, how was rollout conducted? Normally it would be tiered / staggered to minimize damage from faulty code. I haven't found any confirmation, but this looked like a "big bang" release.
@Tahgtahv
@Tahgtahv 4 ай бұрын
@@ytechnology It sounded like from the video, what they pushed out was definition files, and not code per se? Normally I would not expect that kind of thing to cause a kernel panic, so maybe they didn't either. Hopefully, this incident will make them take a hard look at how they do/deploy things in the future, no matter what it is.
@MrThebigcheese75
@MrThebigcheese75 4 ай бұрын
Friday update before the holidays strikes. Just like Friday built cars. Just push into production and go down the pub, will deal with problems when we get back.
@muhdiversity7409
@muhdiversity7409 4 ай бұрын
QA is a cost center. Everyone is getting rid of that. Why not have the devs responsible for QA, oh and deploying the stuff to the customers and datacenters. The above is not a joke, I've lived it for 5 year now.
@ChrisM541
@ChrisM541 4 ай бұрын
"Wondering how it got past QA?" - there was none. This industry is unregulated. The mentality is "push now, patch later". Maybe governments will finally wake up to the certainty of more timebombs.
@m4rt_
@m4rt_ 4 ай бұрын
"Anything that can go wrong will go wrong.." - Murphy's Law Another one I like is the variation of Murphy's law from Interstellar: "Anything that can happen will happen."
@ChrisM541
@ChrisM541 4 ай бұрын
Murphy also says... "Remove QC/QA and you're f*d !!"
@michipeka9973
@michipeka9973 4 ай бұрын
"Dave's Garage" a former microsoft software engineer just did a video about what he thinks happened about this. Very comprehensive and very clear. He also speaks extensively that this was possible because Crowdstrike works in kernel mode.
@murzilkastepanowich5818
@murzilkastepanowich5818 4 ай бұрын
why would anyone want to watch that scammer?
@cidercreekranch
@cidercreekranch 4 ай бұрын
@@murzilkastepanowich5818 WTF?
@michipeka9973
@michipeka9973 4 ай бұрын
@@murzilkastepanowich5818 Sorry, I am not aware about any of that or don't even know what you are talking about. Just found about it yesterday, the video in question seems fine and basically makes some of the same points as this one, but is a bit more detailed.
@murzilkastepanowich5818
@murzilkastepanowich5818 4 ай бұрын
@@cidercreekranch your wholesome 100 big le epic reddit content creator aint that wholesome 100 eh?
@Razzy_D9111
@Razzy_D9111 4 ай бұрын
@@murzilkastepanowich5818 take your meds
@JohnHessGA
@JohnHessGA Ай бұрын
Thanks, great video. Any recent update?
@rodolphenemr9064
@rodolphenemr9064 4 ай бұрын
Been waiting for this 🍿
@alexandrecolautoneto7374
@alexandrecolautoneto7374 4 ай бұрын
13:06 totally agree, we just need to US develop our technology. But we see how US monopoly all technologycal aspects, and any real competitor they ban out...
@lambda653
@lambda653 4 ай бұрын
8:42 It can happen and indeed DOES happen on mac and particularly linux machines but the difference is those operating systems have safety mechanisms in place so that mass IT outages like the kind that just occurred can't fail to the point of individually booting every single device into safe mode and deleting a driver file. As you said, there was a kernel panic error on clownstrike's linux distributions, yet it didn't crash the world's infrastructure because the error was handled correctly. So microsoft should be at fault in some part for not providing these error handling systems.
@Formalec
@Formalec 4 ай бұрын
This could be exactly as bad for linux machine if the driver is at ring 0.
@ipadista
@ipadista 4 ай бұрын
@@Formalec the x86 family supports four rings, but for reasons Linux didn't continue the tradition used in VMS and some other contemporary mini computer operating systems, where kernel is ring 0, drivers are ring 1 and shared libraries are in ring 2. Choosing to do the same as NT did, skipping rings 1 & 2 only leaving kernel and user processes. Since essentially nothing uses more than ring 0 & 3 nowadays most new CPU designs only implement 2 rings
@JonBrase
@JonBrase 4 ай бұрын
Linux allows you to specify a kernel command line from the bootloader, and you can blacklist individual drivers in the kernel command line, so recovery would be simpler.
@ipadista
@ipadista 4 ай бұрын
@@JonBrase Same as with BSoDs, you would still need some techie typing in the fix at the Console. On cloud servers, it could be automated, same as with BSoD fixes, but I doubt it could be done on standalone machines
@genehenson8851
@genehenson8851 4 ай бұрын
Mac has not allowed kernel level access since Big Sur.
@diogotrindade444
@diogotrindade444 4 ай бұрын
This is the best breakdown that I saw about this. Even if they do not say it, it screams: Start using better OS like macOS with APIs and avoid boot-start drivers running on kernel mode.
@spookycode
@spookycode 4 ай бұрын
Honestly I would have called it crowdstroke :p
@steevf
@steevf 4 ай бұрын
It's ironic that a bit of software intended to prevent a system from getting taken out ends up taking out the system.
@NoahSpurrier
@NoahSpurrier 4 ай бұрын
The cure was worse than the disease.
@charlesrussell6183
@charlesrussell6183 4 ай бұрын
Bagley and Pound. What a great duo.
@dembro27
@dembro27 4 ай бұрын
Sounds like a law firm.
@charlesrussell6183
@charlesrussell6183 3 ай бұрын
@@dembro27 A get things done firm
@steveftoth
@steveftoth 4 ай бұрын
"Sorry Elon"? Never apologize to that man.
@jonglass458
@jonglass458 4 ай бұрын
This point about the OS being able to roll back a failed patch may not apply in the case, since it seems this was just an updated config file, which exposed an existing logic flaw in the application code.
@ChrisM541
@ChrisM541 4 ай бұрын
If it was only one/more data files then yes, no input sanity checking. I'm not sure if there was any code in amongst it.
@johnhudson9167
@johnhudson9167 4 ай бұрын
Loving how social media is making comp sci lecturers get trendy haircuts and dress properly 😂
@AlanCanon2222
@AlanCanon2222 4 ай бұрын
Never, I say! NEVER! *puts on sandals over socks*
@sensecurities
@sensecurities 24 күн бұрын
00:03 Windows machines experienced widespread blue screens due to an operational error. 01:55 Windows utilizes safety mechanisms like blue screens to protect against critical failures. 03:43 Kernel-level code in Windows can cause serious errors if not managed properly. 05:32 Kernel mode software failures can severely disrupt essential services. 07:25 Microsoft's Windows systems faced critical issues due to a specific bug. 09:04 Mitigating system failures through advanced update mechanisms. 10:56 A genuine mistake led to significant issues, but damage could have been far worse. 12:42 Cloud dependency poses risks for individuals and organizations during outages. 14:24 Exploring advanced image recognition capabilities.
@choleralul
@choleralul 4 ай бұрын
Thanks Lord Targaryen
@8Dbaybled8D
@8Dbaybled8D 4 ай бұрын
Enjoying the fashion coordination to represent affected hosts, even down to the socks!
@IsYitzach
@IsYitzach 4 ай бұрын
12:50 don't apologize to Elon. He deadnames one of his kids. If he can do that, you can deadname his company. The best he's going to get out of me is ex-Twitter.
@spht9ng
@spht9ng 4 ай бұрын
And then uses his child as a culture war pawn publicly. Gross
@mukulnag1578
@mukulnag1578 4 ай бұрын
As someone whos network jumbox had this ... A very bad day for the IT guy in my company
@dahla1973
@dahla1973 4 ай бұрын
This was nor very well informed with a lot of lacking info and some facts clearly missing. Much better videos already out there. That being said, normally a fan ❤️
@LaurentBonnaud
@LaurentBonnaud 4 ай бұрын
On Linux an EDR software can use the eBPF kernel subsystem to probe system activity. And an eBPF program cannot take down the Linux kernel by design.
@tscoffey1
@tscoffey1 4 ай бұрын
Apple has the luxury of being able to force changes to their OS like that because only a minuscule percentage of the world infrastructure relies on it. Microsoft must remain backwards compatible as best they can with their OS upgrades precisely because they aren't a tiny player in this arena.
@shawnmendrek3544
@shawnmendrek3544 3 ай бұрын
This guy is great at explaining.
@TimothyWhiteheadzm
@TimothyWhiteheadzm 4 ай бұрын
"They may have implemented something badly, we don't know". Yes, we do know. It happened, therefore they implemented something badly. This sort of thing is why we have canary deployments, and apparently they have the infrastructure for that, and allow customers to have settings for which computers get updates first in order to validate them, but they also have some updates that simply ignore those settings, and this one one of them. Yes, they 'implemented something badly'.
@alazarbisrat1978
@alazarbisrat1978 4 ай бұрын
it was definition files not the drivers themselves that broke so it's held under less scrutiny
@TimothyWhiteheadzm
@TimothyWhiteheadzm 4 ай бұрын
@@alazarbisrat1978 'Held under less scrutiny' by whom? The reality is that it crashed computers, and this isn't the first time similar updates by Crowdstrike have caused crashes (including on linux). The fact that they know this is a possibility but failed to implement proper testing before pushing out to everyone, means the 'implemented something badly'.
@alazarbisrat1978
@alazarbisrat1978 4 ай бұрын
@@TimothyWhiteheadzm they didn't know that would happen, sorta how this ever got out in the first place. but companies always neglect QA, it's just how it is. and also definition files themselves couldn't do any of this without a huge screw-up so they're not as important to defend, but had they tested it there would be no problem. some programmers just prefer to test after failure tho, just a complete miss
@0LoneTech
@0LoneTech 4 ай бұрын
@@alazarbisrat1978 What makes this remarkable is that the entire purpose of this product and company is to address that QA neglect. They've demonstrated they're among the worst at the one thing they're claiming to do better.
@alazarbisrat1978
@alazarbisrat1978 4 ай бұрын
​@@0LoneTech not really, most companies do that, just that this one was widespread and broke something fundamental. they just got unlucky with their neglect and this slip-up got all the way and broke everything. legend has it that there have been many other issues in their code over time that went totally unnoticed and only now caused catastrophic failure
@feldmanovitch
@feldmanovitch 3 ай бұрын
Really amazing video, like always!
@---ox1lg
@---ox1lg 4 ай бұрын
"There's no problem with Microsoft. There's no problem with Windows."
@shiroyasha_007
@shiroyasha_007 4 ай бұрын
Perhaps 😢
@ChuckleDuck
@ChuckleDuck 4 ай бұрын
lol, lmao even.
@yurisebastiao1872
@yurisebastiao1872 4 ай бұрын
It's actually right .... only those windows machines with Crowd strike software were affected by such zero day attack (self attack actually, more like a buggy one:😂)
@yurisebastiao1872
@yurisebastiao1872 4 ай бұрын
They've created their own zero day attack by not testing pieces of codes in their software update release. 😂
@titaniummechanism3214
@titaniummechanism3214 4 ай бұрын
nothing wrong... ...other than the usual stuff
@ianflint4610
@ianflint4610 4 ай бұрын
The wider issue is that, while Windows acts in a way to mitigate the consequences of a malicious act (which this failed update mimicked), there has seemingly been no thought into how to manage, contain and recover from such a problem when it is happening at scale on massive numbers of end-points at a very rapid rate. The rate of 'infection' is happening far faster than it can be contained. Microsoft's kernel code policy on top of Crowdstrikes error has exacerbated the problem. The impact isn't a theoretical one, it is real with potentially life threatening consequences (like the Highways Agency being unable to control Smart motorways when their displays were not reflecting what signs were saying and they couldn't change them - that left people in Refuges being unable to rejoin live motorway lanes). It has exposed many weaknesses.
@dgo4490
@dgo4490 4 ай бұрын
It's been obvious for a while now - MS does NOT DO software testing, nor Crowdstruck evidently. They are delegating the testing straight to the end user. They pushed a bad binary to an "on-the-fly" update, and after the updated binary was first touched, it crashed the system. That's criminal negligence, brought to you by industry's greatest security providers.
@SiljCBcnr
@SiljCBcnr 4 ай бұрын
Thanks for explaining it so well. I love this channel!
Cracking Enigma in 2021 - Computerphile
21:20
Computerphile
Рет қаралды 2,5 МЛН
Spectre & Meltdown - Computerphile
13:45
Computerphile
Рет қаралды 348 М.
FOREVER BUNNY
00:14
Natan por Aí
Рет қаралды 34 МЛН
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 9 МЛН
Why no RONALDO?! 🤔⚽️
00:28
Celine Dept
Рет қаралды 95 МЛН
CrowdStrike IT Outage Explained by a Windows Developer
13:40
Dave's Garage
Рет қаралды 2,1 МЛН
The Clever Way to Count Tanks - Numberphile
16:45
Numberphile
Рет қаралды 1,4 МЛН
I switched to Linux 30 days ago... How did it go?
28:46
Craft Computing
Рет қаралды 282 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,3 МЛН
Ethernet (50th Birthday) - Computerphile
26:18
Computerphile
Рет қаралды 131 М.
How AI 'Understands' Images (CLIP) - Computerphile
18:05
Computerphile
Рет қаралды 215 М.
Log4J & JNDI Exploit: Why So Bad? - Computerphile
26:31
Computerphile
Рет қаралды 500 М.
NEVER install these programs on your PC... EVER!!!
19:26
JayzTwoCents
Рет қаралды 4,1 МЛН
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 1 МЛН
FOREVER BUNNY
00:14
Natan por Aí
Рет қаралды 34 МЛН