x86 Internals for Fun & Profit • Matt Godbolt • GOTO 2014

Рет қаралды 32,963

Күн бұрын

Пікірлер: 47

@carvoloco4229 3 жыл бұрын

Great video about a guy giving a presentation about the inner workings of an x86 execution pipeline. Wonderful gestures, I've seen few people point at slides like this presenter. And the way he sometimes moves his head up and down... Marvelous! The topic of the presentation seemed interesting too, but it is difficult to tell without actually seeing the slides :(

@ankushm3t 3 ай бұрын

Slides in description ^

@edgeeffect 7 күн бұрын

@@ankushm3t Yeah... but "Slides in video ^" are so much easier!

@test143000 8 жыл бұрын

a very bad video composition, the speaker discuss something on a screen, instead the screen with a code we are continue watching the speaker for a prolonged time then a short glimpse of the screen

@me0262 5 жыл бұрын

Agreed. It's a good talk, but the editing is horrible. The presenter goes over multiple slides that are crucial to the talk, but these slides are not shown in the video. focus more on the slides than the presenter.

@LemonChieff 5 жыл бұрын

Pretty sure it's automated… I mean it has to be. Nobody is that bad at editing, right? RIGHT?

@mekafinchi 2 жыл бұрын

The more I see about things involving x86 and RISC stuff the more interesting it gets, x86 is really hard to decode but the equivalent RISC encoding would be 4-6 times longer, and then x86 can turn around and convert the tiny CISC into VLIW level internal representations. tldr it's just really cool

@Carutsu 9 жыл бұрын

Great talk indeed. If I may suggest, please add the slides as a picture in picture view rather than switch to it. A lot of the time the speaker is making a reference that is lost because there's no way to see the slides.

@LemonChieff 5 жыл бұрын

It's simple, really: first read the 5000 page manual then read the "RELATED LITERATURE" and then you're halfway there. Then just read the amd manuals and realize it actually makes sense now but you're also 10 years older.

@evilone1351 7 жыл бұрын

Excellent talk! Conveying CPU internals in such a clear way is a talent.

@angelsbmartin3431 9 жыл бұрын

That was really a great talk, highly recommended. Thanks, Matt.

@PingPong-em5pg 8 жыл бұрын

The person responsible for this "montage" should change profession and stay away from any video production related jobs. Where are the slides? Did the guy even watch it? *Sigh*

@KonradTheWizzard 2 жыл бұрын

One little correction at the end: while it is true that electrons don't move anywhere near the speed of light (they "drift" centimeters per second or less), the electric signal does move at the speed of light. Specifically the speed of light in the wire (of course for a very lose definition of light: wires are "transparent" to signals, which are electro-magnetic, like light) - this is about 75-90% of c (speed of light in vacuum), depending on the specific material. One note for the incurably curious: when he sais "very few electrons" in a RAM cell - for modern RAM or Flash cells that is in the region of a few hundred to one thousand electrons - it used to be a few tens of thousands in the 90'ies. That is femto Farads in storage capacity. In comparison for a LED to briefly light up millions to billions of electrons have to move.

@johnmccrary2959 9 жыл бұрын

Skip to 10:56 if you already know Intel assembly.

@eXcalibooor 7 жыл бұрын

The slides aren't visible half of the time :/

@frutiboy1 9 жыл бұрын

Great talk. Needs to be watched several times to grasp the vast information in there. I made it all the first time:)

@dascandy 8 жыл бұрын

wrt instruction decoding; you start with in parallel deciding that if location X is an instruction, how long it would be, for N bytes at a time. The cycle after you do this for N bytes further (and so on). In the second cycle, you sequentially add up the lengths to get to the limit of valid instructions and ship those instructions off to the actual decoder, with still implied links between them wrt dependencies. The step after then outputs a bitfield for the actual register changes it does so that if there's an actual dependency on a previous instruction it can be delayed. I'm more impressed that they made this kind of stuff work with self-modifying code - which is like throwing a giant wrench into this entire machine & making it do one instruction per full pipeline flush.

@JohnSmith-he5xg 8 жыл бұрын

Great discussion of the tradeoffs of RAM vs cache @ ~48 mins

@andrewmartin3671 10 жыл бұрын

Nice talk (though I've not seen all of it yet), branch prediction first came up for me in this relatively well known Stackoverflow Question: "Why is processing a sorted array faster than an unsorted array?"

@MattGodbolt 10 жыл бұрын

Yup - I cover that and indeed demonstrate that even in a language like Python you can see similar effects to that SO question.

@walter0bz 8 жыл бұрын

how the hell did that ISA end up being our standard.

@ChristofferViken 8 жыл бұрын

+walter0bz Bit by bit

@walter0bz 8 жыл бұрын

apple managed to change ISA .. twice. Seemingly paradoxical that they ditched 68000 (the most elegant CISC), for PowerPC (a decent RISC), but simply by virtue of the worlds momentum they ended up having to go to an inelegant CISC (x86)

@ChristofferViken 8 жыл бұрын

walter0bz That happens to provide the best performance per watt of power consumption. Only a few RISCs exceed x86 in that respect and the few that does are severely capped.

@ChristofferViken 8 жыл бұрын

And before you say it: Yes, it has everything to do with the amount of money they can put into the research for making it more efficient, and it will stay that way for a long time. Because they are the ones with the R&D money to make it stay true.

@eformance 6 жыл бұрын

Huh, at 31:04 it sounded like you said "most instruction sets have a decrement and jump if not zero, but x86 doesn't". I'm confused because that's exactly what loop does, it decrements CX and jumps if not zero.

@biggiefrosty 5 жыл бұрын

he meant a single instruction that does the compound operation "decrement and jump if not zero". That's two instructions in x86, but they really function as a unit that can and should be scheduled together. So internally, the chip fuses those two operations into a compound operation, even though it's still coded in assembly as two separate instructions.

@aquaz_eu 8 жыл бұрын

Link for slides : gotocon.com/dl/goto-chicago-2014/slides/MattGodbolt_X86InternalsForFunAndProfit.pdf

@ankushm3t 3 ай бұрын

Thanks!

@mumblic 8 жыл бұрын

Could be a great video but like mentioned before the edit is terrible. Stopped watching after 5 minutes Link for the slides don't work !!

@SomeAdam 6 жыл бұрын

gotocon.com/dl/goto-chicago-2014/slides/MattGodbolt_X86InternalsForFunAndProfit.pdf

@DaveWhoa 6 жыл бұрын

you mentioned the fetcher reads 16 bytes... that would make sense to me even though the max instruction size is 15, but if you put a valid 15-byte instruction on the last 15 bytes of an executable page (the next page is PAGE_NOACCESS or non-existing), it executes nominally, ie. without causing a page fault, whereas if your process tried to read the 16 bytes it would trigger a page-fault from the 16th byte being on the invalid page. So maybe it just reads 1 byte at a time and determines if it needs to read more? Or maybe it still tries to read 16 bytes but simply doesn't trigger faults during fetch

@paranoiatable 5 жыл бұрын

Much of the content in this talk is difficult to understand simply because the viewer can't see the diagrams/code or schematics that Mr. Godbolt is referencing.

@rocqua 7 жыл бұрын

Interesting to see how few steps it actually takes to go from the information in this talk to Spectre. And yet, those steps took 3.5 years. It's obvious in hindsight, which goes to show how big the difference between 'obvious' and 'obvious in hindsight' is.

@AlexanderBollbach 8 жыл бұрын

i'm way out of my element here, but couldn't these chips expose some API / control mechanism for informing it about branching patterns in the compiled code that back end compiler people could target. this only comes to mind because the task of looking for patterns in branch behavior seems like something naturally in the domain of software and not hardware.

@DaveWhoa 6 жыл бұрын

good compilers will structure code in such a way that the most-likely or most-common tests/branches are done first. There's no real need for CPU-specific knowledge to do that

@jubbernaut 10 жыл бұрын

I will be watching this video one minute at a time, before having a lie down, in order to avoid nosebleeds. I'm two minutes in and it seems good so far - although the bloke doing the talk seems a bit shifty

@Ljosi 9 жыл бұрын

+robin jubber if you are having such a hard time grasping the concepts in this video that means you are moving ahead too fast. Go back to the basics and move up slowly, that's what i did. Now i understand 90% of everything he explained here.

@jubbernaut 9 жыл бұрын

Heh - thanks Ljosi. I used to work with Matt - I was mostly just taking the piss :-) But you give good advice nonetheless!

@MrWogle1 6 жыл бұрын

Am I crazy, or did this guy almost stumble on Spectre?

@AxelStrem 7 жыл бұрын

2^32 bits is not really 4 GB... sry couldn't hold it in :)

@featel1 9 жыл бұрын

I don't understand CPU architecture that well and my head is getting hit by terms I don't understand...

@greywolf271 6 жыл бұрын

You need intel inside...........

@desmondwilson3390 7 жыл бұрын

Chris Domas - Sandsifter github.com/xoreaxeaxeax/sandsifter s a n d s i f t e r : the x86 processor fuzzer Overview The sandsifter audits x86 processors for hidden instructions and hardware bugs, by systematically generating machine code to search through a processor's instruction set, and monitoring execution for anomalies. Sandsifter has uncovered secret processor instructions from every major vendor; ubiquitous software bugs in disassemblers, assemblers, and emulators; flaws in enterprise hypervisors; and both benign and security-critical hardware bugs in x86 chips. With the multitude of x86 processors in existence, the goal of the tool is to enable users to check their own systems for hidden instructions and bugs.