How to Build a Virtual Machine

  Рет қаралды 96,732

InfoQ

InfoQ

Күн бұрын

Пікірлер: 58
@philipp7732
@philipp7732 9 жыл бұрын
Awesome video, had some initial problems, but after debugging I found the issue and now I am a proud creator of a VM!
@unlockwithjsr
@unlockwithjsr 5 жыл бұрын
Wow, how ? Did you open-source it or built it your own ?
@chrisparker9672
@chrisparker9672 4 жыл бұрын
@@unlockwithjsr I mean, he literally builds one during the video.
@ryanlunger8215
@ryanlunger8215 10 жыл бұрын
I watched this video mainly because I admire Mr. Parr for his work. I enjoy the particulars of language design and implementation, so I knew starting out that this was going to cover pretty basic stuff. But I really appreciate and enjoy presentations given in this very informal way. My first C++ class was very much like this. There was a handful of us in the class, and with our instructor's involvement we would democratically discuss and design the solution to some particular assignment. A great deal of the more advanced stuff really sank in during those times.
@mmille10
@mmille10 4 жыл бұрын
I started trying to read the (last) implementation chapter of "Smalltalk-80: The Language and its Implementation," and got lost rather quickly. I got here after reading Chap. 7 of "Squeak: Open Personal Computing and Multimedia," where it talked about the VM implementation. Now, some basics are coming together in my head. Great talk!
@xN811x
@xN811x 8 жыл бұрын
Immediately ordered his book
@HistoriasCuentosLecciones
@HistoriasCuentosLecciones 8 жыл бұрын
title of book?
@xN811x
@xN811x 8 жыл бұрын
Ruben Rivero "Language Implementation Patterns"
@HistoriasCuentosLecciones
@HistoriasCuentosLecciones 8 жыл бұрын
Thank you!
@avwie132
@avwie132 5 жыл бұрын
When you close your eyes it sounds like Tom Hanks is talking
@snorman1911
@snorman1911 8 ай бұрын
Wilsonnnnn!
@ivand8393
@ivand8393 5 жыл бұрын
It is worth mentioning that probably the the first one was the FORTH virtual machine
@daschewie
@daschewie 5 жыл бұрын
The first VM was O-code for BCPL in 1966 followed by Forth and Pascal in 1970.
@solaxun
@solaxun 2 жыл бұрын
Great video, helped clear up many of the questions I had about how bytecode VM's work (along with his book). One thing I do not understand though, is how compiled function code shares the same array as the rest of the code (see 1:23:05 discussing function calls). I can see how jumping to the address of a function's code works during the interpretation of "call" instruction, but how do we avoid running into sections of compiled function code while just normally stepping through the code array (as we increment the instruction pointer)? Is a portion of the code array effectively blocked off for function use exclusively, and the compiler handles this separation during compilation?
@leandroaraujo4201
@leandroaraujo4201 2 жыл бұрын
You could certainly step into other code if you do a bad jump, but in the case of a function this is handled by the RET instruction. Ret cleans up the stack, sets the return value, and jumps back to the caller address. Now if a function doesn't call RET, that's likely to happen, since you don't jump back.
@solaxun
@solaxun 2 жыл бұрын
@@leandroaraujo4201 That's related to, but not quite the same as the question I'm asking. It's tough to ask this one online, but I'll try. Imagine you have some pseudocode like this: ``` x = 10 define somefunc(x,y){dostuff} y = 9 ...more code ... ``` Since in the example VM all bytecode gets compiled into the same array, wouldn't you end up with compiled function bytecode in the middle of those variable assignments, which you would then "run into" while incrementing the program counter? For example, maybe the above code compiles to something like: ICONST 10 GSTORE xaddr **compiled function code** ICONST 9 GSTORE yaddr You start with the PC at "ICONST 10" and then immediately after running that, you increment the PC and step into the compiled function code, even though it hasn't been called, so there would have been no jump to that location. I could see this working if function code compiled to it's own "region" of the array (like before or after everything else), but that part wasn't covered in the video.
@leandroaraujo4201
@leandroaraujo4201 2 жыл бұрын
@@solaxun Oh, that. Yes, that would happen; and I guess the only way to avoid it is structuring the code and setting the program counter correctly. See in the factorial example, how the main function is last, and the program counter is set to point to it. As far as I know, the same thing happens with assembly (with some minor changes).
@ThaerRazeq
@ThaerRazeq 9 жыл бұрын
The fastest way of doing this other than generating native code manually, is by generating c code and compile that with llvm and run it at runtime. This is similar to how unreal script or valve's half life SDK do it. But I think iOS and Mac don't allow this for security reasons due to the potential malicious modifications.
@SimGunther
@SimGunther 2 жыл бұрын
To get around that Mac walled garden restriction, those generated operations must be built into the program and made to be indexable via a table. The technique has been formalized as "building a weird machine" and QEMU makes good user of this tech. Just ask Kate Temkin as she's done a !!con talk on this sort of thing in 2021.
@petercheung63
@petercheung63 7 жыл бұрын
better to have english subtitle for this wonderful video for non native english speaker
@Speaks4itself
@Speaks4itself 2 жыл бұрын
Use the auto-generated captions
@guilhermesaraiva3846
@guilhermesaraiva3846 10 ай бұрын
there are any book about this subject that guy is talked, building VM step by step i did not find it
@kennethcarvalho3684
@kennethcarvalho3684 2 жыл бұрын
How can one get java source shown in this presentation?
@danielsmith5626
@danielsmith5626 3 жыл бұрын
14:24 I'm still in shock that universities exist that don't introduce assembly language.
@kovertopz
@kovertopz 10 жыл бұрын
I've written a tree interpreter for a small DSL. Traversing the tree isn't that bad. Creating the tree to me is difficult since you have to create one that represents the order of operations. My grammar was very restricted and I should have went with something like a Pratt parser.
@bloody_albatross
@bloody_albatross 10 жыл бұрын
A tree based interpreter has one application: When you have a throw-away expression that is only executed once. E.g. a input field where a user can input a value via an expression that gets immediately evaluated and replaced with its value. (Blender 3D has these, IIRC. But I think it simply uses Python for that.) Also I don't find tree based interpreters harder than others, but then I haven't built complex interpreters. Only very basic stuff (simple mathematical functions with only one data type (number)).
@falc0knights496
@falc0knights496 9 жыл бұрын
fuck outta here,jk you should write a book bruh many books and i will sql force ly way to the domain of the book and DDOSS and analyze the book using sharkwire
@HumanBeingSpawn
@HumanBeingSpawn 7 жыл бұрын
adrian venegas Wireshark? lol
@BryanChance
@BryanChance 3 жыл бұрын
How does the CPU do anything in the first place. That's what I want to know. Too low level?
@OttoFazzl
@OttoFazzl 9 ай бұрын
Check out the free course nand2tetris (part 1). In it, you build a CPU from scratch (only from nand blocks), then you build a compiler, a bytecode VM similar to the one explained here (ever more complex), then high level language and OS for the computer. After that course, there will be no more "magic" in how the CPU works and how it's all connected.
@JackPurdon
@JackPurdon 3 жыл бұрын
Awesome presentation
@jeanclaudescandale
@jeanclaudescandale 5 жыл бұрын
Hi, Why not using a data stack and a return stack ? each instruction, beside push, could pops its arguments from the stack only and the compiler could use macros to avoid the hassle of using too many pushes and pops; Like jmpz could be inlined : push addr push 0 sub jmpz What do you thinks ?
@jeanclaudescandale
@jeanclaudescandale 5 жыл бұрын
or call (fun (arg1 arg2)) : push arg2 push arg1 push 2 (or even nothing) push addr on top of ds and ip+1 onto the return stack call and the vm could use stack pointers instead of linked list to manage the stacks.
@peterfireflylund
@peterfireflylund 7 жыл бұрын
Is it just me or is the 6502 memcpy code he shows in the beginning buggy? I think it decrements Y (CNT.L) outside the inner loop when it shouldn't. I also think it is wrong to increment SRC.H and DST.H after the inner loop because it will only have copied 256 bytes if CNT.L started out as 0.
@karlmin8471
@karlmin8471 7 жыл бұрын
He looks like Harrison Wells in The Flash, even his modal.
@carnelyve866
@carnelyve866 10 жыл бұрын
bookmarked and subscribed....
@xinyuliu7346
@xinyuliu7346 8 жыл бұрын
this guy is great.
@Kitulous
@Kitulous 6 жыл бұрын
Xinyu Liu he is an Apple fanboy that doesn't understand that phones are just a tool, not the way to show off or something.
@HumanBeingSpawn
@HumanBeingSpawn 8 жыл бұрын
Name of book please
@ihnwtpu
@ihnwtpu 9 жыл бұрын
I felt so smart when I was able to answer his question before he said the correct answer :D
@jovaha
@jovaha 8 жыл бұрын
In most of the instructions he uses the value stack[sp] and then decrements sp or increments sp and than sets stack[sp] which would correspond to push/pop. but in the LOAD/STORE instructions he uses and sets the value stack[fp+offset]. To me that seams like something you couldn't do whit a stack. I would like to implement this whit an actual stack like data structure. Is there any way of doing LOAD/STORE instructions whit only pop/push?
@OttoFazzl
@OttoFazzl 9 ай бұрын
My understanding based on this lecture and other course I took on this topic is that the implementation of a stack virtual machine itself doesn't have to be stack-based. It purely depends on what language you are using to implement the machine. So, there is nothing wrong to have access to stack with offset while implementing LOAD/STORE. He also used offsets when implementing CALL for example. Stack virtual machine is stack based abstraction, but implementation doesn't have to be.
@OttoFazzl
@OttoFazzl 9 ай бұрын
Even if you were implementing the stack VM in pure assembly, you can have offset-based memory access in assembly by calculating offset memory address and storing it in a register.
@alessandromeyer4888
@alessandromeyer4888 10 жыл бұрын
Really great! If u'd have used scala u'd have been twice as fast writing the code. :-)
@LarsHarmsen1337
@LarsHarmsen1337 10 жыл бұрын
When he was spiting on android users I wanted to stop the video. And then, instead of naming the register "program counter", he gave it some funny name, because he's an apple fanboy. Say what? But I kept watching until the end. And I guess it was worth it. I wasn't aware you can label things and break out of those specific parts in java. A little criticism: I couldn't understand the questions of the audience.
@HansUhlig
@HansUhlig 10 жыл бұрын
Are you referring to the instruction pointer? The names are synonymous. See en.wikipedia.org/wiki/Program_counter
@bloody_albatross
@bloody_albatross 10 жыл бұрын
Indeed. I learned both names in a lecture held by someone that is definitely not an Apple fanboy (Linux user).
@asuasuasu
@asuasuasu 6 жыл бұрын
x86 uses the instruction pointer name. the x86-64 register is named r**ip**.
@asuasuasu
@asuasuasu 6 жыл бұрын
oh i'm 3 years late ok
@HumanBeingSpawn
@HumanBeingSpawn 8 жыл бұрын
Did you know Microsoft uses the *STDCALL* convention in their Win32 API? lol It seems like you despise everything associated with Microsoft, or anything not associated with Linux.
@FacebookIL
@FacebookIL 9 жыл бұрын
the stack is growing downwards and not upwards.
@gonkula
@gonkula 9 жыл бұрын
+NoPTic S (Dersus) That's actually architecture dependant (and true for x86/x86_64 amongst others) but there are architectures where that's not true HP's PA-RISC springs to mind.
@FacebookIL
@FacebookIL 9 жыл бұрын
I know, but most of them grow downwards. Anyway, nice to see some people who does understand some ASM !
@kilswitchengaged
@kilswitchengaged 9 жыл бұрын
+Alexandru Pană Its not irrelevant if you are working at the OS level generating machine code. You need to be aware of how much memory you have for the stack and if you are overwriting something else.
Computer Science - Brian Kernighan on successful language design
1:00:06
University of Nottingham
Рет қаралды 313 М.
16-Bit Virtual Machine in JavaScript 001
21:07
Low Byte Productions
Рет қаралды 56 М.
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН
Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨
00:21
Two More French
Рет қаралды 42 МЛН
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 148 МЛН
Learn about JVM internals - what does the JVM do?
1:33:35
InfoQ
Рет қаралды 151 М.
JVMLS 2015 - VM Design Choices
1:01:39
Java
Рет қаралды 4,6 М.
are stack based vms really slower?
10:48
leddoo
Рет қаралды 19 М.
Let's Create a Compiler (Pt.1)
1:11:03
Pixeled
Рет қаралды 602 М.
Dear Game Developers, Stop Messing This Up!
22:19
Jonas Tyroller
Рет қаралды 798 М.
ANTLR v4 with Terence Parr
59:18
PragProg
Рет қаралды 86 М.
Lets Write A Virtual Machine -  Part 1
51:11
ChapmanWorld
Рет қаралды 11 М.
why my scripting language is already faster than python
4:19
The Return of Procedural Programming - Richard Feldman
52:53
ChariotSolutions
Рет қаралды 64 М.