Now imagine developing Roller Coaster Tycoon in assembly... insane
@homelessrobot11 ай бұрын
not that this diminishes the achievement too much, but it wasn't just naked assembly, he had macros. once you get a foundation of good primitives you can build up essentially arbitrarily high level abstractions with a good macro assembler. And if you are working alone theres nobody to chastize you for your program being almost entirely macros.
@promethuser11 ай бұрын
@@homelessrobot I mean that's true with any language. If you wanted him to remove the macros then well he's just duplicating stuff, which is pointless.
@BboyKeny11 ай бұрын
@@homelessrobotI heard that most self rolled abstraction over assembly looks like C. Now I imagine that Rollercoaster Tycoon source code looks much like C
@homelessrobot11 ай бұрын
its really not true for any language, certainly not in the way that most people use most of them. Most languages that have macro systems intentionally hobble them so that every other program isn't a new programming language. Either that or the best practices for their usage strongly frowns on getting carried away. No such restrictions or culture exists for macro assemblers, certainly not in 1994. Generally they are as potent and unteathered as the implementors imagination was capable of fathoming. If you are still dubious, you should go look at the other fasm assembler, fasmg. It has an algebra solver built in.
@promethuser11 ай бұрын
@@homelessrobot I thought by macros you meant just easing code duplication. If they are anything beyond that (like doing math not written by you) then yeah I get your point. But does tycoon use that? If he built the macros himself then it shouldn't take away from the achievement.
@bradmccoy174711 ай бұрын
Its so ironic that you went through the same stages of grief over the shufps instruction that I experienced when trying to figure out the pshufb instruction. In the end I felt like the instruction was simple too and also mad at the documentation for making things more complex than necessary.
@Majkelo87911 ай бұрын
It is kinda funny how low level people constantly invents some super complicated instruction sets to make everything as fast as possible so that high level people (where everything is so much easier to optimize) can don't give a shit about performance at all and do somthing like react xD great session as always
@antonioaugusto824411 ай бұрын
Lovely, your video themes always make me curious to learn more about themes I never considered before
@Narblo11 ай бұрын
Subject is a better word than themes. I know that in our language themes is used for it too but is not the same in english. I hope it helps.
@RohitKumar-ku2lq11 ай бұрын
Your stream are like lofi songs. Just open it up and relax.❤
@mrcrafter_y11 ай бұрын
and... fall asleep
@beeverfeever493011 ай бұрын
This is exactly what I do, with the benefit of learning things constantly.
@theninedivides68518 ай бұрын
You still absorb information while sleeping
@BramBolder11 ай бұрын
There is a reason why they encode true as -1(integer) and not 1: You are supposed to use AND, OR, and ANDNOT with these masks and the floats instead of multiplication. So instead of p'=[C]*a+[^C]*b you use p'=( a AND C ) OR ( b ANDNOT C). No conversions between floats and integers and no multiplication by -1.0, etc. NOTE: ANDNOT destroys the mask!! Nice feature for the velocities is: v' = ( C
@Jowy98911 ай бұрын
Fascinating experiment. It shows so nicely how close C is to optimized assembly, as long as you don't introduce floats. Btw if you are interested: In German we call ß "scharfes S" which refers to it being spelled more sharp / harsh. Also, it is contained in one of your favorite words "Scheiße" which is kinda spelled like scheisze so it starts soft and gets sharp / harsh towards the end.
@SuperCacazinho11 ай бұрын
Mano, cada vez que vejo o título dos vossos vídeos, meu olho se arregala kkkkk completamente isano
@gustavohqueiroz11 ай бұрын
O gajo é bom msm
@SuperCacazinho11 ай бұрын
@@gustavohqueirozgajo ? por gentileza, fale português corretamente.
@gabrielfrigo279210 ай бұрын
Poisé mano. Esse cara é insano Muito bom ver esses videos, mesmo que dure 4h
@Salehalanazi-79 ай бұрын
I freaking love how you demand your viewers to take responsibility of their suggestions. What a cool channel I just found. Subbed
@sarasaworks6 ай бұрын
Here's a fun hack for next time: If you assemble using GCC rather than a dedicated assembler, you don't need to import external functions every time you want to use it. The difference is GCC requires the Start label to be Main instead. Happy Tsoding!
@justfly19842 ай бұрын
Only today I've seen a video from uncle Bob about people not writing assembly by hand anymore, and now I'm watching how you write assembly by the hand. Now do the same in ARM instruction set!
@not_herobrine375211 ай бұрын
reminds me of someone writing an article about doing this with masm (only) with directx11, its great to know that there is someone on this planet sane enough to do this on linux as well
@LOL-cp6js11 ай бұрын
bro with his stream is equivalent to what I studied in two years at my university💀💀💔🤣
@justfly19842 ай бұрын
Holy cow, people say you can't learn anything in two weeks, but here I've learn so much in 3 hours on 1.5x speed, you can't imagine how good is the timing!
@dnkreative11 ай бұрын
Most of the SSE instructions mnemonics contain letters which encode data type they work with: A/U - aligined/unaligned memory access, Sx/Px - single/packed S/D/Q/W/B - single/double/quad word/word/byte etc. Integer opcodes mostly start with P prefix (paddb, pavgb etc), float/double are without prefix ending with p postfix.
@MyManJohnny11 ай бұрын
I could understand the official description of the SHUFPS instruction on my second read, but can't for the love of god wrap my head around the explanation you gave in your own words. It always surprises me how everyone's brain is wired differently.
@paco344711 ай бұрын
Nice! Old school vibes. I was on 68K ASM back in the late 80s until C compilers began to improve.
@learnwithabdulbari11 ай бұрын
What the fuck bro just created a window in 15 minutes from ASM. Too excited for this entire stream
@icarvs_vivit11 ай бұрын
Always great to see someone learn SIMD assembly but frankly I'm suprised your laptop doesn't have avx. Some older, cheaper laptops forgo avx2 because of their use as primarily a low-power web browsing machine, but you might have regular avx"1" extensions, such as vpbroadcast in xword sizes. Anyway, you can check the feature extensions your processor has by using the cpuid utility, which you can find on the AUR, and invoking it with "cpuid -1 -l 1". Fun fact, the instruction "cpuid" does basically the same thing and you can use it to programmatically detect the feature set of any modern x86 cpu at runtime, and then dynamically choose to run only the code compiled for features the cpu actually has.
@anon_y_mousse11 ай бұрын
Funner fact, you can read all of that information at once from /proc/cpuinfo.
@TheRover10135 ай бұрын
Pretty sure his old laptop is ivybridge i5 mobile of some kind. AVX2 wasn't a thing until Haswell, just after that *shrug*
@Memepolicedoggo10 ай бұрын
Someone else has probably mentioned it but the "d" at the end of register names stands for "double word" and the "w" for "word" where a word is 2 bytes/16 bits. The "e" at the beginning stands for "extended" and the "r" probably stands for "re-extended" or some other bs because programmers can't name things
@wiktorwektor12311 ай бұрын
mov instruction in assembly takes more clock cycles than xor operation. If you want to zero register, it's faster to xor its value by itself than move zero to it.
@iamdozerq11 ай бұрын
His laptop can do trillions operations of that kind in second. Why even try to optimize asm?
@wiktorwektor12311 ай бұрын
@@iamdozerq Because you're an idiot and don't comprehend how many times CPU needs ZERO per nano second.
@not_herobrine375211 ай бұрын
@@iamdozerq high level language programs can outperform poorly made assembly programs
@namename898610 ай бұрын
@@iamdozerq easy to find by optimiser
@drdca826310 ай бұрын
@@iamdozerq”If you keep track of the pennies, the dollars will take care of themselves”
@kamhawy11 ай бұрын
Awesome, educational & fun as usual. Thanks for the great content man.
@dromedda681011 ай бұрын
i watched this without breaks and my brain has been working at full capacity to keep up with everything thats happening. best worst feeling.. im off to bed wish i didnt miss the streams, but life got in the way
@DatBoi_TheGudBIAS10 ай бұрын
If I remember correctly, the 8 bit mask in shufps works like this: 00 00 00 00 first 2 bits set the position from source Dat will be copied to 1st position (index 0) of destination Next 2 bits do the same for 2nd position Next 2 bits for 3rd Next 2 bits for 4th So with a mask of 0, ure saying u want all 4 positions to get the 1st element, so ure just copying the 1st element to all positions of the xmm register
@Mozartenhimer11 ай бұрын
Amazing how stating: the imm8 argument is treated as a array of 2 bit unsigned integers. Would of cleared everything up.
@Rose-ec6he10 ай бұрын
If I recall correctly, you can pass the object file to a C compiler with a flag to not link C's `main` and to link the standard library. Then it outputs an executable that uses _start as it's entrypoint but also adds the standard library other symbols that C infrastructure expects to be present in a binary
@tomasprerovsky301711 ай бұрын
I am hooked, that's coding at its very best.
@danielinacio156711 ай бұрын
I'm more comfortable with nasm syntax but I think there is some interesting aspects in fasm, like assembler and linker information directly in the file.
@monad4210 ай бұрын
The address 38 in test.o can be explained as follows: check the machine code on the left, the operand to call is actually 0. Since the instruction call uses address relative to rip, a zero offset means that you are calling the function where rip currently points to, which is the next instruction at 38. The main function is put to address 0 coincidentally, so objdump displays the address relative to main, which is main+38. Now the remaining question is, why does that call instruction use a zero offset? The answer is that it is left blank for the linker to fill in.
@bbq142311 ай бұрын
3:43:55 I already explained why it shouldn't happen (modify position and save it back), I don't know why I argue... Wait a minute, we have leftovers from the old code where it modifies the position and saves it back! Lol.
@homelessrobot11 ай бұрын
i think the reason for writing vector and raster version of the drawing functions is simply because it makes working with graphical data that is either inherently raster or inherently vector easier to do and less error prone, and not because of the potential usecase from assembly.
@labsendeyshent11 ай бұрын
Next video: Electron in Assembly
@remrevo394410 ай бұрын
There also is the "align" macro for fasm to automatically fill the space with some bytes to align the following data.
@JekasObps11 ай бұрын
2:44:27 according to GNU, malloc actually aligned by 8 bytes on x32 and by 16 bytes on x64 architectures
@vodracseck11 ай бұрын
this man is insane
@mertaliyigit328811 ай бұрын
ieee754 negative 0 is helpful because 1/0 results in ±infinity depending on the sign of zero which does have use cases
@darkobul110 ай бұрын
This is great. You just showed me what I was thinking was very complicated. Kudos for video.
@chesterhackenbush7 ай бұрын
Very informative - with a touch of genius. Well worth a subscription!
@DatBoi_TheGudBIAS10 ай бұрын
I work with windows instead of Linux. I frekin memorosed the calling conventions for 32 bits and 64 bits. In 32 bits, u just push the arguments in reverse order into the stack, and return value is in, iirc, eax in ints and st0 (yes, the fpu stack) for floating points. In 64 bits, u have fastcall for ints and vectorcall for floating points. In fastcall, first for args go into rcx, rdx, r8 and r9, then push in reverse order to the stack. In vectorcall, first 4 arguments go into xmm0-3 and then pushed in reverse order to stack, and return value is in rax for ints and xmm0 for floating points. If an argument is too large (for example a 10 byte struct), u put the adress (pointer) of the thing as the argument U can mix them, so if u have a function int func(int a, int b, float C, float d), in 64 bit asm, it would look like: mov rcx, a mov rdx, b movss xmm2, c movss xmm3, d I'm aware movss doesn't allow the 2nd argument to be a number, its an example Also, movss and mulss are exclusively for floats. ss means scalar single precision. sd is for doubles, it stands for scalar double precision. cvt instructions convert int to floating points or vice versa, or single precision to double precision and vice versa (like the cvtss2sd instruction somewere in the first h, it was converting the float to a double for some reason)
@mattshu10 ай бұрын
36:05-36:35 was the greatest 30 seconds i ever heard on this channel
@thehackr25811 ай бұрын
I appreciate you coding sessions, it's amazing how frequently your release sometimes gets hard to be along, still watching past year videos Lol
@ecosta10 ай бұрын
x86_64 is a bizarre architecture to code in assembly. It has x86 operands - which are rich and created with humans coding in assembly - mixed with newer stuff with makes more sense to compiler writers... I did a lot of x86 assembly years ago but anything after Pentium felt messier than calling Windows API.
@Heater-v1.0.010 ай бұрын
Back in the day I found a book that described how to use the Win32 API from assembler. So I had to try it. Got some windows and graphics up. Amazingly I found it was easier to do and read afterwards than the C API.
@trannusaran616411 ай бұрын
fascinating to see how the optimization works here! also lordy those variable length ops and overlapping register slices are cursed (aka, I've been ruined by risc-v)
@sukina506611 ай бұрын
7:45 oh boy we're getting there, hell yeah 13:50 ....
@ludwintor498611 ай бұрын
shader's conditions isn't really slow. It CAN be slow because shaders execute one instruction in parallel for different values so if an "if" statement result in separation of code flow, it means that part that go into "if" body will be copied (which result in slowing down) and execute "if" body when other part that doesn't go into "if" body stay still and wait until other part return from "if" body resulting in additional slow down if all values doesn't separate on "if" statement then there's nothing to copy and wait, it just go forward without any real slow downs so in shaders not an "if" itself slow but it consequences can be for example if you just comparing constants or uniforms doesn't result in separating so it will be as fast as possible or if you comparing alpha received from texture sampling like (color.a > 0.1), it is likely that in one shader invocation there are some pixels with 0 alpha or 1 so it will result in separation and slow run
@ВладимирПирко-я6к11 ай бұрын
Не успеваю смотреть твои стримы, сейчас наверстываю плейлист по musializer'у. Спасибо за творчество
@cobbcoding11 ай бұрын
gonna need someone to give me the clip where jblow makes a specific float out of a hex. I need to see it!
@tekno67911 ай бұрын
Instead of doing the weird float operations of multiplying by minus one and so on in the "Top-Left Borders" video segment, you could just use ANDPS with the mask you get from the CMPSS instruction :)
@AlexPund10 ай бұрын
the pain in his face when the red cube slid accross the shity box >>xddd im dying
@maxellstudios121011 ай бұрын
I dont know how you do it but every time I get into something the next day you release a video about it
@vicenteeduardo559811 ай бұрын
A great idea in 3:04:01 is instead of using XMM1 as a vector of floats (-1), use it as a bit mask, since for true is always ones and for false is always zeros, this way you can treat the formula's multiplication as bitwise AND, and does not have to change it
@maksymiliank513511 ай бұрын
I used to play around with some masm x64 assembly on windows a couple of years ago but SSE instructions were just too hard for me to understand. After watching this, I kinda wanna try doing that again
@N00byEdge11 ай бұрын
Honestly could improve this further. After doing the comparison, you did note that you get -1 if triggered. You could just literally take that result, convert to float and directly multiply that vector onto the velocity vector. You're already getting the -1s in all the correct slots of the vector, (if pos0.x goes out of bounds, you will get -1 in that slot and invert vel0.x)
@neutron_stz889410 ай бұрын
hint: you can use gcc test.c -o test.S -S to generate assembly file
@neojupyter32211 ай бұрын
Just what I was looking for. Thank you very much for your contribution.
@MySisterIsASlytherin4 ай бұрын
32:48 "As you can see we have a triangle" (shows red square)
@ecosta10 ай бұрын
2:23:12 - 100% agree with Tsoding that's "gate keeping". And I'm also 100% sure whoever wrote that shite thinks: 1. the text is clear and understandable; 2. the diagram helps a lot; 3. last case, the pseudo-code makes everything easier. I had to deal with this kind of engineers who can't comprehend how such documentation is only clear to whoever already knows that stuff. I hate this kind of documentation. Stop writing documentation for yourself and start writing for the readers!
@louis100110 ай бұрын
I'd argue 'Kronecker delta' is as cryptic a name as 'Boolean logic'. It's just someone's name.
@thefafala11 ай бұрын
Numpy and Scipy have some fortran at hearth. In general, numeric algorithms written in Fortran, stay in Fortran
@Noritoshi-r8m11 ай бұрын
omg Rollercoaster Tycoon 4 let`s go
@Devsharma-cl4go11 ай бұрын
How about makng a fully functional OS that can perform all the tasks like process management,file management ....etc with c and Assembly ,i realy like to follow that journey , by the way if you will decide to make it then please make sure that it will be bootable not only on the Legacy BIOS but also UEFI (Please Please make it )
@glowiak343011 ай бұрын
Cool! First time I see a 4 hour-long stream.
@alexloktionoff683310 ай бұрын
But for i32 gcc 13 still uses x87 math coprocessor commands, not SSE :(
@Ellefsen9710 ай бұрын
I know jackshit about Assembly programming, but I have found this very interesting. Makes me kinda want to learn Assembly a little bit, but it looks like such a daunting task
@johanngambolputty535111 ай бұрын
I always thought of numpy+matplotlib as an open source version of matlab, and matlab is the new fortran in the sense that its basically designed for maths/numerical computing/scientists, who like matrices/vectorisation... (applied math background, it comes down to modelling problems in terms of linear algebra). Fun fact, from what I remember, calling a minimisation function in scipy in python, calls some c lib which still calls some olf fortran lib from what I remember.
@iamdozerq11 ай бұрын
Matlab is very clancky to use. When i switch out from it entirely for some reason it became easier to do a lot of things. Also matlab is very slow.
@johanngambolputty535111 ай бұрын
@@iamdozerq Haven't used it in a long time, I just didn't like the licence, and preferred python loops and objects at the time, now I've joined the rust cult
@98danielray11 ай бұрын
@@iamdozerqvery slow is an exaggeration. if everything is correctly vectorized, you will get something at worst an order of magnitude slower than pure C and at best on par
@nimitzpro10 ай бұрын
can you graph the complex plane, and 3d for PDEs with python + maths libraries? I've done maths and programming separately but haven't combined them up to now
@johanngambolputty535110 ай бұрын
@@nimitzpro In short yes, but what do you mean graph the complex plane? A scalar function on the complex plane is just like one on 2D space, you use heatmaps, contours. Otherwise you can use multiple 2D plots (e.g. real and imaginary part). 3D figures in matplotlib can be a little clunky but otherwise fine. There's also libs like the js based plotly, but I always used matplotlib. I never used it (I used to embed figures in Qt), but if you want more interactive controls, I'd say have a look at dearpygui.
@chriswinslow11 ай бұрын
I demanded Tsoding start streaming live on YT for at least 4 times per week 😅
@grayishcolors11 ай бұрын
2:18:46 I am pretty sure I figured out the shufps thingy while watching (you might of figured it out by the end of the stream idk). You have 4 float sections & you have four possible rearrangements so that’s 4^4 or 256. The last value just picks which one… The graphic they have is horrendous rofl
@grayishcolors11 ай бұрын
Like each float section in the first half of the final value can be one of four floats from the first passed floats & each float section in the second half can of the final value can be one of four floats from the second passed floats. Why is it made that way? Idk.
hey man, it's been few months since I last checked your channel, just wanted to kick in and say fck you, you are awesome
@tdoc666___11 ай бұрын
*Sorry for the english guys* im currently working on a socket multi-threaded chat room in c++, all i can say is, dude, i've done a couple of chat rooms in other languages(js, ts, php, rust, nodejs...), but the complexity and difficulty of c++ in this kind of project is ABSURD, im stuck on how to handle multiple clients using different thread per peer, but there is something im missing about it, i wrote the code, but everytime i delete everything and start doing from the beginning so that i can learn whatever i wrote down, i stuck at the same point, then i realized how complex c++ might be when used for more advanced projects if used by yourself without any high level libraries to help you, is just a totally different concept, idk, maybe it is me, maybe im too dumb 🤣🤣
@mega251911 ай бұрын
what's up with the numbering in editor
@SlinkyD11 ай бұрын
Relative line numbers.
@yglyglya11 ай бұрын
Zozin try to write code in machine code (not assembly)
@TsodingDaily11 ай бұрын
Didn't I basically do that in the previous video about the JIT compiler tho? kzbin.info/www/bejne/o5OpimaIrNtqjq8
@berndeckenfels4 ай бұрын
I agree f-ing around is good for learning, it does however the drawback you don’t know about guarantees or exceptions, so read up
@Ariiio12010 ай бұрын
Does someone know a good place to start learning assembly, especially with the fasm syntax?
@alejandroulisessanchezgame692411 ай бұрын
What roadmap do you recomend to learn assembly?
@TsodingDaily11 ай бұрын
Uhm... Have you seen how I learn things?
@younger.than.yester_day11 ай бұрын
Just think of something you want to do, think of all the things you need to know to do that, and then Google how to do those things.
@alejandroulisessanchezgame692411 ай бұрын
Ok got it
@not_herobrine375211 ай бұрын
having a background in how to import stuff from c will be a great start because you will be calling into them at some point or the other to get stuff done also, "the art of assembly" is a great read, though my 2003 edition is a bit out of date
@SlinkyD11 ай бұрын
14min in & you got my brain pumped up. Time to hit gym.h
@iamtimsson11 ай бұрын
yoooooo what a bad ass (sincere) good job kiddoe i do have envy of your skills and applied interests good job kiddoe
@fridgeburns11 ай бұрын
This is so cool man wtf
@lucaspalomodevelop11 ай бұрын
"meine Freunde" ... i love it XD .. greetings from germany
@corteztt5184 ай бұрын
is this recommended for a beginner? or its too advanced? If not, how to I set up so that I can follow what he is doing?
@bbq142311 ай бұрын
shufps: you could look at imm8 as an array of four 2-bit numbers which chooses what part of the source register to copy to the part of the destination register that corresponds with the array index. So for 0b01'00'01'00, it basically does: xmm1[0] = xmm0[0b00]; xmm1[1] = xmm0[0b01]; xmm1[2] = xmm0[0b00]; xmm1[3] = xmm0[0b01]; Edit: seems you figured it out, the picture is awful. Interesting how x86 is full of instructions, yet the instructions of how to use them are horrible.
@mokalux10 ай бұрын
very instructive, thank you!
@frikimanhd408711 ай бұрын
just a little heads-up, there's a discord bot that automatically notifies when you go live on twitch so you don't have to
@Czeckie11 ай бұрын
parsing is such a weird academic discipline. So much effort goes into studying parser generators, but every useful language has a handcoded parser that's faster and gives error messages.
@alexandernazarov164210 ай бұрын
so what is that gf2 thingy?
@MenkoDany11 ай бұрын
I hope Tsoding tries "ASM++" (asm with functions, if/else, and while)
@SiddheshPardeshi-mp9cr11 ай бұрын
Did you mean C?
@Omar-fn2im11 ай бұрын
He uses C a lot, you should check out his other streams
@MenkoDany11 ай бұрын
@@Omar-fn2im I'm not stupid
@Omar-fn2im11 ай бұрын
@@MenkoDanyI didn’t call you stupid
@not_herobrine375211 ай бұрын
so... masm?
@alvaronaranjo258911 ай бұрын
3:10:32 apollo guidance computer: ‘looks good comrade’
@EineSchwarzeKatzeMiauАй бұрын
What kind of Scheiße, meine Freunde! Love your german. Und bleib wie du bist
@Eren_Ozdemir2 ай бұрын
INSANE !!!
@MladenMijatov10 ай бұрын
Last time I used assembly to program anything useful was many years ago. These new instructions are a maze, or at least that's how they look like to me. Perhaps it's a combination of sub-optimal documentation and my lack of experience.
@ivankramarenko10 ай бұрын
mov rax 200.0 mov eax 200.0f ??
@colonthree11 ай бұрын
You should look into Rollercoaster Tycoon 1 & 2 for gamedev in ASM. (I've not seen the whole video yet since it was loaded up only an hour ago... ;w;)
@accountprincipale229311 ай бұрын
what distro do you use and are you using VIM or what? i love your setup and how well it works but i don't really know or recognize anything in it
@mertaliyigit328811 ай бұрын
i3 and emacs
@matteovalentino489010 ай бұрын
On debian
@mooncorizer29011 ай бұрын
This is amazing bro
@Crux16111 ай бұрын
This video feels pretty epic 😁
@kamertonaudiophileplayer84711 ай бұрын
I did it in Assembler only in 80s, simple any other solution worked fast enough.
@blackhaze385611 ай бұрын
Easy for Tsoding, no-hit Dark Souls for us.
@berndeckenfels4 ай бұрын
3:48:12 Java Panama will Support That, but rather low-level