LMARV-1 reboot part 4: the sequencer

  Рет қаралды 3,224

Robert Baruch

Robert Baruch

Күн бұрын

This part explains and does a little bit of formal verification on the sequencer card.
github repo for code: github.com/RobertBaruch/riscv...
nMigen tutorial: github.com/RobertBaruch/nmige...

Пікірлер: 36
@marijnstollenga1601
@marijnstollenga1601 3 жыл бұрын
Thank god the intro is back!
@havresylt
@havresylt 3 жыл бұрын
This is about the coolest thing happening right now. Looking forward to the next episode!
@dxred2553
@dxred2553 3 жыл бұрын
In French, the verb "aller" means to go. If you were conjugating it with "je" ("I" in French), you would say "je vais". You could argue "j'allais" ("I went") would work, but you were on the right track using it as a mnemonic for JALR
@JonSeverinsson
@JonSeverinsson 3 жыл бұрын
In JAL and JALR you use rd on the register card as a temporary in order to save the value on the Z bus at end of cycle 1 and put it on X in cycle 2. At first glance this look fine, as you will overwrite it with PC+4 in cycle 2 anyway, but consider the case of rd = r0. In that case the register card will happily put a 0 on the X bus in cycle 2 instead of the value saved in cycle 1, and thus your cpu would erroneously jump to address 0. I think you will have to add a latch to the sequencer card that can save a value from the Z bus in one cycle and then route that value to the PC or memaddr in a subsequent cycle (note that if you do this, you don't have to be able to route X to PC or memaddr, so your muxes doesn't have to be any more complex).
@RobertBaruch
@RobertBaruch 3 жыл бұрын
And that's what formal verification did for me. I found the bug during formal verification :)
@mattnottingham3663
@mattnottingham3663 3 жыл бұрын
Cool stuff. What sort of clock speed are you hoping for with the actual hardware? Does it make sense to also repeat the formal testing with actual hardware - by which I mean use a FPGA to drive each 'daughter card' (where the cards are yhe register, alu, memory, etc circuits) individually to allow you to verify that each daughter card does actually conform to its nmigen model? Thanks. Matt.
@gudenau
@gudenau 3 жыл бұрын
Oh hey, you've got a wireless Vive. Cool stuff!
@adrianohaha7659
@adrianohaha7659 3 жыл бұрын
I'm here for the music.
3 жыл бұрын
I feel a bit complicated to do load things this way (shifting). I would say, it's even more simple to have system memory as 4 * 8 bit RAM chips in a way, that you can select only individual chips to be selected (or not) for writing and reading (so byte-wise the system stores addresses in this pattern: chip0, chip1, chip2, chip3, chip0 ...). With some not so complex sign extension logic in hardware, and almost ready. At least, I would guess (but I can be wrong) it's even more simple to do and do not require extra cycles for shifting back and forth on extra cycles (also saves some login in the sequencer maybe?).
@AndersJackson
@AndersJackson 3 жыл бұрын
ECALL is calls to OS or your monitor/BIOS. Should at least be a call through a vector of addresses. So arguments are, as you said, an operation call to the monitor. So implementing it should be possible.
@rolandruckerbauer7312
@rolandruckerbauer7312 3 жыл бұрын
If you are finished with designing all the cards with nmigen HDL, how do you plan to create your schematics? I think, because you abstracted essentially "real chips" with your nmigen modules, you could replace all of them with blackboxes, and use nmigen + yosys to generate a netlist. With some scripting you may be able to convert e.g. the json output of yosys into a netlist for kicad. The blackboxes in the json output would be converted into the correct parts in the netlist. From there, you can already do the layouting / routing :D. No schematic needed.
@RobertBaruch
@RobertBaruch 3 жыл бұрын
I think what you're referring to is techmapping. That's a whole other field that I don't know anything about, and don't want to know at this point. I'm planning on writing modules for actual chips, and replace non-chip logic with chip logic little by little in the design. When that's done, the code shows how the chips are hooked up. Generating a netlist *could* be done from that, but then there would be just as much work in making the schematics look nice as there would be just doing it from scratch. Also, a schematic helps when you need to to pinswaps because of physical layout constraints.
@mikescott58
@mikescott58 3 жыл бұрын
Yay! The theme song is back! Question regarding the program counter. Is there ever a case where you need to write a byte or half-word? If not, why not hard code A1:A0 to be zero and then you can employ a simple up counter for A31:A2. It may change how you deal with sign-extending bytes and half-words, but the sequencer could figure that out. Am I missing something.
@RobertBaruch
@RobertBaruch 3 жыл бұрын
If I don't implement compressed instructions, then I can safely ignore the bottom two bits. Compressed instructions are 16 bits long, and eventually I might want those. But not now. Also, an up counter would require a clock pulse (therefore, a machine cycle) while an adder is strictly combinatorial, so doesn't require any machine cycles.
@mikescott58
@mikescott58 3 жыл бұрын
@@RobertBaruch I was thinking something like using a 74LS160. Now you have the latch fed by a 3-input MUX (A-bus, Z-Bus, PC+4) clocked in. If you used a parallel-load counter and a 2-in MUX (A-bus, Z-bus) and instead of the +4, just don't parallel load and it will increment. Same number of clock cycles. It also gives you a reset input to initialize the PC to begin fetching code at memory location 0. As for compressed instructions, aren't those just to allow smaller code footprint for embedded applications?
@RobertBaruch
@RobertBaruch 3 жыл бұрын
@@mikescott58 Unfortunately, PC+4 doesn't only need to go to the PC, but also to the destination register for JAL and JALR. That means PC+4 needs to be computed combinatorically rather than synchronously. Otherwise, yes, a counter would work better. Compress instructions are mainly for memory-limited implementations. I probably won't be implementing the C extension.
@wChris_
@wChris_ 3 жыл бұрын
instead for load: place byte in the right destination and broadcast the highest bit to all above positions
@wChris_
@wChris_ 3 жыл бұрын
ie have some sign extension unit
@joedetode
@joedetode 3 жыл бұрын
I think the sticking point with this case is that there would need to be a multiplexer choosing between the sign extended and untouched versions. You would also need a way to choose between 8- and 16-bit extending. Whereas the video method uses the existing shifter hardware that would otherwise be unused in a load. In this case the tradeoff is more speed vs less complexity, simply an implementation choice to prefer one to the other. I think that this video is coming down on the side of less complexity ;)
@pengbertuuu
@pengbertuuu 9 ай бұрын
@@joedetodewe just need to add an alu op shift right and sign extend
@TomStorey96
@TomStorey96 3 жыл бұрын
RISC V and MIPS are so similar in many ways.
@lawrencemanning
@lawrencemanning 3 жыл бұрын
Yup. Both came from mostly the same people IIRC.
@wChris_
@wChris_ 3 жыл бұрын
i think there are already formal verification frameworks for RISC-V
@DantalionNl
@DantalionNl 3 жыл бұрын
There are but I think that is still a few steps ahead of the current progress.
@gudenau
@gudenau 3 жыл бұрын
Any chance you could explain how atomics work on a hardware level? I can't really wrap my head around that.
@RobertBaruch
@RobertBaruch 3 жыл бұрын
As far as I know, atomics could just be implemented as flip-flops. You can read them, process them, and write them atomically, as long as interrupts are inhibited during that process, and maybe that's how processors with atomic instructions operate. I don't know. I'm pretty sure software just uses locks where there are no atomic instructions available.
@gudenau
@gudenau 3 жыл бұрын
@@RobertBaruch The problem with software only locks is, how do you make a guarantee that another core doesn't modify it? Even if there aren't any interrupts there are other free running threads, and if you pause the entire CPU minus your current thread you'll lose a lot of aggregate performance with more cores. It's a hard question for me to reason with.
@eehawkee
@eehawkee 3 жыл бұрын
@@gudenau You are correct, special hardware (normally interfaced with special instructions) are required to implement atomicity, typically done at the system bus level, by locking the system bus when a CPU starts an atomic operation, and unlocked when completed, if a different processor tries to read or write the memory in the mean time it will be forced to stall until the bus is unlocked.
@gudenau
@gudenau 3 жыл бұрын
@@eehawkee There has got to be a better way to handle that. I wonder how stuff like ZEN and multi socket systems handle that.
@eehawkee
@eehawkee 3 жыл бұрын
@@gudenau I'm not familiar with zen, but with UltraSPARC-T2 which is a 8-core 64-thread CPU, and supports multi-socket, it does atomic operations with the L2 cache, so a atomic operation will invalidate the cache line for all connected CPUs, more efficient than bus locking, but still a performance hit. You should use atomics sparingly in other words.
@dr.chrismaldonado
@dr.chrismaldonado 3 жыл бұрын
How old is your channel photo where you have a Johnny Depp look?
@RobertBaruch
@RobertBaruch 3 жыл бұрын
A few years ago.
@dr.chrismaldonado
@dr.chrismaldonado 3 жыл бұрын
@@RobertBaruch You're doing an awesome job! Thanks, I've learnt a lot.
LMARV-1 reboot part 5: formal verification of the CPU
54:35
Robert Baruch
Рет қаралды 2,9 М.
LMARV-1 reboot part 6: About CSRs and interrupts
21:32
Robert Baruch
Рет қаралды 3,7 М.
- А что в креме? - Это кАкАооо! #КондитерДети
00:24
Телеканал ПЯТНИЦА
Рет қаралды 4,4 МЛН
Survival skills: A great idea with duct tape #survival #lifehacks #camping
00:27
NERF WAR HEAVY: Drone Battle!
00:30
MacDannyGun
Рет қаралды 54 МЛН
🤔Какой Орган самый длинный ? #shorts
00:42
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 864 М.
This Video is About Electroadhesion.
14:05
Reactions
Рет қаралды 45 М.
LMARV-1 reboot part 14: Chips that don't exist
37:37
Robert Baruch
Рет қаралды 6 М.
How One Change Transformed my Indie Game!
11:05
BenBonk
Рет қаралды 14 М.
Compilers, How They Work, And Writing Them From Scratch
23:53
Adam McDaniel
Рет қаралды 105 М.
The End Of Jr Engineers
30:58
ThePrimeTime
Рет қаралды 322 М.
One Unique Fact for Every AoE2 Unique Unit (Part 1)
12:41
Spirit Of The Law
Рет қаралды 77 М.
Coding a Web Server in 25 Lines - Computerphile
17:49
Computerphile
Рет қаралды 327 М.
What is Formal Verification?
2:29
Galois
Рет қаралды 33 М.
😱Хакер взломал зашифрованный ноутбук.
0:54
Последний Оплот Безопасности
Рет қаралды 415 М.
Красиво, но телефон жаль
0:32
Бесполезные Новости
Рет қаралды 618 М.
Samsung Galaxy Unpacked July 2024: Official Replay
1:8:53
Samsung
Рет қаралды 23 МЛН