Get the project files here: github.com/hal... NOTE: This project is still under development. It's near the tail-end of development, but it still may have some latent issues not documented yet.
Пікірлер: 14
@fabian999ification5 күн бұрын
this is a very interesting idea. I've used 2-phase in my CPU design for similar reasons (phase 1 is instruction and control state change, phase 2 is register state change), but never thought to use 4-phase or 6-phase. Instead I just use the slower "conventional" method of using multiple micro-instructions to disambiguous conflicting control lines (such as PC increment and RAM access).
@HalfBurntToast5 күн бұрын
Yeah, both are valid ways to approach it. The biggest performance improvement was from from getting the microcode prefetching done in parallel because it improves the speed on literally every opcode. All the other parallel optimizations are pretty conditional. Some opcodes still require having separate micro-instructions for PC increment if writing to a non-edge-triggered memory.
@ychto4 күн бұрын
I use a 6-part ring counter for my design. States t1-t3 point to addresses 0x00-0x02 in my microcode ROMs and handle incrementing the PC, moving the PC value to the IR and pulling the instruction out of RAM. States t4-tX perform the actual micro-ops of the instruction. If an instruction needs more than 3 micro-ops there is an AND gate between t3 and t4 that will reset to t4 if "extended instruction" value is set. Otherwise the final micro-op is to reset the ring counter back to t1.
@MrDejvidkit5 күн бұрын
Very good explanation.
@sohl9474 күн бұрын
Interesting! I'm a long-time software developer/researcher but fairly recent FPGA designer as a hobby. I designed my first super-simple CPU to use two phases, but timing through my combinatorial logic was not fast enough (about 120 MHz native FPGA clock speed in a Xilinx Zync SoC), so I divided it down to two two-cycle phases (four phases total), for around 30 MIPS. For my current project, a much more complex design, I designed around a four-phase sequence from the start. It also seems to be working at the same native clock speed, but maybe there are still some subtle timing bugs I'll need to fix after I do more comprehensive testing. Perhaps I'll need to add more phases or maybe I can just put more work into the third phase that right now doesn't do much except set up the bus address for data read/write in the fourth phase.
@samuelemmettbray5 күн бұрын
This was wonderful, all around.
@MattSiegel4 күн бұрын
super! thank you for this clear and detailed follow up :D
@piotrludorowski95294 күн бұрын
Thanks alot! I didn't know about phases, microprograms, and now I know now :) One thing, I read about HC04 is better than HC14 (schmit) when used as xtal clk gen. Have you tested that 14 works? Or maybe the 1Meg ohm is making this work?
@HalfBurntToast3 күн бұрын
That is a good point. Ideally, I think the HCU series should be used because the inputs are un-buffered, which lets them play better in their linear region. That said, I tuned Rs to dampen the feedback of the inverter so it doesn't overdrive the crystal. That's probably enough to keep it stable as it is. But, it would probably be more accurate and less power hungry with a non-schmitt gate.
@eitantal7263 күн бұрын
is pipelining feasible?
@HalfBurntToast3 күн бұрын
Good question. No, not really. Complex instruction set computers are already tricky to pipeline just by the nature of having variable length instructions and slower execution times per instruction. And that's not even going into dealing with hazards. It would take a complete redesign and a large amount of additional hardware to maintain the functionality it has currently with a pipeline.
@eitantal7263 күн бұрын
@@HalfBurntToast I ask, because your list of phases' activities reminded me of MIPS and its pipeline