Unfortunately, i have a correction: The RSP is close to the bottleneck now, but it is not entirely the bottleneck. There is still a bit of fifo buffer time. A little bit more mem throughput optimization is required.
@lpfan4491Ай бұрын
Basically, we are approaching the actual limits of the console. It's pretty exciting, but also kinda sad that it seems like the journey is coming to its end soonish(tm).
@indask8Ай бұрын
The end of perf improvement but the begining for better homebrew games.
@lmk10000Ай бұрын
@@indask8 this
@ZeFoxiiАй бұрын
Now if only someone did this for the GameCube ps2 wii etc
@tyjuarezАй бұрын
You have nearly a PirateSoftware level of MS Paint explanation skill
@BeefJerkeyАй бұрын
The hardware of the console is such a weird mess that literally no one knew how to properly use all of its resources back in the day. Rare and Factor 5 came the closest, it seems to me.
@FWCrashАй бұрын
I feel like all these mind blowing discoveries are ineviteably delaying the release of your mod cuz you have to re-visit thousand things you already thought were done to account for it LMAO
@fruitsnackia2012Ай бұрын
it doesnt bother me if he want to continue to find new optimizations to improve the game im open to it. plus he will get it out there hes completed many mods so why not this? you cant rush art.
@FWCrashАй бұрын
@fruitsnackia2012 i never implied whether i was bothered by it or not, nor did i say it was a bad thing. Just a funny realization that came to me
@guillaumefigarella1704Ай бұрын
i would be fine with the game releasing in 2030 if its the only way
@fruitsnackia2012Ай бұрын
@@FWCrash ah. ok thats fair lol.
@fruitsnackia2012Ай бұрын
@@guillaumefigarella1704 i dont know for sure but if i had to guess probably in a few years.
@Wyatt_JamesАй бұрын
I wonder if this tactic could help SM643DS' performance. The 3DS doesn't have nearly the memory bottleneck of the N64, but vbo creation is a lot of data movement, so therefore a lot of memory reads. It's worth looking into at any rate.
@LandonEmmaАй бұрын
FR
@walidslammer8091Ай бұрын
im pretty sure the main reason performance is bad for that port is because it uses software rendering instead of using native 3d
@Wyatt_JamesАй бұрын
That would be laggy, but it doesn't use software rendering.
@guillaumefigarella1704Ай бұрын
@@walidslammer8091 crazy that its software rendered, it really put cpu power into perspective
@vurpo7080Ай бұрын
@@guillaumefigarella1704it's not
@espfusionАй бұрын
VR4300 has a 8x4 byte write buffer, the same size as a cacheline. You should be able to write to one of the uncached mirrors of memory (kseg1 or a TLB mapped page w/caching disabled) to get writes to RDRAM with the same burst performance as cleaned cached writes but without thrashing the cache. I'm surprised games don't already do this.
@wirelyreАй бұрын
You inspired me to look up the user manual and now I'm surprised too lol. Skimming through, I wonder if cache write-backs are still faster than uncached writes from the buffer. Thrashing aside.
@ryan.crosbyАй бұрын
> I'm surprised games don't already do this. Looking at how most N64 games were developed, most developers were flying by the seat of their pants just to get any game working well enough in the 1-2 years that they had to get the game out the door. The N64 hardware was such a paradigm shift in terms of memory model and hardware, many developers were coming from the SNES or *maybe* PS1, and the official developer documentation was very lacking. So much knowledge seems to have been lost in translation between Silicon Graphics -> Nintendo -> game studios. I'm not surprised stuff like this got missed completely.
@KazeClipsАй бұрын
the problem with this is that you end up only being able to write 8x4 and i'm not 100% confident that they won't get flushed at randomish points (so that you might end up using 1 mem transfers for just 8 bytes). I've tried moving all the display list writes to uncached memory before and unfortunately it did not help performance, so for some reason this does not give as much performance as you'd hope.
@masterofdoom5000Ай бұрын
Any game console would be lucky to have you Kaze, cool stuff
@nintendo4life132Ай бұрын
Simple understanding, I think, I'm trying to follow along
@pukaloАй бұрын
The PS2 uses RDRAM and memory doesn't seem to be an issue for it, though iirc it is dual-channel and has a dedicated DMA controller.
@tcscommentАй бұрын
yea I'd say that the only memory problem on the PS2 is that it should've had more of jt
@jrharbortproductionsАй бұрын
Dual channel as well as being a newer generation/revision of RDRAM with improved bandwidth and latency. Latency was still kinda high though.
@sebastiankulcheАй бұрын
PS2 is a next gen console. There is no comparison. Dual channel, more bandwith, etc
@xdanic3Ай бұрын
I don't know how big of a deal this is, and I'm not sure I understood, but to me it sounds like a lot of rewriting code, changing a function in many places or rewritting the function that is already used
@PalendromeАй бұрын
Kaze, it would be interesting if SM64 had a dual stick mode like Goldeneye/Perfect Dark, where one player holds two controllers with the middle handle to have two analogue sticks.
@KazeClipsАй бұрын
i did this years ago
@timmygilbert4102Ай бұрын
I ... I understand the gist if it, but not the details 😅 How much cache line there is? Is it that there is many 16 byte of cache line addressable in individual chunk, but can't be written without loading from main memory? And this instructions bypass the loading tax by allowing free access to cache line, regardlessof their current status?
@Wyatt_JamesАй бұрын
If you want to overwrite an entire cache line, you need to first load it from cache, then write the data to cache, and finally save the data from cache back to memory. If you're going to be overwriting the entire cache line anyway, why bother loading from main memory first? Simply tell the CPU to pretend it's already loaded. That saves one memory transfer. Of course, if you only need to write part of the cache line, then you have to load from memory first no matter what, because you need to preserve some data that's already in memory at that location.
@KazeClipsАй бұрын
yeah exactly this
@DaltonisntabotАй бұрын
Wyatt already explained it. But, I'll reiterate. You read the code and put it in memory and you need to cache it, then load it into memory (run it), then put it back to execute it on the next frame. This is slow. We skip this part, tell the game 'Hey, it's already loaded, go ahead and execute it'. That's basically what it is. This does remove data and flushes it immediately, but that's fine, it can always be ran again later. Whenever you close a program on a computer, it's still cached for example as it's slow to open a program from the hardrive as it has to read it, run it and finish executing the commands, validate everything is correct (don't want it to corrupt anything, viruses, etc). So, it caches it to memory once closed. Much faster and frees up the main memory for active programs.
@timmygilbert4102Ай бұрын
@@Daltonisntabot to be frank you make it more confusing by not naming the memory 😆 reading you I don't know if you are talking about main memory aka ram, cache memory, instructions memory or register, my issues isn't about the explanation but the memory lay out and hierarchy, ie less overloading of meaning on top of the word memory
@eduardoanonimo3031Ай бұрын
Ignorant here 🙋♂️ So all the games of my n64 except battle of naboo and majoras mask looks like crap because someone in SGI make a mistake of recursively call the cache memory whoever is used or not?
@LotoTheHeroАй бұрын
Very cool!
@athos5359Ай бұрын
where s the idle time for the rdp? =gpu clock alive.
@blueyoshi8517Ай бұрын
Oh my.
@M.N.9Ай бұрын
Hey Kaze, is your romhack designed with Antialiasing in mind or not?
@KazeClipsАй бұрын
its optional
@guillaumefigarella1704Ай бұрын
from what i understand the cpu is vastly vastly too fast compared to the rcp, because the rcp does most of the heavy lifting? then why have that very fast cpu?
@tcscommentАй бұрын
because that's a flex
@IncognitoActivadoАй бұрын
Incompetence.
@jrharbortproductionsАй бұрын
The extra CPU power can be useful for other calculations related to things like physics in games that needed it.
@guillaumefigarella1704Ай бұрын
@@jrharbortproductions can you give an exemple of a game that had high cpu utilization? from what i know, world driver championship is the most demanding n64 game cpu wise and its still very far from it becoming even remotely cpu bound, and the game didn't use the FPU it was z sorted, it had suspension physics, it ran at high resolution and fairly high framerate, why the hell did they think they needed that huge 4300i?
@jrharbortproductionsАй бұрын
@@guillaumefigarella1704 I can't really name an example off the top of my head, but I just think it would probably be a good use case for the extra power in a game.
@MuskelhiasАй бұрын
0:14 Cen64 is NOT the most accurate n64 emulator - ares is! Cen64 is old, unmaintained and deprecated.
@anki.8434Ай бұрын
What use do you give to internet explorer since you have it pinned? 😄
@-sdsideАй бұрын
masochism
@uponeric36Ай бұрын
INB4 he has a custom .exe byte code edited internet explorer that makes it good.
@XebozoneАй бұрын
It can be used to open really old, insecure websites, but edge also has an Internet Explorer compatibility mode now
@SimoneN64Ай бұрын
CEN64 is far from the most accurate. Ares is. Unfortunate common misconception
@KazeClipsАй бұрын
HUH i had no idea
@SimoneN64Ай бұрын
@@KazeClips Please check out Ares and all the accuracy tests it passes. It's truly a marvel. It does need somewhat of a powerful pc cause it uses Parallel-RDP, but it runs fast generally
@mariomadproductionsАй бұрын
Its probably more the case that they are accurate in different areas
@KazeClipsАй бұрын
i've used it before and unfortunately it did not pass the console check in my game and was detected as emulator, which is why i had thought it was not that accurate.
@SimoneN64Ай бұрын
@@mariomadproductions No. Ares is more accurate in every aspect of the hardware, including cache emulation. Cen has been abandonware for years. People really need to fact-check things before talking. Not to mention, cycle-accuracy hardly matters for anything. CEN64 is cycle-accurate, but hardly anything else is
@athos5359Ай бұрын
ps1 mainram was also slow and only 5kb of cache on the cpu die and on top of that display list had to be stored in main ram aswel atleast the n64 had more cache to play with.
@TheSamucacsАй бұрын
I still believe you'll make N64 have GameCube graphics
@killpidoneАй бұрын
Kaze, don't talk to german speaking bots lmao
@-sdsideАй бұрын
they are trying to trick you into converting to mormonism trust me