What is the fastest way to scroll gravel? | Game revolutions cycle 11

  Рет қаралды 314

Coco Town

Coco Town

Күн бұрын

Пікірлер: 20
@David_Ladd
@David_Ladd 3 ай бұрын
Thank you for sharing @CocoTownRetro Great video as always :) Keep up the good work :)
@lostwizard
@lostwizard 3 ай бұрын
Curtis's suggestion looks better yet. However, with your unwound loop, you could just stash the +30 value in, say, U and then just store U instead of storing it. So "LDU 30,X" at the start and "STU ,X" at the end. Saves an STD and an LDD. Compared to the overall loop, that's negligible, but it does remove the need for the selfmod in the routine.
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
Yeah, that makes sense. Turns out that the current incarnation of this code had already done away with the selfmod for other reasons (stay tuned). Curtis's code still works great in the new version, and is about 30 cycles less.
@lostwizard
@lostwizard 3 ай бұрын
I'll come clean and say that I expected something more like duplicating the gravel data so there would be two copies of it back to back, removing the need to break the copy into two pieces, and then just doing a sequence of 16 LDD/STD pairs using extended addressing to store to the screen. (I had thought about extended addressing to read the gravel data but that wouldn't work with rotation.) However, on a 6809, the extended vs 5 or 8 bit constant offset indexing are the same cycle count so it wouldn't be a win. It didn't actually occur to me to just rotate the screen data directly so you get all the points for that.
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
Ah, thanks for the full explanation of your thinking. My evolution of thinking went something like... 1) When in high school I just rotated the screen with a wound-up loop of LDD/STD from/to screen, 2) More recently use stack blasting for everything, 3) What you saw me do in this video. 🙂. So ultimately it was a return to my original idea of in-place rotation, but with the loop unwound via constant offset indexing. And now, 4) Curtis's hybrid puls/std approach which is even faster.
@simonjonassen9357
@simonjonassen9357 3 ай бұрын
sumtimes it's cheaper to use 2,x 4,x etc.... depends on the situation at hand i suppose (trust me i love speeding up code) unwinding code is also a very good way to gain cycles, and it's used very much in the demo world (instead of writing out the code, we use a piece of code that writes an unrolled loop in memory)
@DhinCardoso
@DhinCardoso 3 ай бұрын
Man of courage 🦾 ~ entertaining to watch, painful to even imagine Assembler's life
@CurtisBoyle
@CurtisBoyle 3 ай бұрын
I haven't tested this, but it should be shorter and faster (combining stack blasting with unrolled). I quickly totalled up the cycle counts; hopefully I didn't calculate any wrong. ;Use U as the screen pointer to the gravel instead of X ; (if you need U preserved, a pshs u (5+2=7 cycles) would be added ; at the top, and the rts would change to puls pc,u ; (5+4=9 cycles replacing 5 cycle RTS. So add 11 cycles. Still a bit faster RotateGroundTop: ldu GRAPH (6) leau GroundUpperLeft,u (4+4 = 8) ldd ,u++ (5+3 = 8) std RGTWrapWord+1 (6) ; above setup = 28 cycles ; RGTCopyWordLoop (30 bytes) pulu d,x,y (5+6 = 11) 6 bytes moved std -8,u (5+1 = 6) stx -6,u (5+1 = 6) sty -4,u (6+1 = 7) pulu d,x,y (5+6 = 11) 12 bytes moved std -8,u (5+1 = 6) stx -6,u (5+1 = 6) sty -4,u (6+1 = 7) pulu d,x,y (5+6 = 11) 18 bytes moved std -8,u (5+1 = 6) stx -6,u (5+1 = 6) sty -4,u (6+1 = 7) pulu d,x,y (5+6 = 11) 24 bytes moved std -8,u (5+1 = 6) stx -6,u (5+1 = 6) sty -4,u (6+1 = 7) pulu d,x,y (5+6 = 11) 30 bytes moved std -8,u (5+1 = 6) stx -6,u (5+1 = 6) sty -4,u (6+1 = 7) ;above 30 byte copy = 150 cycles RGTWrapWord: ldd #0000 (3) and wraparound word std ,u (5+1 = 6) rts (5) ; cleanup above = 14 cycles ; ; Total=192 cycles (including RTS, 187 cycles w/o). Should be shorter, too.
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
Clever! I'll need to try this out. Thank you!
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
Nailed it! Right now, I'm coding way ahead of the videos, so this code has had some changes since this video. But putting your code into the new routine works great and is about 30 cycles less than the latest I'd had. I'll be featuring your code in yet another upcoming vid, though it'll be a while before we get there. 🙂 Thanks for taking the time to write this all up!
@paranoidcactusgames
@paranoidcactusgames 3 ай бұрын
@@CocoTownRetro I don't know if this is still relevant, but I played around with this routine to see if it could be faster with full stack blasting. After several iterations I managed to get it down to 168 cycles (if I've counted correctly). I haven't tested it so hopefully I didn't get anything wrong here. RotateGroundTop: ; Copy stack pointer to lds instruction sts RGTRestoreStack+2 ; 7 ; Point U to second byte ldu GRAPH ; 6 leau GroundUpperLeft+1,U ; 8 (4+4) ; Copy wrap byte to ldb instruction lda -1,U ; 5 (4+1) sta RGTWrapByte+1 ; 5 ; Stack blast 3 x 10 byte blocks pulu CC,D,DP,X,Y,S ; 15 (5+10) leau -1,U ; 5 (4+1) pshu S,Y,X,DP,D,CC ; 15 (5+10) leau 11,U ; 5 (4+1) pulu CC,D,DP,X,Y,S ; 15 (5+10) leau -1,U ; 5 (4+1) pshu S,Y,X,DP,D,CC ; 15 (5+10) leau 11,U ; 5 (4+1) pulu CC,D,DP,X,Y,S ; 15 (5+10) leau -1,U ; 5 (4+1) pshu S,Y,X,DP,D,CC ; 15 (5+10) ; Load last byte into A lda 11,U ; 5 (4+1) RGTWrapByte: ldb #00 ; 2 ; Store last byte and wrap byte std 10,U ; 6 (5+1) RGTRestoreStack: lds #0000 ; 4 rts ; 5 ; Total = 168 cycles
@paranoidcactusgames
@paranoidcactusgames 3 ай бұрын
I've realised that this fails to retain the state of the CC register so the IRQ flags will end up containing garbage. It should have a "pshs CC" at the start and a "puls CC" at the end or, if you know what state those flags should be in, it could be achieved with a single "orcc" or "andcc". Also, as it modifies the stack pointer it would be a disaster if an interrupt can occur while it's executing.
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
Thanks for looking at this and offering your suggestion. Most of your code is indeed still relevant for the current version of this routine. This is definitely worth trying out and comparing the ultimate cycle counts that LWASM reports, once it’s integrated into the new code. I can't tell just yet, but it does seem possible that this might speed things up even further from what I now have. The good news is that I am already disabling interrupts at the PIA when this routine is run, because of other stock blasting I do elsewhere. It may take a while, but stay tuned for an upcoming video where I will try this out!
@markusfassbinder8275
@markusfassbinder8275 3 ай бұрын
Hmm. If you randomize your source code, you will eventually find a version of your game that is faster. But I doubt we live long enought to see that happen.
@CocoTownRetro
@CocoTownRetro 2 ай бұрын
You’ve uncovered my secret to writing code: monkeys on typewriters. Lots of monkeys on typewriters.
@lamune6809
@lamune6809 3 ай бұрын
What's the development environment being used here? Is it part of MAME itself?
@alexgayer85
@alexgayer85 3 ай бұрын
It appears to be Microsoft Visual Studio Code
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
VS Code is correct. Plus some extensions, lwasm to assemble (and count cycles), another tool to carry over source into MAME as comments, and the MAME debugger itself. You can get all the details from a couple past videos of mine: kzbin.info/www/bejne/sGPRgnuuoqt_sNk and kzbin.info/www/bejne/f4KYaZdqo9Cnrbc
@erroneus00
@erroneus00 3 ай бұрын
Apologize to your past self for not having access to MAME and its debugger as well!
@CocoTownRetro
@CocoTownRetro 3 ай бұрын
Past self used state-of-the-art development tools (at the time), and couldn't imagine anything better. 😁
Machine Code Instructions
11:24
John Philip Jones
Рет қаралды 387 М.
Just Give me my Money!
00:18
GL Show Russian
Рет қаралды 1,1 МЛН
这三姐弟太会藏了!#小丑#天使#路飞#家庭#搞笑
00:24
家庭搞笑日记
Рет қаралды 121 МЛН
KiCad 7 STM32 Bluetooth Hardware Design (2/2 PCB) - Phil's Lab #128
2:56:53
Level encoding and HOLES! | Game revolutions cycle 13
25:30
5 shocking joystick secrets! | Asm Adventures
26:31
Coco Town
Рет қаралды 459
What is the Smallest Possible .EXE?
17:04
Inkbox
Рет қаралды 385 М.
From Discrete to Continuous Automata... but in Godot
15:45
The Pathfinders Codex
Рет қаралды 104
When my CoCo seriously spooked me... | 8-Bit Childhood
6:43
But, what is Virtual Memory?
20:11
Tech With Nikola
Рет қаралды 270 М.
Just Give me my Money!
00:18
GL Show Russian
Рет қаралды 1,1 МЛН