Overflow: I think you are right, but as I can't ATM take more projects, I would still like to play a little with the idea...
I still think that the colorful Gameboy stuff is out of reach, but if we look where this idea originates we end up to those "dancing girls" that Amiga used to have... Maybe something like that could be doable especially if we move some of the work to outside of screen draw... Let me explain:
Ok let's say that we update 2 bytes for each address setup then I think we need to add one NOP so we are in 81 clocks / 2byte change.
Let's imagine we have a SCREEN 1 where the fonts are generated by repeating 8 times the font number... This gives us possibility to draw B/W picture in resolution of 256 x 24 that takes 768 bytes of memory to draw using name table. Now if we change 2 bytes from the name table we update 16 pixels next to each other. This gives us possibility to move edge of the figure by at least 8 pixels in X-direction that is quite ok.
If that 228 cycles is accurate this means we can do 22 such updates during 8 line draws until we hit another character row... With careful planning I think this is enough... When we get to end of draw area we can then start applying deltas to those 24 rows of (now modified) characters to get them ready for next frames top lines of each character line.
I hope you can follow as this idea is a bit hard to explain, but in other words the theory is that we hava a 768 byte key frame that we modify as much as we can during the screen time and after that we just continue to modify it until we have another 768 byte key frame ready for next frames delta modifications.
From data handling & planning point of view this is terrible stuff, but end result might look pretty nice... and maybe even some quite ok raster colors could be added by carefully planning the characters.
@NYYRIKKI - as a fx variant, not only dancing babes
@maggoo - as an enhancement of HW-trick 1st used in Kefrens bars
Check single-color fx such as rotating Gotham(?) from Batman Forever (again! :D ).
Pretty cool, I always assumed this is how this was done but never gave it a try. I guess I'll need to give it a shot once my current project is out of the way!
BTW what tools did u use to generate the rotating Gotham data? Are they done on the Amstrad first or do you use PC tools to precalc all of it?
I'm thinking the delta compression 2 lines at a time might work actually. We just need to experiment to see how many pixels could be done within the time available.
228 cycles per line. Assuming the address needs to be changed for every byte, mmh, OUTI
, OUT (99H),A
, DEC C
, OUTI
, INC C
(58 cycles per byte), plus two OUTI
s every other line for the vertical offset adjust (18 cycles per line); 3,6 bytes per scanline. Hard to get above 4…
Is it 228 for the vblank period only or the entire line?
Is it 228 for the vblank period only or the entire line?
Entire line. VBlank lasts 57.33 Z80 cycles.
See http://map.grauw.nl/articles/vdp-vram-timing/vdp-timing.html, Horizontal line timing section, for the full line timing details (that doc is a treasure). The values are in VDP cycles, which runs at 21.48 MHz, 6x the speed of the 3.58 MHz Z80, so a full line is 1368 / 6 = 228 Z80 cycles.
You can't play "racing the beam" on the MSX, as the VDP access speed varies from one machine to another. The effect would be ruined in the majority of the Sanyo and Panasonic MSX2+ machines, just to mention some of them.
As sd_snatcher rightfully points out, when the Z80 is not clocked by the same crystal as the VDP the timings will drift, as you can read back in the extensive discussion earlier in this thread :). In IO if there was drift it was visibly noticeable because it synchronously updates the colour in the visible area, and the graphics would look slanted.
But you can race the beam though as long as you’re not syncing to it directly, just updating pixels ahead of it, which is the case for this double-scanline method (though the margin is small). Taking sufficient head start and not assuming you can stay in cycle accurately sync. If the head start is in the middle of the VBlank area (~29 cycles), maybe the drift is small enough enough for the beam not to catch up or lag too far behind.
I think the IO demo can be used to gain information about the amount of drift if you have a variety of MSX models to test with. Look how many pixels it slants from top to bottom on problematic machines (4 VDP cycles per pixel), and you can find out what the worst cases for drift will be approximately.
A less timing-critical application of racing the beam, more suitable for MSX perhaps, is if you draw graphics from top to bottom, and start with the right timing relative to the current vertical raster position so that you won’t catch up to it and will stay ahead of the next one, then you can get tearing-free screen updates without double buffering.
This could be used instead of the double-scanline method. You can update 932 bytes per frame that way @ 60 Hz (3579545 / 60 / 64 for OUTI
, OUTI
, DEC C
, OUTI
, INC C
). Your data size grows by 50% though, because it needs to include MSB address values. And deltas are relative to the last frame instead of the last two scanlines, which is maybe good or maybe bad. In the Batman demo case, the fast rotating part is definitely too much :).
Humm. But most of the changes occurs blocks of pixels, and not on totally random addresses. Each sequence of pixels could use OTIR to write them, at a cost of 22 cycles/byte, against 24 on the CPC.
But it will be more challenging to stay in ~sync because the timing is variable, and every 456 cycles you need to interrupt the stream to do the vertical offset update.
For the per-frame method though, it seems worthwhile:
outi outi ld b,(hl) inc hl dec c otir inc c
79 cycles per single byte, and each adjacent byte costs only 23 cycles, the break-even point is when more than 37% of the bytes is a sequential one.
Per frame, this means a minimum throughput of 755 bytes and a maximum of 2569.
Back to the nonsequential double-scanline version,
228 cycles per line. Assuming the address needs to be changed for every byte, mmh, OUTI
, OUT (99H),A
, DEC C
, OUTI
, INC C
(58 cycles per byte), plus two OUTI
s every other line for the vertical offset adjust (18 cycles per line); 3,6 bytes per scanline. Hard to get above 4…
Another approach is to always update two bytes at once, which should take care of the most common type of sequence; OUTI
, OUT (99H),A
, DEC C
, OUTI
, OUTI
, INC C
(76 cycles per 2 bytes), plus two OUTI
s every other line for the vertical offset adjust (18 cycles per line); 5.5 bytes per scanline.
@ren: Here is the pt3 tune! As usual: no commercial use, and state the author.
Thanks a bunch! (Though I somehow thought an unabridged version would exist, but maybe not (in the latter case it's up to me then to make an extended version ;))
MSX never had a lot of PSG tunes that sounded SID/chpitune like. (Death Wish 3 (Gremlin/ Daglish), Feud (Mastertronic/ Whittaker) & VENOM Strikes Back (Gremlin/ Daglish) come to mind (so all originally written for the C64/SID I reckon).
I see now Factor6 is actually a member of this site as well, should check the topic (PSG arrangements of 1980s SID game musics) he posted in.
So perhaps I'll do an extended remake of the tune in Jeskola Buzz or something (as my VT skills are (still) lacking.. ;)) Of course all necessary accreditation will be given ;)
Tell Factor6 I said thanks! And can you perhaps recommend some tunes, or other prods (demos) featuring quality chip music by him (or artists alike)? Thanks! :)
Tell Factor6 I said thanks! And can you perhaps recommend some tunes, or other prods (demos) featuring quality chip music by him (or artists alike)? Thanks!
Factor6 is active for years, huge and amazing work on various oldschool platforms - mainly AY. I can't decide what to point from Pouet. Ok, you may try some I clearly remember: Tailwind, Back to the Gemba, Batman Forever (Yzi did some music on it as well).