R800 DMA...

Page 5/5
1 | 2 | 3 | 4 |

By PingPong

Prophet (3435)

PingPong's picture

09-05-2019, 22:35

hit9918 wrote:

ok now we again got into VDP tricks
but what I wanted to say
there were two pages where was cried about the port IO
when the actual thing behind it was a question of game engine design!

and the port IO is not the main showstopper, :-)

By Grauw

Ascended (8388)

Grauw's picture

09-05-2019, 23:56

hit9918 wrote:

the C mirror byte. it works like an index or id.
if the desired C is the same as the mirror C, then one does not need to copy those 16 colorbytes.

every SAT gets 32 bytes of mirror bytes.

and somewhere in RAM one got those colorpatterns indexed by the C byte.

Ah, I see, thanks Smile.

Personally, I prefer not to optimise for the best case. If the worst case can run at 60 fps, I will never have frame drops and scroll stutters Smile. But if frame drops are acceptable, e.g. in a shmup I think occasional slow-down during busy sections may be preferable to not having busy sections at all, probably this kind of optimisation makes sense.

Note that comparing this colour byte with its previous value x32 is not free.

For example 32x CPI to compare, if NZ then CALL to set VRAM address + OUTI, and last 32x LDI to copy to old values to compare with. When no colours change this costs roughly 32 * 47 = 1504 cycles. When all colours change it costs 32 * 422 = 13504 cycles (call / ret, calculate & set address, 16x outi).

For comparison, just a straight OUTI of all 512 SCT takes 9216 cycles.

So if all colours have changed, the “colour mirror byte” optimisation makes the SCT update 46% slower! The break-even point is at around 20 sprites.

In my game-in-progress, this lead me to remove the complicated sprite engine that I had built at first which optimised for the best case (updating colours & allocating patterns on demand, etc.). I replaced it with something extremely trivial, straightforwardly outputting everything all the time. It was much faster in the worst case, and not even that much slower in the best case.

It has limits but at the same time is easy to work with and allows me choose how to deal with the VDP limitations (e.g. Y=216) on a sprite-by-sprite basis. So the approach feels good, though admittedly I absolutely haven’t put it under the same kind of duress as a shmup would.

By hit9918

Prophet (2866)

hit9918's picture

18-05-2019, 17:04

coming from the other thread where the 9978 was a pimped 9938

NYYRIKKI wrote:

it was a real project of Yamaha that tried to combine V9958, DMA and V9990 features together

the candidates for DMA are floppy and PCM

a DMA for blitter commands SX SY etc is not needed. the time you set up the DMA you just as well could set up the command.
also DMA is not free but costs cpu time

it is the DOOM era. and there exists no phantastic DMA that could help a slow cpu.

in other words, there is no phantastic 9978 thing that the 9990 is missing.

Page 5/5
1 | 2 | 3 | 4 |