Micha:
I have to study what you are proposing, but the problem is not the speed of writing to the vram.
In the P1 graphic mode we have two planes for scrolling, with different palettes, as well as the possibility of two sets of sprites, with other different palettes each. That is, 64 different colors.
That's not true.
P1 has 1 set of 125 sprites, and each sprite can use any of the 4 palettes available.
You can put the sprite pattern table in any of the 8 available slots in VRAM below 40000h.
And that's all.
To be precise, there are 8 available slots in VRAM below 40000 for P1, and 16 available slots below 80000 for P2.
But yes, there is only 1 set of sprite at a time.
The color palette can be specified for each sprite in the SAT among the 4 possible palette offset: 0, 16, 32 and 48.
I am more precise. I have the first 64 sprites, with one palette, and the rest with another.
I use the Team Bomba library. When getting the g9b,
without the -b parameter, with bmp2g9b, the files are compressed. When loading them into the vram, the decompression routine already writes the palette in each case. I explain this because with the sprites, I am not defining the palette in each one. This would be slow.
There is a difference between what you do, what gfx9k library allow and what you can do with the hardware.
On the hardware side, there is no sprites "set" and you can define a unique palette offset for each of the 125 sprites. You can of course set the same value on the first 64 entries but it is not mandatory.
That said, depending of the use case it may be a good practice to group sprites that share the same palette.
That said, depending of the use case it may be a good practice to group sprites that share the same palette.
This is what I do.
I cannot afford 12000 CPU cycles each frame.
I guess my only solution is to split the full update on multiple frames.
This sounds no good, one sprite wobbles versus the other when updated in a different frame. Better use double buffering.