Vscreen optimization test

by snout on 23-11-2005, 00:17
Topic: Software
Languages:

Active visitors of the MSX Resource Center already knew that Arturo Ragozini had found a way to enable VDP commands in SCREEN 4. During vblank, a switch to SCREEN 5 is made, enabling VDP commands. He revealed his theory in this forum thread and decided to prove his theory on Vscreen, the impressive smooth scrolling platform game engine.

The result is VTEST, a demo in which the Vscreen engine was modifide to incorporate this trick. Source codes are all included, so that you can have a close look at how it all works. You can use the cursor keys to move around the test map. At the moment, it's not yet clear whether or not these optimizations will also be incorporated in the next VSCREEN release. Further details on the trick used can be found over here.

Relevant link: VTEST

Comments (10)

By ro

Guardian (4122)

ro's picture

23-11-2005, 08:08

and the results are....?!?

By ARTRAG

Enlighted (6276)

ARTRAG's picture

23-11-2005, 20:18

Actually it is too early for a news like this Tongue

Now the only result is that there is a lot of Z80
speare time that could be used for other tasks.
In this demo the time is spent waiting for
the VDP CE flag. In a true situation you can do
other things, like extra I/O to the VRAM, for example.

From my tests you can OUTI more that 256 bytes while
updating a 32x24 screen. This could be sufficient for
sprite updating (you could have up to 2 sprite attribute tables
and 64 sprites --using a split screen)

The Vblank time allows the VDP to do many different tasks,
like the animated object you can see. The example moves
a block of 32x4 tiles using TPSET, but larger areas are possible.

A different possible use could be for updating in real time color
and pattern definitions for sprites, but the gain wrt standard
RAM/VRAM copy is only when you use 3 colors per line or more.

By Edwin

Paragon (1182)

Edwin's picture

23-11-2005, 21:09

I tried this method a while back when I was doing my first experiments for the Wings code. But for most cases there's just no point. It's useless for dynamic data. That needs to be stored in VRAM first anyway. You may as well store it in a place that you can change the pointer to. Also, in the VBLANK time you can barely copy more than the screen patterns. So all in all, I think it's only usable in very specific cases with fixed data and the needs to free up cpu time. I'm even having trouble thinking of a good example.

By Maggoo

Paragon (1195)

Maggoo's picture

23-11-2005, 21:28

Well fast scrolling in screen 4 without using CPU time would be a good example Tongue Leave the CPU time available for say... playing CPU hungry OPL4 music and game intelligence...

By Edwin

Paragon (1182)

Edwin's picture

23-11-2005, 21:59

Good example. You could store a large vertical Aleste style level and copy the appropriate part to the screen. Efficiency for horizontal scrolling is much less because of the random access writes needed to draw the new column. Same when there is a lot of dynamic updates like pattern based enemies.

By Maggoo

Paragon (1195)

Maggoo's picture

23-11-2005, 22:08

Give a look at the example made by Artag.It arcually does work for horizontal scrolling as well and pattern basd enemies, using tpset copy. That's the beauty of it !

By ARTRAG

Enlighted (6276)

ARTRAG's picture

23-11-2005, 22:36

@edwin
The demo updates a 32x24 screen less than 8 VDP commands.
You can randomly move the 32x24 window in a logical map of 128x256 tiles.

The same principle holds for a 32x24 window in a logical map of 256x256 tiles, but this time I need up to 16 commands for updating the screen .

By Sonic_aka_T

Enlighted (4130)

Sonic_aka_T's picture

24-11-2005, 00:03

I did some tests once too, and it didn't really work out for me that well either. For a smooth vertical scroll it's great, but that's not what I was doing in this case. The thought first occurred to me when I ran into the reverse of this trick. My mode split was cutting off my scroller copy prematurely, something I never could quite fix by the way. Haven't looked at is since tho. But yes, there are idd some applications for this 'trick'. The only downside is like edwin said, there's only so much the VDP can move during VBLANK plus the odd few lines you can have SCREEN5 enabled. In case of VSCREEN it might work though, especially considering the status-part of the screen could also be (or already is?) SCREEN 5.

By ARTRAG

Enlighted (6276)

ARTRAG's picture

24-11-2005, 11:44

The tests shows that at 50Hz the VDP can, in the sole Vblank time,

1) scroll a 32x24 screen in 8 direction in a logical map of 128x256
2) move a first area of 32x4 tiles using TPSET (usable for animated objects)
3) move a sencond area of 32x4 tiles using TPSET (usable for a second animated object)

or

1) scroll a 32x24 screen in 8 direction in a logical map of 128x256
2) move an area of 32x8 tiles using TPSET (usable for animated objects)

Moreover the VDP wait time is free for z80 tasks

the sideback are that :

1) the scroll needs from 4 to 8 commands, so the z80 time is sliced in 4 to 8 intervals

2) the update of the 32x4 area using TPSET needs 1 or two commands, thus the z80 time cold be sliced in one or two intervals

3) the update of the 32x8 area using TPSET needs from 2 to 4 commands, thus the z80 time cold be sliced in up to 4 intervals

For bulks of OUTI this is good, for sprite AI could be fine, for music replayer could be ugly Sad
SadSad

By ARTRAG

Enlighted (6276)

ARTRAG's picture

24-11-2005, 18:03

BTW I was thinking about the use of scr5/7 or scr6 for animations.
IN this way you can get for free, using TPSET, extra frames for smoot
movement of objects made of tiles. In scr5 a logical movement of
1 pixel corresponds to a movement of one nibble in the tile map.

Assume 00h is a background tile and AAh is your moving block.
Assuming a platform of two tiles you have an initial configuration of

[...], 00h, AAh, AAh, 00h, [...]

using scr5/7 and TPSET you have:

step 1) 00h, 00h, AAh, AAh, 00h

step 2) 00h, 0Ah, AAh, A0h, 00h

step 3) 00h, AAh, AAh, 00h, 00h

If you code in a suitable way the tiles (in this case 0Ah, A0h and AAh),
you can have smooth movements of the objects of 4 points (using scr5 copy)
or of 2 points (using scr6 copy).
This can waste a lot of tile definitions and implies a lot of problems in
the tiles design caused by the scr4 colour limitations, nevertheless this
seems a nice feature to be used.