Transferring data after HMMC / LMMC command

Page 2/2
1 |

By sd_snatcher

Prophet (2734)

sd_snatcher's picture

15-08-2017, 13:24

Grauw wrote:

Better alternative for LMCM:

10 SCREEN 0 : WIDTH 80 : PRINT : I = 65
20 VDP(26) = VDP(26) OR 64
30 VDP(47) = &HA0
40 VDP(37) = 0 : VDP(38) = 0 : VDP(39) = 0 : VDP(40) = 0
50 VDP(41) = 8 : VDP(42) = 0 : VDP(43) = 1 : VDP(44) = 0
60 VDP(46) = 0 : VDP(47) = &HF0
70 IF VDP(-2) AND 1 THEN VDP(45) = I : I = I + 1 : GOTO 70

Executes LMCM before it writes anything at all.

  • No need to set SX / SY or NX / NY values for the LMCM
  • No need to reset NY before the HMMC / LMMC
  • LMCM executes before everything else so can be done in a TR != 1 check after the CE wait loop
  • Or just always, it’s pretty quick, just a R#46 CMD write, that’s all
  • Plenty of time for TR to go high so definitely no need for a wait loop later (on line 60)

Awesome!

But I wonder: skipping the TR checking won't break things on turbo machines?

By Grauw

Enlighted (7248)

Grauw's picture

15-08-2017, 14:24

No, because VDP access is always slowed down.

Additionally, even if there was no slowdown, there are 9 VDP register writes between the two CMD writes, which is plenty of wait time. An indication of the TR speed: in screen 5 with sprites enabled, I can back-to-back OUTI data to HMMC without write corruption, that’s just 18 cycles per byte. In the VDP_ExecuteCommandMMC routine I posted earlier, time between the two CMD writes is >200 cycles.

By DarkSchneider

Paladin (691)

DarkSchneider's picture

16-08-2017, 10:46

I don't see much problem about specifing the color before the command execution. I see more interesting the possibility of modifying the coordinates values while executing. This would allow to do all the transfers in a single (or multiple "half") command execution. Making the corresponding queues of source-destination-length values. Once finished simply send a STOP command, no need to use the command itself finish signal, or for the last one set the NY to its real value instead NY+1.

There is any benchmarking to see if there is difference between using HMMC vs direct RAM -> VRAM transfer?

By NYYRIKKI

Enlighted (4947)

NYYRIKKI's picture

16-08-2017, 11:38

DarkSchneider wrote:

I don't see much problem about specifing the color before the command execution.

If you are just transferring data, it is more like curiosity, but if you are generating data on the fly then this very easily turns in to problem where you need some routines written two times or you insert stuff that slows down the process.

By DarkSchneider

Paladin (691)

DarkSchneider's picture

16-08-2017, 15:00

Procedural content, interesting.

By Grauw

Enlighted (7248)

Grauw's picture

16-08-2017, 16:05

HMMC is just as fast as using the indirect VRAM write. Both are just an OUTI to a port (18 cycles / byte, or exactly 16 bytes / line). Neither require you to wait in the bitmap screen modes.

I like the idea about modifying the coordinates, however changes to SX, DX and NX may not immediately be applied. I think during command execution it increments / decrements X in internal registers, and only resets them to the set value when it advances to the next line.

By DarkSchneider

Paladin (691)

DarkSchneider's picture

17-08-2017, 11:29

Probably need to execute CMD again to reset the counters.

If you are going to copy many blocks of the same width, can set NX to that value, set NY to the sum of all them, and modify DX, DY each time.

The other good thing is that with the other commands can apply logical operations using the VDP.

By Grauw

Enlighted (7248)

Grauw's picture

18-03-2018, 16:09

I used the LMCM method today in assembly (screen 5) and it works like a charm Big smile.

There was one snag though: in openMSX everything worked fine, but on my real MSX I encountered a freeze after a couple of frames (in a CE wait loop I think) when I did two of these back-to-back each frame (moving diagonally). Strangely it didn’t happen immediately, every time, and it also didn’t happen when I did it once per frame, with no other commands inbetween. Bit of a mystery this one.

Since it was on the real MSX I haven’t debugged it fully yet, but I did find a fix; Executing a STOP command after the LMCM avoided the issue. (A generous wait didn’t.) So be wary of this particular thing, and make sure to always test on a real MSX. If you apply this technique and find out more information, be sure to post here!

By PingPong

Prophet (3096)

PingPong's picture

18-03-2018, 18:34

Grauw wrote:

No, because VDP access is always slowed down.

Additionally, even if there was no slowdown, there are 9 VDP register writes between the two CMD writes, which is plenty of wait time. An indication of the TR speed: in screen 5 with sprites enabled, I can back-to-back OUTI data to HMMC without write corruption, that’s just 18 cycles per byte. In the VDP_ExecuteCommandMMC routine I posted earlier, time between the two CMD writes is >200 cycles.

Hi, Grauw. HMMC / OUTI work without corruptions. OK.
What about the LMMC / OUTI? Do you need to wait ?

By Grauw

Enlighted (7248)

Grauw's picture

18-03-2018, 19:09

I haven’t tried, but looking at the speed measurements of LMMV and the command timing for LMMV, it is about two times slower than HMMV per processed unit (pixel / byte).

Since HMMC and LMMC work very similarly, from the command timing information we can see that at a minimum per unit HMMC needs 48-104 VDP cycles (8-17.3 CPU cycles), where LMMC needs 96-160 VDP cycles (16-26.7 CPU cycles). Given that OUTI takes 18 CPU cycles, it’s too fast for LMMC.

(Additionally, I would think that with sprites enabled even HMMC would have issues, given that a VDP access slot is not always immediately available, however I’ve done that just recently and it did not seem to be the case, so I guess it manages to snag one in time after all. I don’t think it would explain the issues from my previous post.)

Page 2/2
1 |
My MSX profile