GPGPU applications on MSX

Page 1/2
| 2

By Grauw

Ascended (10767)

Grauw's picture

28-09-2008, 01:33

So ‘GPGPU’ (General Purpose GPU) programs are all the rage nowadays. I was wondering, who has good ideas of (calculation) tasks you could use the MSX v9938/58 VDP’s command engine for? Theoretically it is much faster than the Z80 :).

Some things the v9938 can do for you:

  • logical operations with other data (from VRAM or CPU): and, or, xor, not, timp, tand, tor, txor, tnot
  • 2 bit-precision rotates (in screen 6), i.e. multiplication and division by 4
  • 2 bit shifts (using and for bit masks)

Maybe there is more you can do by e.g. using screen 12, or by placing the data in memory in clever structures, or by using the ‘transparent’ versions of the operators, or by storing data in screen 5 and reading it in screen 6…? Maarten (tH) mentioned that you can color the area between two white dots on a black background by doing an XOR copy one pixel to the left, those kinds of techniques are interesting applications.

I was thinking about my Tiger Tree Hash implementation (link), which does a lot of 64-bit operations which are relatively slow and the VDP could easily apply them to 8 bytes in one go, with carry-over. But I also need to additions/subtracts and odd-number shifts, and TTH is not very easily paralellisable, so I would need to move a lot of data back and forth between the CPU, which very likely causes too much overhead for this to be useful there.

So, any thoughts? Other types of calculations the VDP (or another MSX chip) can do, perhaps by coming up with clever combinations of these? And any applications for this? Certain kinds of hashes (CRC?), bitmap conversion, certain 3D calculations, folding@home, mp3 decoding :)?

Btw, you can also respond on my blog, if you want.

Login or register to post comments

By Grauw

Ascended (10767)

Grauw's picture

28-09-2008, 02:23

Additionally, 2 or 4 bit to 8 bit conversion and vice versa (using LMMC/LMCM).

By Edwin

Paragon (1182)

Edwin's picture

28-09-2008, 10:04

Since any mathematic operation can be broken down into a series of logical operations, I'd expect any mass calculation to be possible. However, the sequential nature of operations in for instance additions will probably not make it as quick as you'd want it to be. I expect that the z80 beats the hell out of the VDP for 8 and 16 bit additions. For larger ones you might close in. You'd have to do the math on that one. Tongue

By mth

Champion (507)

mth's picture

28-09-2008, 12:29

Grauw: The XOR fill trick requires a copy one pixel to the right (unless you tell the VDP to read the pixels right-to-left). A friend told me about this trick, which is used a lot in the Amiga demo scene.

By Leo

Paragon (1236)

Leo's picture

28-09-2008, 14:42

you can have a 8bitx8bit look up table in 64kb for division or mult

By AuroraMSX

Paragon (1902)

AuroraMSX's picture

29-09-2008, 10:52

It might then be even more interesting to use the V9990. It's faster than the V99x8 and has a more generic LogOp implementation. (TNXOR anyone? Tongue)

By Grauw

Ascended (10767)

Grauw's picture

30-09-2008, 11:23

Well maybe I’m missing something, or the documentation I’m looking at is not complete (http://msxbanzai.tni.nl/v9990/manual.html#writeoperations , I don’t have the official v9990 at hand :)), but as far as I can see the v9990 has the same logical operations as the v9938, which has TXOR as well… But yeah, it would be much faster :).

By AuroraMSX

Paragon (1902)

AuroraMSX's picture

30-09-2008, 18:29

Well maybe I’m missing something, or the documentation I’m looking at is not complete (http://msxbanzai.tni.nl/v9990/manual.html#writeoperations , I don’t have the official v9990 at hand :)), but as far as I can see the v9990 has the same logical operations as the v9938, which has TXOR as well… But yeah, it would be much faster :).Read closer:
1. I said TNXOR :-P
2. The operations mentioned in the docs are examples. It even says so directly above the table. The four Lxx bits define the truth table for the logop, so you can define any binary operation you want.

By nitrofurano

Champion (303)

nitrofurano's picture

10-06-2016, 23:44

and how can we decompress rle or zx7 pictures into screens 12, 10, 8, 7, 6, 5... ?

By Louthrax

Prophet (2465)

Louthrax's picture

11-06-2016, 00:05

Maybe something like "the game of life" ??

By snout

Ascended (15187)

snout's picture

14-06-2016, 10:51

Might be a silly question, but would 192kB VRAM and/or ADVRAM have interesting benefits for GPGPU use as well?

Page 1/2
| 2