new GFX card

Page 12/20
5 | 6 | 7 | 8 | 9 | 10 | 11 | | 13 | 14 | 15 | 16 | 17

By MagicBox

Master (198)

MagicBox's picture

15-09-2008, 18:23

Okay. This is what I will do:

I keep quiet for now and design the VDPX with basic specs as evolved in this thread. Once I'm done and reached a point where I have a concrete design, perhaps I should make a dedicated threat to VDPX. The thing that's going to have to be discussed still is, what will be the preferred base port address (&H98 for standard VDP), register layout etc.

But such discussion starts to be useful when I have register definitions, which I don't have right now. Having gone the Altera way for FPGA, the option is there to chose one with more capacity while retaining the same pinout. Eventually the project may grow to be something that would replace/emulate a bunch of existing hardware, like SCC, FMPAC and Moonsound for as far as FPGA capacity allows. To be continued...

By ARTRAG

Enlighted (6573)

ARTRAG's picture

15-09-2008, 18:24

The OS has to deal with multiple colour resolutions in order to support the colour depth of different applications.
And applications cannot be built in 4 or 5 versions, or the dream multi-platform applications under a single OS will fade.

By MagicBox

Master (198)

MagicBox's picture

15-09-2008, 18:29

It can still do that. However, if SymbOS detects VDPX, it would just let the VDPX do it. I'm sure SymbOS has a sort of a display device driver piece of code.

By Prodatron

Paragon (1801)

Prodatron's picture

15-09-2008, 18:37

Yes, correct. The applications just have their 4 or 16 (or later 256) colour graphics and tell the OS to display them. It doesn't matter for the application, in which graphic mode the computer is really running - this is the problem of the OS (and its display driver).
But if the VDPX could handle the required conversion, it would save a lot of code space and CPU time on Z80 side.

@Trebmint: Such a VDPX version of SymbOS would always run in 256 colour mode and let the VDPX to handle 4 and 16 colour graphics Smile

By Trebmint

Champion (294)

Trebmint's picture

15-09-2008, 19:30

Sorry don't want to hijack this thread as a symbos thing, but I don't think the 16>256 4>256 2>256 and in reverse will work as 256 colour 3:3:2 or 2:3:3 bit layout. They work fine as they are laid out, but they came from the 16 colour down, and then worked into intensity grouping. Won't work very well for 256 unless the 256 colour is palettised. Then you can just drop the most significant nibble etc, and we'll just have to fit a full palette to match a corresponding 1 of the 16.

Where I'd like to see this go is to keep symbos 2/4/16 and perhaps 256 in GUI mode, but have a dedicated 256 full screen, multilayer, mode for games. To keep cross platform compatibility for the games would then be down to developing versions of the VDPX for those symbos ports. Symbos TBH is perfect for games as an OS, but drop the Gui for that

By PingPong

Prophet (3793)

PingPong's picture

15-09-2008, 20:23

@MagicBox: i've one suggestion about memory mapping. You told about using a 16KB page mapped into z80 address space. I ask you to allow to select the bank number via a out(port),bank. Only a out. Because otherwise the z80 had to do too much work.

You spoke about old vdp style of addressing. Good for block move operations. I give you a suggestion that is useful on some situations.
Split the vram ptr into 2 vram ptrs, one for read, one for write, or better, give the ability to have 2 vram ptrs. In this way, a copy command that should be performed *at any cost* by cpu is fast. just do something like:

in a,(...)
out(...),a
djnz loop

Plus give the ability to increase the vram ptr by a value in a register, not only by 1 but instead with a value ranging from (0->256?). On the NES gives a good help in block move operations.

Do you think it's possible to push all of suggestion on FPGA?

Thx.

By Jorito

Mr. Ambassadors (1771)

Jorito's picture

15-09-2008, 20:33

Once I'm done and reached a point where I have a concrete design, perhaps I should make a dedicated threat to VDPX.

Won't that scare the poor VDPX? What if it runs away? Tongue

By MagicBox

Master (198)

MagicBox's picture

15-09-2008, 20:56

Ok you caught a typo Tongue
@PingPong:

Yes, the idea was to have a 16KB window. There would be an OUT register to define the slotpage, 0 ~ 3 (with off-bit to disable memory mapped mode) and ofcourse an OUT to select the VRAM bank (0 - 31 normal VRAM 512KB total and bank 32 being a special bank containing the sprite attribute table and the palette table).

Adding a read/write pointer for VRAM access through I/O are small logic functions easy to be added. Even the stepsize as you suggest can be implemented.

The most complex (and resource consuming) part will be the sprite and pattern engine. Not to forget the VRAM arbiter, the logic that decides who gets to read/write from VRAM. This will be both sprite and pattern engine as well as Z80.

The reason why concurrent access is possible because the Z80 will take only 1 out of 60 VRAM cycles using the fastest Z80 memory/io acces by use of LDIR/OTIR.

The sprite engine does something smart. Before a new frame, it will copy all the 128 selected (through attribute table) sprite patterns into internal SRAM of the FPGA that runs at 200MHz. The copy will be completely finished in 1 scanline at most, even with concurrent Z80 access. When the screen is built, there are 60 cycles available per pixel. Since sprites can be processed 4 at a time (128 bits are evaluated per cycle), after 32 cycles raw ram access the engine has determined whether there's a sprite dot to show and if, which sprite and which dot in the pattern.

Meanwhile the pattern engine performs 16 accesses out of 30 available per pixel. After those 16 accesses, it will determine in one clock what the resulting pixel will be, based on the 8 pixels read for each layer and the possible sprite pixel. This output is then buffered, ready for the pixel clock to pick it up.

I forgot one point: Why use a Z80 to copy VRAM when the VDPX itself can copy around pixels 400x as fast (a Z80 cycle is 4 pulses (T1 - T4). In that time, the VDPX can move around 10 per clock = 40 pixels in a Z80 machine cycle. I'm not sure, but a DJNZ'ed IN/OUT takes around 10 machine cycles. When the Z80 copied 1 pixel, the VDPX would have copied 400 pixels. So why use Z80 copying.. it's obsolete with this VDPX Wink

By ARTRAG

Enlighted (6573)

ARTRAG's picture

15-09-2008, 21:12

Well, in theory copying vram to vram trough the z80 it should avoided, but in same cases there are elaboration that only the z80 can do (eg. on the PNT).

so you can have

in a,(...)
compex elaboration
out(...),a
djnz loop

Think to a VRAM de-packer that does not use RAM buffers.
(MSX-o-mizer is slow due to this kind of problems)

By PingPong

Prophet (3793)

PingPong's picture

15-09-2008, 21:21

Ok you caught a typo Tongue
@PingPong:

I forgot one point: Why use a Z80 to copy VRAM when the VDPX itself can copy around pixels 400x as fast (a Z80 cycle is 4 pulses (T1 - T4). In that time, the VDPX can move around 10 per clock = 40 pixels in a Z80 machine cycle. I'm not sure, but a DJNZ'ed IN/OUT takes around 10 machine cycles. When the Z80 copied 1 pixel, the VDPX would have copied 400 pixels. So why use Z80 copying.. it's obsolete with this VDPX Wink

Yes, in the majority of situations is meaning less. but think about a simple color replace routine, that change a pixel color with another, those functions are too much specific to be implemented on VDPX. In similar functions as ARTRAG pointed out, the CPU is also working.

Have you tought about io port memry that you will use? the range between 98-9b should be reserved in order to conserve the old vdp....

Page 12/20
5 | 6 | 7 | 8 | 9 | 10 | 11 | | 13 | 14 | 15 | 16 | 17