NMS 8245 vdp upgrade

Page 4/6
1 | 2 | 3 | | 5 | 6

By Grauw

Enlighted (6040)

Grauw's picture

20-08-2017, 02:02

Quoting a bit selectively… Smile I was referring to have a separated internal and external bus.

On the turboR the external bus timing is delayed by the S1990 to be within standard MSX bus parameters; 3.58 MHz clock, 2 cycles for memory access, 3 cycles for I/O access. The idle time between bus accesses is reduced of course, but that’s good, by using the bus bandwidth more effectively we get better performance.

Also internal hardware designed to run at 3.58 MHz should be on the external bus (e.g. OPLL, VDP, etc). Combine that with a connected V9958 /WAIT, and it might just work?

Basically what I meant was, the way most 7MHz turbo circuits and the 1chipMSX turbo modes work, where they speed up the external bus, is not right.

By sd_snatcher

Prophet (2540)

sd_snatcher's picture

20-08-2017, 02:04

But the standard MSX bus parameters are 5 cycles for memory and 6 cycles for I/O. The R800 goes to the external bus to access the ROMs, the YM2413 and the T9679C much faster than that.

Again, The S1990 only forces it to respect the 5 cycles rule when it's doing memory accesses to the external slot connectors. And even that isn't enough if you count slow disk interfaces. This is why they made a software workaround on DOS2, to disable the R800 on disk accesses.

But being too conservative on a machine design can also have bad consequences. And the R800<->VRAM access speed is the perfect example of that. If I could have chosen the tradeoff, I would have picked faster VRAM access for the R800 without thinking twice. Smile

The slow disk performance of the DOS2 also comes to mind. Nextor is not that conservative and has way better results.

By Grauw

Enlighted (6040)

Grauw's picture

20-08-2017, 02:21

sd_snatcher wrote:
Grauw wrote:

On the turboR the external bus timing is delayed by the S1990 to be within standard MSX bus parameters; 3.58 MHz clock, 2 cycles for memory access, 3 cycles for I/O access. The idle time between bus accesses is reduced of course, but that’s good, by using the bus bandwidth more effectively we get better performance.

But the standard MSX bus parameters are 5 cycles for memory and 6 cycles for I/O. The R800 goes to the external bus to access the ROMs, the YM2413 and the T9679C much faster than that.

Just to clarify, with 2-3 cycles I’m referring to bus cycles, so respectively 4-6 R800 cycles.

Surely the YM2413 is connected to the external bus as well? The chip is built to operate around 3-4 MHz afaik, and pretty sure I/O to it is delayed just like any other I/O is… I can’t imagine that it’s getting a 3.5 MHz clock (because pitch), but full-on R800 high speed timing signals. Unless Yamaha made a custom YM2413 to operate at those speeds.

By sd_snatcher

Prophet (2540)

sd_snatcher's picture

20-08-2017, 02:42

Some people find this confusing, but the MSX bus in in fact asynchronous. Otherwise you couldn't never connect things like EPROMs or SRAMs to it. The /WAIT signal is typical of asynchronous buses.

The very same applies to chips like the YM2413, PSG and SCC. Their clock lines are only used to drive the internal frequency generators. From the design point of view, they're not different from an UART or the VDP, since they can have their own clock for its internal processes, but the communication with the CPU is entirely asynchronous.

For example the Sega FM-70 OPLL sound expansion for the Master System has its own crystal and has no issues to communicate with a Z80 that runs faster than the one inside the MSX. A lot of MSX models have one crystal for the VDP and another crystal for the CPU.

The only reasons why they reused the 3.57MHz clock from the VDP for the PSG, YM2413 and SCC was to save some pennies on a crystal. External cartridges should never have done that, because the clock pin from the slot is the CPU clock and not some "fixed general purpose clock".

The clock line of the bus is only needed for the rare cases when you need real synchronization with the CPU timing. But in those cases it's vital. For example, DRAM controllers. They need the clock to make the RAS/CAS transitions at the correct time.

By DarkSchneider

Champion (511)

DarkSchneider's picture

20-08-2017, 09:42

Grauw wrote:

No need for a turbo Big smile, at the regular Z80 speed I have just recently been accessing the V9938 too fast in GRAPHIC4 mode while writing sprite attribute data with a series of 4x out (c),r during execution of a VDP command. In TEXT2 mode is is also very easy to surpass the limits, the VDP has very few access slots in that screen mode and is slower than TMS9918, ran into that with Synthesix a couple years back.

But it doesn’t really bother me, just something I’ve learned from the beginning to take into account, at least I’m glad I know now what needs a wait and what doesn’t, and why, and how much (in the old days I was just adding spurious nops between any VDP access). Even if it was handled for the VDP, sound chips have the same need, even more so, so in my mind this is just a normal part of assembly programming.

Btw, I wonder what this note in the datapack (about the WTE setting) is about:

“ When setting this bit to "1", please make dummy VRAM access before that. ”

If you don’t do this, could it somehow perhaps halt the system or something?

What? I though the wait circuitry avoided corruption on MSX2 or later for fast accesses. Then can corruption occur?

By Grauw

Enlighted (6040)

Grauw's picture

20-08-2017, 13:41

Yes, MSX2 or later does not have wait circuitry. Only machines with turbo do, out of necessity, to avoid VDP corruption in software not designed for turbo.

By Grauw

Enlighted (6040)

Grauw's picture

20-08-2017, 14:31

@sd_snatcher I took a long look at the turboR schematics, and the following is my understanding:

The S19990 communicates with the T9769C via the same data path as the external slots, so with standard timing. ROM, VDP, OPLL and FDC connect to the S1990 directly, and we know that it has different timing for the ROM and VDP. For the OPLL I’m pretty sure the S1990 applies the same timing to it as the external bus, doubt it can handle more. Idk about the FDC and devices built-in the S1990, like the timer and PCM, but afaik openMSX emulates that all I/O is throttled to external bus speeds.

(Something I found interesting is that, RAM is directly controlled by the R800, although I think the S1990 does manage the paging and introduces some waits, and the Z80 accesses the RAM through the (inactive) R800.)

Anyway so all I wanted to say originally is that I think a properly designed turbo should communicate with the external bus according to MSX standard timing, like the turboR does. This means 3.58 MHz clock, and read/write timing for memory and I/O which match (more or less) the Z80 at 3.58 MHz. Otherwise there are compatibility issues, and not just with chips who use the clock.

So given that, I wonder if the V9958 is connected to the external bus, whether connecting /WAIT to the CPU is then sufficient for it to work smoothly, because it gets the normal read/write timings that it was designed for.

By mohai

Hero (649)

mohai's picture

20-08-2017, 16:44

I think that WAIT pin must be connected to CPU somehow.
V9958 is faster than previous VDP versions but legacy compatible. I mean, it can deal with data faster than previous versions, but this means an exclusive speed for this chip.
Legacy software does proper delays to avoid VRAM corruption (in 9918, 9938 ...) so, it will work with no problem for sure. I think that if you want to transfer data faster and forget about software delays, you can do it because WAIT signal activates when needed. Let's say it is a pure V9958 way.
Imagine that you are doing some program for V9958 and you are doing software delays when writing or reading to VRAM. You will need to recalculate every delay, depending on the speed of the MSX. WAIT signal avoids it.
As an example, as I said before, try F1 Spirit 3D in an MSX with no WAIT signal. You will see a lot of garbage in the in-game screen. It works perfect in an MSX with WAIT signal connected. I tried it in my 8245+ with and without it.

By mohai

Hero (649)

mohai's picture

20-08-2017, 16:48

The problem with the TurboR is that a new VDP (or maybe 2 VDPs in paralel) was in the first design but, due to sales dropping fast and time delays, V9958 was used instead, so a simple solution was used to keep compatibility with this chip, while having R800 running at faster speeds.
Again, WAIT signal was very convenient to this.

By Grauw

Enlighted (6040)

Grauw's picture

20-08-2017, 17:07

V9958 is not faster than V9938, its design is near-identical with just minor additions. And neither turboR nor MSX2+es with turbo use the V9958 wait signal, they use another wait mechanism. The only performance benefit the V9958 could offer turbo machines is that, VRAM access can be done with the absolute minimal amount of wait required if the wait is connected (but it’s not).

Also, F1 Spirit 3D uses screensplits heavily (every line), so by its nature it is very timing-sensitive and if the code is not written carefully to take turbo CPUs into account, it will not look nice. A connected V9958 wait signal will not help, because it only triggers on VRAM access, while screensplits are done with only register read/writes. This is why every turbo circuit has a non-turbo mode for compatibility.

On a side note, V9938 register I/O is really quick and as far as I know does not require a delay other than the normal I/O bus timings. It’s only the VRAM access which is slow due to the access slots mechanism (for sharing VRAM access with the display circuitry). A shame that the turboR adds a big delay regardless. The V9958 wait signal does this better (but again, it’s unused).

Page 4/6
1 | 2 | 3 | | 5 | 6
My MSX profile