OUTI takes 1 extra cycle on Turbo machine

Page 3/4
1 | 2 | | 4

By Grauw

Enlighted (7321)

Grauw's picture

28-07-2018, 21:18

Yeah those wait states can be pretty annoying… I’m really mystified as to why they were introduced, surely they put a lot of value on compatibility, so why do that. I reckon it has to do with the 5.37 MHz turbo mode that the T9769 supports, used in the Panasonic FX, WX and WSX (but not the others), but it’s quite the oversight to keep it in place when the turbo is disabled. And since this engine was introduced with MSX2+, why not use the V9958 WAIT function instead. Oh well, no use crying over spilled milk.

By gdx

Prophet (2161)

gdx's picture

29-07-2018, 12:10

Eugeny_Brychkov wrote:
Grauw wrote:

To be clear, most MSX2+ machines also have these wait cycles, even two of them:

And I guess it is not defined in the MSX standard... One more reason why MSX faded away from the market - companies started to make/release "custom" hardware competing with PC. This is very small deviation or variation on the design, but severe obstacle for using machine for RTOS and real time tasks like audio and video.

Since these machines have a turbo mode, it can compensate for this extra waiting time, right?

By Grauw

Enlighted (7321)

Grauw's picture

29-07-2018, 13:02

The Sanyos and Panasonic FM and Al Alamiah don't have a turbo mode...

By DarkSchneider

Paladin (716)

DarkSchneider's picture

29-07-2018, 14:57

@Grauw because laziness. From (the released) MSX2+ and above the MSX systems were garbage simply by small details easily fixable. It is clear they have no intention to pay much attention to the system.
I have my own opinion about what happened and responsibilities but probably this is not the thread to talk about it.

By gdx

Prophet (2161)

gdx's picture

29-07-2018, 14:58

Dammit ! Sanyos too.

By Eugeny_Brychkov

Paladin (964)

Eugeny_Brychkov's picture

10-08-2018, 15:44

We are taking measurements and some checking of the Panasonic FS-A1WX machine with Pencioner, and results are the following:

  • In non-turbo mode:
    • Checked/flipped bits of S1990 ports (ports E4/E5, which are not there) with no results;
    • Checked all 256 switched index regs (port 40), only #0 and #8 is returning NOT of value written. 0 is logical as FF (no data) inverted is 0;
    • Checked/flipped bits of switched index reg #8 (ports 41-4F) of all bits except processor and ROM/DRAM change bits, running performance measuring tool, for all the circumstances it shows 279/280 nanoseconds additional wait state for writing to VDP ports.
  • In turbo mode:
    • EX (SP),HL instruction runs one T-cycle less that LD A,(IX+0), which by the Z80 documentation is not correct. Seems “turbo” mode is not only about increasing speed by 1.5, but also change timing of the commands. Might be a kind of “prequel” of the R800 inside;
    • The measurement tool performs a set of OUTs in a row, and there's delay of additional 5.962 us (32 T-cycles @ 5.37) between OUT commands.
  • I am looking for ideas what else can we do and what should we look into. The goal is to disable these additional VDP I/O wait states in non-turbo mode.

    Edit: I will put measurement of these wait states into next release of _NETSYSINFO. Along with measurement of RTC accuracy.

By Grauw

Enlighted (7321)

Grauw's picture

10-08-2018, 18:54

Eugeny_Brychkov wrote:

EX (SP),HL instruction runs one T-cycle less that LD A,(IX+0), which by the Z80 documentation is not correct. Seems “turbo” mode is not only about increasing speed by 1.5, but also change timing of the commands. Might be a kind of “prequel” of the R800 inside;

No, that is correct; on MSX ex (sp),hl takes 20 cycles, and ld a,(ix + 0) takes 21. This is due to the M1 wait cycle which is present on the MSX architecture (and btw also documented in the Z80 manual), and due to the latter having two M1 cycles because it’s a prefixed instruction.

This is the case on all MSX computers, and should not differ between turbo and nonturbo modes. I don’t think a nonstandard Z80 is likely, especially not a prequel to the R800.

See this instruction reference table, particularly the “Timing Z80+M1” column, which is the one you should use as instruction timing reference for MSX.

(Btw, doesn’t GR8BIT have the M1 cycle wait circuity implemented as well?)

Eugeny_Brychkov wrote:

The measurement tool performs a set of OUTs in a row, and there's delay of additional 5.962 us (32 T-cycles @ 5.37) between OUT commands.

Interesting data. I presume this is OUTing to the VDP? Since I would expect the VDP to receive a longer wait than the other I/O ports. That’s quite a strong wait, similar to the Turbo R…

By Eugeny_Brychkov

Paladin (964)

Eugeny_Brychkov's picture

10-08-2018, 20:34

Grauw wrote:

No, that is correct; on MSX ex (sp),hl takes 20 cycles, and ld a,(ix + 0) takes 21. This is due to the M1 wait cycle which is present on the MSX architecture (and btw also documented in the Z80 manual), and due to the latter having two M1 cycles because it’s a prefixed instruction.

But you see on the picture that difference in execution time in "turbo" mode between ex (sp),hl and ld a,(ix + 0) is -186 ns (negative, minus 186 nanosecond). This is test on the real machine. And it means that specification is not fulfilled. I am stuck with it, as it should NOT be this way per both Z80 specification and MSX specification.

Here's the code, both loops running with DE=2048 as input. I must say that I do NOT assign IX any value, using arbitrary one. Also note that machine is MSX2+ (thus no DRAM mode in there).

waist1:
	ex	(sp),hl		; 19+n
	dec	de		; 6+n
	ld	a,d		; 4+n
	or	e		; 4+n
	jr	nz,waist1	; 12+n
	ret			; --- 45+5n (t0)
waist2:
	ld	a,(ix+0)	; 19+2n
	dec	de		; 6+n
	ld	a,d		; 4+n
	or	e		; 4+n
	jr	nz,waist2	; 12+n
	ret			; --- 45+6n (t1)
Grauw wrote:

This is the case on all MSX computers, and should not differ between turbo and nonturbo modes.

The real test shows opposite!

Grauw wrote:

(Btw, doesn’t GR8BIT have the M1 cycle wait circuity implemented as well?)

It does, configurable by switches from 0 to 15 wait states.

Grauw wrote:

Interesting data. I presume this is OUTing to the VDP? Since I would expect the VDP to receive a longer wait than the other I/O ports. That’s quite a strong wait, similar to the Turbo R…

I calculate time difference between 2048 OUTs in a row to the GR8NET port (5E) to the port 99 (VDPrw) and port 98 (VDPmw). Output to port 5E is expected to have no extra wait states, and these two values represent nanosecond difference with outputting to specific port.

Wouterv said that such long time (32 T-cycles) can be a cumulative delay time between OUT commands to VDP ports, thus the time is free to perform some other operations between two OUTs, but this requires specific programming technique certainly not followed by the game and other application developers in early MSX years. It also means that "streaming" video is not possible with such MSX2+ and Turbo-R machines as there is nothing to be done between moving data into VRAM (video is actually is only about moving data to the VRAM, not accounting for audio data which is 10 times less than video data).

I can achieve such nanosecond precision with GR8NET as it has controlled generator with 100 ns time step.

My main question for this thread is still: where to look at and what to try in this MSX2+ machine based on T97* chip looking for the setting disabling additional wait state of 279 ns for VDP in non-turbo mode?

By Pencioner

Hero (637)

Pencioner's picture

10-08-2018, 21:31

gdx wrote:

Since these machines have a turbo mode, it can compensate for this extra waiting time, right?

i tried with Manbow 2, Aleste, and Space Manbow and there are no delays in 5.37 mode (Carnivore 2 has great option for turning MSX2+ turbo on and off without patching roms so it was easy to test)

By Eugeny_Brychkov

Paladin (964)

Eugeny_Brychkov's picture

20-08-2018, 22:33

Small update: searching through S1990 Turbo-R registers (#e4/#e5) in Turbo mode did not give any wait state performance change, however it seems register #6 has bit 7 which causes machine to reboot. This bit seems not to be documented.

Turbo-R technical handbook also lists port #e3 as "?", we scanned it too without any finding.

Page 3/4
1 | 2 | | 4
My MSX profile