R800 clock cycles per instruction

صفحة 2/2
1 |

بواسطة wouter_

Champion (508)

صورة wouter_

25-06-2011, 22:49


Most (all?) R800 timing tables indeed list 2 cycles for a LD A.(HL) instruction. But that doesn't count the penalty cycles for switching between opcode fetching and data read/writes.

This reminds me what I meanwhile have read about 256 byte DRAM pages, typically code is in a different page, I guess that is meant with "switching". Means with code and data in same page, ld a,(hl) should go in 2 cycles.

Unfortunately that's not the case. Even when data and code are in the same 256-byte memory region, there's still this 2 cycle overhead. See that doc/r800test.txt document for details.

بواسطة WORP3

Paladin (864)

صورة WORP3

01-07-2011, 08:43

One more question, what is the exact clock frequency of the turbo-r, is this 7.16MHz ?
I need it for re-calculating some functions to R800 timing !

بواسطة wouter_

Champion (508)

صورة wouter_

01-07-2011, 10:46

One more question, what is the exact clock frequency of the turbo-r, is this 7.16MHz?
Indeed, exactly twice as fast as the Z80 clock.

I need it for re-calculating some functions to R800 timing!
It's very hard to calculate exactly how many cycles a routine on R800 will take. For Z80 you could simply add all the cycles of the individual instructions. The only difficulties on Z80 are:
* Compared to most Z80 timing tables you find, you have to add 1 or 2(!) extra wait cycles.
* Some instructions have two possible cycles counts (e.g. taken/non-taken conditional relative jump, last iteration of a LDIR, ...)

For R800 there are many more exceptions:
* Compared to most R800 timing tables, you have to add cycles for 'page-breaks' (see the 'LD A,(HL)' example earlier in this thread). A page-break occurs when the upper 8-bits of the (16-bit) address for a memory read/write are different from the previous memory read/write. A page break _also_ occurs when the R800 switches between opcode fetching, memory reads or memory write (though there are exceptions (e.g. 'EX (SP),HL'). There's also always a 'page-break' (or at least the extra cycle) when accessing any other memory than the internal RAM.
* Refresh cycles: __approximately__ every 184 or 185 cycles the R800 is stalled for 26 cycles, to refresh the (internal) RAM. Note that the exact timing behavior for refresh is not clear to me yet (help would be appreciated!). According to some (vague) documentation the timing may also be different on a turbor FS-A1ST compared to a FS-A1GT (I didn't test this myself).
* The internal R800 clock runs at 7MHz, but the external clock for the MSX cartridge slots is still 3.5MHz. So if you e.g. perform an OUT instruction it's possible the R800 has to wait for 1 cycle (1 cycle @ 7Mhz, halve a cycle @ 3.5MHz) to get 'aligned' to the external bus timing.
* JR and DJNZ takes 1 cycle longer than expected when that instruction starts at address 0x..FE (see doc/r800-djnz.txt for details).
* directly after a EI instruction there is never a RAM refresh (so a very long sequence of EI instructions executes faster than an equally long sequence of NOP instructions!)
* (likely there are more yet unknown exceptions)

All the exceptions above are implemented in openMSX. Though there are still cases where the R800 timing on a real machine is different than the emulated timing in openMSX. (My current best guess in that the model for the refresh timing is not 100% correct. Any help here would be very much appreciated!!)

So I'm afraid if you want to "re-calculating some functions to R800 timing", the best you can do is get an approximate or an average timing.

بواسطة WORP3

Paladin (864)

صورة WORP3

01-07-2011, 15:28

Thanks for the info, luckily i only need to guarantee that a routine will take-up a minimum time, all the above turbo-r issue's will only make it slower which will be just fine Wink
When i got the time i will use a timer to determine a more exact timing so i can make it auto switching between a turbo-r, 7 mhz or standard msx.

Thanks,
Tjeerd.

بواسطة sd_snatcher

Prophet (3642)

صورة sd_snatcher

01-07-2011, 20:35

@WORP3: the correct way of doing this kind of delay on a MSX Turbo-R is to use the System-Timer at I/O port E6h/E7h. Take a look at the MSX Assembly Page for more info on how to use this timer.

What exactly do you want the processor to wait for?

بواسطة WORP3

Paladin (864)

صورة WORP3

02-07-2011, 17:17

i had to recalculate the YM2413 routines, they need a minimum dead time after writing the address or data poort !
I can use the internal MIDI-PAC timer to determination the cpu type Wink

بواسطة sd_snatcher

Prophet (3642)

صورة sd_snatcher

02-07-2011, 20:30

I was suspecting that was the case. Smile

In other words, you want a Turbo-compatible way to write data on the YM2413. The two routines below will solve your problem without the need for complex recalculations.

1) The direct-hardware access YM2413 writing routine

This routine is the fastest one and is compatible with all existing MSX machines. But it doesn't follow the MSX programming guidelines as it does direct I/O access to the YM2413 I/O ports. This means it will fail to pass the Acid2Test. I used it in my old TurboFixes, but because it don't pass the Acid2Test I'm now migrating to the second solution.

This routine does support the Z80 up to 7.14MHz on machines without a system-timer, and any CPU speed on machines with a built-in SystemTimer.

WRITE2413:
; Input: E=Register, D=Data
; Modifies: none

	push af
WAIT1:
	in a,(0E6h)
	cp 6
	jr c,WAIT1
	out (0E6h),a
	ld a,e
	out (07Ch),a
	out (0E6h),a
WAIT2:
	in a,(0e6h)
	cp 1
	jr c,WAIT2
	ld a,d
	out (07Dh),a
	out (0E6h),a
	ex (sp),hl
	ex (sp),hl
	pop af
	ret

2) The MSX compliant YM2413 writing routine

According to the MSX-Datapack, the correct and fully compatible way to write on the YM2413 is to use the MSX-Music BIOS. And there's really some games that use this method:

- Nearly all ASCII and MSX-Magazine games (i.e.: Penguin Wars-2, Dante-2, Super Zelixer, Fleet Commander II, Lübeck etc)
- R-type

The algorithm is simple:

1) On your MSX-Music detection routine, store the slot were you found the MSX-Music BIOS.

2) Write on the YM2413 chip using the routine below:

WRTOPL	equ	4110h
MSXBIOSSLT:	db	0	; Slot where you found the FM-BIOS


WRITE2413:
	ld	iy,(MSXBIOSSLT-1)	; Get the slot where the FM-BIOS was found
	ld	ix,WRTOPL
	jp	CALSLT

It may seem inefficient for some people, but I already tested it on FireHawk and the overall performance impact was negligible. And it has the advantage of being a very compact routine.

بواسطة Edwin

Paragon (1182)

صورة Edwin

02-07-2011, 23:13

* Refresh cycles: __approximately__ every 184 or 185 cycles the R800 is stalled for 26 cycles, to refresh the (internal) RAM. Note that the exact timing behavior for refresh is not clear to me yet (help would be appreciated!). According to some (vague) documentation the timing may also be different on a turbor FS-A1ST compared to a FS-A1GT (I didn't test this myself).

I have had others do some testing a while back. It has nothing to do with the ST/GT version, but with the amount of internal memory. As a result, machines with 512k (both ST and GT) will run approximately 2.5% slower than those with 256k. I believe a 1MB extension board doesn't change it because it's wired differently and is not detected by the refresh circuit.

بواسطة WORP3

Paladin (864)

صورة WORP3

03-07-2011, 00:57

@SD-Snatcher.
Thanks for the routine, i really like this kind of structures Big smile
I already had it adjusted on my software buy maybe this would be a nice improvement as i don't have to fiddle around to determine the cpu type anymore Wink Not that i need it to be fast, it's only for a configuration program and not for a music player routine or something !

How is the experience with this routine in combination with different kind of msx machines ?
(As for some reason an old msx will not return FF while reading the E6 address, it could crash !)

صفحة 2/2
1 |