TMS9918 sprite fetch timing

Page 1/2
| 2

By TomH

Champion (335)

TomH's picture

28-11-2017, 19:06

This question possibly relates to the 9938 too, but I'm endeavouring to start at the begging and work my knowledge up.

Per my understanding of http://map.grauw.nl/articles/vdp-vram-timing/vdp-timing-2.html in graphics mode 2 two things occur for display of sprites: during the course of line n, sprite y position fetches are (mostly) interleaved with colour, pattern and name fetches. From that the visible four are picked. Full details of those four are fetched during the right and left borders, and they appear on line n+1.

Follow-up questions:

  • what's the rule for sprites on line 0? Does the VDP have an undepicted memory access pattern for the line one before the display, does it do the equivalent of a full character fetch but throw the non-sprite data away? Does it carry over information from the last line 191? Does it not draw sprites at all?
  • similarly, what happens if I switch mode mid-display, or decide to switch the screen on when it's previously been off? No sprite output for the first line? Nonsense sprite output for that line (e.g. third and fourth sprite data is still fetched from somewhere and displayed depending on historic state, but the first and second sprites are definitely all shifted out, so invisible)?

I've read elsewhere that mode and display on/off changes don't take effect until the end of the line, and have assumed that to mean during sync. If it's before the right border then obviously the nonsense sprite output comment would be different.

Login or register to post comments

By ARTRAG

Enlighted (6923)

ARTRAG's picture

28-11-2017, 22:59

I cannot answer about line 0 and line 191, but I do not think that mode change or display on/off are buffered until end of the line. On the contrary, having a "clean" screen split is usually very tricky.

By TomH

Champion (335)

TomH's picture

29-11-2017, 00:23

ARTRAG wrote:

I cannot answer about line 0 and line 191, but I do not think that mode change or display on/off are buffered until end of the line. On the contrary, having a "clean" screen split is usually very tricky.

I got that from Grauw's Screensplit programming guide: "With seamless I am referring to the screen splits whose effect gets delayed by the VDP until the end of the current line, and therefore they always look pretty. As far as I can tell, there are two kinds of splits which are ‘seamless’, those are screen mode splits (does not include setting new table values) and screen blank splits using bit 6 of r#1. Although the screen mode splits themself are ‘seamless’, because the table base address registers aren’t you still need to put a little effort in it."

That document appears to be from 2003, so maybe it contains information now known to be inaccurate?

By ARTRAG

Enlighted (6923)

ARTRAG's picture

29-11-2017, 18:18

Blanking the screen does not get delayed until the end of the line.
You have to tune the split point using the HR flag (on msx2)

By TomH

Champion (335)

TomH's picture

29-11-2017, 18:30

Okay, so the document is incorrect. Fair enough. It'd be interesting to know whether the line's access pattern is locked-in at the start of the line, but I think that's probably deep into the weeds.

I've otherwise decided to take the hint of sprite y positions needing to be off by one but with y=255 positioning a sprite on the first line rather than e.g. y=192 that there is a special case for collecting that first line of sprite coordinates It also seems to make most sense to me that it'd be one before the main display, so that all changes you time to the interrupt work properly. So I'm just going to assume that.

Which means 69 lines of fast VRAM access in that window after the interrupt in NTSC, 120 in PAL?

By Grauw

Ascended (10699)

Grauw's picture

29-11-2017, 20:14

ARTRAG wrote:

Blanking the screen does not get delayed until the end of the line.
You have to tune the split point using the HR flag (on msx2)

I just tested and my test does not match your claim… Smile The blanking bit takes effect somewhere at the start of a line. Probably the blanking state is copied from that bit at the transition of the left border into the active display area. Though it could be a bit earlier as well, it’s hard to say precisely.

Before I blank I set the foreground colour to red and the background colour to light blue, after reactivating display I restore them. As you can see the red part continues until the end of the line (when the blanking takes effect), and when the light blue turns into dark blue two lines later, it keeps displaying the background colour until the next line. This test is on a turboR, openMSX gives the same result. Ignore the text btw, I hacked this test into another project :).

So it’s a good way to mask various other types of splits (e.g. display offset) without needing careful timing to make them occur in the horizontal blanking area, if a blank line is acceptable (e.g. between game level and a status bar).

p.s. I should update that article someday, indeed it’s pretty old by now and the writing style is a bit “20-yo” :). Maybe next time when I do some screensplit programming and feel like doing some new tests. One thing not mentioned there: a line takes exactly 228 cycles on a 3.58 MHz Z80.

By TomH

Champion (335)

TomH's picture

29-11-2017, 20:55

Grauw wrote:

p.s. I should update that article someday, indeed it’s pretty old by now and the writing style is a bit “20-yo” Smile. Maybe next time when I do some screensplit programming and feel like doing some new tests. One thing not mentioned there: a line takes exactly 228 cycles on a 3.58 MHz Z80.

I recently reread some of the documentation I wrote as a 20-year old. It's absolutely dreadful.

That aside, thanks for running the test! I guess the intermediate pinks are just because you used a long exposure and captured several frames?

Separate question on 228 cycles: I've noticed that the Portar docs quote 227.75, which sounded a little fishy to me, and there's at least one thread here suggesting an observed 228.5. But I recall reading that the TMS plays fast and loose with the colour burst — it's technically non-standards conformant because it puts out the same phase of colour burst every line — so could it be that the non-composite chips, the 9928 and 9929, are often clocked slightly differently? 228.5 would allow space for the half-a-cycle line-on-line phase difference that's normal in NTSC, and 227.75 would similarly accommodate the quarter-of-a-cycle that's normal in PAL, in both cases without affecting the chip's perceived whole-line timings.

Could just all be coincidence, of course.

By Grauw

Ascended (10699)

Grauw's picture

29-11-2017, 21:38

Yes that’s the camera.

As for 228, check appendix 7 of the V9938 application manual, it quotes 1368 cycles per display line, and how much time it spends on what. The VDP runs at 6x the clock rate of the Z80; 1368 / 6 = 228. The VDP can be configured to have 1365 cycles per display line, but I think that’s when it’s in genlock mode (and the real time would actually depend on the input sync). See also the VDP timing article part 1.

Given the x6 clock rate of the VDP, 227.75 is not possible, and in the S01=2 mode it would be 227.5.

By TomH

Champion (335)

TomH's picture

30-11-2017, 20:33

Grauw wrote:

As for 228, check appendix 7 of the V9938 application manual, it quotes 1368 cycles per display line, and how much time it spends on what.

I'm asking about the 9918/9928/9929 rather than the '38, but just to complete the record: in this thread, when discussing machines on which his heavily line-timed demo IO appears not to work, Overflow makes the comment:

Overflow wrote:

The demos needs/assumes that (almost) 71364 cpu-cycles set a frame, which means 313 scanlines of 228 cycles each.
On NMS8280? well, there's something like 228,5 cycles on a scanline, WTF?

So it definitely sounds like he has in effect written a test case for 228 CPU cycles/line, found that NMS8280 fails to pass the test case, and observed that it appears to be somewhere around 228.5.

It looks like that machine was sold specifically for genlocking, so similar guesses about independent clocks and video timing concerns are probably appropriate.

By TomH

Champion (335)

TomH's picture

30-11-2017, 20:49

Upon rereading more of that thread, I see that Grauw is already on there, and fully in agreement about the diversity of different implementations of the 9918 feature set. So I think I've failed here to be sufficiently explicit about the scope of my questioning. Apologies for the confusion.

By Grauw

Ascended (10699)

Grauw's picture

30-11-2017, 23:51

The reason why Overflow counted 228.5 on that MSX2 is because the VDP clock crystal and CPU clock crystal are not the same in that machine (that’s not uncommon on MSX). As you may know a clock crystal has a certain deviation so indeed they can not be perfectly in sync, and you can not rely on counting exactly 228 cycles. That’s why despite the impressiveness of the demo, he got a little bit of flak about that style of cycle-exact programming, because compatibility Smile.

However it’s a useful factoid for math on whether something is feasible and fits within one line, or to time things roughly one line apart if you don’t need to be pixel-perfect (though consider turbo CPUs and sync on the HR flag preferably Big smile).

About TMS9918, going by what Overflow said, it’s also 228 cycles. Wonderful! Smile

Page 1/2
| 2