Shadow of the Beast on Amstrad CPC+!!

Page 3/10
1 | 2 | | 4 | 5 | 6 | 7 | 8

By PingPong

Prophet (3413)

PingPong's picture

25-02-2018, 11:59

maxis wrote:
PingPong wrote:
maxis wrote:

Z80 has a very sluggish BUSRQ/BUSACK mechanism, hence it's not used for any kind of synchronization. I.e. Z80 WAIT and the hardware bus multiplexing is in use on CPC in order to preempt the current CPU access and have a video DMA access in place. .

Do you have any reference to schematic about the BUS mux? i cannot find this online. (I'm interested in more detail)

Check out this (the basic CPC464):

thanks !
IC114 and IC114 are the buffers isolating the microprocessor from the RAM. The multiplexing goes via controlling the tri-state of 74ls244 from the gate array.

By maxis

Champion (512)

maxis's picture

25-02-2018, 12:08

Now check out GX4000/CPC PLUS arrangement:

D[0 : 7] and A[0 : 15] are connected to Z80 and dedicated port on Arnold ASIC.
RD[0 : 7] and RA[0 : 7] are connected to the RAM and Arnold ASIC.
Inside the ASIC the usage of tri-state buses is not recommended (timing issues) and there are more than enough resources to do the bus RAM bus multiplexing between video/audio DMA and the CPU.

By sd_snatcher

Prophet (3031)

sd_snatcher's picture

25-02-2018, 14:55

maxis wrote:

BTW, SMS solves the extra bandwidth requirement for the colorful sprites differently - SMS uses the unique and very elegant architecture - 16 bit pseudo static XRAM providing no penalty at random accesses and great throughput (like on the professional video equipment).

Sorry, but this is not true. I already programmed the SMS and its VRAM bandwidth is exactly the same as the TMS9918. XRAM chips are nothing more than PSRAM chips, but the latter term wasn't coined yet. IOW, DRAM chips disguised with a SRAM pinout.

This means that in mode-4 there are absolutely zero VRAM access time slots available to the CPU during the active display area.

Just like the NES, the CPU can only access the VRAM in the VBLANK, otherwise the data will be corrupted. They could only get away with this because it's a dedicated game machine. It's not acceptable for a general purpose computer.

By maxis

Champion (512)

maxis's picture

25-02-2018, 17:40

sd_snatcher wrote:
maxis wrote:

BTW, SMS solves the extra bandwidth requirement for the colorful sprites differently - SMS uses the unique and very elegant architecture - 16 bit pseudo static XRAM providing no penalty at random accesses and great throughput (like on the professional video equipment).

Sorry, but this is not true. I already programmed the SMS and its VRAM bandwidth is exactly the same as the TMS9918. XRAM chips are nothing more than PSRAM chips, but the latter term wasn't coined yet. IOW, DRAM chips disguised with a SRAM pinout.

This means that in mode-4 there are absolutely zero VRAM access time slots available to the CPU during the active display area.

Just like the NES, the CPU can only access the VRAM in the VBLANK, otherwise the data will be corrupted. They could only get away with this because it's a dedicated game machine. It's not acceptable for a general purpose computer.

Look, TMS9918 video overhead bandwidth to sustain the raster, attributes and sprites is significantly lower than for SMS. Hence SMS is based on the 16 bit video architecture in place of 8 bit. XRAM or PSRAM offers a good compromise to reduce the required pincount, since the address and data phases are inteleaved. Without 16 bit XRAM SMS VDP wouldn't be able to lift 4 bpp sprites. Also saying about the great thoughput I meant the total throughput. Throughput wise it's comparable to 16 bit Amiga OCS. I know very well the fact that there is no CPU access to the VRAM in the active screen area. However, the CPU access isn't just limited to VBLANKING. Usually ppl enable/disable the raster based on the horizontal line interrupts allocating the rest of the timing to the CPU.

By PingPong

Prophet (3413)

PingPong's picture

25-02-2018, 17:50

maxis wrote:

However, the CPU access isn't just limited to VBLANKING. Usually ppl enable/disable the raster based on the horizontal line interrupts allocating the rest of the timing to the CPU.

Disabling the raster to do does not appear to me as a viable solution. No raster, no displayed data!
and disabling the raster create the same situation as VBLANK time. so i think to this as an extended VBLANK.

By maxis

Champion (512)

maxis's picture

25-02-2018, 17:59

PingPong wrote:
maxis wrote:

However, the CPU access isn't just limited to VBLANKING. Usually ppl enable/disable the raster based on the horizontal line interrupts allocating the rest of the timing to the CPU.

Disabling the raster to do does not appear to me as a viable solution. No raster, no displayed data!
and disabling the raster create the same situation as VBLANK time. so i think to this as an extended VBLANK.

Ok, we can call it "extended VBLANK", however, VBLANK is a very fixed and well defined portion of the TV raster timing, unfortunately offering no flexibility. Sometimes this kind of rigidity requires 2 different IC masks - one for PAL and the other is for NTSC timing (like on OCS AMIGA for example).
Agreed that during "extended VBLANK" there is no raster on the screen. As for the games, quite often the video display XY area is artificially limited in order to find the best balance between the frame rate and the graphic details. Nevertheless, IMHO, SMS games don't suffer from the aforementioned limitations too much.

By PingPong

Prophet (3413)

PingPong's picture

25-02-2018, 18:31

maxis wrote:

Agreed that during "extended VBLANK" there is no raster on the screen. As for the games, quite often the video display XY area is artificially limited in order to find the best balance between the frame rate and the graphic details. Nevertheless, IMHO, SMS games don't suffer from the aforementioned limitations too much.

this is of course doable on msx2 and in some degree even on msx 1, but it is somewhat a workaround to a limitation, not an advantage.
for example one can wait tms vblank time then disable the screen and re-enable it after a fix amount of time to gain bandwidth.....

By Overflow

Resident (57)

Overflow's picture

26-02-2018, 16:46

@maxis thank you! Kudos to start a thread about Amstrad CPC but on a MSX forum.

I'm the guy behind that "Shadow of the Beast" screen on Amstrad GX4000. Actually I also coded for MSX1 once, a few years ago, see "Io" demo/dentro. I have my own thread on this forum: starting from "Io", then back to CPC6128 with "Logon's Run", and last words about this SOTB screen - you can check bottom of page 15, there are some interesting discussions about similar topics found in this current thread.

-

Let's be honest: 8-bit lately released in 1990, Amstrad GX4000/Plus series is crap - but low-cost. Sure! it has some nice features for a 8bit platform - moreover when compared (injustly) with older older platforms. It can't beat MSX2(+) which was released before - no need (again) to start a flame war. Then? well, MSX2(+) could produce some wonderful fx for a demo - such as this SOTB screen on GX4000. As I wrote in my thread: just missing some motivated guys to turn that "it could" into "yes we did it!"

-

Everyone should know Maggoo's great attempt on SOTB screen. To be fair, it should be compared with my own attempt on stock Amstrad CPC6128. Both screens were made years ago! and OK did not feature that multi-layers thingie (moon+balloon+cloud+tree=4layers). Well! now I'm feeling older and older - I should stop talking about 25+ years old events!

-

Comparing cpu timing from MSX1 and CPC? FYI in "Io" demo the last part is constant-timed: 71364 cycles are spent exactly during one frame - including = decrunching YM to PSG, managing text plus scrolling, and last: color-splitting during main screen. This means: I know about cpu timing. Here are "some" numbers (t-cycles):
(MSX,CPC)
(5,4) LD A,B | JP (HL) | NOP | HALT note: I can confirm "simple" instructions are 4 t-cycles long
(8,8) INC HL | LD A,(HL) | LD A,xx
(11,12) POP | RET | JP xxxx | LD HL,xx
(12,12) ADD HL,BC | DEC (HL)
(12,16) PUSH HL
(13,12) JR xx note: strange, isn't it?
(14,16) DNJZ B>0
(14,16) OUT (C),A note: OUT (xx),A can't run on CPC cos' 16bit I/O bus
(18,20) LDI | OUTI
Draw your own conclusion.

By Grauw

Ascended (8309)

Grauw's picture

26-02-2018, 19:16

Interesting! So CPC rounds everything up to multiples of four, while MSX adds an M1 wait cycle…

(Btw, inc hl is 7 cycles on MSX rather than 8.)

By TomH

Champion (327)

TomH's picture

26-02-2018, 19:43

It's almost true that the CPC rounds everything up to multiples of four but not quite exactly once you factor in interrupt cycles, because the rounding up is an artefact of where the instruction before leaves you versus the three-in-four WAIT signal.

Though I guess that if you've disabled interrupts in order very closely to race the beam, that makes it exactly true.

Page 3/10
1 | 2 | | 4 | 5 | 6 | 7 | 8