A Doom engine in about 2000 lines of C

Page 3/4
1 | 2 | | 4

By Grauw

Ascended (10159)

Grauw's picture

07-06-2017, 10:53

ARTRAG wrote:

Have you already implemented portal rendering?
Do you aim to bitmap rendering or to flat color walls?

No, I will need to do a lot of culling and LODding and everything I can possibly optimise to minimise the nr. of polygons drawn, but right now I’m just drawing a moving rotating box. I’m doing flat shaded scanline rendering with a span buffer, so in principle a 3rd-generation rendering pipeline like Quake, minus texturing. Secretly I want to end up with something like Star Fox Big smile.

I got kind of inspired by the rotating gameboy in Overflow’s CPC demo (see bottom description), he is precomputing the rotating gameboy as image deltas. I figured on turboR, VDP I/O is a big bottleneck because of those 54 cycle waits, and these could be avoided by using deltas as well.

So my approach is to render into an RLE image buffer, and then only actually update the pixels which have changed since the previous frame. Since with flat shading usually only a few pixels around the edges change, I figured I should be able to get a decent framerate. Also as the polygons scale up, the number of pixels to update doesn’t increase quadratically, but linearly, so full screen should be possible.

So far there’s still a lot of optimisation to do though. Not sure yet if the approach is promising enough. I don’t even want to think about clipping and BSP trees and all that yet :D.

p.s. I found Michael Abrash's Graphics Programming Black Book nice to read.

By santiontanon

Paragon (1526)

santiontanon's picture

07-06-2017, 15:43

wow, very exciting! Looking forward to seeing some sneak peeks as you progress! Smile


Enlighted (5889)

NYYRIKKI's picture

09-06-2017, 00:52

@Grauw Yeah, I'm also interested about what you can come up with... My brains say that doing deltas to frame can't really work in a way that you can get reasonable output, but I'm happy if you can prove that I'm wrong...

If I would dream about Star fox type of graphics, I would rather think vector deltas... With this I mean that I would try to track time and draw lines to the shape edges in +-1 pixel steps (in correct line priority order) to get closer to next calculated "key frame". Even when it is not done on frame it would most probably look ok. From mathematical point of view this is terrible stuff to implement, but I think we could get pretty awesome animations out of MSX this way...

Ok, to be honest I don't understand what I just wrote even my self... Let's take a super simple 2 line example:

30 FOR I=0 TO 1500
40 Z=0:D=150+I
50 IF D<>0 THEN U=(I+Z)*140/D
60 LINE (U,U)-(255-U,U)
70 Z=-200:D=D+Z
80 IF D<>0 THEN U=(I+Z)*140/D
90 LINE (U,U)-(255-U,U),0
100 NEXT I

... now first thing to add would be skipping the line draws when they are not needed... This example also doesn't have more than 1 pixel steps... In order to handle those situations a routine would be needed that divides the time with biggest step and draws needed lines to upper and lower part in a way that there is no other than 0 or 1 pixel steps. If you keep the drawing the lines in correct order the tearing that happens when screen draw catches line draw is limited to pixel here and there. -> Pretty impossible to see during animation I think and in worst case you kind of end up displaying frame that is morphed somewhere between two actual calculated frames... The nightmare begins when you actually need to start figuring out what is the vector's turning point = What parts of the vector "draws" and what part of the vector "erases".

Definitely not anything close to easy, but I think this would be only possibility to display full screen filled vector graphics on MSX2 without speed issues... Still the Star Fox might be a bit out of reach, but animations like these should be easily possible.


Enlighted (6570)

ARTRAG's picture

09-06-2017, 09:36

If you accept some restrictions, probably a good approach for delta plotting is the one in the flat version of the raycasting demo.
Slice the active window in columns (I used 2x160 iirc, having 128 columns)
Assume a max number of color blocks in the column, say e.g. 4
A color block is a colored box 2xysize
Define for each column an array of 4 entries. Each entry tells : ymin of the color block, y size, color.

Now a single screen is defined by 4 (max number of blocks per column) x 128 x size of the entry bytes (3 byes).
Use two of them for double buffering
Delta plotting becomes only comparisons and differences within a column.
You can also save the ymin in the entry, as you can get it by the sum of the ysizes of the precedent blocks

By Grauw

Ascended (10159)

Grauw's picture

09-06-2017, 19:08

@NYYRIKKI I’ve got wireframe rendering working at the moment, it’s actually relatively heavy on the VRAM access bandwidth because there are quite a number of deltas, I notice this when zooming in, the framerate suffers. I’m still working on filled polygons but I think they will be lighter.

E.g. consider the left and right edge of a quad moving to the right by a couple of pixels per frame. In the case of a wireframe, each edge is writing VRAM in two places, one for drawing the new pixels, and one for erasing the previous frame’s pixels. So the wireframe quad has 4 VRAM writes per scanline each frame.

As opposed to flat shaded filled polygons, which only write to VRAM in two places, drawing pixels on the one side and erasing the pixels on the other. Additionally, since each edge is shared with an adjacent polygon, the erase of one polygon is the draw for another. So there is only 1 VRAM write per polygon per scanline. A factor 4 less than wireframe. Also the RLE image and span buffer are smaller because there are less transitions.

(Note I’m considering sequential writes as 1 write if they’re just a couple of pixels.)

When you’re rendering into an RLE image buffer, comparing the buffer with the previous frame’s buffer to know which pixels to update is relatively simple. Page flipping is also a simple matter of an additional buffer, however of course the update deltas grow bigger because it’s now comparing to two frames back, so it is a bit slower.

For wireframe graphics, it may be better to just double buffer, draw lines manually on a blank screen, and let the VDP clear the back buffer in the meanwhile, maximising parallelism.

By Grauw

Ascended (10159)

Grauw's picture

09-06-2017, 19:35

Ah, what the heck. Prepare not to be impressed Smile. Stuff is not optimised, just getting things working atm. I’ll make my own topic when I get the filled polygons working Smile.

Source code

For the center scanline there are 8 lines horizontally, so 16 VRAM writes, which gets a bit slow when it’s close to the camera. If these were filled polygons, there would be only 3 (left, middle and right edge).

By Louthrax

Prophet (2406)

Louthrax's picture

09-06-2017, 19:49

So there's this "Elite" 3D game which was not so bad on MSX1 in the 80's ;-) I know the C sources for that game are available for openGL / PC somewhere, maybe we could plug that with that filled-polygons engine to come ??

By Lord_Zett

Paladin (807)

Lord_Zett's picture

09-06-2017, 19:49

but how did it works in sandstone demo? the got real 3d stuff!

By Grauw

Ascended (10159)

Grauw's picture

10-06-2017, 11:57

Sandstone uses texturing so it can’t use the delta update optimisation I mentioned, they’re limited by the VDP output speed which means it slows down a lot when you zoom in. To avoid that is the key idea behind my approach.

Nevertheless it’s very impressive at what framerate it runs on even on a Z80. I have a ways to go! :D The screen update alone is only part of the performance cost, the scanline algorithm itself is also quite expensive. Would be nice to read an elaboration on what methods were used. Guess I should ask Jon or David!

@Louthrax Elite was awesome, I played the hell out of that :).


Enlighted (6570)

ARTRAG's picture

10-06-2017, 07:51

This was the differential plotting at work for flat walls

Page 3/4
1 | 2 | | 4