Grauw’s RPG in development

페이지 7/22
1 | 2 | 3 | 4 | 5 | 6 | | 8 | 9 | 10 | 11 | 12

By Grauw

Ascended (8515)

Grauw의 아바타

30-03-2018, 10:44

In this example I’m hardcoding the addresses, but speed aside, another benefit of setting registers with OUTI as opposed to OUT (n),a is that it’s easier to change the port, because it’s passed in a register and coded in fewer locations. So it’s a positive move towards a dynamic VDP port, I’m also trying to make this kind of change in other places of the code (though no promises).

And as for the system variables; I do update them for the general screen mode settings and such, where I call a slow library method which does other fancy things as well like restoring interrupt state. But for values that change every frame I don’t really care about their current value. Especially in screen splits I simply won’t have time for this, and there is no benefit to it. If the game is interrupted by some UI screen or something, they will be set again when the game resumes, so this seems an OK practice.

By Grauw

Ascended (8515)

Grauw의 아바타

31-03-2018, 19:48

Grauw wrote:

Slight change of plans; instead of writing sprite attributes from a 128-byte RAM buffer on the ISR, I’m going to double buffer the sprite attribute table, write the attribute data each frame, the colour data every other frame, and use an YMMM command to copy the 512 colour data bytes from the first buffer to the second.

Well, I did this, the sprite colours update time was reduced from 19.57% (3.26 ms / frame) to 10.25% (1.71 ms / frame). Works well! Big smile The 30 fps game loop with 60 fps motion is also implemented.

By Grauw

Ascended (8515)

Grauw의 아바타

31-03-2018, 23:35

I just finished the tile drawing code optimisation I mentioned a couple of days ago. Looking at my CPU meters now, in the worst case I still have 37% CPU time (12.3 ms) remaining for each 30 fps frame. That seems more than plenty! :)

I also published my source code:

https://bitbucket.org/grauw/tiletile

Note that the license for the code is the permissive Simplified BSD as usual, but all the artwork, music and story is licensed with the fairly restrictive Creative Commons BY-NC-ND, which most importantly means that no derivative works are permitted.

By santiontanon

Paladin (846)

santiontanon의 아바타

01-04-2018, 02:59

Nice!!!

But btw, now that you are releasing source code, I think it'd be wise to first register your entry in the MSXDev competition asap. Last year, the game "Ghost" was not allowed in the competition, because they released the game and THEN tried to register for the competition. So, just in case Smile

By Grauw

Ascended (8515)

Grauw의 아바타

01-04-2018, 11:06

Ah, thanks for he heads-up, I hadn’t thought of that.

I published the source code because I like developing in the open, but I wouldn’t call it a release… I wanted to officially announce the game when I know a name, have at least one demo level, and have some reasonable level of confidence that I’ll finish it. My motivation is very fickle and it goes from periods of lots of development to long inactivity and back Wink. I also note the registration asks for a name, artwork, etc., and the game isn’t in such a state yet.

But I’ll contact the MSXDev’18 organisation just to make sure, because I am working hard on it now with MSXDev’18 in mind Big smile, so it would suck if there was a problem due to technicalities. A good reminder that I need to keep the framework of the competition in mind! I’d prefer not to register it yet at this point though.

By MOA

Champion (293)

MOA의 아바타

01-04-2018, 19:34

I was trying to build the code, but it looks like Macros.asm is missing in the repository:

Exception in thread "main" nl.grauw.glass.Assembly
Exception: Include file not found: Macros.asm
[at src/ROM.asm:6]
        INCLUDE "Macros.asm"

Edit: seems like these are missing as well:

	INCLUDE "BIOS.asm"
	INCLUDE "System.asm"
	INCLUDE "Memory.asm"
	INCLUDE "MemoryTest.asm"
	INCLUDE "VDP.asm"
	INCLUDE "VDPCommand.asm"
	INCLUDE "Palette.asm"

Edit 2: never mind... these are in the separate neonlib repository.

By Grauw

Ascended (8515)

Grauw의 아바타

01-04-2018, 19:39

Cool! Yeah, it’s a subrepository… if you use Mercurial to clone the repository, it should check out the library subrepository as well. You also need Node.JS (and npm) installed for the scripts to run.

By MOA

Champion (293)

MOA의 아바타

02-04-2018, 04:16

I played around with your code, seeing if I could optimize it a bit (mainly through peephole optimizations). I gained a few %, so I can share if you're interested:

idle:
idle2

moving diagonally:
movediag

moving horizontally:
movehor

moving vertically:
movevert

Seems like most time is still spent waiting for the VDP during the hot loops... maybe some simple tasks can be run while the VDP is running its commands you're likely to wait for? (if I disable command wait I see a 3% to 4% improvement on an MSX 2+)

I see you create a lot of small functions, which an optimizing compiler would definitely inline on a modern platform. In my version I inlined most of them and it really helps. Maybe you could extend your assembler and add support for inlining?

i.e.:
- functions which are only called once are always suitable
- procedures with less than (some threshold bytes) generated opcodes
- maybe some keyword to force inlining a procedure
- inlined procedures with ret [flag] in them automatically replace those mnemonics by jr/jp to ' end' (and ret at very end of the procedure will be removed)

in my version I just changed certain functions to macros, replaced certain rets with jr/jp (and added an _inline postfix).

By Grauw

Ascended (8515)

Grauw의 아바타

02-04-2018, 05:58

Hey MOA! Nice feedback, thanks!

MOA wrote:

Ah, they include the new “free CPU time” bar (no. 10) :). Measures the time spent halting in vsync.

MOA wrote:

Seems like most time is still spent waiting for the VDP during the hot loops... maybe some simple tasks can be run while the VDP is running its commands you're likely to wait for? (if I disable command wait I see a 3% to 4% improvement on an MSX 2+)

Interesting test! I just tried it, commented out the jr c,WaitReady in VDPCommand_Execute_HL, and the CPU time of the diagonal tile rendering code goes down from 33.55% (11.18 ms) to 26.90% (8.97 ms). Looks like it’s definitely VDP bound there for 2.22 ms while drawing the 32 tile fragments.

Per tile that means it’s waiting for 0.069 ms, or 248 cycles. Mmh, tricky to use that time effectively for something else… Maybe copying some data to the VRAM, probably the alternate registers are unused there so I can context switch relatively quickly. There’s only so many things that would qualify for that though, e.g. sprite colour data comes to mind, but it needs to complete before the primary subframe, so not an option... Maybe I’ll find something later.

MOA wrote:

I see you create a lot of small functions, which an optimizing compiler would definitely inline on a modern platform. In my version I inlined most of them and it really helps. Maybe you could extend your assembler and add support for inlining?

Yes! Things along those lines are in the planning :). Inlining is a bit tricky to automate as I’ll discuss later, but I really do want to implement C++-like templating. The idea is to generate methods based on the calling arguments, like a macro which generates multiple versions in-place:

	push ix
	call Scene_GetTileSet
	call TileSet_Load
	pop ix

	push iy
	call Engine_GetSpritePatternTableEvenIY
	call Scene_InitializeSpritePatterns
	pop iy

vs.

	call TileSet_Load(ix + Scene.tileSet)
	call Scene_InitializeSpritePatterns(iy + Engine.spritePatternTableEven)

Both easier on the eyes as well as faster, without compromising code structure. It would generate a version of Load which substitutes ix for ix + Scene.tileSet, and since this is the only place Load is called, it will only generate that one version.

It would also allow me to auto-generate ix or iy versions of methods depending on which I need, currently I have to write either or multiple versions of them (with IX/IY suffix), which can be a bit annoying to use and refactor. So yeah, this is high on my assembler improvements wishlist.

I already did a similar thing with macros in the Engine_Submit* methods (see the _Select calls).

MOA wrote:

- functions which are only called once are always suitable
- procedures with less than generated opcodes
- maybe some keyword to force inlining a procedure
- inlined procedures with ret in them automatically replace those mnemonics by jr/jp to ' end'

in my version I just changed certain functions to macros, replaced rets with jr/sp (and added an _inline postfix).

Yeah, for now that’s the way to do it…

Inlining is more tricky to automate in the assembler, since as you noted it needs to transform the source code, replacing ret with jr/jp to endm, and there are additional troublesome considerations like how to deal with tail calls. Code transformations are really cool, and I’m thinking of it, but for now I don’t have any ideas that I can do anytime soon because it will require a lot of work on the assembler.

So for this project I will probably resort to the manual method when it gets close to completion. If it’s needed, which I hope it won’t be… ;)

By DarkSchneider

Paladin (880)

DarkSchneider의 아바타

02-04-2018, 08:33

The time while VDP working and what you can do in that gap is the great question. An interrupt would be so useful. Without interrupt, more than preparing the next command in small ones is hard to use that gap.

About inline, not sure if the compiler will inline by its own in a small memory machine.

페이지 7/22
1 | 2 | 3 | 4 | 5 | 6 | | 8 | 9 | 10 | 11 | 12