A long time ago, I was a MSX and MSX2 programmer

Page 5/12
1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | 10

By ARTRAG

Enlighted (6515)

ARTRAG's picture

18-12-2012, 22:10

No reasons apart the speed: less times I access to the nametable, less time the blitting will cost

By ARTRAG

Enlighted (6515)

ARTRAG's picture

19-12-2012, 00:10

This is the current wip
https://docs.google.com/open?id=0Bx4kWAc-fapqbXNGZDlmM2FraWs
there are some glitches to be investigated and removed

Scrolling left I think that you can see the "first column problem" you mentioned as few missing pixels in the first two/three lines each 16 pixels

By hit9918

Prophet (2904)

hit9918's picture

19-12-2012, 03:54

@ARTRAG, when issue is just speed, then I try to convince for 8x8:

Below I sketched a code to draw multiple tiles without a call/ret overhead, and all in registers.
56 rasterlines for 176 lines.

Remember the Nemesis plants level making tile combination bloat on MSX1 scroll - it will do that on 16x16 too!
16x16 is 4 times less puzzeling resolution than 8x8.
16x16 limit doesn't fit the picture, you are about to make high performance engine for the high performance MSX2 Smile

;call parameters:

;b  : loop count, amount of tiles to draw
;c  : 0x99
;de : offset to go one line down in nametable
;hl : vram pointer
;ix : pointer to 16bit nametable

;c' : 0x98
;de': offset in tile for X scroll

drawmultipletiles:
dmtl:
5	exx

21	ld l,(ix+0)
21	ld h,(ix+1)	;get tile address from nametable
12	add hl,de	;add offset for scroll
	
8*61	rept 8
	exx
	out (c),l
	out (c),h
	inc h
	exx

	outi
	endm

	
5	exx
17	add ix,de	;to next line in nametable
14	djnz dmtl
--
583 for 8 lines, 72,9 cycles per line, 0.32 rasterlines per line.

	ret

By ARTRAG

Enlighted (6515)

ARTRAG's picture

19-12-2012, 08:08

If you add the correct management of the high bits of the vram addresses, this 8x8 solution is by far slower than than the one I posted with 16x16 tiles
The sole strong point in your code is the 16 name table where I have (for now) this

 	ld hl,(_p)  	; q = &tile[*p++][offs];
 	ld a,(hl)
	inc hl
	ld (_p),hl
	ld bc,(_offs)
	add a,b
	ld b,a
	ld hl,_tile
	add hl,bc   	; from here hl points the current tile column

and you have

21	ld l,(ix+0)
21	ld h,(ix+1)	;get tile address from nametable
12	add hl,de	;add offset for scroll

Actually I would like to "compress" tiles removing repeated columns
This implies the use of two data structures, one holding the data
uchar column[Nmax][256];
one holding the addresses of the actual columns for each tile
uchar* tile[Ntile][16];

in the end, I should compute offline something like
tile[n][offset] = &column[x][16*offset];
thus in the code I should use something like
uchar* q = tile[level[i][j]][offset]

just thinking out loud
;-)

By ARTRAG

Enlighted (6515)

ARTRAG's picture

19-12-2012, 09:18

Sorry column definition has to be [Nmax][16]

By hit9918

Prophet (2904)

hit9918's picture

19-12-2012, 18:16

I got problems with the C 2D array descriptions, especially when I expect double indirection, two 16bit fetches, while your asm seems to do one 8bit fetch and then some high byte offsetting with it.

Doing two times 16bit fetch, with stack abuse, I end up with surprising little extra penalty.
And with DE having gotten free, looks like one can use the faster core with 8*57 cycles instead 8*61.

Funny, in the end it cost practically nothing, lol Smile

instead the
11 add hl,de

it is
11 ld sp,offset
12 add hl,sp
5 ld sp,hl
10 pop hl

;need disabled interrupts
;call parameters:

;b  : loop count, amount of tiles to draw
;de : offset to go one line down in nametable

;ix : pointer to 16bit nametable
;sp : for X scroll, column number * 2

;c' : 0x98
;de': vram pointer


drawmultipletiles:
	ld (savesp),sp
dmtl:
5	exx

21	ld l,(ix+0)
21	ld h,(ix+1)	;get address from nametable

11	ld sp,offset	;selfmodified code will be faster, preliminary can work with ld sp,(offset).
			;in column 1, offset = 2, because it is pointers
12	add hl,sp
5	ld sp,hl
10	pop hl		;fetch pointer to pixel colum
	
8*57	rept 8
	ld a,e
	out (0x99),a
	ld a,d
	out (0x99),a
	ind d

	outi
	endm

	
5	exx
17	add ix,de	;to next line in nametable
14	djnz dmtl

	ld sp,(savesp)
	ret
	

By hit9918

Prophet (2904)

hit9918's picture

19-12-2012, 18:09

So, when finaly tuning it, I guess no speed issues.
But I wonder whether you got some special usage in mind?
Something special in mind that is about 16x16 compression?

In general usage, stuffing an 8x8 level will again cause the tile combination bloat,
and on that one actualy with disabled column sharing.
e.g. when one 8x8 is a symmetric tile, but the 8x8 below is a not symetric tile, in 16x16 end up without column sharing.

By ARTRAG

Enlighted (6515)

ARTRAG's picture

19-12-2012, 19:38

You ended up to my very same inner loop. Add management for high bits of the VRAM address and you will arrive to my code with extra overhead due to the larger numger of tiles to be accessed...

Column compression can apply also to 8x8 tiles, with lower gains. The reason is that mapped RAM costs more than VRAM and bank switching is usually much more annoing than VRAM access.

I have no special use in mind atm, maybe MOAM in screen 8, maybe something else ;-)

By Huey

Prophet (2675)

Huey's picture

19-12-2012, 19:41

"This thread has been officially hijacked"

By Paulbrk

Hero (611)

Paulbrk's picture

19-12-2012, 19:48

Frederic.markus, you have to play Manboy 2 game, has a wonderful screen5 horizontal and vertical scroll at the same time. BA-team

Page 5/12
1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | 10