Sprite to Bitmap

Страница 2/3
1 | | 3

By Chilly Willy

Resident (64)

Аватар пользователя Chilly Willy

22-06-2021, 12:52

Metalion wrote:

I don't understand at all how someone who is able to program in assembly, is unable to figure the maths behind this.

address = pointer + (Y/8)*32+(X/8)

How hard is that ?????

It's not so much as I have a hard time figuring out the math it is writing in a way that the compiler understands it.
I am still learning and at the end of the day if one of you gave me a routine I would dissect and analyze the thing until I get it.

Here is the reality whether most of you admit to it or not.

Someone showed you.

You could walk into a gaming company during the 80's, break out the technical manual and then end up using in house routines that were faster.

I would love to meet that one person who took Z80 assembly language in college during the 80's then went off to program games for the MSX or Colecovision.

By Dolphin101546015

Champion (324)

Аватар пользователя Dolphin101546015

23-06-2021, 17:04

Metalion wrote:

I don't understand at all how someone who is able to program in assembly, is unable to figure the maths behind this.

address = pointer + (Y/8)*32+(X/8)

How hard is that ?????

Coz you need look at this math like:
address = pointer + (Y>>3)<<5+(X>>3)

So you get first look optimizations:
address = pointer + (Y<<2)+(X>>3)

PS: Ofc you need truncate some anoying bits, anyway, you get more clean code then.

By Chilly Willy

Resident (64)

Аватар пользователя Chilly Willy

24-06-2021, 15:31

You guys have been so kind to me I am going to share with my completed and rewritten routine.
It can be used to grab direct from VRAM or a Ram table adjust comments to the version you need.

It works, it's short and sweet.
Remember that you have to provide offsets because the sprite calculates from the upper left hand corner as 0,0 while a sprite can be 8x8 or 16x16. Even more if you stack or enlarge them.

SPRITE_TO_TILE: ; Updated June 24, 2021. Changed for Pointer in Ram or Direct from VRAM
LD DE, TABLE ; Table name in RAM or $1800 for grabbing from VRAM
LD A, SPRITE_Y
AND $F8
LD H, 0
LD L, A
ADD HL, HL
ADD HL, HL
LD A, SPRITE_X
SRL A
SRL A
SRL A
OR L
LD L, A
ADD HL,DE
LD A,(HL) ; Comment out if calling direct from VRAM

; DI
;LD A,L
;OUT (98h),A
;LD A,H
;AND 3FH
;OUT (99h),A
;EI
;IN A, (98h)
RET

By ro

Scribe (4451)

Аватар пользователя ro

26-06-2021, 16:31

Metalion wrote:

Yes, I also used this clever ADD HL,A operation a few times myself.
But I found out it is quicker to do :

ADD A,L
LD L,A
JR NC,$+3
INC H

I use something similar, but without a jump.

ADD A,L
LD L,A
LD A,0
ADC A,H
LD H,A

I love it when there's just more than one way of doing the same thing Smile

By Bengalack

Champion (394)

Аватар пользователя Bengalack

26-06-2021, 17:52

ro wrote:
Metalion wrote:

Yes, I also used this clever ADD HL,A operation a few times myself.
But I found out it is quicker to do :

ADD A,L
LD L,A
JR NC,$+3
INC H

I use something similar, but without a jump.

ADD A,L
LD L,A
LD A,0
ADC A,H
LD H,A

I love it when there's just more than one way of doing the same thing Smile

Love this little detail. I wasn't aware of Chilly Willy's version, but like that one best. Metalion's variant is 28/23 cycles, and is good, but it is not constant. One can make it constant by replacing JR with JP, but it ends on 26 cycles. The above is 28 cycles. Chilly Willy's variant is constantly 25 cycles.

Good input. Thanks!

By Metalion

Paragon (1421)

Аватар пользователя Metalion

26-06-2021, 19:57

Bengalack wrote:

Metalion's variant is 28/23 cycles, and is good, but it is not constant

Nope, mine is 23 cycles and constant, that's the beauty of it !
Smile

ADD A,L     ;  5 cycles
LD  L,A     ;  5 cycles
JR  NC,$+3  ; 13 cycles if NC, 8 cycles if not
INC H       ;  5 cycles

By Chilly Willy

Resident (64)

Аватар пользователя Chilly Willy

27-06-2021, 00:56

Metalion wrote:
Bengalack wrote:

Metalion's variant is 28/23 cycles, and is good, but it is not constant

Nope, mine is 23 cycles and constant, that's the beauty of it !
Smile

ADD A,L     ;  5 cycles
LD  L,A     ;  5 cycles
JR  NC,$+3  ; 13 cycles if NC, 8 cycles if not
INC H       ;  5 cycles

Hey, I always want to improve so let's see the whole routine so I can plug it in a test it.

By Arjan

Paladin (730)

Аватар пользователя Arjan

27-06-2021, 08:52

Well, your version is actually the fastest (OR L / LD L,A is just 10 cycles) but it only works correctly if the table address is a multiple of 32.

By Bengalack

Champion (394)

Аватар пользователя Bengalack

27-06-2021, 13:24

Metalion wrote:
Bengalack wrote:

Metalion's variant is 28/23 cycles, and is good, but it is not constant

Nope, mine is 23 cycles and constant, that's the beauty of it !

You are of course right. And I'm totally blind Smile haha. And this is great. I tend to favorise constant time, so this is just up my alley Smile

By Micha

Resident (34)

Аватар пользователя Micha

28-06-2021, 11:28

The more I looked into this, the more I appreciate the "Konami solution" posted by Guillian. It is a little gem and way faster (approx. 30%) than the other solutions posted. It cleverly uses the carry to copy bits from one register to the other. I also like that by using HL as input you can fetch the two coordinates from memory in once ( ld hl,(nn) ) instead of doing ld a,(nn) twice, which saves 11 cycles. It also doesn't use BC and DE, which is nice.
There is one drawback however: it is not easy anymore to provide offsets to the coordinates; in most cases you don't want the character behind the upperleft corner but the char behind the middle of the sprite.
The most efficient "Konami solution" with offsets I could come up with would then look like this (I switched l and h to leave out the ld l,h at the end):

ld a,(X)
add offsetX
ld l,a
ld a,(Y)
add offsetY
rra
rra
rra
rra
rr l
rra
rr l
rra
rr l
and 3
add a,bufstart/256
ld h,a
ld a,(hl)

It should be 22 cycles faster than the other solution. Anyone able to optimise this even further ?

Страница 2/3
1 | | 3