Fast 16-bit signed comparison

Página 1/3
| 2 | 3

Por Metalion

Paragon (1451)

imagem de Metalion

05-12-2020, 19:09

Hello,
I'm searching for a (very) fast way to do a 16-bit signed comparison between 'de' and 'hl'.

More specifically, I have two signed numbers stored in 'de' and 'hl' and I want to be able to output according to that matrix :

	hl<=0	hl>0
de<=0	0	hl
de>0	de	max(hl,de)

I've coded this, but it's messy ('hl' contains the output):

	ld	a,h		;  5
	cp	d		;  5
	jp	z,.same		; 11 | same sign
	jp	c,.hl_max	; 11 | de < 0 and hl >= 0, therefore 'hl' is greater
	jp	.hl_max-1	; 11

.same	ld	a,l		;  5
	cp	e		;  5
	jr	nc,.hl_max+1	;  8/13 | l>e therefore 'hl' is greater
	ex	de,hl		;  5

.hl_max	xor	a		;  5
	cp	h		;  5 | is hl positive ?
	jp	z,.next		; 11
	ld	hl,0		; 11 | hl is negative
	jp	.dothings	; 11

.next	ld	a,l		;  5
	and	a		;  5 | is hl = 0 ?
	jp	nz,.dothings	; 11
	ld	hl,0		; 11 | hl =0

.dothings ...

Is there a faster and better way ? (I'm sure there is)

I thought I found a way using 'sbc hl,de' (see the the snippet of code below). However, unfortunately, it does not work correctly when 'hl' and 'de' have opposite signs ...

; de = max(hl,de) (-29/+41) 
	and	a		;  5
	sbc	hl,de		; 16
	jr	c,$+4		;  8/13
	add	hl,de		; 12 | hl>=de
	ex	de,hl		;  5
	...
Entrar ou registrar-se para comentar

Por Pencioner

Scribe (1480)

imagem de Pencioner

05-12-2020, 19:43

So you actually want the max(de, hl) but value should be set to zero if the max is negative?

Por Mumbly

Expert (86)

imagem de Mumbly

05-12-2020, 19:47

You want that ??? or I missed something

;------------------------------------------------------------
; Compare hl and DE
; jr c,hl_lessthan_de
; jr nc,hl_greaterthan_de
;------------------------------------------------------------

cmpHLDE:
ld a,h
sub d
ret nz
ld a,l
sub e
ret

Por Metalion

Paragon (1451)

imagem de Metalion

05-12-2020, 20:15

Pencioner wrote:

So you actually want the max(de, hl) but value should be set to zero if the max is negative?

Exactly.

Mumbly wrote:

;------------------------------------------------------------
; Compare hl and DE
; jr c,hl_lessthan_de
; jr nc,hl_greaterthan_de
;------------------------------------------------------------

I know about that routine but it does not work with signed numbers.
if hl = 5 and de = -2 (65534) then it will tell me that de is greater than hl, which is false.

Por Briqunullus

Champion (429)

imagem de Briqunullus

05-12-2020, 20:07

ld a,h
cp d
jp z,.same

If for instance HL=-257 (feff) and DE=-256 (ff00) the comparison fails, while both have the same sign.

Por Metalion

Paragon (1451)

imagem de Metalion

05-12-2020, 20:14

Briqunullus wrote:

ld a,h
cp d
jp z,.same

If for instance HL=-257 (feff) and DE=-256 (ff00) the comparison fails, while both have the same sign.

You are right, of course, but I did leave out of the explanation that this version of the code was linked to the fact that 'hl' and 'de' were capped. 'hl' could not be smaller than -255 and 'de' could not be greater than 255. However, this is a constraint (or an advantage) which is disappearing now in my present version.

So I should have added that this code is not working anymore with my current parameters. We have to consider that 'hl' and 'de' can go the full range.

Por Metalion

Paragon (1451)

imagem de Metalion

05-12-2020, 20:43

In other words, I want a very fast code to do this :
hl = max(0,hl,de)
with 'hl' and 'de' containing signed 16-bit numbers.

Por theNestruo

Champion (323)

imagem de theNestruo

05-12-2020, 21:14

Not sure if this is the fastest, but please take a look (warning: untested code!)

	bit	7, d
	jp	z, de_positive ; (de >= 0)
; de < 0
	bit	7, h
	ret	z ; ret hl (hl >= 0) (43 tstates)
; de < 0, hl < 0
	ld	hl, 0
	ret	; ret 0 (59 tstates)

de_positive:
; de > 0 (+21 tstates)
	bit	7, h
	jp	nz, ret_de ; (de > 0, hl < 0) (58 tstates)
; hl > 0, de > 0
	ld	a, d
	cp	h
	ret	c ; ret hl (h > d) (64 tstates)
	jp	nz, ret_de ; (h < d) (85 tsates)
; h = d
	ld	a, e
	cp	l
	ret	c ; ret hl (l > e) (91 tstates)
; h = d, l <= e (101 tstates)
ret_de:
	ex	de, hl
	ret

If you know your use cases, maybe it can be further optimized for the most common scenarios.

Por santiontanon

Paragon (1526)

imagem de santiontanon

05-12-2020, 22:52

Nice solution theNestruo! A small improvement over it. By checking first hl and then de, we avoid one comparison):

	bit 7,h
	jp z,hl_positive	; hl >= 0
; hl < 0
	ld hl,0
hl_positive:
	bit 7,d
	ret nz  ; de < 0
; hl > 0, de > 0
	ld a, d
	cp h
	ret c  ; hl is larger
	jp nz, ret_de  ; de is larger
	ld a, e
	cp l
	ret c  ; hl is larger
ret_de:
	ex de, hl
	ret

I have not tested it either, so might contain bugs Smile

Por Metalion

Paragon (1451)

imagem de Metalion

06-12-2020, 08:51

Thank you TheNuestro and santiontanon.

Nice idea to change 'hl' to zero, if it is negative, before testing both registers. Your version, santiontanon, gives :
. 43 cycles for the fastest case
. 107 cycles for the slowest case

Switching to 'de' for the result, using 'jr' instead of 'jp', and using 'sbc hl,de', here is my version :
. 45 cycles for the fastest case
. 100 cycles for the slowest case
with a provision to gain 5 cycles on the slowest case if carry does not need to be reset.

	bit	7,d		; 10
	jr	z,de_pos	;  8/13 | de >= 0
	ld	de,0		; 11
de_pos	bit	7,h		; 10
	ret	nz  		;  6/12 | hl < 0
; hl > 0, de > 0
	and	a		;  5
	sbc	hl,de		; 16
	ret	c		;  6/12
	add	hl,de		; 12
	ex	de,hl		;  5
	ret			; 11

Por theNestruo

Champion (323)

imagem de theNestruo

06-12-2020, 09:57

I think both santiontanon code and your last code are slower in the case when both hl and de are negative: they just change one of the values to 0 and perform a 16-bit comparison.
They are shorter routines but, if size is not a problem, I would keep the second sign comparison in that route to perform a faster ret 0 / ret the other value by avoiding the 16-bit comparison.

Página 1/3
| 2 | 3