8-bit atan2

Page 1/3
| 2 | 3

By ARTRAG

Enlighted (6538)

ARTRAG's picture

15-05-2016, 22:37

This function calculates the angle, in a 256-degree circle. The trick is to use logarithmic division to get the y/x ratio and integrate the power function into the atan table. The idea comes from a 6502 routine.
It seems (to me) the fastest solution ever seen on z80 to compute angles. It should be perfect for bullet hell shooters.
Maybe someone has a faster solution, any proposal for improvements ?

; 8-bit atan2

; Calculate the angle, in a 256-degree circle.
; The trick is to use logarithmic division to get the y/x ratio and
; integrate the power function into the atan table. 

;	input
; 	B = x, C = y	in -128,127
;
;	output
;	A = angle		in 0-255

;      |
;  q1  |  q0
;------+-------
;  q3  |  q2
;      |
		
atan2:	
		ld	de,0x8000			
		
		ld	a,c
		add	a,d
		rl	e				; y-					
		
		ld	a,b
		add	a,d
		rl	e				; x-					
		
		dec	e
		jp	z,q1
		dec	e
		jp	z,q2
		dec	e
		jp	z,q3
		
q0:			
		ld	h,log2_tab / 256
		ld	l,b
		
		ld	a,(hl)			; 32*log2(x)
		ld	l,c
		
		sub	a,(hl)			; 32*log2(x/y)
		
		jr	nc,1f			; |x|>|y|
		neg				; |x|<|y|	A = 32*log2(y/x)
1:		ld	l,a

		ld	h,atan_tab / 256
		ld	a,(hl)
		ret	c			; |x|<|y|
		
		neg
		and	0x3F			; |x|>|y|
		ret
				
q1:		ld	a,b
		neg
		ld	b,a
		call	q0
		neg
		and	0x7F
		ret
		
q2:		ld	a,c
		neg
		ld	c,a
		call	q0
		neg
		ret		
		
q3:		ld	a,b
		neg
		ld	b,a
		ld	a,c
		neg
		ld	c,a
		call	q0
		add	a,128
		ret


		; align to byte
		
		align 0x100
		
;		z=0:255; sprintf('%d,',fix(atan(2.^(z/32))*128/pi))

		;;;;;;;; atan(2^(x/32))*128/pi ;;;;;;;;
atan_tab:	
		db 020h,020h,020h,021h,021h,022h,022h,023h,023h,023h,024h,024h,025h,025h,026h,026h
		db 026h,027h,027h,028h,028h,028h,029h,029h,02Ah,02Ah,02Ah,02Bh,02Bh,02Ch,02Ch,02Ch
		db 02Dh,02Dh,02Dh,02Eh,02Eh,02Eh,02Fh,02Fh,02Fh,030h,030h,030h,031h,031h,031h,031h
		db 032h,032h,032h,032h,033h,033h,033h,033h,034h,034h,034h,034h,035h,035h,035h,035h
		db 036h,036h,036h,036h,036h,037h,037h,037h,037h,037h,037h,038h,038h,038h,038h,038h
		db 038h,039h,039h,039h,039h,039h,039h,039h,039h,03Ah,03Ah,03Ah,03Ah,03Ah,03Ah,03Ah
		db 03Ah,03Bh,03Bh,03Bh,03Bh,03Bh,03Bh,03Bh,03Bh,03Bh,03Bh,03Bh,03Ch,03Ch,03Ch,03Ch
		db 03Ch,03Ch,03Ch,03Ch,03Ch,03Ch,03Ch,03Ch,03Ch,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh
		db 03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Dh,03Eh,03Eh,03Eh,03Eh
		db 03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh
		db 03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Eh,03Fh,03Fh,03Fh,03Fh
		db 03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh
		db 03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh
		db 03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh
		db 03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh
		db 03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh,03Fh
 
 ;		x=0:255;x(1)=1; sprintf('%d,',fix(32*log2(x)))
 
		;;;;;;;; log2(x)*32 ;;;;;;;; 
log2_tab:  
		db 000h,000h,020h,032h,040h,04Ah,052h,059h,060h,065h,06Ah,06Eh,072h,076h,079h,07Dh
		db 080h,082h,085h,087h,08Ah,08Ch,08Eh,090h,092h,094h,096h,098h,099h,09Bh,09Dh,09Eh
		db 0A0h,0A1h,0A2h,0A4h,0A5h,0A6h,0A7h,0A9h,0AAh,0ABh,0ACh,0ADh,0AEh,0AFh,0B0h,0B1h
		db 0B2h,0B3h,0B4h,0B5h,0B6h,0B7h,0B8h,0B9h,0B9h,0BAh,0BBh,0BCh,0BDh,0BDh,0BEh,0BFh
		db 0C0h,0C0h,0C1h,0C2h,0C2h,0C3h,0C4h,0C4h,0C5h,0C6h,0C6h,0C7h,0C7h,0C8h,0C9h,0C9h
		db 0CAh,0CAh,0CBh,0CCh,0CCh,0CDh,0CDh,0CEh,0CEh,0CFh,0CFh,0D0h,0D0h,0D1h,0D1h,0D2h
		db 0D2h,0D3h,0D3h,0D4h,0D4h,0D5h,0D5h,0D5h,0D6h,0D6h,0D7h,0D7h,0D8h,0D8h,0D9h,0D9h
		db 0D9h,0DAh,0DAh,0DBh,0DBh,0DBh,0DCh,0DCh,0DDh,0DDh,0DDh,0DEh,0DEh,0DEh,0DFh,0DFh
		db 0DFh,0E0h,0E0h,0E1h,0E1h,0E1h,0E2h,0E2h,0E2h,0E3h,0E3h,0E3h,0E4h,0E4h,0E4h,0E5h
		db 0E5h,0E5h,0E6h,0E6h,0E6h,0E7h,0E7h,0E7h,0E7h,0E8h,0E8h,0E8h,0E9h,0E9h,0E9h,0EAh
		db 0EAh,0EAh,0EAh,0EBh,0EBh,0EBh,0ECh,0ECh,0ECh,0ECh,0EDh,0EDh,0EDh,0EDh,0EEh,0EEh
		db 0EEh,0EEh,0EFh,0EFh,0EFh,0EFh,0F0h,0F0h,0F0h,0F1h,0F1h,0F1h,0F1h,0F1h,0F2h,0F2h
		db 0F2h,0F2h,0F3h,0F3h,0F3h,0F3h,0F4h,0F4h,0F4h,0F4h,0F5h,0F5h,0F5h,0F5h,0F5h,0F6h
		db 0F6h,0F6h,0F6h,0F7h,0F7h,0F7h,0F7h,0F7h,0F8h,0F8h,0F8h,0F8h,0F9h,0F9h,0F9h,0F9h
		db 0F9h,0FAh,0FAh,0FAh,0FAh,0FAh,0FBh,0FBh,0FBh,0FBh,0FBh,0FCh,0FCh,0FCh,0FCh,0FCh
		db 0FDh,0FDh,0FDh,0FDh,0FDh,0FDh,0FEh,0FEh,0FEh,0FEh,0FEh,0FFh,0FFh,0FFh,0FFh,0FFh
 		


Login or register to post comments

By Grauw

Ascended (10061)

Grauw's picture

16-05-2016, 00:11

Nice snippet! Bookmarked Smile.

By NYYRIKKI

Enlighted (5873)

NYYRIKKI's picture

16-05-2016, 10:03

Very nice! I found following algorithm, but I did not test if it works ok or not:

10 'A=ATAN2(X,Y)
20 A1=ATN(1):AY=ABS(Y)
30 IF X<0 THEN A=A1*3-(AY+X)/(AY-X)*A1 ELSE A=A1-(X-AY)/(X+AY)*A1
40 IF Y<0 THEN A=-A

By ARTRAG

Enlighted (6538)

ARTRAG's picture

16-05-2016, 18:11

I've found that approximation as well, but it needs a division. The one above needs only a subtraction and 3 accesses to tables.
Moreover, I limited it to signed bytes in -128,127, but it works also in -256,255 provided that you store the signs of X and Y in E

By NYYRIKKI

Enlighted (5873)

NYYRIKKI's picture

16-05-2016, 19:52

ARTRAG wrote:

I've found that approximation as well, but it needs a division. The one above needs only a subtraction and 3 accesses to tables.

Yes, I noticed that it does not really compete with your code at all, but I wanted to drop it in, in case someone want's to play with some simple BASIC routines first. Smile

BTW this is quite a bit out of topic, but as I see you always love to play with math & speedy Z80 routines, you don't happen to have any reasonable clipping routine in your magic drawer? I mean something that would ie. take 4x signed 16bit values that represent vector (SX.SY)-(DX,DY) and clip it to 8bit signed address space... In MSX case ie (-127,-105)-(128,106)... In theory this should be pretty straight forward, but if I try to implement it I very soon start to realize that the accuracy is not what I want or the routine becomes very slow &complex... especially when both ends of the vector are out of "window" and yet still partly crossing it. (Maybe atan2 could be used to see if this is the case and to avoid needless divides? I didn't think this yet trough...)

By PingPong

Prophet (3758)

PingPong's picture

16-05-2016, 20:15

NYYRIKKI wrote:

....clipping routine....

Prodatron should have it used on graphics plot routines on SymbOS. Ask him....

By NYYRIKKI

Enlighted (5873)

NYYRIKKI's picture

16-05-2016, 20:38

PingPong wrote:
NYYRIKKI wrote:

....clipping routine....

Prodatron should have it used on graphics plot routines on SymbOS. Ask him....

Hmm... I think you got me a bit wrong... I didn't mean window to window clipping, but vector to window clipping.

By ARTRAG

Enlighted (6538)

ARTRAG's picture

17-05-2016, 07:34

Sorry, I have no clipping code in the drawer...
A fully working routine has to deal with lots of special cases according to the positions of the vertices with respect to the frame. It is not an easy task.

By DarkSchneider

Paladin (942)

DarkSchneider's picture

17-05-2016, 11:06

@ARTRAG could you put a link to your drawer?. I remember sometime you put the link (with some routines), but I don't have it in the bookmark.

By wouter_

Champion (467)

wouter_'s picture

17-05-2016, 18:33

NYYRIKKI wrote:

... you don't happen to have any reasonable clipping routine in your magic drawer? ...

I assume you already studied the Cohen–Sutherland and Liang–Barsky algorithms? Both indeed require divisions and multiplications.

I didn't fully work it out, but I think you can eliminate those divisions/multiplications by repeatedly calculating the midpoint of the line (=very easy in Z80 assembly), pick the right segment(s), and continue until you hit one of the screen boundaries.

This does require several steps, but likely less steps than what a (software) division routine needs. On the other hand, each step is likely a bit more expensive than a step in the division routine. So I don't know if it's a netto win.

By Metalion

Paragon (1444)

Metalion's picture

17-05-2016, 19:24

wouter_ wrote:

I assume you already studied the Cohen–Sutherland and Liang–Barsky algorithms?

Funny, because while reading the latest posts on this thread, I begin to think about how I would solve the clipping problem. And I thought : well, I can deduce from the vector its actual x-y equation, and therefore find the clipped segment limits by solving the equation for the window coordinates. I then proceed to read the wiki article about the algorythms, and ... Bam ... There you go, I had reinvented the Liang-Barsky algorythm all over again !
:D

I must say I felt a little bit proud about it ... For about 3 minutes.

Page 1/3
| 2 | 3