About C / Z80 optimizations (SDCC)

Page 7/15
1 | 2 | 3 | 4 | 5 | 6 | | 8 | 9 | 10 | 11 | 12

By Giangiacomo Zaffini 2

Master (154)

Giangiacomo Zaffini 2's picture

08-09-2019, 19:08

A new release 3.10 of SDCC is about to land. Time flies!

By DarkSchneider

Paladin (854)

DarkSchneider's picture

08-09-2019, 20:07

PingPong wrote:

look at vrampoke & vrampeek functions. SDCC generate some kind of lengtly code:
this segment

ld	c, 1 (iy)
	ld	b, #0x00
	ld	a, c
	and	a, #0x3f
	or	a, #0x40
	out	(_port99), a

is simply extremely stupid.
ld c, then ld b, 00 then ld a, c (so first ld c could be ld a, .....

I see you are using Fusion-C. It is known even recognized by author that currently libraries are not optimized, and one calling another one (in layers), so you have that kind of long codes generated by the libs dependency probably.
I think it would be better to simply try with pure C code, not concerning libraries or machine specific, that could not be optimized.

akumajo wrote:

I'm glad to see that there is a growing community of programmers using SDCC and even more pleased to see that a specific version for Z80 has just been released. I just tried ZSDCC and I admit it was really won me over ; The generated code is clean and optimized. Thank you for this topic that will allow many of us to improve in writing C programs with SDCC and/or ZSDCC!

Yes, currently Fusion-C use SDCC version 3.6.5

z88dk is nice and as said some optimizations included in the z88dk version of the SDCC are copied into the main branch. Add that it has good (generic) libraries.

By zPasi

Champion (410)

zPasi's picture

08-09-2019, 20:13

PingPong wrote:

this segment

ld	c, 1 (iy)
	ld	b, #0x00
	ld	a, c
	and	a, #0x3f
	or	a, #0x40
	out	(_port99), a

is simply extremely stupid.
ld c, then ld b, 00 then ld a, c (so first ld c could be ld a, .....

Indeed it is. SDCC occasionally does that. I tried --opt-code-speed --max-allocs-per-node 100000 and even that didn't make the stupidity go away.

It is ugly, but doesn't affect performance very much. And, normally you hand-optimize this kind of functions anyway.

Bigger impact is of pushing parameters into the stack. When there is only one parameter, you can use __z88dk_fastcall but otherwise can't. __z88dk_callee partially helps even it's not as good as register parameters.

BTW: you still have --opt-code-size in place, I can tell from lines like ___sdcc_enter_ix

By ericb59

Paladin (852)

ericb59's picture

08-09-2019, 20:34

All this topic is interesting, but also very funny !
While you argue about which is the best compiler, or about such code is not optimal, there is nothing concrete that is coded for the MSX ...

You are all fabulous people, knowing MSX on your fingertips, and knowing the assembler as your native language.
I'm not worth 1/10 of a single person here, all I want is games come out on MSX.
Fusion-c is a little brick of this project, a simple idea, probably not optimal, not perfect, but simple and easy enough for many programmers can be tempted to code on MSX without too much trouble.

Do you need a super clean, and super optimized code?
In recent times we have seen some games developed in Compiled Basic out on the scene, so I think that even with an awful code generated by a ugly compiler and a non-optimized library, we can do better ... Wink

Be happy, and code more for MSX Big smile

By zPasi

Champion (410)

zPasi's picture

08-09-2019, 21:07

ericb59 wrote:

All this topic is interesting, but also very funny !
While you argue about which is the best compiler, or about such code is not optimal, there is nothing concrete that is coded for the MSX ...

Yes, it is funny of a kind Smile

My idea was not to argue about which compiler is the best. I was trying to discuss about SDCC and best practises to write good performing code without starting to write assembly immediately when there is slight issues with speed.

Writing assembly and hand-optimizing compiler generated code is rather slow a process. I would like to do less that and more game logic. And C-code is more portable and more easy to change when needed.

When I write assembly I'd like it to be versatile, something I could contribute to Fusion-C Smile

Quote:

In recent times we have seen some games developed in Compiled Basic out on the scene, so I think that even with an awful code generated by a ugly compiler and a non-optimized library, we can do better ... Wink

Yes. Many game ideas are not so performance critical, so much can be done in current Fusion-C already. And utilities, of course.

Quote:

Be happy, and code more for MSX Big smile

Thanks, I'll try Big smile

By DarkSchneider

Paladin (854)

DarkSchneider's picture

09-09-2019, 10:28

ericb59 wrote:

all I want is games come out on MSX.

Wise words. Currently not using SDCC, if sometime in the future I move to it I'd like to contribute Fusion-C.

zPasi wrote:

Writing assembly and hand-optimizing compiler generated code is rather slow a process. I would like to do less that and more game logic. And C-code is more portable and more easy to change when needed.

Ding ding, BINGO! Totally agree. A good logic is what makes a good game (interesting gameplay with possibilities). While in ASM for logic if you want to change something you have to modify sometimes much code to fit the registers with the new code, using a compiler you modify code based on structures and friendly language, and then the compiler will generate the new machine code for you.

My opinion is that currently something 100% ASM limits you in many ways, even if you are very good at coding, mainly on productivity, because rewriting is much slower.
For me the perfect combo is C for main logic, ASM for critical routines (like ISR), or those you see that the compiler cannot optimize and has real impact (maybe the map decompression). And even in the 2nd case, 1st code them in C, so changes can be easily made, and at optimizing step of development, translate the final one to ASM.

By zPasi

Champion (410)

zPasi's picture

10-09-2019, 14:57

I did a little comparison between SDCC and ZSDCC.

;  x = FT_RandomNumber(0,256); ZSDCC

   6177 AF            [ 4]22993 	xor	a, a
   6178 F5            [11]22994 	push	af
   6179 33            [ 6]22995 	inc	sp
   617A AF            [ 4]22996 	xor	a, a
   617B F5            [11]22997 	push	af
   617C 33            [ 6]22998 	inc	sp
   617D CDrE4r10      [17]22999 	call	_FT_RandomNumber
   6180 F1            [10]23000 	pop	af

; cycles: 69

;  x = FT_RandomNumber(0,256); SDCC

   624E 21 00 00      [10]23040 	ld	hl,#0x0000
   6251 E5            [11]23041 	push	hl
   6252 CDrFAr10      [17]23042 	call	_FT_RandomNumber
   6255 F1            [10]23043 	pop	af

; cycles: 48

;  newRock(x,y, dx, dy, i2); ZSDCC

   61A8 3Ar20r01      [13]23023 	ld	a, (_main_i2_65536_663)
   61AB F5            [11]23024 	push	af
   61AC 33            [ 6]23025 	inc	sp
   61AD 58            [ 4]23026 	ld	e, b	; dx in b
   61AE D5            [11]23027 	push	de	; dy in d
   61AF 2Ar23r01      [16]23028 	ld	hl, (_main_y_65536_663)
   61B2 E5            [11]23029 	push	hl
   61B3 2Ar21r01      [16]23030 	ld	hl, (_main_x_65536_663)
   61B6 E5            [11]23031 	push	hl
   61B7 CDr2Fr52      [17]23032 	call	_newRock
   61BA 21 07 00      [10]23033 	ld	hl, #7
   61BD 39            [11]23034 	add	hl, sp
   61BE F9            [ 6]23035 	ld	sp, hl

; cycles: 143

;  newRock(x,y, dx, dy, i2); SDCC

   627A 3Ar20r01      [13]23062 	ld	a,(_main_i2_1_412)
   627D F5            [11]23063 	push	af
   627E 33            [ 6]23064 	inc	sp
   627F C5            [11]23065 	push	bc	; dy in b
   6280 33            [ 6]23066 	inc	sp
   6281 7B            [ 4]23067 	ld	a,e		; dx in e
   6282 F5            [11]23068 	push	af
   6283 33            [ 6]23069 	inc	sp
   6284 2Ar23r01      [16]23070 	ld	hl,(_main_y_1_412)
   6287 E5            [11]23071 	push	hl
   6288 2Ar21r01      [16]23072 	ld	hl,(_main_x_1_412)
   628B E5            [11]23073 	push	hl
   628C CDr4Ar52      [17]23074 	call	_newRock
   628F 21 07 00      [10]23075 	ld	hl,#7
   6292 39            [11]23076 	add	hl,sp
   6293 F9            [ 6]23077 	ld	sp,hl

; cycles: 166

Interestingly, in the first case ZSDCC actually performs worse. The parameters are 0,256 but because type is (unsigned) char, the second parm is converted to zero. That may not affect how the code is compiled. In any case, SDCC does better on this one.

In the second case it's ZSDCCs turn to perform better. But these cases are similar, so why the same optimizations don't work in both cases? Weird.

Ok, enough of this. I'll better stop testing compilers and resume coding, for now.

By Grauw

Ascended (8309)

Grauw's picture

10-09-2019, 15:10

Nooo, do more Big smile

Weird to see such deviation. ZSDCC is based on SDCC right?

By zPasi

Champion (410)

zPasi's picture

10-09-2019, 17:32

Grauw wrote:

Weird to see such deviation. ZSDCC is based on SDCC right?

Yes. A Z80-only branch. Generally it should optimize better for Z80.

By DarkSchneider

Paladin (854)

DarkSchneider's picture

10-09-2019, 18:48

It is usually better. But SDCC is currently very close. As mentioned, on later releases they sometimes include the optimizations of the z88dk branch into the main.
In any case a few cycles is nothing to worry about. I see the current SDCC nice.

Page 7/15
1 | 2 | 3 | 4 | 5 | 6 | | 8 | 9 | 10 | 11 | 12