I would have had the 3rd parameter on BC, instead the 3rd parameter and the following are passed on the stack
I'm a little skeptical about the choices of some of the regiters. When I talked to the author about it, he told me that it is the result of analysis of existing code. That's a pretty unstoppable answer. ^^
We'd have to see what data was analyzed, but at least it's not an arbitrary choice.
And in any case, it's a big step forward for SDCC's compiler performance.
Personally, the strangest thing for me is that you can pass through register: 1 x 8-bit parameter followed by 1 x 16-bit parameter (A and DE), but not the opposite! (In this case, the 1st 16-bit prameter goes into HL and the 2nd 8-bit goes into the stack)
The next step for me would be to be able to choose directly which parameter uses which register. This way, not only could we optimize the calls to some functions written in inline assembler, but above all, we could create C interfaces to libraries written in assembler (like the BIOS for example).
But unless there is a big demand from the users, I'm afraid it will never see the light of day in SDCC. :-/
When you are concerned about speed, notice the keyword __z88dk_callee as well. I've shaved off A LOT of cycles with this one in some cases.
Like this - how to avoid the normal stack-dance:
; extern void copy16x16blockInRam_fromC( unsigned char* pAddrSrc, unsigned char* pAddrDest ) __z88dk_callee; _copy16x16blockInRam_fromC:: pop iy ; return address pop hl ; pAddrSrc pop de ; pAddrDest ; do whatever stuff in here... jp ( iy ) ; instead of return (faster - the address is already retrieved from the stack / SP already changed)
The great thing here, is that the caller does not pop all the registers after return either - stack is cleaned-up already. *LOTS* of cycles saved.
__z88dk_callee is only supported on the caller side, so if you want to use it, you can only use it on routines you write in asm yourself.
I'm a little skeptical about the choices of some of the regiters. When I talked to the author about it, he told me that it is the result of analysis of existing code. That's a pretty unstoppable answer. ^^
We'd have to see what data was analyzed, but at least it's not an arbitrary choice.
Well if you read the paper it gives some reasonable insight I think. Although it doesn’t specify all the details of the data set, nor attempts to give a logical explanation for the results found.
Personally, the strangest thing for me is that you can pass through register: 1 x 8-bit parameter followed by 1 x 16-bit parameter (A and DE), but not the opposite! (In this case, the 1st 16-bit prameter goes into HL and the 2nd 8-bit goes into the stack)
Hm that is a little bit odd indeed, because the paper explicitly mentions that case (and says it does use the register). I guess at some point in the review & integration process this aspect has not survived the discussions.
The next step for me would be to be able to choose directly which parameter uses which register. This way, not only could we optimize the calls to some functions written in inline assembler, but above all, we could create C interfaces to libraries written in assembler (like the BIOS for example).
I think there’s really two a different goals here that are currently using the same mechanism.
What you describe here is more about optimising the C to assembly interface. Although the current calling convention annotation system could be extended for it, it may well be better served with an entirely separate dedicated mechanism. For example, one where you assign the input parameters to a struct representing the Z80 register bank and then pass that into an assembly function call.
But from a pure C point of view the calling convention is primarily an internal detail about how code generated by the C compiler interfaces with other C functions. Because you do not control the code that is generated, as a programmer you can not make a meaningful decision on which calling convention is best for the generated code. So from that perspective it doesn’t make much sense to give fine grained control of the calling convention, and it even arguably goes against the concept of having a higher level language like C.
and it even arguably goes against the concept of having a higher level language like C
I agree, the point of C is that you can write complicated stuff much faster than in assembler and don't care so much avoid losing cycles.
If you really want to optimize and avoid compiler shenanigans you can always store the parameters somewhere in memory, then use naked calls and inline asm code for performance critical functions. But well, if you do that everywhere then that's almost the same as coding directly in assembler...
I am finding some issues with SDCC 4.1.12.
It seems they updated the z80 library to rely on the new calling conventions, and made it non-backwards compatible with the old calling convention. So far I see they removed register saving to the stack before integer division, for instance, and this is breaking all my tests. Upgrading my existing code base while using --sdcccall 0 is turning into a bit of pain.
Currently the release is 4.1.14, but you are right that the major change in parameter handling is from release 4.1.12
Yeah, I confirm some other things are broken in 4.1.12 compared to previous versions if you try to enable the legacy calling convention, so porting older code is quite a challenge. What I have found is the following:
- z80 library is built with the newer calling convention, several functions mess up the stack when called using legacy. As this is done when building sdcc, it cannot be changed afterwards. Solution is re-implement the functions locally.
- parameter passing using index arrays doesn't work (the index is ignored)
this is quite a problem because syntactically correct code just doesn't do what is expected, or just hangs. I will try with the latest snapshot, but I am not very hopeful because this indicates the test suites used by SDCC are insufficient.
There is a global setting on command line that changes the default parameter convention restoring the older method.
In this way you have to specify that you want the new convention applying a descriptor to your functions one by one