Gunzip for MSX

페이지 6/11
1 | 2 | 3 | 4 | 5 | | 7 | 8 | 9 | 10 | 11

By hit9918

Prophet (2868)

hit9918의 아바타

28-10-2015, 21:17

C compilers. runs out of registers all the time and then end up lot (ix).
using EXX registers could give a lot speed.

without exx, out of registers, to load a register
ld l,(ix+0)
ld h,(ix+1)
and in case HL needs first saving to stack it is another
ld (ix+2),l
ld (ix+3),h
it is horrible 84 cycles to get another pointer usable.

while
push hl : exx : pop hl 
is 28 cycles. to get a pointer usable with DE or BC of the other bank

another example
ld a,(de)
inc de
exx
ld (de),a
inc de
exx


when the thing without exx usage is out of registers, 
two times load and store a pointer,
168 cycles instead 10, 16 opcode bytes instead 2.

it would be so cool if SDCC would go EXX. after initial efforts, it beats a rocket science compiler that has no exx.

By Grauw

Ascended (8457)

Grauw의 아바타

28-10-2015, 23:14

Counter-example, increment a counter in memory:

ld a,(0C010H)  ; increment field on static object and check carry
inc a
ld (0C010H),a
jp z,Carry

44 cycles, 10 bytes, A modified.

ld de,10       ; increment field on dynamic object and check carry
push hl
add hl,de
inc (hl)
pop hl
jp z,Carry

74 cycles, 10 bytes, A and DE modified.

inc (ix + 10)  ; increment field on dynamic object and check carry
jp z,Carry

36 cycles, 6 bytes, no registers modified, readability & flexibility +9000.

Index registers are super awesome Smile.

By Prodatron

Paragon (1788)

Prodatron의 아바타

28-10-2015, 23:24

Just for completion Smile

ld hl,counter
inc (hl)
jp z,carry

36 cycles, 7 bytes, HL modified, IX unused, so it can be used for other stuff

PS: Don't get me wrong. I like to use IX/IY. Currently I just need to find a way to adapt your routines as fast as possible without using the 2nd register set. Anyway I will post them as soon as something is running.

By hit9918

Prophet (2868)

hit9918의 아바타

28-10-2015, 23:32

It depends on the situation. Is it misc stuff or is it cacheable hot stuff. for example

exx
inc e
exx

Smile

By hit9918

Prophet (2868)

hit9918의 아바타

28-10-2015, 23:38

but I had the impression that the single bit fiddeling and jumping takes most of the time?

By Prodatron

Paragon (1788)

Prodatron의 아바타

28-10-2015, 23:40

Absolutely. What I just want to say is, that I have to find a solution, which is nearly as fast as Grauws implementation but doesn't use the 2nd register set. My current approach is not to use the index registers to achieve (more or less) the same speed.

By Grauw

Ascended (8457)

Grauw의 아바타

28-10-2015, 23:52

You’re going to use it for something? Nice! (Any hints?)

The 2nd register set is only used in a few places, but where I do use it I really need the extra registers so it will be hard to avoid. In Inflate, by making the alphabets static objects you could avoid using hl' and de', but bc' and a' are still used… I guess you could also statically allocate the reader and writer, freeing up ix and iy for use in stead of bc' and a', though it would require a lot more code changes.

But why can’t programs use EXX? Due to the multitasking context switching? Surely even with a few context switches every frame, it doesn’t noticeably hurt performance to do a few extra pushes and pops?

By hit9918

Prophet (2868)

hit9918의 아바타

28-10-2015, 23:54

The idea that EXX is for interrupts is for embedded z80 usage.
What the z80 is asked in the homecomputer is much more extreme. EXX is badly needed.

By Prodatron

Paragon (1788)

Prodatron의 아바타

29-10-2015, 00:13

@Grauw, ok, good to know!
You are right. The task scheduler is using the 2nd register set for faster context switching. If its possible to lock the interrupts for a while and use push/pop for the 2nd register set, it would work. But in this case it has to be for a constant but limited time, so other processes are not stopped too long. Unfortunately that's not so easy in a platform independant environment where the CPU speed can be very different. But anyway I think it's not a big issue.

Yes, I would like to implement an Unzip (deflate) tool, which is based on your code in SymbOS. There are two very motivating things:
- in SymbOS you can run it at a lower priority in the background; so you can still listen to MP3s, do a Telnet session or make some other stuff, while the remaining free Z80 cpu time is used for decompressing your ZIP files with some very fast Grauw-routines
- it will be the first time to have an UnZIP tool on three other computer systems as well: Amstrad CPC, Enterprise 64/128 and Amstrad PCW

Let's try it Smile

By Louthrax

Prophet (2084)

Louthrax의 아바타

29-10-2015, 09:40

Prodatron wrote:

Yes, I would like to implement an Unzip (deflate) tool, which is based on your code in SymbOS.

With a nice GUI to select & extract files Smile2 ?

페이지 6/11
1 | 2 | 3 | 4 | 5 | | 7 | 8 | 9 | 10 | 11