LZEXE decompressor for Z80

By pgimeno

Master (190)

pgimeno's picture

09-06-2020, 22:25

For those who don't know, LZEXE is a compressor of EXE-format executable files for MS-DOS, written by Fabrice Bellard (also known for FFMPEG, QEmu, TCC and other projects). The compression ratios were often within an acceptable range, but where it really shone was in the decompression speed. You barely ever noticed that the executable was compressed back in the time (1990).

My amazement at this blazing-fast speed led me to disassemble the decompressor and try to understand it. When I looked into its guts, I was in awe about the techniques used to speed up the process. I knew it would be very hard for me to write a matching compressor for this, but the idea of using LZEXE itself as a general purpose compressor to compress data for my own projects was always in my mind since then.

Recently Grauw asked help about something compression related, and I posted my ideas. I was going to post this in his tread, but then I realized that I misread him and he needed a compressor, not a decompressor. Since the work was already done by the time I noticed, I decided to post it in a new thread instead.

Here's the code for a decompressor of LZEXE compressed streams, written in Z80 code. I used Pasmo. The syntax may vary in other assemblers but it uses nothing advanced anyway, so it should be straightforward to port to other assemblers if necessary.

http://www.formauri.es/personal/pgimeno/temp/Z80unLZE.zip

The ZIP includes also a sample project that compresses a text file (a Lorem Ipsum text generated online) using LZEXE under DOSBOX, producing a binary in the form of both a .BIN and a .CAS file that you can BLOAD with ,R; this binary decompresses the file and prints it. To build the sample project, GNU make and Python are required.

The decompressor takes just 157 bytes of memory, and if decompression speed is not that important, it can be shrunk to just 122 bytes by not inlining a certain routine.

The license text is:

; Copyright © 2020 Pedro Gimeno Fortea
;
; This is a decompressor for LZEXE-compressed streams. Copying, distribution
; and modification of this package or parts of it are permitted, under two
; conditions: that any copyright notices and this notice are preserved, and
; that any modified version is clearly marked as such. This package is offered
; as-is, without any warranty express or implied.
Login or register to post comments

By Grauw

Ascended (9161)

Grauw's picture

10-06-2020, 00:02

Very cool!

So as I understand it, in essence it has a separate (simple) Huffman coded bitstream and a bytestream, muxed together? That’s very clever and will keep the performance high because there’s no need to shift bytes out of the bitstream nor have a large Huffman decoding tree!

Definitely this deserves its own thread, it would only get snowed under in mine Smile. Indeed for the tooling I’m looking at the compression side, finding the repetition. But this gives some useful insight!

By pgimeno

Master (190)

pgimeno's picture

10-06-2020, 08:29

Grauw wrote:

So as I understand it, in essence it has a separate (simple) Huffman coded bitstream and a bytestream, muxed together? That’s very clever and will keep the performance high because there’s no need to shift bytes out of the bitstream nor have a large Huffman decoding tree!

Indeed, that's how it works! Sadly it requires right shifts for the bitstream. If it required left shifts, it could have done with ADD IX,IX to shift, saving some pushes and pops of BC. Maybe I create a tool to reverse the bitstream order and update the decompressor accordingly.

[Pedantic note: Strictly speaking, the encoding is not Huffman, but it uses a binary tree like Huffman or Shannon-Fano.]

By Bengalack

Master (135)

Bengalack's picture

10-06-2020, 09:43

Me wants a like button Big smile

By pgimeno

Master (190)

pgimeno's picture

10-06-2020, 17:40

Version 1.1 has a few minor optimizations.

I've created a repository in NotABug so you can have access to any version and check the differences:
https://notabug.org/pgimeno/z80unlze

By thegeps

Champion (479)

thegeps's picture

11-06-2020, 01:32

I agree, like button required!

By pgimeno

Master (190)

pgimeno's picture

13-06-2020, 18:27

I wasn't aware about the work of this guy:

https://github.com/uniabis/z80depacker

That repository contains a comparison table of several compressors with Z80 decompressors, with the compression ratio, the decompression speed and the decompressor size.

His decompressor dlzee_fast decompresses significantly faster than z80unlze, while offering basically identical compression ratios to LZEXE (typically with 1 byte difference in most cases), so I recommend that one instead of this. Furthermore, in this repository there is a compressor that can compress to both LZEE and LZEXE: https://github.com/uniabis/lzee - thus obviating the need of DOSBOX and the old LZEXE.EXE executable for compression and a Python program for extraction. It does all in one.

By Manuel

Ascended (16698)

Manuel's picture

13-06-2020, 18:33

Great comparison! Pick what suits your needs Smile

By Grauw

Ascended (9161)

Grauw's picture

13-06-2020, 19:06

So what does he do different than your code? You mentioned the left shifts earlier…

Also a nice trick that was pointed out to me by wouterv while I was working on gunzip is that you can shift a sentinel bit into the first shift after fetching a new byte, so you can do a simple zero check to see if a new byte needs to be fetched rather than keeping track of a bit counter.

By pgimeno

Master (190)

pgimeno's picture

13-06-2020, 19:36

No idea, I haven't analysed the code in any depth.

I had the zero check in mind, but with the lzexe stream I didn't find a way to use that method. I believe you have 1 bit of lag, which needs to be accounted for in the mixing of bitstream and bytestream. LZEE seems to be using that, see `call z,getbit` and the initial `scf` in the source here: https://github.com/uniabis/z80depacker/blob/master/dlzee_fas...

And anyway, since it's a 16-bit register, the zero check would be somewhat expensive.

He submitted a pull request with further optimizations, which place it a little bit under LDIR x 4: https://notabug.org/pgimeno/minetest/pulls/6