Monday, May 16, 2016

Performance testing a new parallel zip file implementation

Previously we looked into rewriting a zip file unpacker rather than modernising an existing one as well as measuring its performance. Now we have implemented fully parallel zip file compression so it is time to measure its performance against existing applications. The new implementation is called Parzip and can be obtained from its Github repo.

Test setup

Parzip uses only LZMA compression (though it can uncompress deflated archives). We tested it against the following programs:

  • Info-Zip, the default zip/unzip implementation on most Unix platforms
  • 7-Zip, the first archiver to use LZMA
  • Tar + gzip, the default way of compressing files on Unix
  • Tar + xz, is the same as above, but using LZMA compression instead of deflate
The test corpus was a checkout of the Clang compiler, including full source code, SVN directories and a build tree. This amounts to roughly 27 gigabytes of data. The test machine was an i7 machine with 8 cores and a spinning disk running Debian stable. Parzip was self compiled and is used liblzma from Debian repository, all other binaries were their default versions as provided by Debian. All compression settings were left at their default values.

Results

Let's start with the most important measurement: how long did each app take to compress the directory tree.
Info-Zip and gzip both use the deflate algorithm which is a lot faster than LZMA. Here we can see how they are the fastest even though they only use one thread for compression. Parzip is the fastest LZMA compressor by quite a wide margin. Xz is the slowest because it uses only one thread. 7-zip has some threading support which makes it faster than xz, but it loses heavily to Parzip.

Decompression times show a similar trend.
Decompression is a lot faster than compression (this graph is in seconds as opposed to minutes). The interesting point here is that Parzip is much faster than Info-Zip and gzip even though it uses LZMA. Being able to use all cores really helps here.

Finally let's look at sizes of the compressed archives.
Nothing particularly surprising here. LZMA files are roughly the same size with xz being the smallest, 7-zip following close by and Parzip holding the third place. Programs using deflate achieve noticeably worse compression ratio. Info-Zip is the clear loser here, maybe due to it using its own deflate implementation rather than zlib.

Big iron test

We also tested the performance of xz and Parzip on a 48 core heavy duty server.
XZ takes almost three hours to compress the data set while Parzip can achieve the same in just over 20 minutes. This is almost 90% faster. The theoretical best possible speedup would be roughly 97%, but due to overhead and other issues we can't reach those numbers.
Decompression times are similar with Parzip being 93% faster. This is due to decompression parallelising better than compression. Compression needs to write its results in a single file whereas decompression can write its outputs to the file system independent of each other.

Conclusions

Using modern tools and technologies is is possible to create a parallel file compressor that achieves almost the same compression ratio as the current leading technology (tar + xz) while providing almost linear parallelisation on the number of processor cores available.

Caveat

If you plan on doing your own measurements (which you should, it's educational), be careful when analysing the results. If you have a slow hard disk and a lot of decompression threads, they may saturate the disk's write capacity.

No comments:

Post a Comment