tag:blogger.com,1999:blog-5968355124473522212.post4313835955226046636..comments2024-03-26T19:42:51.465+02:00Comments on Nibble Stew: Clarifications to xz compression articleJussihttp://www.blogger.com/profile/03370287682352908292noreply@blogger.comBlogger1125tag:blogger.com,1999:blog-5968355124473522212.post-64497554173388655382017-01-04T20:38:46.903+02:002017-01-04T20:38:46.903+02:00Some notes about disk performance: Around 5-10 yea...Some notes about disk performance: Around 5-10 years ago most desktop users bought budget 5400-7200 rpm disks. They had specs like:<br />100 MB/s max sustained read performance<br />50-100 IOPS<br /><br />Now (Samsung 960 EVO ~130 EUR)<br />3200 MB/s<br />330 000 IOPS<br /><br />Disk bandwidth has improved over 3000% and IOPS over 300000%. We should definitely expect more and rapid improvements in the near future (e.g. Optane disks). So, it's pretty obvious that things have become CPU/memory bound again. Many users also encrypt their file system, which affects the disk performance. For instance, my old i7 3770k system (2012) can only encrypt/decrypt up to 2000/2000 MB/s (multi-core parallel AES XTS 256b). If you combine that with transparent disk compression, even the 600 MB/s SATA disks might be fast enough to congest the CPU. LZ4/LZO are pretty much the only available algorithms that seem transparent enough not to slow everything down. For instance, squashfs/xz is clearly slowing down disk performance on most machines. The only place it's improving perf is PXE/NFS boot over 100 Mbps ethernet.<br /><br />Haswell and later generations provide somewhat better encryption perf due to new AVX instructions, but overall the situation is quite bad if you don't parallelize. Currently it seems that CPUs are pretty stuck with 1-3% annual perf upgrades with single threaded programs. Most improvements come from larger (turbo) clock frequencies, not from improved IPC. I find it really unlikely that we could even get 200% speedup with the current hardware development style during the next *100* years. Like your graphs show, we could achieve 10000% NOW with better algorithms. Heck, we could even switch back to simpler, in-order cores to make room for a larger core count, yet still achieve better performance than incremental CPU gate-level optimizations in 100 years. This is clearly the way to go.<br /><br />Given that most people really do expect constant perf improvements in all technology and are willing to spend lots of money on new, faster hardware with each new year, I don't quite understand the criticism here. Compared to new hardware, software upgrades are 'free'. You just install an update and instantly get better performance. Why waste money on bogus hardware upgrades? Better algorithms are clearly now the lowest hanging fruit with lots of potential. It's actually quite interesting that somebody spends time thinking about parallel compression. I've seen widely used CPU benchmarks where the compression tests typically yield worse results with server class many-core hardware (vs 2-4 turbo core gaming CPUs).miasmahttps://www.blogger.com/profile/12710096873877267938noreply@blogger.com