The Fastest Way to Archive and Compress Large Files
I recently tested three tools for compressing 102 NetCDF files contain gridded rainfall data (138GB) and found 7-Zip to be the best. Here’s a quick comparison:
Note: The tests were done on a multi-core machine, with 192 cores.
- 7zz (7-Zip)
7zz a -tzip test_archive.zip *.nc
- Time: 1 minute 17 seconds
- Final Size: 6.6GB
- tar + gzip
tar -cvzf test_archive.tar.gz *.nc
- Time: 17 minutes
- Final Size: 6.9GB
- tar + pigz (Parallel gzip)
tar cf - *.nc | pigz -p 150 > archive.tar.gz
- Time: 6 minutes
- Final Size: 6.8GB
If you’re dealing with large datasets, 7-Zip is the quickest and most efficient choice.