less than 1 minute read

I recently tested three tools for compressing 102 NetCDF files contain gridded rainfall data (138GB) and found 7-Zip to be the best. Here’s a quick comparison:

Note: The tests were done on a multi-core machine, with 192 cores.

  1. 7zz (7-Zip)
    7zz a -tzip test_archive.zip *.nc
    
    • Time: 1 minute 17 seconds
    • Final Size: 6.6GB
  2. tar + gzip
    tar -cvzf test_archive.tar.gz *.nc
    
    • Time: 17 minutes
    • Final Size: 6.9GB
  3. tar + pigz (Parallel gzip)
    tar cf - *.nc | pigz -p 150 > archive.tar.gz
    
    • Time: 6 minutes
    • Final Size: 6.8GB

If you’re dealing with large datasets, 7-Zip is the quickest and most efficient choice.