Overview

Modern large-scale scientific simulations running on HPC systems generate voluminous amounts of data during a single run. To lessen the I/O load during a simulation run, scientists are forced to capture data infrequently, thereby making data collection an intrinsically lossy process. Yet, most lossless compression techniques are hardly suitable for large-scale reduction of floating-point datasets from scientific simulations as the data tends to be inherently random and hard-to-compress.

We introduce an effective method for In-situ Sort-And-B-spline Error-bounded Lossy Abatement (ISABELA) of scientific data. ISABELA is particularly designed for compressing spatio-temporal scientific data that is characterized as being inherently noisy and random-like, and thus commonly believed to be incompressible. With ISABELA, we apply a preconditioner to seemingly random and noisy data along spatial resolution to achieve an accurate fitting model that achieve a very high correlation (≥ 0.99) with the original data.

Publications

  1. S. Lakshminarasimhan, N. Shah, S. Ethier, S. Klasky, R. Latham, R. Ross, and N.F. Samatova, "Compressing the Incompressible with ISABELA: In-situ Reduction of Spatio-Temporal Data", European Conference on Parallel and Distributed Computing (Euro-Par), Bordeaux, France, Aug. 2011
  2. S. Lakshminarasimhan, N. Shah, S. Ethier, S-H. Ku, C.S Chang, S. Klasky, R. Latham, R. Ross, and N.F. Samatova, "ISABELA for Effective In Situ Compression of Scientific Data", Journal of Concurrency and Computation: Practice and Experience (CCPE), 2013

Code

ISABELA library [ISABELA-compress-0.2.1.tar.gz]