Evaluating small neural networks for general-purpose lossy data compression
| config | ||
| graphs/plots | ||
| models | ||
| results | ||
| src | ||
| .gitignore | ||
| .python-version | ||
| benchmark.py | ||
| cpu_compression_graphs.py | ||
| job.pbs | ||
| main.py | ||
| make_graphs.py | ||
| measure.py | ||
| measure_gzip_lz4.sh | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
neural compression
Running locally
uv sync --all-extras
Example usage:
# Fetching
python main.py --debug train --method fetch \
--dataset enwik9 --data-root /path/to/datasets
# Training
python main.py --debug train --method optuna \
--dataset enwik9 --data-root /path/to/datasets \
--model cnn --model-save-path /path/to/optuna-model
python main.py --debug --results /path/to/results train --method full \
--dataset enwik9 --data-root /path/to/datasets \
--model-load-path /path/to/optuna-model --model-save-path /path/to/full-model
# Compressing
python benchmark.py --debug compress \
--model-load-path /path/to/full-model \
--input-file inputfile --output-file outputfile
Testing compression:
bash config/download_datasets.sh config/urls.txt /home/tdpeuter/data/ml-inputs
bash config/generate_csv.sh > config/sub.csv
bash config/local.sh
Running on the Ghent University HPC
See the Infrastructure docs for more information about the clusters.
module swap cluster/joltik # Specify the (GPU) cluster, {joltik,accelgor,litleo}
qsub job.pbs # Submit job
qstat # Check status