Evaluating small neural networks for general-purpose lossy data compression
This repository has been archived on 2025-12-23. You can view files and clone it, but you cannot make any changes to it's state, such as pushing and creating new issues, pull requests or comments.
Find a file
Robin Meersman 04fa7c2387 Merge pull request #12 from ML/process
(De-) Compression
2025-12-13 18:03:27 +01:00
config chore: Replace firefox with 7zip (smaller) 2025-12-11 22:45:46 +01:00
results feat: Time+memory tracking 2025-12-07 21:49:45 +01:00
src fixup! WIP: Attempt at switching 2025-12-11 23:58:54 +01:00
.gitignore feat: Time+memory tracking 2025-12-07 21:49:45 +01:00
.python-version chore: Change versions, setup HPC 2025-11-30 16:51:44 +01:00
benchmark.py feat: Time+memory tracking 2025-12-07 21:49:45 +01:00
job.pbs feat: Smaller genome dataset 2025-12-11 13:38:52 +01:00
main.py fix: No hardcoding context len 2025-12-11 23:23:55 +01:00
pyproject.toml Merge branch 'main' into process 2025-12-11 23:16:25 +01:00
README.md chore: Replace firefox with 7zip (smaller) 2025-12-11 22:45:46 +01:00
uv.lock Merge branch 'main' into process 2025-12-11 23:16:25 +01:00

neural compression

Running locally

uv sync --all-extras

Example usage:

# Fetching
python main.py --debug train --method fetch \
  --dataset enwik9 --data-root /path/to/datasets

# Training
python main.py --debug train --method optuna \
  --dataset enwik9 --data-root /path/to/datasets \
  --model cnn --model-save-path /path/to/optuna-model
python main.py --debug --results /path/to/results train --method full \
  --dataset enwik9 --data-root /path/to/datasets \
  --model-load-path /path/to/optuna-model --model-save-path /path/to/full-model

# Compressing
python benchmark.py --debug compress \
  --model-load-path /path/to/full-model \
  --input-file inputfile --output-file outputfile

Testing compression:

bash config/download_datasets.sh config/urls.txt /home/tdpeuter/data/ml-inputs
bash config/generate_csv.sh > config/sub.csv
bash config/local.sh

Running on the Ghent University HPC

See the Infrastructure docs for more information about the clusters.

module swap cluster/joltik # Specify the (GPU) cluster, {joltik,accelgor,litleo}

qsub job.pbs               # Submit job
qstat                      # Check status