Evaluating small neural networks for general-purpose lossy data compression
This repository has been archived on 2025-12-23. You can view files and clone it, but you cannot make any changes to it's state, such as pushing and creating new issues, pull requests or comments.
Find a file
Tibo De Peuter 2f869a8a7a
chore: Restructure
# Conflicts:
#	config/measure.py
#	results/compression_results_auto_small.csv
2025-12-19 17:58:43 +01:00
config chore: Restructure 2025-12-19 17:58:43 +01:00
models chore: Restructure 2025-12-19 17:58:43 +01:00
results chore: Restructure 2025-12-19 17:58:43 +01:00
src other: cleanup unused code file 2025-12-17 21:09:27 +01:00
.gitignore feat: ignored macos specific files :) 2025-12-17 10:39:10 +01:00
.python-version chore: Change versions, setup HPC 2025-11-30 16:51:44 +01:00
benchmark.py feat: Time+memory tracking 2025-12-07 21:49:45 +01:00
job.pbs feat: Smaller genome dataset 2025-12-11 13:38:52 +01:00
main.py feat: autoencoder + updated trainers + cleaned up process to allow using autoencoder 2025-12-14 14:37:04 +01:00
pyproject.toml chore: Restructure 2025-12-19 17:58:43 +01:00
README.md chore: Replace firefox with 7zip (smaller) 2025-12-11 22:45:46 +01:00
uv.lock chore: Restructure 2025-12-19 17:58:43 +01:00

neural compression

Running locally

uv sync --all-extras

Example usage:

# Fetching
python main.py --debug train --method fetch \
  --dataset enwik9 --data-root /path/to/datasets

# Training
python main.py --debug train --method optuna \
  --dataset enwik9 --data-root /path/to/datasets \
  --model cnn --model-save-path /path/to/optuna-model
python main.py --debug --results /path/to/results train --method full \
  --dataset enwik9 --data-root /path/to/datasets \
  --model-load-path /path/to/optuna-model --model-save-path /path/to/full-model

# Compressing
python benchmark.py --debug compress \
  --model-load-path /path/to/full-model \
  --input-file inputfile --output-file outputfile

Testing compression:

bash config/download_datasets.sh config/urls.txt /home/tdpeuter/data/ml-inputs
bash config/generate_csv.sh > config/sub.csv
bash config/local.sh

Running on the Ghent University HPC

See the Infrastructure docs for more information about the clusters.

module swap cluster/joltik # Specify the (GPU) cluster, {joltik,accelgor,litleo}

qsub job.pbs               # Submit job
qstat                      # Check status