Quick Start
See tacozip API in action:
A blazing-fast, STORE-only ZIP writer with embedded TACO Header for instant metadata access. Built in C with first-class Python bindings.
See tacozip API in action:
import tacozip
from pathlib import Path
# 1. Create sample Parquet files
Path("train.parquet").write_bytes(b"training data..." * 1000)
Path("test.parquet").write_bytes(b"test data..." * 500)
# 2. Archive with row group metadata
tacozip.create(
"dataset.taco",
src_files=["train.parquet", "test.parquet"],
entries=[
(1000, 5000), # Row group 0: bytes 1000-6000
(6000, 4500) # Row group 1: bytes 6000-10500
]
)
# 3. Read metadata instantly (165 bytes only!)
entries = tacozip.read_header("dataset.taco")
print(f"Metadata: {entries}")
# 4. Works with cloud storage (S3, HTTP)
import requests
r = requests.get(
"https://cdn.example.com/dataset.taco",
headers={"Range": "bytes=0-164"}
)
entries = tacozip.read_header(r.content)

ISP · Image & Signal Processing
Universitat de València