Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

A cross-platform command-line tool + C library with permissive license that makes compressed file-deduplicated snapshots of directories with accompanying indexed metadata. It should be foolproof, use a fast compression algorithm like snappy or LZO, and be optimized for fast multicore creation and extraction (but with optional CPU-limiting). Metadata could be stored in an sqlite database.


To what end? I think borg or restic would both match these criteria. Maybe you're hoping for something that does this more online and continuously?


I'm looking for something in-between a version control system and an archiver, with simple file-based deduplication and as fast and robust as possible. Maybe you can explain whether restic and borg fit my needs. Desired functionality:

- Main inputs are an input directory or a list of files, the location of a file repository (1 file), the location of a metadata DB, and some indexed and searchable metadata like original path, modification date, semantic version number, arbitrary title, and other strings.

- If an input file has been changed or is not yet stored, it is archived in the file repository compressed with very fast compression like snappy or LZO with corresponding metadata and version in the metadata DB. Otherwise only the metadata information is stored. Support for deltas / storing incremental changes in the file repository would also be nice but is not required.

- The metadata is stored in the metadata DB so later a file can be retrieved based on version information, it's original modification data, it's original path, and/or metadata regexp searches.

- Snapshots of an input directory or list of files can be extracted based on version information and any other metadata. Extraction needs to be very fast.

- The library needs to work on common file systems and at least on Linux, Windows, and MacOS.

- Archiving should be ACID-compliant. Either a file is stored with all of its metadata so that it is retrievable, or nothing is stored and an error is returned. Likewise for whole input directories and for extraction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: