fakedatafs icon indicating copy to clipboard operation
fakedatafs copied to clipboard

Feature request: Generate data using markov chains

Open cfcs opened this issue 9 years ago • 1 comments

It would be nice to use markov chains or similar to produce data of different patterns / "similarity" for use with benchmarking compression and deduplication.

Candidates include:

  • file/directory names
  • data blocks / "segments"
  • directory depth / structure

cfcs avatar Mar 07 '16 21:03 cfcs

Speaking of which, there should be test cases for weird path elements like long names, funny charset encodings, broken charset encodings, etc. I suspect some of the rsync-based tools might have a hard time with files and folders containing special characters.

cfcs avatar Mar 07 '16 21:03 cfcs