foldcomp icon indicating copy to clipboard operation
foldcomp copied to clipboard

Dealing with nested subdirectories

Open jomimc opened this issue 1 year ago • 2 comments

I have a lot of data to compress, and they are stored in nested subdirectories (e.g. /Data/Protein/Mutation/...pdb).

Default behavior of "foldcomp compress -r" seems to be to create an output folder, and to put everything in there. So I encounter the "Output file already exists" error.

Is there a way to either create a new directory with the same subdirectory structure? Or to output the ".fcz" files in the same directories as the uncompressed pdb files?

jomimc avatar Oct 18 '23 09:10 jomimc

I think you can write a script that iterate through nested sub-directories.

#!/bin/bash
# Usage: ./foldcomp_recursive.sh <path> <threads>
threads=$2

function run_command_in_dir {
    for dir in "$1"/*; do
        if [ -d "$dir" ]; then
            run_command_in_dir "$dir"
        fi
    done

    # Check if pdb or cif files exist in the directory
    if ls "$1"/*.pdb 1> /dev/null 2>&1 || ls "$1"/*.cif 1> /dev/null 2>&1; then
        foldcomp compress -t $threads "$1" "$1"
    fi
}

run_command_in_dir "$1"

This one is an example bash script that iterate through the input directory recursively and check if there are pdb or cif files in the directory while compressing if there are wanted files.

khb7840 avatar Oct 20 '23 07:10 khb7840

That's what I did, thanks. I managed to get all ~ 60,000 pdb files compressed within an hour.

jomimc avatar Oct 21 '23 08:10 jomimc