bigflow icon indicating copy to clipboard operation
bigflow copied to clipboard

manage byproducts

Open ziyenano opened this issue 7 years ago • 4 comments

While running bigflow program, I find it will output some byproducts, e.g., entity-* .flume ... After several times, it will make the folder in a mess. Could you please put those byproducts into a pre-specified subfolder, in order to conveniently manage.

ziyenano avatar Dec 21 '17 12:12 ziyenano

yeah.. This is a issue we wanted to do for a long time but haven't done yet. We are hoping that some of our users can help us to improve it in the future. PS: You should pay some attention when you want to remove all the byproducts, because it will cause a failure if you delete a file which is currently being used.

acmol avatar Dec 21 '17 13:12 acmol

An alternative method: copy the following bash script, e.g., path/bigflow_cleanup.sh

#!/bin/bash
set -x
set -e
if [ $# -ne '0' ]; then
    cd $1
fi
echo `pwd`

rm -rf ./entity-*
rm -rf ./.flume-resource-*
rm -rf ./.flume-app-*.tar.gz
rm -rf ./.empty-*.tar.gz
rm -rf ./hs_err_*
rm -rf ./.tmp

then add an alias in your .bashrc or .zshrc, etc.:

alias bigflow_cleanup='sh path/bigflow_cleanup.sh'

Source the .bashrc or .zshrc file, and then you can easily clean up those byproducts in any folder by excuting bigflow_cleanup command.

ziyenano avatar Dec 21 '17 16:12 ziyenano

Yes, you could do it in your way, but it's very easy to make the running job fail if the job is using these tmp files. So I don't think it's a good idea to make this command to be built-in.

A proper way would be making all the paths under a same tmp folder, such as .tmp/<uuid>/

then, user could run bigflow cleanup 3days to cleanup the folders which is older than 3 days.

acmol avatar Dec 28 '17 12:12 acmol

Normally, those files will be cleaned after successful runs. If not, there should be a problem.

chunyang-wen avatar Dec 31 '17 14:12 chunyang-wen