bigflow
bigflow copied to clipboard
manage byproducts
While running bigflow program, I find it will output some byproducts, e.g., entity-* .flume ... After several times, it will make the folder in a mess. Could you please put those byproducts into a pre-specified subfolder, in order to conveniently manage.
yeah.. This is a issue we wanted to do for a long time but haven't done yet. We are hoping that some of our users can help us to improve it in the future. PS: You should pay some attention when you want to remove all the byproducts, because it will cause a failure if you delete a file which is currently being used.
An alternative method: copy the following bash script, e.g., path/bigflow_cleanup.sh
#!/bin/bash
set -x
set -e
if [ $# -ne '0' ]; then
cd $1
fi
echo `pwd`
rm -rf ./entity-*
rm -rf ./.flume-resource-*
rm -rf ./.flume-app-*.tar.gz
rm -rf ./.empty-*.tar.gz
rm -rf ./hs_err_*
rm -rf ./.tmp
then add an alias in your .bashrc or .zshrc, etc.:
alias bigflow_cleanup='sh path/bigflow_cleanup.sh'
Source the .bashrc or .zshrc file, and then you can easily clean up those byproducts in any folder by excuting bigflow_cleanup command.
Yes, you could do it in your way, but it's very easy to make the running job fail if the job is using these tmp files. So I don't think it's a good idea to make this command to be built-in.
A proper way would be making all the paths under a same tmp folder, such as
.tmp/<uuid>/
then, user could run
bigflow cleanup 3days
to cleanup the folders which is older than 3 days.
Normally, those files will be cleaned after successful runs. If not, there should be a problem.