rsync-time-backup icon indicating copy to clipboard operation
rsync-time-backup copied to clipboard

exclude folders not working

Open lyenliang opened this issue 4 years ago • 24 comments

I want to back up a folder /home/ec2-user/rsync_folder from one machine to another machine.

Here's the content of the folder

.
|-- excludeme
|   `-- backupFile.txt
|-- file2.txt
|-- hello.txt
`-- mylink.txt -> hello.txt

The excludeme folder should not be copied to the destination machine.

Based on this tutorial, I created a exclude file exclude_list.txt. Here's the content of it.

- excludeme/

After running ./rsync_tmbackup.sh -i ~/.ssh/my-key.pem -p 22 /home/ec2-user/rsync_folder [email protected]:/home/ec2-user exclude_list.txt command. The folder excludeme appears on the other machine?

Why does this happen?

Is there anything wrong with the exclude file?

I've also tried - /excludeme/, - /excludeme/*, and - ./excludeme/. But all of them failed to prevent from copying excludeme folder to another machine.

lyenliang avatar Nov 12 '19 10:11 lyenliang

check some guidance here: https://linuxize.com/post/how-to-exclude-files-and-directories-with-rsync/

exclude the directory content but not the directory itself: - /excludeme/*

or exclude the directory

- /excludeme

kapitainsky avatar Nov 12 '19 10:11 kapitainsky

- /excludeme doesn't work, either. The excludeme folder still appears on the destination machine after running rsync_tmbackup.sh command.

lyenliang avatar Nov 12 '19 10:11 lyenliang

/home/ec2-user/rsync_folder/excludeme is not the same as /excludeme ?

kapitainsky avatar Nov 12 '19 10:11 kapitainsky

it all works very well and I've been using it for years. You just have to carefully build your exclusion file with correct paths and syntax

kapitainsky avatar Nov 12 '19 10:11 kapitainsky

/home/ec2-user/rsync_folder/excludeme is not the same as /excludeme ?

The full path of the excludeme folder is /home/ec2-user/rsync_folder/excludeme.

There isn't a folder named excludeme under / path.

lyenliang avatar Nov 12 '19 10:11 lyenliang

I just tried - /home/ec2-user/rsync_folder/excludeme. It doesn't work, either.

lyenliang avatar Nov 12 '19 10:11 lyenliang

try - /excludeme

kapitainsky avatar Nov 12 '19 10:11 kapitainsky

I just tried myself

sudo /usr/local/bin/rsync_tmbackup.sh /home/pi /media/db_tc_1/BackupTest/ /media/db_tc_1/BackupTest/excluded_patterns.txt

if I create excludeme folder in /home/pi

and include - /excludeme in /media/db_tc_1/BackupTest/excluded_patterns.txt

it is properly excluded from backup

kapitainsky avatar Nov 12 '19 10:11 kapitainsky

Let me describe more details of my problem:

I placed rsync-time-backup under /home/ec2-user/ folder. A file exclude_list.txt indicating what files to be excluded is placed under /home/ec2-user/rsync-time-backup.

$ tree /home/ec2-user/rsync-time-backup
/home/ec2-user/rsync-time-backup
|-- exclude_list.txt
|-- README.md
|-- rsync_tmbackup.sh
`-- tests
    `-- populate_dest.php

The folder to be backed up to another machine is /home/ec2-user/rsync_folder.

$ tree /home/ec2-user/rsync_folder
/home/ec2-user/rsync_folder
|-- excludeme
|   `-- backupFile.txt
|-- file2.txt
|-- hello.txt
`-- mylink.txt -> hello.txt

/home/ec2-user/rsync_folder/excludeme/ is a folder that should not be copied to the other machine.

The content of exclude_list.txt is - /excludeme. I created this file based on this tutorial.

After running the following command, excludeme folder appears on the other machine. It is not excluded from the backup. Why does this happen?

$ cd /home/ec2-user/rsync-time-backup
$ ./rsync_tmbackup.sh -i ~/.ssh/my-key.pem -p 22 /home/ec2-user/rsync_folder [email protected]:/home/ec2-user exclude_list.txt

Here's the log of the command:

rsync_tmbackup: No previous backup - creating new one.
rsync_tmbackup: Creating destination [email protected]:/home/ec2-user/2019-11-12-134015
rsync_tmbackup: Starting backup...
rsync_tmbackup: From: /home/ec2-user/rsync_folder/
rsync_tmbackup: To:   [email protected]:/home/ec2-user/2019-11-12-134015/
rsync_tmbackup: Running command:
rsync_tmbackup: rsync  -e 'ssh -p 22 -i /home/ec2-user/.ssh/my-key.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null' -D --numeric-ids --links --hard-links --one-file-system --itemize-changes --times --recursive --perms --owner --group --stats --human-readable --compress --log-file '/home/ec2-user/.rsync_tmbackup/2019-11-12-134016.log' --exclude-from 'exclude_list.txt'  -- '/home/ec2-user/rsync_folder/' '[email protected]:/home/ec2-user/2019-11-12-134015/'
Warning: Permanently added '123.123.123.123' (ECDSA) to the list of known hosts.
.d..t...... ./
<f+++++++++ file2.txt
<f+++++++++ hello.txt
cL+++++++++ mylink.txt -> hello.txt
cd+++++++++ excludeme/
<f+++++++++ excludeme/backupFile.txt

Number of files: 6 (reg: 3, dir: 2, link: 1)
Number of created files: 5 (reg: 3, dir: 1, link: 1)
Number of deleted files: 0
Number of regular files transferred: 3
Total file size: 27 bytes
Total transferred file size: 18 bytes
Literal data: 18 bytes
Matched data: 0 bytes
File list size: 0
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 355
Total bytes received: 87

sent 355 bytes  received 87 bytes  884.00 bytes/sec
total size is 27  speedup is 0.06
rsync_tmbackup: Backup completed without errors.

lyenliang avatar Nov 13 '19 01:11 lyenliang

The version of my rsync is rsync version 3.1.2 protocol version 31

lyenliang avatar Nov 13 '19 01:11 lyenliang

@kapitainsky, May I know your rsync's version?

lyenliang avatar Nov 13 '19 01:11 lyenliang

Exclusion for a file (- hello.txt) works. But exclusion for a directory doesn't work.

lyenliang avatar Nov 13 '19 09:11 lyenliang

rsync version 3.1.3 protocol version 31

kapitainsky avatar Nov 13 '19 11:11 kapitainsky

can you try pure rsync?

kapitainsky avatar Nov 13 '19 12:11 kapitainsky

but I would not think that issue is with rsync version. 3.1.2 is one I used for ages.. no issues with exclusions.

kapitainsky avatar Nov 13 '19 12:11 kapitainsky

Have you managed to fix it? I have replicated your exact config and used remove backup destination over ssh. Exclude work for me exactly as it should, so I am rather puzzled with your experience.

kapitainsky avatar Nov 14 '19 15:11 kapitainsky

No, I haven't found a solution to exclude directories.

lyenliang avatar Nov 18 '19 01:11 lyenliang

what is your OS?

kapitainsky avatar Nov 18 '19 09:11 kapitainsky

Here's my OS information:

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

It's generated from cat /etc/os-release

lyenliang avatar Nov 22 '19 07:11 lyenliang

I have a hunch this is related to how the script is written to handle recursion and the way you seem to think of a directory as one exclusion. hint: think of the directory and their contents separately: try this rule set:

- /excludeme/
- /excludeme/*

Caveat: I realize you may already actually think of them this way, but I doubt the poor rsync program picked up on this human thought

let me know if this helps

reactive-firewall avatar Mar 19 '20 02:03 reactive-firewall

I hope this clears things up and helps someone.

My source tree (command tree ./test-source/):

./test-source/
├── folder01
│   └── should-be-excluded.txt
├── folder02
│   └── should-be-excluded.txt
├── folder03
│   └── should-be-excluded.txt
├── folder04
│   ├── should-be-excluded.txt
│   └── should-not-be-excluded.txt
├── folder05
│   └── should-be-excluded.txt
├── folder06
│   └── should-be-excluded.txt
├── folder07
│   └── should-be-excluded.txt
├── should-not-be-excluded.txt
└── test-tree
    ├── folder08
    │   └── should-be-excluded.txt
    ├── folder09
    │   └── should-be-excluded.txt
    ├── folder10
    │   └── will-not-be-excluded.txt
    ├── folder11
    │   └── will-not-be-excluded.txt
    └── folder12
        └── should-be-excluded.txt

 

This is my excludes file (filename excludes.txt):

# excludes.txt
# Directories all under ./test-source/
- /folder01
- folder02
+ folder03/should-*
- folder03/
+ folder04/should-not*
- folder04/*
- folder04/*
- /folder05/*
- folder06/
- /folder07/

# Directories all under ./test-source/test-tree/
- folder08
- folder09/
- /folder10
- /folder11
- /test-tree/folder12

 

The command:

./rsync-time-backup/rsync_tmbackup.sh \
  ./test-source/ /tmp/test-backup-storage/ \
  ./excludes.txt

 

The result (command tree /tmp/test-backup-storage/):

/tmp/test-backup-storage
├── 2020-06-04-160842
│   ├── folder04
│   │   └── should-not-be-excluded.txt
│   ├── folder05
│   ├── should-not-be-excluded.txt
│   └── test-tree
│       ├── folder10
│       │   └── will-not-be-excluded.txt
│       └── folder11
│           └── will-not-be-excluded.txt
├── backup.marker
└── latest -> 2020-06-04-160842

The surprising bit may be the + folder03/should-* that was not honored. Why? I suppose because the - folder03/ is a rule one level up, or some such? I don't know but it sorta makes sense. You will notice how things were treated differently for the file in folder04.

taw00 avatar Jun 04 '20 20:06 taw00

That's very useful. Thank you.

jackcctse avatar Aug 09 '20 18:08 jackcctse

Quoting doesn't seem to work if there is a directory name with embedded spaces.

I want to exclude this directory:

/System Volume Information/

Using it like this doesn't work:

- "/System Volume Information/"

This doesn't work, either:

- '/System Volume Information/'

It also doesn't work like this:

- "/System Volume Information/" - "/System Volume Information/*"

The only thing that works is if I include question marks:

- /System?Volume?Information/

peter-fb avatar Jan 21 '21 10:01 peter-fb

I found my way here because I seem to be experiencing a similar problem. I understand that this is not the rsync issue tracker but I thought it might be of interest to this project.

My version of rsync on Debian Buster

hbarta@olive:~/Downloads/rsync$ rsync --version
rsync  version 3.1.3  protocol version 31
Copyright (C) 1996-2018 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes, prealloc

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
hbarta@olive:~/Downloads/rsync$ 

The command I'm using

/usr/bin/rsync -azXAS --delete -v          --exclude='.git/*'         /home/hbarta/Programming/         oak:/srvpool/srv/redwood/hbarta/olive/2021-05-01/Programming

(Note" I've tried using --exclude='.git/' with similar results.

Some of the output.

C++/log-filter/
C++/log-filter/_filedir
C++/log-filter/blah
C++/log-filter/blah.cpp
C++/log-filter/log-01
C++/log-filter/log-01.cpp
C++/log-filter/putx
C++/log-filter/putx.cpp
C++/log-filter/to_upper
C++/log-filter/to_upper.cpp
C++/log-filter/.git/
C++/log-filter/.git/COMMIT_EDITMSG
C++/log-filter/.git/HEAD
C++/log-filter/.git/config
C++/log-filter/.git/description

I've done some cursory testing by running the same rsync command from a terminal and it excludes .git directories as desired. Out above is from a bash script executed by cron. The process runs with normal user rights and the remote system is on the same version of Debian and rsync. I have not been able to craft a test case that demonstrates this issue.

HankB avatar May 02 '21 16:05 HankB