noobaa-core
noobaa-core copied to clipboard
NSFS: migration timestamp is updated unexpectedly
Environment info
- NooBaa Version: noobaa-core-5.16.0-20240218.el8.x86_64
- Platform: RPM
Actual behavior
- migration timestamp
logs/timestamp.migrate
is updated even if the migration script is not called
Expected behavior
- migration timestamp
logs/timestamp.migrate
is updated only when the migration script is called
Steps to reproduce
# for i in `seq 5`; do aws s3api put-object --bucket testbucket01 --key 10M${i} --body ~/testdata/dummy.M.10M --storage-class GLACIER; done
# sleep 900
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T05:39:11.318Z
Rst: 2024-02-29T05:39:31.279Z
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier migrate
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T05:56:05.263Z
Rst: 2024-02-29T05:39:31.279Z
# aws s3api restore-object --bucket testbucket01 --key 10M1 --restore-request '{"Days":1}'
# sleep 900
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier restore
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T05:56:05.263Z
Rst: 2024-02-29T06:13:11.750Z
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier migrate
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T06:13:42.856Z
Rst: 2024-02-29T06:13:11.750Z
More information - Screenshots / Logs / Other output
Here is the log of 2nd manage_nsfs.js glacier migrate
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier migrate; echo $?
load_nsfs_nc_config.setting config.NSFS_NC_CONF_DIR /gpfs/tapecloud/nsfs
nsfs: config_dir_path=/gpfs/tapecloud/nsfs config.json= {
ENDPOINT_FORKS: 1,
NOOBAA_LOG_LEVEL: 'nsfs',
ENDPOINT_PORT: 80,
ENDPOINT_SSL_PORT: 443,
EP_METRICS_SERVER_PORT: 7690,
ALLOW_HTTP: true,
NSFS_CALCULATE_MD5: false,
UV_THREADPOOL_SIZE: 256,
GPFS_DL_PATH: '/usr/lpp/mmfs/lib/libgpfs.so',
NSFS_BUF_POOL_MEM_LIMIT: 104857600,
NSFS_BUF_SIZE: 16777216,
NSFS_OPEN_READ_MODE: 'rd',
NSFS_CHECK_BUCKET_BOUNDARIES: false,
NSFS_TRIGGER_FSYNC: true,
DENY_UPLOAD_TO_STORAGE_CLASS_STANDARD: true,
NSFS_GLACIER_ENABLED: true,
NSFS_GLACIER_LOGS_ENABLED: true,
NSFS_GLACIER_BACKEND: 'TAPECLOUD',
NSFS_GLACIER_TAPECLOUD_BIN_DIR: '/opt/ibm/tapecloud/bin',
NSFS_GLACIER_LOGS_DIR: '/gpfs/tapecloud/nsfs/logs',
NSFS_GLACIER_LOGS_MAX_INTERVAL: 900000,
NSFS_GLACIER_MIGRATE_INTERVAL: 600000,
NSFS_GLACIER_RESTORE_INTERVAL: 600000,
NSFS_GLACIER_EXPIRY_INTERVAL: 43200000
}
2024-02-29 15:13:41.556098 [PID-1985829/TID-1985829] FS::GPFS GPFS_DL_PATH=/usr/lpp/mmfs/lib/libgpfs.so
2024-02-29 15:13:41.556264 [PID-1985829/TID-1985829] FS::GPFS found GPFS lib file GPFS_DL_PATH=/usr/lpp/mmfs/lib/libgpfs.so
2024-02-29 15:13:41.564207 [PID-1985829/TID-1985829] [L1] FS::set_debug_level 5
2024-02-29 15:13:42.710803 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs
2024-02-29 15:13:42.711094 [PID-1985829/TID-1985862] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.711197 [PID-1985829/TID-1985862] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.711504 [PID-1985829/TID-1985862] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs took: 0.181459 ms
(node:1985829) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023.
Please migrate your code to use AWS SDK for JavaScript (v3).
For more information, check the migration guide at https://a.co/7PzMCcy
(Use `node --trace-warnings ...` to show where the warning was created)
Feb-29 15:13:42.715 [/1985829] [LOG] CONSOLE:: read_rand_seed: reading 32 bytes from /dev/random ...
2024-02-29 15:13:42.717541 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs _stat_res.st_ino=12546 _stat_res.st_size=4096
2024-02-29 15:13:42.718051 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs/buckets
2024-02-29 15:13:42.718652 [PID-1985829/TID-1985865] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/buckets _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.718778 [PID-1985829/TID-1985865] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/buckets _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.718983 [PID-1985829/TID-1985865] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/buckets took: 0.083815 ms
Feb-29 15:13:42.719 [/1985829] [LOG] CONSOLE:: read_rand_seed: got 32 bytes from /dev/random, total 32 ...
Feb-29 15:13:42.719 [/1985829] [LOG] CONSOLE:: read_rand_seed: closing fd ...
2024-02-29 15:13:42.720681 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs/buckets _stat_res.st_ino=36115 _stat_res.st_size=4096
2024-02-29 15:13:42.720902 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs/accounts
2024-02-29 15:13:42.728656 [PID-1985829/TID-1985866] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/accounts _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.728776 [PID-1985829/TID-1985866] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/accounts _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.728996 [PID-1985829/TID-1985866] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/accounts took: 0.099093 ms
Feb-29 15:13:42.729 [/1985829] [LOG] CONSOLE:: init_rand_seed: seeding with 32 bytes
2024-02-29 15:13:42.729916 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs/accounts _stat_res.st_ino=36116 _stat_res.st_size=4096
2024-02-29 15:13:42.730146 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs/access_keys
2024-02-29 15:13:42.739651 [PID-1985829/TID-1985870] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/access_keys _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.739843 [PID-1985829/TID-1985870] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/access_keys _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.740063 [PID-1985829/TID-1985870] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/access_keys took: 0.097466 ms
2024-02-29 15:13:42.740578 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs/access_keys _stat_res.st_ino=36117 _stat_res.st_size=4096
2024-02-29 15:13:42.741196 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438
2024-02-29 15:13:42.744339 [PID-1985829/TID-1985872] [L1] FS::FSWorker::Execute: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438 _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.744512 [PID-1985829/TID-1985872] [L1] FS::FSWorker::Execute: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438 _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.745684 [PID-1985829/TID-1985872] [L1] FS::FSWorker::Execute: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438 took: 0.9709 ms
Feb-29 15:13:42.758 [/1985829] [LOG] CONSOLE:: generate_entropy: entropy_avail 2048
Feb-29 15:13:42.759 [/1985829] [LOG] CONSOLE:: init_rand_seed: done
2024-02-29 15:13:42.759490 [PID-1985829/TID-1985829] [L1] FS::FileOpen::OnOK: _path=/gpfs/tapecloud/nsfs/logs/cluster.lock
2024-02-29 15:13:42.759734 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock
2024-02-29 15:13:42.766276 [PID-1985829/TID-1985874] [L1] FS::FSWorker::Execute: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.766407 [PID-1985829/TID-1985874] [L1] FS::FSWorker::Execute: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.766540 [PID-1985829/TID-1985874] [L1] FS::FSWorker::Execute: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock took: 0.012023 ms
2024-02-29 15:13:42.766668 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: undefined FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock
2024-02-29 15:13:42.853525 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate
2024-02-29 15:13:42.853715 [PID-1985829/TID-1985875] [L1] FS::FSWorker::Execute: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.853841 [PID-1985829/TID-1985875] [L1] FS::FSWorker::Execute: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.854127 [PID-1985829/TID-1985875] [L1] FS::FSWorker::Execute: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate took: 0.148095 ms
2024-02-29 15:13:42.854268 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate
2024-02-29 15:13:42.855549 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Readdir _path=/gpfs/tapecloud/nsfs/logs
2024-02-29 15:13:42.855685 [PID-1985829/TID-1985876] [L1] FS::FSWorker::Execute: Readdir _path=/gpfs/tapecloud/nsfs/logs _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.855831 [PID-1985829/TID-1985876] [L1] FS::FSWorker::Execute: Readdir _path=/gpfs/tapecloud/nsfs/logs _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.856146 [PID-1985829/TID-1985876] [L1] FS::FSWorker::Execute: Readdir _path=/gpfs/tapecloud/nsfs/logs took: 0.181358 ms
2024-02-29 15:13:42.856316 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: Readdir _path=/gpfs/tapecloud/nsfs/logs
2024-02-29 15:13:42.857341 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438
2024-02-29 15:13:42.857459 [PID-1985829/TID-1985877] [L1] FS::FSWorker::Execute: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438 _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.857601 [PID-1985829/TID-1985877] [L1] FS::FSWorker::Execute: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438 _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.858322 [PID-1985829/TID-1985877] [L1] FS::FSWorker::Execute: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438 took: 0.479036 ms
2024-02-29 15:13:42.858510 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: undefined Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438
2024-02-29 15:13:42.858686 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19
2024-02-29 15:13:42.858786 [PID-1985829/TID-1985878] [L1] FS::FSWorker::Execute: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19 _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.858986 [PID-1985829/TID-1985878] [L1] FS::FSWorker::Execute: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19 _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.859212 [PID-1985829/TID-1985878] [L1] FS::FSWorker::Execute: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19 took: 0.017905 ms
2024-02-29 15:13:42.859308 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: undefined FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19
Hi, I tried thinking about this a little more. Should we really not update the timestamp if there was nothing to process? I don't think that there is any harm in doing so but I don't yet understand the gain yet either.
Pros - Sounds "right" Cons - Migrate gets triggered every minute so despite trying to batch uploads happened in last ...MIGRATE_INTERVAL we just batch uploads happen between last cron scheduling and current.
WDYT @piste-jp-ibm @guymguym
In my thought, it is good to update the timestamp in the case below.
- when
manage_nsfs
issuesmigration
script - when first entry is coming to
migration
file under the log directory
- in other words, when noobaa creates
migration
log file under the log directory - another thought is keeping this value in to another place and decide go or no-go with 2 values,
timestamp.migrate
and this value
I think this implementation might keep all files are started to migrate within 11 mins at most.
This issue had no activity for too long - it will now be labeled stale. Update it to prevent it from getting closed.
This issue is stale and had no activity for too long - it will now be closed.