nydus icon indicating copy to clipboard operation
nydus copied to clipboard

WIP fscache: blob gc feature for fscache with inuse and cull

Open yawqi opened this issue 2 years ago • 25 comments

This patch is still working in progress, hasn't been tested at all, very likely to have bugs or even errors. Does anyone have suggestions on how can I test it?

Add blob gc for fscache by using inuse and cull, this implementation reference the implementation of the cachefilesd, but since we use ondemand mode, so it's sightly different.

The gc watermark for now is 95% used blocks or 95% used files, in the future, we need to change it configurable.

#454

Signed-off-by: Qi Wang [email protected]

yawqi avatar Jun 21 '22 01:06 yawqi

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

jiangliu avatar Jun 21 '22 02:06 jiangliu

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

If we only reclaim unused blob, is it necessary to reclaim the chunk map state? Does the chunk map state contains unused blob? The cull of cachefile will only cull the unused blob.

/*
 * Cull an object if it's not in use
 * - called only by cache manager daemon
*/
int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, char *filename)

yawqi avatar Jun 21 '22 02:06 yawqi

@yawqi , Your pull request has been updated. A new test job has been submitted. Please wait in patience.

yqleng1987 avatar Jun 21 '22 04:06 yqleng1987

@yawqi , Test job has been submitted. Please wait in patience.

yqleng1987 avatar Jun 21 '22 04:06 yqleng1987

@yawqi , The CI test is completed, please check result:

Test CaseTest Result
merge-target-branch:white_check_mark:SUCCESS
build-docker-image:white_check_mark:SUCCESS
compile-nydus:white_check_mark:SUCCESS
compile-ctr-remote:white_check_mark:SUCCESS
compile-nydus-snapshotter:white_check_mark:SUCCESS
start-nydus-snapshotter-config-containerd:white_check_mark:SUCCESS
run-container-with-nydus-image:white_check_mark:SUCCESS

Congratulations, your test job passed!

yqleng1987 avatar Jun 21 '22 04:06 yqleng1987

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

If we only reclaim unused blob, is it necessary to reclaim the chunk map state? Does the chunk map state contains unused blob? The cull of cachefile will only cull the unused blob.

/*
 * Cull an object if it's not in use
 * - called only by cache manager daemon
*/
int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, char *filename)

The chunk map is used to track chunk readiness for blob files. If the stale chunk map if left over and when the blob is used again, the fscache driver will send READ request as expected, but the nydusd won't fetch data from remote and write data to the new blob file because the chunk map file says all chunks are ready.

jiangliu avatar Jun 21 '22 05:06 jiangliu

Filesystem atime could never be changed per as mount option. Will it affect your design philosophy?

changweige avatar Jun 21 '22 06:06 changweige

Filesystem atime could never be changed per as mount option. Will it affect your design philosophy?

IMHO, It won't. I use atime as a rough estimate, don't depend on its accuracy. Such as outdated image's unused blob must have the oldest atime, so we can delete it first. For other files, I think the atime can partly reflect whether it is in use or when it has been used. And in most sceneries, mount won't disable atime (I guess...) : ).

yawqi avatar Jun 21 '22 07:06 yawqi

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

If we only reclaim unused blob, is it necessary to reclaim the chunk map state? Does the chunk map state contains unused blob? The cull of cachefile will only cull the unused blob.

/*
 * Cull an object if it's not in use
 * - called only by cache manager daemon
*/
int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, char *filename)

The chunk map is used to track chunk readiness for blob files. If the stale chunk map if left over and when the blob is used again, the fscache driver will send READ request as expected, but the nydusd won't fetch data from remote and write data to the new blob file because the chunk map file says all chunks are ready.

OK, I will fix it first.

yawqi avatar Jun 21 '22 08:06 yawqi

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

Sorry to bother you, but I wonder whether it is ok if I just delete the corresponding ${work_dir}/${blob_id}.chunk_map file? Thanks a lot!

yawqi avatar Jun 22 '22 07:06 yawqi

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

Sorry to bother you, but I wonder whether it is ok if I just delete the corresponding ${work_dir}/${blob_id}.chunk_map file? Thanks a lot!

Actually, Should I just delete the ${blob_id}, ${blob_id}.chunk_map and ${blob_id}.blob.meta all together?

yawqi avatar Jun 22 '22 08:06 yawqi

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

Sorry to bother you, but I wonder whether it is ok if I just delete the corresponding ${work_dir}/${blob_id}.chunk_map file? Thanks a lot!

Actually, Should I just delete the ${blob_id}, ${blob_id}.chunk_map and ${blob_id}.blob.meta all together?

I think I'm gonna add a gc_file(blob_id) method in trait BlobCacheMgr or struct FsCacheMgr to actually delete the chunk_map file.

yawqi avatar Jun 22 '22 09:06 yawqi

When reclaiming the data blob files, we should also reclaim the associated chunk map state file. Actually we should reclaim the chunk map state file before reclaim the blob file. Otherwise it may cause inconsistent state when the blob is used again.

Sorry to bother you, but I wonder whether it is ok if I just delete the corresponding ${work_dir}/${blob_id}.chunk_map file? Thanks a lot!

Actually, Should I just delete the ${blob_id}, ${blob_id}.chunk_map and ${blob_id}.blob.meta all together?

blob.meta is a const file, should be small, and affect container startup time. So it would be better to keep it. For chunk_map, we need a safe way to delete to avoid race conditions. We need to add an interface to safely delete the chunk map file.

jiangliu avatar Jun 22 '22 15:06 jiangliu

blob.meta is a const file, should be small, and affect container startup time. So it would be better to keep it. For chunk_map, we need a safe way to delete to avoid race conditions. We need to add an interface to safely delete the chunk map file.

I am thinking that adding another flag delete in the chunk map's header which can be used to indicate the status of the current chunk map file when we need to cull the blob file, we first set the delete flag in the header. If it can be culled, we may delete the chunk map file afterward.

pub(crate) struct Header {
    /// PersistMap magic number
    pub magic: u32,
    pub version: u32,
    pub magic2: u32,
    pub all_ready: u32,
    __pub delete: u32__,
    pub reserved: [u8; HEADER_RESERVED_SIZE],
}

Or should I go with the nydus-snapshotter way?

yawqi avatar Jun 23 '22 07:06 yawqi

blob.meta is a const file, should be small, and affect container startup time. So it would be better to keep it. For chunk_map, we need a safe way to delete to avoid race conditions. We need to add an interface to safely delete the chunk map file.

I am thinking that adding another flag delete in the chunk map's header which can be used to indicate the status of the current chunk map file when we need to cull the blob file, we first set the delete flag in the header. If it can be culled, we may delete the chunk map file afterward.

pub(crate) struct Header {
    /// PersistMap magic number
    pub magic: u32,
    pub version: u32,
    pub magic2: u32,
    pub all_ready: u32,
    __pub delete: u32__,
    pub reserved: [u8; HEADER_RESERVED_SIZE],
}

Or should I go with the nydus-snapshotter way?

I like this way:) We could set the flag, delete the chunkmap file from fs. All clients holding active fd to the chunkmap file won't get broken and will eventually see the flag. And the chunkmap can't be used by any new clients because it has been removed from fs.

jiangliu avatar Jun 26 '22 17:06 jiangliu

@yawqi , your pull request has been updated. A new test job will be submitted. Please wait in patience.

yqleng1987 avatar Jun 29 '22 10:06 yqleng1987

@yawqi , your test job has passed, and no need to test again.

yqleng1987 avatar Jun 29 '22 10:06 yqleng1987

Seems something gets wrong with CI,could you please help to retrigger the CI?

jiangliu avatar Jul 01 '22 06:07 jiangliu

I am currently occupied by the school affairs, is it ok if I close this PR temporarily? After these two months when I have passed the mid-term of my graduation project, I will have more time to work on this. Much thanks!

yawqi avatar Jul 01 '22 07:07 yawqi

I am currently occupied by the school affairs, is it ok if I close this PR temporarily? After these two months when I have passed the mid-term of my graduation project, I will have more time to work on this. Much thanks!

Let‘s just keep it open:)

jiangliu avatar Jul 02 '22 02:07 jiangliu

@yawqi Hi, how about your dealing with this problem?

adamqqqplay avatar Mar 23 '23 03:03 adamqqqplay

@adamqqqplay Sorry about that, I kinda forget about this PR, is this feature still needed? It seems there is a discussion about this feature last year, so I didn't work on it since then. I apologize for my delay. Should I keep on working on this feature or close it?

yawqi avatar Mar 23 '23 04:03 yawqi

@adamqqqplay Sorry about that, I kinda forget about this PR, is this feature still needed? It seems there is a discussion about this feature last year, so I didn't work on it since then. I apologize for my delay. Should I keep on working on this feature or close it?

@yawqi I found a similar PR #894, but I'm not sure if it has the same feature. Could you please confirm it if you have time? cc @changweige @kevinXYin

adamqqqplay avatar Mar 23 '23 06:03 adamqqqplay

@adamqqqplay Sorry about that, I kinda forget about this PR, is this feature still needed? It seems there is a discussion about this feature last year, so I didn't work on it since then. I apologize for my delay. Should I keep on working on this feature or close it? @yawqi I found a similar PR #894, but I'm not sure if it has the same feature. Could you please confirm it if you have time? cc @changweige @kevinXYin

It seems that this feature has been implemented in the nydus-snapshotter, maybe this PR should be closed? https://github.com/containerd/nydus-snapshotter/pull/262

yawqi avatar Mar 23 '23 08:03 yawqi

why we need to bother with whether close it or not? IOWs, I have to say, watermark from the kernel is needed anyway, so this PR should be needed eventually.

hsiangkao avatar Mar 24 '23 02:03 hsiangkao