gravitino icon indicating copy to clipboard operation
gravitino copied to clipboard

Cache Hadoop Filesystem instance on Gravitino server to improve the performance

Open yuqi1129 opened this issue 10 months ago • 3 comments

Currently , all Gravitino File system providers use the following code

(Take HDFSFileSystemProvider for example)

Image

FileSystem.instance will always create a new Filesystem everytime even though they are the same mostly. In fact Hadoop FileSystem did have cache mechanism, If we use FileSystem.get, cache mechanism in FileSystem will works. Due to the fact the Gravitino virtual FileSystem (GVFS) client also shares FileSystemProviders and supports credentials for each unique path, we should be cautious when planning to enable cache in the file system. in all

  • In Gravitno server side, we can enable cache in FileSystem level
  • In GVFS, we need to disable it FileSystem level and cache file system instacen in GVFS level

yuqi1129 avatar Feb 27 '25 02:02 yuqi1129

@yuqi1129 please assign it to me.

sunxiaojian avatar Mar 02 '25 16:03 sunxiaojian

Of course, I will send it to you if you are interested in this issue.

yuqi1129 avatar Mar 03 '25 03:03 yuqi1129

Of course, I will send it to you if you are interested in this issue.

ok, thanks

sunxiaojian avatar Mar 03 '25 06:03 sunxiaojian

Good catch!

shaofengshi avatar Mar 24 '25 03:03 shaofengshi