gravitino icon indicating copy to clipboard operation
gravitino copied to clipboard

Optimize the code path in `createFileset` and optimize path.

Open yuqi1129 opened this issue 10 months ago • 2 comments

Code in HadoopCatalogOperations#createFileset

   try {
      // formalize the path to avoid path without scheme, uri, authority, etc.
      filesetPath = formalizePath(filesetPath, conf);

      FileSystem fs = getFileSystem(filesetPath, conf);
      if (!fs.exists(filesetPath)) {
        if (!fs.mkdirs(filesetPath)) {
          throw new RuntimeException(
              "Failed to create fileset " + ident + " location " + filesetPath);
        }

        LOG.info("Created fileset {} location {}", ident, filesetPath);
      } else {
        LOG.info("Fileset {} manages the existing location {}", ident, filesetPath);
      }

    } catch (IOException ioe) {
      throw new RuntimeException(
          "Failed to create fileset " + ident + " location " + filesetPath, ioe);
    
  • filesetPath = formalizePath(filesetPath, conf);
  • FileSystem fs = getFileSystem(filesetPath, conf);

These two lines will repeatedly get and initialize file system and can be merged into one

      AtomicReference<FileSystem> fileSystem = new AtomicReference<>();
      Awaitility.await()
          .atMost(timeoutSeconds, TimeUnit.SECONDS)
          .until(
              () -> {
                fileSystem.set(provider.getFileSystem(path, config));
                return true;
              });
      return fileSystem.get();

This code can be replaced to Java Future mechanism to reduce the time taken in poll status.

There may be other minor points to improve.

yuqi1129 avatar Feb 27 '25 02:02 yuqi1129

I would like to work on it.

Abyss-lord avatar Feb 27 '25 06:02 Abyss-lord

OK, just go ahead.

yuqi1129 avatar Feb 27 '25 12:02 yuqi1129

@Abyss-lord Do you have time to work on this issue?

yuqi1129 avatar Mar 12 '25 09:03 yuqi1129

@yuqi1129 Yes, I can finish it in two days

Abyss-lord avatar Mar 12 '25 15:03 Abyss-lord