gravitino icon indicating copy to clipboard operation
gravitino copied to clipboard

[Improvement] Get `catalogInUse` and `metalakeInUse` from cache instead of retrieve them from storage

Open yuqi1129 opened this issue 10 months ago • 2 comments

What would you like to be improved?

catalogInUse and metalakeInUse is called frequently and very time-consuming as it directly load data from backend storage, which can be optimized.

Image
 protected <R, E extends Throwable> R doWithCatalog(
      NameIdentifier ident, ThrowableFunction<CatalogManager.CatalogWrapper, R> fn, Class<E> ex)
      throws E {
    checkCatalogInUse(store, ident);

    try {
      CatalogManager.CatalogWrapper c = catalogManager.loadCatalogAndWrap(ident);
      return fn.apply(c);
    } catch (Throwable throwable) {
      if (ex.isInstance(throwable)) {
        throw ex.cast(throwable);
      }
      if (RuntimeException.class.isAssignableFrom(throwable.getClass())) {
        throw (RuntimeException) throwable;
      }
      throw new RuntimeException(throwable);
    }
  }

  public static void checkCatalogInUse(EntityStore store, NameIdentifier ident)
      throws NoSuchMetalakeException, NoSuchCatalogException, CatalogNotInUseException,
          MetalakeNotInUseException {
    NameIdentifier metalakeIdent = NameIdentifier.of(ident.namespace().levels());
    checkMetalake(metalakeIdent, store);

    if (!getCatalogInUseValue(store, ident)) {
      throw new CatalogNotInUseException("Catalog %s is not in use, please enable it first", ident);
    }
  }

  private static boolean getCatalogInUseValue(EntityStore store, NameIdentifier catalogIdent) {
    try {
      CatalogEntity catalogEntity =
          store.get(catalogIdent, EntityType.CATALOG, CatalogEntity.class);
      return (boolean)
          BASIC_CATALOG_PROPERTIES_METADATA.getOrDefault(
              catalogEntity.getProperties(), PROPERTY_IN_USE);

    } catch (NoSuchEntityException e) {
      LOG.warn("Catalog {} does not exist", catalogIdent, e);
      throw new NoSuchCatalogException(CATALOG_DOES_NOT_EXIST_MSG, catalogIdent);

    } catch (IOException e) {
      LOG.error("Failed to do store operation", e);
      throw new RuntimeException(e);
    }
  }

  public static boolean metalakeInUse(EntityStore store, NameIdentifier ident)
      throws NoSuchMetalakeException {
    try {
      BaseMetalake metalake = store.get(ident, EntityType.METALAKE, BaseMetalake.class);
      return (boolean)
          metalake.propertiesMetadata().getOrDefault(metalake.properties(), PROPERTY_IN_USE);

    } catch (NoSuchEntityException e) {
      LOG.warn("Metalake {} does not exist", ident, e);
      throw new NoSuchMetalakeException(METALAKE_DOES_NOT_EXIST_MSG, ident);

    } catch (IOException e) {
      LOG.error("Failed to do store operation", e);
      throw new RuntimeException(e);
    }
  }

After this improvement, QPS of APIs like loadFileset will be increased by as much as 50%.

How should we improve?

No response

yuqi1129 avatar Feb 27 '25 08:02 yuqi1129

I would like to work on it.

Abyss-lord avatar Feb 27 '25 09:02 Abyss-lord

I would like to work on it.

Sorry, I'm already working on it. If you are interested in this point, you can take part in https://github.com/apache/gravitino/issues/6560

yuqi1129 avatar Feb 27 '25 12:02 yuqi1129