amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Bug]: (External Catalog) Iceberg rest catalog implementation error

Open Shreyas220 opened this issue 1 year ago • 6 comments

What happened?

I am trying to connect to the external iceberg catalog using this setup here along with amoro https://iceberg.apache.org/spark-quickstart/#docker-compose

using this jar https://github.com/databricks/iceberg-rest-image for catalog implementation

but getting this error


java.lang.IllegalArgumentException: Cannot initialize Catalog implementation 
/usr/local/amoro/lib/iceberg-rest-image-all.jar: Cannot find constructor for interface
org.apache.iceberg.catalog.Catalog Missing /usr/local/amoro/lib/iceberg-rest-image-all.jar
[java.lang.ClassNotFoundException: /usr/local/amoro/lib/iceberg-rest-image-all.jar]

Affects Versions

latest amoro imge

What table formats are you seeing the problem on?

Iceberg

What engines are you seeing the problem on?

AMS

How to reproduce

use this docker-compose https://iceberg.apache.org/spark-quickstart/#docker-compose along with amoro

and try connecting to iceberg rest catalog

version: "3"

services:
  spark-iceberg:
    image: tabulario/spark-iceberg
    container_name: spark-iceberg
    build: spark/
    networks:
      iceberg_net:
    depends_on:
      - rest
      - minio
    volumes:
      - ./warehouse:/home/iceberg/warehouse
      - ./notebooks:/home/iceberg/notebooks/notebooks
    environment:
      - AWS_ACCESS_KEY_ID=admin
      - AWS_SECRET_ACCESS_KEY=password
      - AWS_REGION=us-east-1
    ports:
      - 8888:8888
      - 8080:8080
      - 10000:10000
      - 10001:10001
  rest:
    image: tabulario/iceberg-rest
    container_name: iceberg-rest
    networks:
      iceberg_net:
    ports:
      - 8181:8181
    environment:
      - AWS_ACCESS_KEY_ID=admin
      - AWS_SECRET_ACCESS_KEY=password
      - AWS_REGION=us-east-1
      - CATALOG_WAREHOUSE=s3://warehouse/
      - CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO
      - CATALOG_S3_ENDPOINT=http://minio:9000
  minio:
    image: minio/minio
    container_name: minio
    environment:
      - MINIO_ROOT_USER=admin
      - MINIO_ROOT_PASSWORD=password
      - MINIO_DOMAIN=minio
    networks:
      iceberg_net:
        aliases:
          - warehouse.minio
    ports:
      - 9001:9001
      - 9000:9000
    command: ["server", "/data", "--console-address", ":9001"]
  mc:
    depends_on:
      - minio
    image: minio/mc
    container_name: mc
    networks:
      iceberg_net:
    environment:
      - AWS_ACCESS_KEY_ID=admin
      - AWS_SECRET_ACCESS_KEY=password
      - AWS_REGION=us-east-1
    entrypoint: >
      /bin/sh -c "
      until (/usr/bin/mc config host add minio http://minio:9000 admin password) do echo '...waiting...' && sleep 1; done;
      /usr/bin/mc rm -r --force minio/warehouse;
      /usr/bin/mc mb minio/warehouse;
      /usr/bin/mc policy set public minio/warehouse;
      tail -f /dev/null
      "
networks:
  iceberg_net:

Relevant log output

No response

Anything else

Screenshot 2024-12-04 at 6 46 28 PM

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

Shreyas220 avatar Dec 04 '24 13:12 Shreyas220

if this is not an issue please guide what jar to use catalog impl for iceberg rest catalog

Shreyas220 avatar Dec 04 '24 13:12 Shreyas220

@Shreyas220, thanks for reporting the issue here. Could you please try to set the value of catalog-impl to the full class name of the rest catalog, and put the jar into $AMORO_HOME/lib dir?

klion26 avatar Dec 05 '24 06:12 klion26

@Shreyas220 catalog-impl should be org.apache.iceberg.rest.RESTCatalog and you also missing some required properties like uri. Custom catalog type actually invoke CatalogUtil.loadCatalog, thus you need to fill in native icebreg catalog properties, refer to https://iceberg.apache.org/javadoc/1.6.0/org/apache/iceberg/CatalogProperties.html

xxubai avatar Dec 05 '24 12:12 xxubai

Hey @XBaith, @klion26 thanks alot 🙌 , got amoro with rest iceberg working

Now the next step is connecting it with Glue catalog, can also point me to what properties would be required for it ?

also is it possible to use s3 storage when using hive metastore ?

Shreyas220 avatar Dec 07 '24 04:12 Shreyas220

Now the next step is connecting it with Glue catalog, can also point me to what properties would be required for it ? also is it possible to use s3 storage when using hive metastore ?

  1. Amoro package provided the iceberg-aws dependency by default, so you can connect the GlueCatalog by set the catalog-impl property to org.apache.iceberg.aws.glue.GlueCatalog.

  2. Yes, you can use s3 storage with hive metastore, but you may need to choose Hadoop storage in Amoro and put the s3 information in hdfs-site.xml, you can find more information here:https://hadoop.apache.org/docs/r3.4.1/hadoop-aws/tools/hadoop-aws/index.html

zhoujinsong avatar Mar 12 '25 07:03 zhoujinsong

Maybe we can enhance the documentation for configuring the Rest catalog so that the user can easily play around with Amoro with the Rest catalog.

klion26 avatar Apr 17 '25 06:04 klion26

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Oct 15 '25 00:10 github-actions[bot]

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions[bot] avatar Oct 29 '25 00:10 github-actions[bot]