langchainjs icon indicating copy to clipboard operation
langchainjs copied to clipboard

S3 File Loader:Endpoint is missing in configuration

Open lawetis opened this issue 9 months ago • 3 comments

Checklist

  • [X] I added a very descriptive title to this issue.
  • [X] I included a link to the documentation page I am referring to (if applicable).

Issue with current documentation:

According to the current configuration https://api.js.langchain.com/interfaces/langchain_document_loaders_web_s3.S3LoaderParams.html

The current configuration is:

const loader = new S3Loader({
  bucket: "my-document-bucket-123",
  key: "AccountingOverview.pdf",
  s3Config: {
    region: "us-east-1",
    credentials: {
      accessKeyId: "AKIAIOSFODNN7EXAMPLE",
      secretAccessKey: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    },
  },
  unstructuredAPIURL: "http://localhost:8000/general/v0/general",
  unstructuredAPIKey: "", // this will be soon required
});

So is it possible to configure the endpoint, which is optional?

Idea or request for content:

After querying @aws-sdk/client-s3, I am convinced that endpoint is an optional configuration.

export interface ClientInputEndpointParameters {
    region?: string | Provider<string>;
    useFipsEndpoint?: boolean | Provider<boolean>;
    useDualstackEndpoint?: boolean | Provider<boolean>;
    endpoint?: string | Provider<string> | Endpoint | Provider<Endpoint> | EndpointV2 | Provider<EndpointV2>;
    forcePathStyle?: boolean | Provider<boolean>;
    useAccelerateEndpoint?: boolean | Provider<boolean>;
    useGlobalEndpoint?: boolean | Provider<boolean>;
    disableMultiregionAccessPoints?: boolean | Provider<boolean>;
    useArnRegion?: boolean | Provider<boolean>;
    disableS3ExpressSessionAuth?: boolean | Provider<boolean>;
}

0d4b14601aa28a2f43e7c2fdaa112b6b

lawetis avatar May 11 '24 15:05 lawetis

The API reference documentation is automatically generated based on the source code. However, we are aware it has some limitations, especially around showing types for externally imported types/interfaces, like the one we see in this file: S3ClientConfig.

Is this issue regarding the API ref documentation not showing the types that S3ClientConfig interface contains, or something else? I just checked and it is possible to pass endpoint to the s3Config object of this class.

bracesproul avatar May 12 '24 21:05 bracesproul

You can configure it through s3Config.endpoint, the ClientInputEndpointParameters you mentioned is a subset of S3LoaderParams['S3Config'].

Example:

new S3Loader({
  bucket: "my-document-bucket-123",
  key: "AccountingOverview.pdf",
  s3Config: {
    region: "us-east-1",
    endpoint: "your endpoint",
    credentials: {
      accessKeyId: "AKIAIOSFODNN7EXAMPLE",
      secretAccessKey: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    },
  },
  unstructuredAPIURL: "http://localhost:8000/general/v0/general",
  unstructuredAPIKey: "", // this will be soon required
});

@lawetis

jeasonnow avatar May 13 '24 08:05 jeasonnow

@bracesproul @jeasonnow Thanks a lot, this was my problem and didn't notice.

lawetis avatar May 13 '24 15:05 lawetis