amazon-textract-transformer-pipeline icon indicating copy to clipboard operation
amazon-textract-transformer-pipeline copied to clipboard

Post-process Amazon Textract results with Hugging Face transformer models for document understanding

Results 17 amazon-textract-transformer-pipeline issues
Sort by recently updated
recently updated
newest added

Bumps [pdfjs-dist](https://github.com/mozilla/pdfjs-dist) from 3.4.120 to 4.2.67. Commits See full diff in compare view [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pdfjs-dist&package-manager=npm_and_yarn&previous-version=3.4.120&new-version=4.2.67)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter...

dependencies
javascript

When trying to run this cell, I got 'sndfile library not found' error. Even after I pip install the packages, the issue is still not resolved. Can anyone suggest how...

Recently heard from a user facing the following error on CDK deploy: ``` Resource handler returned message:"The runtime parameter of go1.x is no longer supported for creating or updating AWS...

bug

Bumps [tar](https://github.com/isaacs/node-tar) from 6.1.13 to 6.2.1. Changelog Sourced from tar's changelog. Changelog 7.0 Rewrite in TypeScript, provide ESM and CommonJS hybrid interface Add tree-shake friendly exports, like import('tar/create') and import('tar/read-entry')...

dependencies
javascript

Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.2 to 4.5.3. Changelog Sourced from vite's changelog. 4.5.3 (2024-03-24) fix: fs.deny with globs with directories (#16250) (96a7f3a), closes #16250 Commits aac695e release: v4.5.3 96a7f3a fix: fs.deny...

dependencies
javascript

We're aware that the `amazon-textract-transformer-pipeline-assets` S3 bucket used by the "Launch stack" button on the [root README](https://github.com/aws-samples/amazon-textract-transformer-pipeline?tab=readme-ov-file#getting-started) is no longer publicly accessible, and working to find a resolution... In the...

I wanted to ask if this solution would currently support Language-Independent Layout Transformer - RoBERTa model (LiLT)? If not, I wanted to request that the inference code be updated to...

enhancement
help wanted

Today we demonstrate annotation and training for entity extraction only. For many use cases document classification is also important, and it should be pretty straightforward to support this too. A...

enhancement

As of #26, users can train generative models to normalize entity text after extraction: For example to standardize date or currency formats, or correct common OCR error patterns. This is...

enhancement

As of now the [custom online human review UI](https://github.com/aws-samples/amazon-textract-transformer-pipeline/blob/5415fb1befa900466d9f03ca098037f2db06b2b3/img/human-review-sample.png) is able to render detection bounding boxes over a full multi-page document at once, but the [training data annotation UI](https://github.com/aws-samples/amazon-textract-transformer-pipeline/blob/5415fb1befa900466d9f03ca098037f2db06b2b3/notebooks/img/smgt-custom-template-demo.png) is...

enhancement