textract icon indicating copy to clipboard operation
textract copied to clipboard

Is there any chance to install this in AWS lambda ?

Open adantart opened this issue 5 years ago • 4 comments

I mean, my question is, which libraries and files are necessary to zip them for AWS Lambda purposes ?

adantart avatar Nov 27 '19 23:11 adantart

It depends on which filetype you want to parse. Textract is just a wrapper for external parsers. Most are python packages, but textract uses external CLI tools as well.

jpweytjens avatar Dec 06 '19 09:12 jpweytjens

I think what the OP: means is that the resulting zip file when installing all dependencies is> 55M. This means that it is not possible to run on AWS lambda.

This is a showstopper for me as well.

` An error occurred (RequestEntityTooLargeException) when calling the CreateFunction operation: Request must be smaller than 69905067 bytes for the CreateFunction operation

This is likely because the deployment package is 54.9 MB. Lambda only allows deployment packages that are 50.0 MB or less in size. To avoid this error, decrease the size of your chalice application by removing code or removing dependencies from your chalice application. `

skjortan23 avatar Jan 21 '20 15:01 skjortan23

I have no experience with AWS Lambda, so thank you for pointing that out.

Most of the dependencies are very small, but SpeechRecognition takes up ~32MB. I'll think about making some dependencies optional to reduce filesize.

jpweytjens avatar Jan 22 '20 08:01 jpweytjens

@jpweytjens yes to work around this I made a fork yesterday and just commented out the dependency on SpeechRecognition and successfully deployed to lambda.

So having the SpeechRecognition dependency as an optional would be great.

skjortan23 avatar Jan 22 '20 08:01 skjortan23