Lee Prevost comments

Results 70 comments of


                                            Lee Prevost

Write to more than one influx database

It seems like the second use case could be handled within Influx continuous queries (ver . 1.8) or tasks (>2.0). Also, influx ver 2 apparently supports a store/forward/synch service with...

make offsite middleware downloader middleware instead of spider middleware

Did this go anywhere? I’m struggling with an issue where my spider crawls links from a link extractor that denies PDFs. But some links are redirected to a PDF on...

Guide/readme/example for using with AWS Glue ETL job

Thanks for the response. I think I am good on your bullet 1 and 3 within my scripts (yes, using pyspark). But, on item 2, I'm struggling with the following:...

Guide/readme/example for using with AWS Glue ETL job

Further, from AWS [docs](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-arguments.html), --extra-jars The Amazon S3 paths to additional Java .jar files that AWS Glue adds to the Java classpath before executing your script. Multiple values must be...

Guide/readme/example for using with AWS Glue ETL job

So, even further reducing my question, I think all I need is to get some of these (which) on my s3 and add the extra jars path. Again, don't have...

Guide/readme/example for using with AWS Glue ETL job

Thank you. I was making it much harder than necessary. Am testing now and will report back.

Guide/readme/example for using with AWS Glue ETL job

Ok, I added these two parameters to my jobs definition: ``` '--extra-jars': "s3://aws-glue-assets-[my account num]-us-east-1/jars/", # path to the splittablegzip-1.3.jar file '--user-jars-first': "true", ``` I then added this to my...

Guide/readme/example for using with AWS Glue ETL job

OK, am reporting back that I commented out the changes above and script is running fine but with everything loaded on one executor ,not parallelization, and slow! So, something about...

Guide/readme/example for using with AWS Glue ETL job

Thinking about this some more: Am wondering if the last post on this thread on the spark jira is the answer: https://issues.apache.org/jira/browse/SPARK-29102 or, AWS glue has a capaqbility to install...

Guide/readme/example for using with AWS Glue ETL job

This looks promising. [so question](https://stackoverflow.com/a/66987359/7397195) again, I see I need extra jars with pointer to the jar file on s3. No problem there. But in the config statement, I can...