Reid Hewitt comments

Results 46 comments of


                                            Reid Hewitt

Add arg parsing to datagov harvesting logic

harvest.py doesn't implement the arg parsing function. that'll happen in [#4715](https://github.com/GSA/data.gov/issues/4715). i reworded this ticket.

[SPIKE] implement job scheduler in flask app

sqlachemy job store for apscheduler [tables](https://github.com/agronholm/apscheduler/blob/d2204627917d8729a3bd0512d779578beafc43de/src/apscheduler/datastores/sqlalchemy.py#L278-L330). seems redundant to include our existing `job` table. looks like custom metadata for jobs and tasks will be supported in [v4.0](https://github.com/agronholm/apscheduler/issues/508#:~:text=APScheduler%204.0%20will%20support%20metadata%20for%20schedules%20and%20jobs%20(which%20are%20a%20separate%20concept%20there).). [v4.0 progress track](https://github.com/agronholm/apscheduler/issues/465)....

[SPIKE] implement job scheduler in flask app

maybe we can store our job results in the return value [here](https://github.com/agronholm/apscheduler/blob/d2204627917d8729a3bd0512d779578beafc43de/src/apscheduler/datastores/sqlalchemy.py#L329)?

[SPIKE] implement job scheduler in flask app

[running jobs immediately](https://apscheduler.readthedocs.io/en/3.x/userguide.html#:~:text=To%20run%20a%20job%20immediately%2C%20omit%20trigger%20argument%20when%20adding%20the%20job.)

[SPIKE] implement job scheduler in flask app

```python # a way to start a job now scheduler_object.get_job(job_id ="my_job_id").modify(next_run_time=datetime.datetime.now()) ```

[SPIKE] implement job scheduler in flask app

we're anticipating running the harvests as tasks in cloudfoundry. how does flask-apscheduler know when a job is complete if cf is doing the work? do we stream the cf task...

[SPIKE] implement job scheduler in flask app

here's a mockup of a flask-apscheduler job... ```python cf_handler = CFHandler() # cf interface app = create_app() # flask app scheduler = create_scheduler(app) @scheduler.task("interval", id="do_job_1", seconds=60) def job1(source_data): app_guuid =...

[SPIKE] implement job scheduler in flask app

implementing a [custom executor](https://apscheduler.readthedocs.io/en/master/extending.html#custom-job-executors)

[SPIKE] implement job scheduler in flask app

job processing workflow in apscheduler - after the scheduler has determined that a job needs to be run the job is submitted to the thread/process pool ([source](https://github.com/agronholm/apscheduler/blob/677ce711a5381c3d44f4c6313098844d26b70666/apscheduler/executors/pool.py#L28)) - `run_job` runs...

[SPIKE] implement job scheduler in flask app

apscheduler has a [datastore](https://apscheduler.readthedocs.io/en/master/api.html#data-stores) to keep track of tasks, jobs, and schedules. would this not replace our job table in the harvest db?