apidaora icon indicating copy to clipboard operation
apidaora copied to clipboard

is there a way to prevent adding a background task twice ?

Open thomasleveil opened this issue 4 years ago • 17 comments

I had a quick look at the code and it seems apidaora does not offer any mean of prevent a task with the same parameters to be queued up twice.

let say POST /hello from the example takes a minute to finish.

I'd looking for a way to have a first call to curl -X POST -i localhost:8000/hello?name=Me accept the task, but a second call to curl -X POST -i localhost:8000/hello?name=Me refusing the task if the first one is still in queue or running.

  • does apidaora provide such a feature ?
  • how could I hack apidaora to implement such a feature ?

thomasleveil avatar Apr 09 '20 10:04 thomasleveil

Hello @thomasleveil !

If understood your question right, apidaora does not have feature to prevent the same request to be called more then once.

But I think it is possible to be implement. You know others frameworks that implement this feature?

dutradda avatar Apr 09 '20 18:04 dutradda

@thomasleveil I think we can do it using the middlewares module, I will try to do it and I will return to you

dutradda avatar Apr 10 '20 17:04 dutradda

Hi @dutradda

It is quite difficult to find such a feature in this realm. Here are a few examples :

My use case is the following :

I want to build an monitoring application which will check every 10 minutes if websites are responding (around 1000 websites). Currently we are using Cabot for this purpose. Our issue is that sometimes, the worker fails and tasks continue to fill up the queue. When we restart the worker we have to make sure to purge the queue, else we will spam the websites. Having a task system which ensure a check for a given website cannot be queued twice would be nice.


Also, I really like how you implemented this backgroundTask system and I wonder if it could be extracted into its own python module so that it could be used with any ASGI server. Currently my favorite ASGI framework is FastAPI but it lacks such a solid background task system (especially backed with redis)

thomasleveil avatar Apr 10 '20 19:04 thomasleveil

@thomasleveil I have implemented a simple middleware that prevents the request to be called until the body was processed.

Here is an example how to use it: https://dutradda.github.io/apidaora/using-prevent-request-middleware/

The code is here: https://github.com/dutradda/apidaora/blob/master/apidaora/middlewares.py#L117

dutradda avatar Apr 11 '20 21:04 dutradda

Also, I really like how you implemented this backgroundTask system and I wonder if it could be extracted into its own python module so that it could be used with any ASGI server.

This feature is coupled with apidaora code. I don't know how difficult is to apart it. I will analyse it.

I used the background task feature on the chunli app: https://dutradda.github.io/chunli/

dutradda avatar Apr 11 '20 21:04 dutradda

@thomasleveil I can implement the blocking request on the background tasks feature so. It is what you want, right?

dutradda avatar Apr 11 '20 21:04 dutradda

I can implement the blocking request on the background tasks feature so.

yes, I want the endpoint to refuse queue a second task with the same parameter if the first task is not processed (the task, not the creation request)

thomasleveil avatar Apr 11 '20 23:04 thomasleveil

hello @thomasleveil

I have implemented the feature you request! 🎉

Take a look on documentation and tell me what you think: https://dutradda.github.io/apidaora/background-task-controller/using-background-task-controller-lock-args-redis/

dutradda avatar Apr 12 '20 16:04 dutradda

Thank you for considering this feature :)

I gave it a try, and I'm confused with task_id being the same when running two tasks with the same parameters. For both tasks, the task_key should indeed be the same in order to perform the deduplication ; but task_id should be unique per task.

You can clone this gist to run my tests

thomasleveil avatar Apr 13 '20 01:04 thomasleveil

For both tasks, the task_key should indeed be the same in order to perform the deduplication ; but task_id should be unique per task.

You are right @thomasleveil , I will change it to be unique too! Thanks for your feedback!

dutradda avatar Apr 13 '20 14:04 dutradda

I'm also uneasy with the HTTP 400 status code returned when creating a task for the 2nd time. 4xx statuses are for client errors, but there is nothing the client did wrong.

searching the web for eventual standards, I was unable to find anything close to a standard for status code related to async tasks, but I stumbled upon different sources which suggest to use the HTTP 303 See Other status code.

What I like about the HTTP 303 response is that it must include the Location header having for value the url of the already existing task. Apidaora could provide the url for /hello-single?task_id=<id of 1st task>

On a side note, some say it is more correct (in regard to REST) to respond with HTTP 202 Accepted rather than HTTP 201 Created for endpoints creating a async task

thomasleveil avatar Apr 13 '20 15:04 thomasleveil

@thomasleveil I have implemented the unique task-id for lock args: https://dutradda.github.io/apidaora/background-task-controller/lock-args-redis/

I will implement the 202 status code, is more correct than 201.

about the 303, I will think about it, and search for others status too.

dutradda avatar Apr 14 '20 07:04 dutradda

Here are a few links regarding asynchronous REST :

  • https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/
  • https://farazdagi.com/2014/rest-and-long-running-jobs/
  • https://stackoverflow.com/questions/36421707/http-status-code-for-async-tasks/36422103

thomasleveil avatar Apr 14 '20 08:04 thomasleveil

I don't know if it is intentional but the 200 response when getting the status of a finished task does not include the args_signature property

thomasleveil avatar Apr 14 '20 09:04 thomasleveil

@thomasleveil I add the 202 and 303 status code

I fix the finished task too, adding the args_signature to response.

https://dutradda.github.io/apidaora/background-task-controller/lock-args-redis/

dutradda avatar Apr 15 '20 00:04 dutradda

Excellent :tada:

I took the time to test it and found no issue with the locking behavior. Thank you again for your quick reaction to this issue :+1:

thomasleveil avatar Apr 15 '20 22:04 thomasleveil

actually, I found an issue when adding two tasks from different modules and those task associated functions have the same name :

# module1.py
import time

from apidaora import route


@route.background('/hello-single-1', lock_args=True)
def hello_task(name: str) -> str:
    time.sleep(1)
    return f'Hello {name} from module 1!'
# module2.py
import time

from apidaora import route


@route.background('/hello-single-2', lock_args=True)
def hello_task(name: str) -> str:
    time.sleep(1)
    return f'Hello {name} from module 2!'
# myapp.py
from apidaora import appdaora
import module1
import module2

app = appdaora([
    module1.hello_task,
    module2.hello_task
])
uvicorn myapp:app &
curl -X POST -i localhost:8000/hello-single-1?name=Me  # 202
curl -X POST -i localhost:8000/hello-single-2?name=Me  # 303

The fix might just be as easy as to replace controller.__name__ with controller.__qualname__.

Or maybe use the request path instead of the controller function name to build the task key ?

You can clone this gist to run my tests

thomasleveil avatar Apr 15 '20 23:04 thomasleveil