fmtm
fmtm copied to clipboard
Redesign project/task/history schemas and relationships to better align with event-driven programming
Problem
- This is a suggestion made for Tasking Manager, but also applies here.
- Credit to original post by
qeef
a community volunteer:- DB design: https://qeef.srht.site/dd-hot-tm/database-tasks-history.html
- API design: https://qeef.srht.site/dd-hot-tm/api-endpoints.html
- Code: https://git.sr.ht/~qeef/dd-hot-tm/tree
The key unit of change in tasking managers is the task status / events.
Currently we have:
- Duplicated info between
tasks
andtasks_history
. - An inefficient design for the flow of task events.
- There is a logical flow between the task statuses, so we can make use of this and make the code more event driven.
DB Redesign
Related to other issues in this milestone: https://github.com/hotosm/fmtm/milestone/39
task_events table
- Rename task_history to task_events.
- Update the schema to be something like:
CREATE TABLE public.task_events (
event_id UUID,
project_id integer,
task_id integer,
user_id integer,
state TaskState,
comment text,
created_at timestamp without time zone DEFAULT timezone('utc'::text, now()),
PRIMARY KEY (event_id),
FOREIGN KEY (project_id) REFERENCES public.projects (project_id),
FOREIGN KEY (project_id, task_id) REFERENCES public.tasks (project_id, task_id)
FOREIGN KEY (user_id) REFERENCES public.users (user_id),
);
-- See details on TaskState enum below
Note I made the event_id a UUID so it is a universal ID, across multiple databases / environments, and is more compatible with other database systems. It should probably be generated in the Python code before creating the DB object?
Note 2: adding the project_id may seem redundant, as task_id is unique, but it allows for more effective indexing. Tasks are divided up by project.
Better indexing
-- Create a better composite index for public.tasks
CREATE INDEX tasks_composite ON public.tasks USING btree (task_id, project_id);
-- Existing indexes from task_history
CREATE INDEX idx_task_event_composite ON public.task_events USING btree (task_id, project_id);
CREATE INDEX idx_task_event_project_id_user_id ON public.task_events USING btree (user_id, project_id);
CREATE INDEX idx_task_event_project_id ON public.task_events USING btree (project_id);
CREATE INDEX idx_task_event_user_id ON public.task_events USING btree (user_id);
-- Add additional index for event date
CREATE INDEX idx_task_event_created_at ON task_events USING btree (created_at);
Refactor enums
- Remove TaskStatus enum as described in https://github.com/hotosm/fmtm/issues/1604 and refactor all usage.
- Also remove TaskAction in it's current implementation.
- Add two enums:
Events via the API:
class EventType(str, Enum):
"""Events that can be used via the API to update a state
Specify the event type for ``POST`` to:
``/project/{pid}/event``
or for Entities endpoints too.
Possible values are:
- ``map`` -- Set to *locked for mapping*, i.e. mapping in progress.
- ``finish`` -- Set to *unlocked to validate*, i.e. is mapped.
- ``validate`` -- Request recent task ready to be validate.
- ``good`` -- Set the state to *unlocked done*.
- ``bad`` -- Set the state *unlocked to map* again, to be mapped once again.
- ``split`` -- Set the state *unlocked done* then generate additional subdivided task areas.
- ``assign`` -- For a requester user to assign a task to another user. Set the state *locked for mapping* passing in the required user id.
- ``comment`` -- Keep the state the same, but simply add a comment.
Note that ``task_id`` must be specified in the endpoint too.
"""
MAP = "map"
FINISH = "finish"
VALIDATE= "validate"
GOOD = "good"
BAD = "bad"
SPLIT = "split"
ASSIGN= "assign"
COMMENT = "comment"
and the available states of an object in the db:
class State(int, Enum):
"""The state of an object (task or entity).
The state can be:
- ``unlocked to map``
- ``locked for mapping``
- ``unlocked to validate``
- ``locked for validation``
- ``unlocked done``
"""
UNLOCKED_TO_MAP = 0
LOCKED_FOR_MAPPING = 1
UNLOCKED_TO_VALIDATE = 2
LOCKED_FOR_VALIDATION = 3
UNLOCKED_DONE = 4
-
task_events.state
should use the State enum + we could probably also reuse this for the Entitiesstatus
field.
Updating state via the API
Models
- Pydantic model for POST update to state:
class NewEvent(BaseModel):
"""DTO for ``POST`` request to ``/project/{pid}/events`` endpoint.
task_id must be set, as this always relates to a task.
The EventType enum provides available options.
"""
event: EventType
task_id: conint(ge=1)
requested_user_id: Optional[conint(ge=1)]
- Replace
tasks.task_status: TaskStatus
withtasks.state: State
. - Refactor
TaskHistory
model toTaskState
:
class TaskState(BaseModel):
"""DTO for ``GET`` request to ``/project/{pid}/task/history``.
And also for ``/project/{pid}/task/{tid}/history``.
"""
task_id: conint(ge=1)
user_id: conint(ge=1)
profile_img: str
state: State
comment: str
date: datetime
- The
Task
model that is nested as a list of tasks under theProject
model could be refactored too.- Remove
locked_by
etc fields, removetask_status
. - Provide a key
history_recent
, which includes the most recent 10 TaskState objects. - We can gather the rest of the info we need from this (i.e. display history in sidebar with comments, show who mapped / locked, etc)
- Remove
Endpoints
Note if using a local-first paradigm, this logic could be implemented on the frontend instead.
Endpoint to update task state via events:
# modify this to use case/when!
@app.post("/project/{project_id}/event")
async def new_event(project_id: conint(ge=1), user_id=Depends(login_required), detail: NewEvent):
if detail.event == EventType.MAP:
updated = await _db_api().lock_for_mapping(project_id, user_id, detail.task_id)
elif detail.event == EventType.FINISH:
updated = await _db_api().finish(project_id, user_id, detail.task_id)
elif detail.event == EventType.VALIDATE:
updated = await _db_api().lock_for_validation(project_id, user_id, detail.task_id)
elif detail.event == EventType.GOOD:
updated = await _db_api().good(project_id, user_id, detail.task_id)
elif detail.event == EventType.BAD:
updated = await _db_api().bad(project_id, user_id, detail.task_id)
elif detail.event == EventType.SPLIT:
updated = await _db_api().split(project_id, user_id, detail.task_id)
elif detail.event == EventType.ASSIGN:
updated = await _db_api().assign(project_id, detail.requested_user_id, detail.task_id)
elif detail.event == EventType.COMMENT:
updated = await _db_api().comment(project_id, user_id, detail.task_id)
if not updated:
raise fastapi.HTTPException(
status_code=400,
detail=f"Failed to update task ({detail.task_id}) with event ({detail.event})")
# 201 response as event is created
return HTTPResponse(status=201)
Database logic
- The _db_api() methods above should be as such:
async def finish(pid, uid, tid) -> None:
r = await db.sq(
"""
WITH last AS (
SELECT *
FROM public.task_events
WHERE project_id=$1 AND task_id=$2
ORDER BY event_id DESC
LIMIT 1),
locked AS (
SELECT *
FROM last
WHERE user_id=$3 AND state='LOCKED_FOR_MAPPING'
)
INSERT INTO public.task_events(project_id, task_id, user_id, event, comment)
SELECT $1, $2, $3, 'UNLOCKED_TO_VALIDATE', 'Note: Mapping finished',
FROM last
WHERE user_id=$3
RETURNING project_id, task_id, user_id
""",
project_id,
task_id,
user_id)
assert 1 == len(r)
assert r[0]["project_id"] == pid
assert r[0]["task_id"] == tid
assert r[0]["user_id"] == uid
return None
- For an
ASSIGN
event the task state should move tolocked for mapping
, with the user_id set to the requested user id. The user should also be notified as part of the logic.
In this flow, the task_events are a centralised data store where each event is stored as a sequence of immutable records, also called a log. Each new event is appended to the end of the list. This is the single source of truth for the task state, which can be accessed from multiple places.