axon
axon copied to clipboard
Add Axon Serving API
WIP: Need to add tests
From our original implementation I added a few features, and there's still more I think we need to determine before merging:
-
preprocess
/postprocess
- I found when trying to deploy a pipeline, it makes sense to have something like this. For example, imagine an image classification task, you probably want the logic of featurizing and processing config to be batched/handled in batches rather than worry about doing it yourself. Bothpreprocess
andpostprocess
take an input andstate
(discussed later) and return tensors. The issue right now with is that if we focus on pre-processing raw inputs, it's not possible to determine batch size unless we allowpredict
to have a signature which accepts either tensors, raw, or list of raw inputs. -
state
- Again imagine the image classification app. You have a featurizer which is loaded on startup. it makes sense to include this as a part of the serving state, then you can use it inpreprocess
andpostprocess
. The state is arbitrary, so it can also be something like a tokenizer or other. -
We still do not correctly handle container outputs. I think we can safely just do the padding of map outputs by just padding. I don't think it's possible to have an unbatched output.
-
We still do not correctly handle unbatched inputs. I handled optional inputs by discarding those not specified in
:shape
, but for inputs likehead_mask
where they are not batched, the API will have a fit and raise if their leading dim is not the same asbatch_size
-
We also still have the question of starting a task to pad/split batches and do inference. I guess the only benefit here is we do not block the queue from filling up? Not sure I have enough info to make a decision ATM.
Otherwise, I will add tests tomorrow and then once we address these concerns I think this is good to merge :)