axon
axon copied to clipboard
Add Axon Serving API
WIP: Need to add tests
From our original implementation I added a few features, and there's still more I think we need to determine before merging:
-
preprocess/postprocess- I found when trying to deploy a pipeline, it makes sense to have something like this. For example, imagine an image classification task, you probably want the logic of featurizing and processing config to be batched/handled in batches rather than worry about doing it yourself. Bothpreprocessandpostprocesstake an input andstate(discussed later) and return tensors. The issue right now with is that if we focus on pre-processing raw inputs, it's not possible to determine batch size unless we allowpredictto have a signature which accepts either tensors, raw, or list of raw inputs. -
state- Again imagine the image classification app. You have a featurizer which is loaded on startup. it makes sense to include this as a part of the serving state, then you can use it inpreprocessandpostprocess. The state is arbitrary, so it can also be something like a tokenizer or other. -
We still do not correctly handle container outputs. I think we can safely just do the padding of map outputs by just padding. I don't think it's possible to have an unbatched output.
-
We still do not correctly handle unbatched inputs. I handled optional inputs by discarding those not specified in
:shape, but for inputs likehead_maskwhere they are not batched, the API will have a fit and raise if their leading dim is not the same asbatch_size -
We also still have the question of starting a task to pad/split batches and do inference. I guess the only benefit here is we do not block the queue from filling up? Not sure I have enough info to make a decision ATM.
Otherwise, I will add tests tomorrow and then once we address these concerns I think this is good to merge :)