Dummy inference engine #325
Implemented DummyInferenceEngine to simulate inference without loading a model or running actual inference. The engine supports:
Static output mode, returning predefined values. Random output mode, generating random outputs based on a specified shape. Customizable latency using asyncio.sleep to simulate inference time with configurable mean and standard deviation. All functionality is asynchronous to meet the requirements of non-blocking code.
Testing:
Added unit tests to verify functionality, including: Static output mode. Random output mode with shape and latency validation. Latency checks to ensure it falls within the expected range.
Looks good!
Can you add this as an option to the cli too? --inference-engine dummy
updated the code and added the cli option.
@AlexCheema clarification is needed: Does the dummy inference only need to be run locally, without involving any other nodes in the network? In a full-fledged sharded version the requirement could be for the dummy inference engine to shard the pseudo work among the nodes communicating with each other while all of them running the dummy inference engine.
@AlexCheema clarification is needed: Does the dummy inference only need to be run locally, without involving any other nodes in the network? In a full-fledged sharded version the requirement could be for the dummy inference engine to shard the pseudo work among the nodes communicating with each other while all of them running the dummy inference engine.
Both need to be possible. This is outside of the scope of the InferenceEngine - the networking and orchestration between nodes happens outside of the InferenceEngine
ready for review
Dummy implemented in a separate PR. Please email [email protected] with your Ethereum address to claim $100 bounty.