TensorFI
TensorFI copied to clipboard
TensorFI: Add state space based random sampling
For the "oneFaultPerRun" mode, faults were injected into operators sampled from a uniform probability distribution.
However it makes more sense to sample across the operator state space as it is a closer model for fault occurence.
Signed-off-by: Niranjhana Narayanan [email protected]
Hi NJ, thanks for the work, I think the high-level idea of implementing it is ok. Just one thing:
You sampling of total state space is based on All the ops in the fiMap, while this is reasonable when we want to inject ALL the ops. However, some might only want to inject a subset of Ops in the graph, e.g., those Ops that involve in computing the results. (same reason why we specify "instance" in the yaml file).
Could you modify this part and sample the distribution more flexibly according to our yaml conf file? then try it running on some simple tests see if it works?
Thanks.
Sure Zitao, that makes sense. I'll make that change and recommit, thanks for the review!
Just saw that this pull request still remains open. Zitao and Niranjhana, should we merge it ? Thanks.