Track initial tensor shapes and dtypes through literals
All tensors initially have the dimensions of the MNIST dataset regardless if they actually have those dimension.
[Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_add7.py.do()LRoot;@103 ], v2] --> [SITE_IN_NODE{<Code body of function Lscript tf2_test_add7.py>:Llist in CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]}]
[Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_add7.py.do()LRoot;@108 ], v2] --> [SITE_IN_NODE{<Code body of function Lscript tf2_test_add7.py>:Llist in CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]}]
callees of node Lscript tf2_test_add7.py : [import, add, ones, ones]
IR of node 2, context CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]
<Code body of function Lscript tf2_test_add7.py>
...
100 v251 = new <PythonLoader,Llist>@100 tf2_test_add7.py [14:16] -> [14:22]
101 fieldref v251.v252:#0 = v253:#1 = v253:#1 tf2_test_add7.py [14:16] -> [14:22]
102 fieldref v251.v254:#1 = v255:#2 = v255:#2 tf2_test_add7.py [14:16] -> [14:22]
103 v248 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v249,v251 @103 exception:v256 tf2_test_add7.py [14:8] -> [14:23]
...
105 v259 = new <PythonLoader,Llist>@105 tf2_test_add7.py [14:33] -> [14:39]
106 fieldref v259.v252:#0 = v255:#2 = v255:#2 tf2_test_add7.py [14:33] -> [14:39]
107 fieldref v259.v254:#1 = v255:#2 = v255:#2 tf2_test_add7.py [14:33] -> [14:39]
108 v257 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v258,v259 @108 exception:v260 tf2_test_add7.py [14:25] -> [14:40]
Currently, we are inferring initial tensor shapes from literal values that are being sent to "generator" APIs. The return values of those APIs are then stored into variables with inferred shapes. If those variables are later used as inputs to other APIs, we currently won't be able to infer their shapes. That is because the variables aren't literals, and thus variables aren't yet supported for shape inference. We then get situations in the tests where if other APIs are used to "generate" tensor arguments, we can't infer the shapes of the corresponding parameters of other API. Thus, I think we should scope this issue for only literal values being passed to generator APIs for now. We can create a separate issue to track supporting variables for shape inference later. This way, we can at least make progress on shape inference for generator APIs without getting blocked on supporting variables.
The good news is that we can already infer shapes for many generator APIs when literal values are passed in. We can then use those inferred shapes to further infer shapes for other APIs that consume those generated tensors.
For tensors that aren't coming from literal values at some point (e.g., files), that's a different story. We likely won't be able to infer their shapes unless we have some other mechanism to provide shape information (e.g., annotations). But that's out of scope for this issue.
The current fundamental issue here is that we are using the pointer analysis to do shape inference. Perhaps there needs to be another pass that uses the tensor analysis in combination with the pointer analysis to do shape inference. On the first pass, the tensor analysis, I believe, won't be fully populated yet.