shschaefer
shschaefer
Tensors are n-dimensional and can have several discerning factors beyond type and shape. In GPU memory environments, there may also commonly be strides. Images and video have channel and depth...
@abrown, the session/execution context interface caches the parameterization of configuration and device selection. You are going to leave this state on the `graph`, to be initialized during model load? And...
@jlb6740, the power preference in WebNN and many frameworks is designed to enable the caller to choose from more than one potential device. With GPUs, there are often more than...
I agree with the addition of a method for loading a graph by name. This is how my team's current implementation works. We use URNs as they provide the means...
Are you assuming that all LLMs have intrinsic tokenization? Not all foundation models are transforms from string to string. Is the implication is that kv-cache and other stateful items will...
Why are these attached to the graph not the execution context?