stormpy
stormpy copied to clipboard
Parameter instantiation of PDTMC results in negative values
Dear Stormpy team,
I have a question regarding the instantiation of the FSC-policy parameters in a parametric DTMC. I have built the parametric DTMC by unfolding memory onto a Stormpy POMDP model, as done in the examples. I then collect the FSC parameters by calling pdtmc.collect_probability_parameters()
. Next I want to instantiate the values of the pDTMC from the FSC that I have computed. I have parsed the FSC parameter names (p{product_obs}_{product_action}) into a dictionary parameter_names
so that I know which parameter of the pDTMC corresponds to the FSC action and next-memory distribution.
for fsc_parameter in self.pdtmc.fsc_parameters:
m, o, a, next_m = parameters_names[fsc_parameter.name]
memory_prob = policy.next_memories[m, o] == policy.M[next_m] # assign positive probability only if the next memory is next_m
action_prob = policy.action_distributions[m, o, a]
prob = action_prob * memory_prob
points[fsc_parameter] = stormpy.RationalRF(prob)
instantiated_model = self._pdtmc_instantiator.instantiate(points)
pdtmc = PDTMCModelWrapper(instantiated_model, self.properties, self.pomdp.nS, self.product_pomdp.nM)
return pdtmc
After I have done this, I model-check the fully instantiated DTMC for a certain specification. I also explore the model by checking the transition probabilities between all state pairs. When I do this, I notice that many transition probabilities are negative.
I am aware that not all parameters of the product POMDP are kept as some are redundant (e.g. because of the law of total probability). My thought is that the negative values therefore happen because some of the parameters that I have instantiated have value 0 (in other words: the sum of the instantiated variables is already 1). This violates the graph-preserving constraint of the product.
Indeed, when I subtract a small number from the FSC values (prob
in the code above), I notice that more states of the DTMC are reachable. However, the negative probabilities for some of the transitions remain.
My question is: is there a way to know which transitions will be given a negative probability, and how can I best circumvent this?
Thanks a lot in advance,
Best.