PyPOMDP Some minor errors need to be fixed

Some minor errors need to be fixed

Open tianyunzhe opened this issue 4 years ago • 1 comments

Very good repo! But there are still some errors need to be fixed.

https://github.com/namoshizun/PyPOMDP/blob/3f115b4b0a2c2efea6c484c849fae242d63912ff/pypomdp/parsers/env_parser.py#L187 to self.Z[(action, next_state, obs)] = float(prob)
https://github.com/namoshizun/PyPOMDP/blob/3f115b4b0a2c2efea6c484c849fae242d63912ff/pypomdp/models/model.py#L32-L34 to return len(self.actions)

Nov 14 '20 09:11 tianyunzhe

Besieds, in 'pbvi', it seems like that you don't use the belief point set expansion proposed by the original paper. Here is my code to expand the belief point set followed the description by that paper, hope it can help you.


 def belief_point_expansion(self):
        for belief_state in self.belief_points:
            init_state = self.model.states[draw_args(belief_state)]
            largest_distance = 0
            add_belief = None
            for action in self.model.actions:
                sj, oj, reward, cost = self.model.simulate_action(init_state, action)
                new_belief = self.update_belief(belief_state, action, oj)
                distance = max([self.norm1(belief_state2, new_belief) for belief_state2 in self.belief_points])
                if distance > largest_distance:
                    largest_distance = distance
                    add_belief = new_belief
            self.belief_points = np.vstack((self.belief_points, add_belief))

def norm1(self, a, b):
    new = np.array(a) - np.array(b)
    distance = 0.0
    for i in new:
        distance += abs(i)
    return distance

Nov 14 '20 15:11 tianyunzhe

PyPOMDP PyPOMDP copied to clipboard

Some minor errors need to be fixed

PyPOMDP
PyPOMDP copied to clipboard