tensor-house
tensor-house copied to clipboard
Wrong Calculation of Total Profit [price-optimization-using-dqn-reinforcement-learning]
Filename : pricing/price-optimization-using-dqn-reinforcement-learning.ipynb
at t=0, the Below function evaluates with p[0] and p[-1] as parameters which seems incorrect to me. because p[-1] in python corresponds to last element of the array.
def profit_total(p, unit_cost, q_0, k, a, b):
return profit_t(p[0], p[0], q_0, k, 0, 0, unit_cost) + sum(map(lambda t: profit_t(p[t], p[t-1], q_0, k, a, b, unit_cost), range(len(p))))
to fix this, we can use range(1,len(p)).
def profit_total(p, unit_cost, q_0, k, a, b):
return profit_t(p[0], p[0], q_0, k, 0, 0, unit_cost) + sum(map(lambda t: profit_t(p[t], p[t-1], q_0, k, a, b, unit_cost), range(1,len(p))))
@ikatsov Do let me know if I am wrong or misunderstood something.