dyypholdem icon indicating copy to clipboard operation
dyypholdem copied to clipboard

Poor performance against Slumbot

Open jidma opened this issue 3 years ago • 11 comments

I run 1800 hands against Slumbot and got the following results:

  • Earnings: -15.92 BB/100
  • Baseline Earnings: -24.49 BB/100
  • Num Hands: 1803

When I checked the weights:

Street epoch loss
Preflop 67 0.001736
Flop 50 0.067477
Turn 50 0.072931
River 95 0.057868

Is this poor performance due to the loss ? how to improve it ? Thanks

jidma avatar Jan 08 '22 19:01 jidma

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

MeowTheCat avatar Jan 09 '22 20:01 MeowTheCat

I run 1800 hands against Slumbot and got the following results:

  • Earnings: -15.92 BB/100
  • Baseline Earnings: -24.49 BB/100
  • Num Hands: 1803

When I checked the weights:

Street epoch loss Preflop 67 0.001736 Flop 50 0.067477 Turn 50 0.072931 River 95 0.057868 Is this poor performance due to the loss ? how to improve it ? Thanks

1800 hands is meaningless. besides, original deepstack didnt claim to beat slumbot(although their models were much much more accurate). image

there are some tweaks that made deepstack beat slumbot by 15bb/100 image

yegorrr avatar Feb 09 '22 22:02 yegorrr

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

you need much bigger sample size. default is 100k. models were trained on 1m samples

yegorrr avatar Feb 09 '22 22:02 yegorrr

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

you need much bigger sample size. default is 100k. models were trained on 1m samples

I already used 1m samples

MeowTheCat avatar Feb 09 '22 22:02 MeowTheCat

besides, original deepstack didnt claim to beat slumbot

They claimed to have at least approached the Nash equilibrium, which means it should at least not lose against it. Right ?

(although their models were much much more accurate).

Do you know how much more accurate and how many training samples they used compared to the models in this repo?

there are some tweaks that made deepstack beat slumbot by 15bb/100

Can you elaborate on that please ? Edit: found this link

jidma avatar Feb 09 '22 23:02 jidma

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

you need much bigger sample size. default is 100k. models were trained on 1m samples

I already used 1m samples

for me, for example. river network produced errors smaller than 0.1 on validation sets on samples of couple hundred datapoints and below 0.057(than the model in the repository) on 1m sample, without any modification of data generation or training code or even changing any parameters (that said, default data generation is for 10k files, that is 100k samples)

yegorrr avatar Feb 10 '22 05:02 yegorrr

besides, original deepstack didnt claim to beat slumbot

They claimed to have at least approached the Nash equilibrium, which means it should at least not lose against it. Right ?

(although their models were much much more accurate).

Do you know how much more accurate and how many training samples they used compared to the models in this repo?

there are some tweaks that made deepstack beat slumbot by 15bb/100

Can you elaborate on that please ? Edit: found this link

not necessarily, nash equilibrium means that neither player can improve his expectation (winnings or losses!) by changing only his own strategy. but yeah in theory in headsup nl hold'em nash equilibrium guarantees that you won't lose if the opponent plays perfectly and win every time he deviates from that play. but in practice neither player plays perfect strategy, so you hold 2 assumptions: about your own perfect strategy and about your opponent's strategy(which in practice may be very different from your assumption because he has a different model with different errors!).

for example, in practice, you may give very little weight to certain nodes in your strategy tree(and hence huge error margins!) because of your belief in how your opponent should play(ie they shouldn't happen very often or at all) but your opponent's strategy may be very different from what you assume and they would happen quite a lot and then you would lose a lot

yes i mentioned this article in particular, and also noam brown comments in his other papers on deepstack's "alternative" approach, see his papers on nested/safe search

if you're interested in this project please contact me @yayegor on telegram

yegorrr avatar Feb 10 '22 05:02 yegorrr

image this exact algorithm without any changes vs slumbot, although i re-trained river/turn and flop models down to ~0.031 huber loss (that took a lot of calculations). obviously, the sample size of ~3k is meaningless, probably the only conclusion is that this implementation is probably not getting crushed by a slumbot (and probably not crushing it either!), so it's probably robust

i went through the handhistories and as a professional poker player with 12+yrs experience i can tell both bots showed a very solid game

yegorrr avatar Feb 12 '22 07:02 yegorrr

image this exact algorithm without any changes vs slumbot, although i re-trained river/turn and flop models down to ~0.031 huber loss (that took a lot of calculations). obviously, the sample size of ~3k is meaningless, probably the only conclusion is that this implementation is probably not getting crushed by a slumbot (and probably not crushing it either!), so it's probably robust

i went through the handhistories and as a professional poker player with 12+yrs experience i can tell both bots showed a very solid game

Hi yegorrr,

Did you also re-trained this model changing the algorithm based on Supremus Bot? I think is not so difficult to change. I will give it a try... If you wanna help, tell me. I'll appreciate.

dan-bio avatar Feb 17 '23 21:02 dan-bio

image this exact algorithm without any changes vs slumbot, although i re-trained river/turn and flop models down to ~0.031 huber loss (that took a lot of calculations). obviously, the sample size of ~3k is meaningless, probably the only conclusion is that this implementation is probably not getting crushed by a slumbot (and probably not crushing it either!), so it's probably robust i went through the handhistories and as a professional poker player with 12+yrs experience i can tell both bots showed a very solid game

Hi yegorrr,

Did you also re-trained this model changing the algorithm based on Supremus Bot? I think is not so difficult to change. I will give it a try... If you wanna help, tell me. I'll appreciate.

Does this match support 6 players?

c976237222 avatar May 04 '23 12:05 c976237222

besides, original deepstack didnt claim to beat slumbot

They claimed to have at least approached the Nash equilibrium, which means it should at least not lose against it. Right ?

(although their models were much much more accurate).

Do you know how much more accurate and how many training samples they used compared to the models in this repo?

there are some tweaks that made deepstack beat slumbot by 15bb/100

Can you elaborate on that please ? Edit: found this link

Does this match support 6 players

c976237222 avatar May 04 '23 12:05 c976237222