dyypholdem Poor performance against Slumbot

I run 1800 hands against Slumbot and got the following results:

Earnings: -15.92 BB/100
Baseline Earnings: -24.49 BB/100
Num Hands: 1803

When I checked the weights:

Street	epoch	loss
Preflop	67	0.001736
Flop	50	0.067477
Turn	50	0.072931
River	95	0.057868

Is this poor performance due to the loss ? how to improve it ? Thanks

Jan 08 '22 19:01 jidma

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

Jan 09 '22 20:01 MeowTheCat

I run 1800 hands against Slumbot and got the following results:

Earnings: -15.92 BB/100

Baseline Earnings: -24.49 BB/100

Num Hands: 1803

When I checked the weights:

Street epoch loss Preflop 67 0.001736 Flop 50 0.067477 Turn 50 0.072931 River 95 0.057868 Is this poor performance due to the loss ? how to improve it ? Thanks

1800 hands is meaningless. besides, original deepstack didnt claim to beat slumbot(although their models were much much more accurate).

there are some tweaks that made deepstack beat slumbot by 15bb/100

Feb 09 '22 22:02 yegorrr

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

you need much bigger sample size. default is 100k. models were trained on 1m samples

Feb 09 '22 22:02 yegorrr

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

you need much bigger sample size. default is 100k. models were trained on 1m samples

I already used 1m samples

Feb 09 '22 22:02 MeowTheCat

besides, original deepstack didnt claim to beat slumbot

They claimed to have at least approached the Nash equilibrium, which means it should at least not lose against it. Right ?

(although their models were much much more accurate).

Do you know how much more accurate and how many training samples they used compared to the models in this repo?

there are some tweaks that made deepstack beat slumbot by 15bb/100

Can you elaborate on that please ? Edit: found this link

Feb 09 '22 23:02 jidma

I tried to regenerate the data and re-train the model, for python training/main_train.py 4 the training loss doesn't go below 0.1, the validation loss doesn't go below 0.11

you need much bigger sample size. default is 100k. models were trained on 1m samples

I already used 1m samples

for me, for example. river network produced errors smaller than 0.1 on validation sets on samples of couple hundred datapoints and below 0.057(than the model in the repository) on 1m sample, without any modification of data generation or training code or even changing any parameters (that said, default data generation is for 10k files, that is 100k samples)

Feb 10 '22 05:02 yegorrr

besides, original deepstack didnt claim to beat slumbot

They claimed to have at least approached the Nash equilibrium, which means it should at least not lose against it. Right ?

(although their models were much much more accurate).

Do you know how much more accurate and how many training samples they used compared to the models in this repo?

there are some tweaks that made deepstack beat slumbot by 15bb/100

Can you elaborate on that please ? Edit: found this link

not necessarily, nash equilibrium means that neither player can improve his expectation (winnings or losses!) by changing only his own strategy. but yeah in theory in headsup nl hold'em nash equilibrium guarantees that you won't lose if the opponent plays perfectly and win every time he deviates from that play. but in practice neither player plays perfect strategy, so you hold 2 assumptions: about your own perfect strategy and about your opponent's strategy(which in practice may be very different from your assumption because he has a different model with different errors!).

for example, in practice, you may give very little weight to certain nodes in your strategy tree(and hence huge error margins!) because of your belief in how your opponent should play(ie they shouldn't happen very often or at all) but your opponent's strategy may be very different from what you assume and they would happen quite a lot and then you would lose a lot

yes i mentioned this article in particular, and also noam brown comments in his other papers on deepstack's "alternative" approach, see his papers on nested/safe search

if you're interested in this project please contact me @yayegor on telegram

Feb 10 '22 05:02 yegorrr

this exact algorithm without any changes vs slumbot, although i re-trained river/turn and flop models down to ~0.031 huber loss (that took a lot of calculations). obviously, the sample size of ~3k is meaningless, probably the only conclusion is that this implementation is probably not getting crushed by a slumbot (and probably not crushing it either!), so it's probably robust

i went through the handhistories and as a professional poker player with 12+yrs experience i can tell both bots showed a very solid game

Feb 12 '22 07:02 yegorrr

this exact algorithm without any changes vs slumbot, although i re-trained river/turn and flop models down to ~0.031 huber loss (that took a lot of calculations). obviously, the sample size of ~3k is meaningless, probably the only conclusion is that this implementation is probably not getting crushed by a slumbot (and probably not crushing it either!), so it's probably robust

i went through the handhistories and as a professional poker player with 12+yrs experience i can tell both bots showed a very solid game

Hi yegorrr,

Did you also re-trained this model changing the algorithm based on Supremus Bot? I think is not so difficult to change. I will give it a try... If you wanna help, tell me. I'll appreciate.

Feb 17 '23 21:02 dan-bio

this exact algorithm without any changes vs slumbot, although i re-trained river/turn and flop models down to ~0.031 huber loss (that took a lot of calculations). obviously, the sample size of ~3k is meaningless, probably the only conclusion is that this implementation is probably not getting crushed by a slumbot (and probably not crushing it either!), so it's probably robust i went through the handhistories and as a professional poker player with 12+yrs experience i can tell both bots showed a very solid game

Hi yegorrr,

Did you also re-trained this model changing the algorithm based on Supremus Bot? I think is not so difficult to change. I will give it a try... If you wanna help, tell me. I'll appreciate.

Does this match support 6 players？

May 04 '23 12:05 c976237222

besides, original deepstack didnt claim to beat slumbot

They claimed to have at least approached the Nash equilibrium, which means it should at least not lose against it. Right ?

(although their models were much much more accurate).

Do you know how much more accurate and how many training samples they used compared to the models in this repo?

there are some tweaks that made deepstack beat slumbot by 15bb/100

Can you elaborate on that please ? Edit: found this link

Does this match support 6 players

May 04 '23 12:05 c976237222

dyypholdem dyypholdem copied to clipboard

Poor performance against Slumbot

dyypholdem
dyypholdem copied to clipboard