Bill Chen - "The Mathematics Of Poker" Study Group

Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Yea, I was actually looking for more of a summer internship (I'm still studying) but it turned out I could actually get a job after (or maybe even before) it ends :rock:
 
dealio96

dealio96

The LAG Monkeys
Silver Level
Joined
Dec 3, 2013
Total posts
7,960
Awards
5
Chips
0
Yea, I was actually looking for more of a summer internship (I'm still studying) but it turned out I could actually get a job after (or maybe even before) it ends :rock:


Weeeeeeeeeee!!! Sir Marty's back!!

giphy.gif


I may not be as far along as you guys are(in the book), but I'm happy that you're back! Seemed like an eternity!!
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Very nice gif, Steven.

@Figaroo2
@rhombus

Do you two actually have the book or you are just patiently waiting for my Chapter Summaries? :(
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
I have the book so Chapter 10 next I guess if we are on to Optimal Play
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Ok great, so Chapter 10 till Sunday and then we could jump right into Chapter 12, which is about Jam-Fold games, to start doing something more practical!
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
You guys want to learn how to solve Nim as "an example" for Chapter 10? :eek:
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Chapter 10 Summary.

(from Chapter 4 Summary)
1. A Game consists of:
a) 2+ players
b) 1+ player has choices (strategy)
(in Poker every player has a strategy whereas in Blackjack a Dealer (who is considered a player) does not have any choices - he just has to follow a specifc set of rules).
c) Set of outcomes for each player (eg: $won/lost) - payoffs
d) Outcomes depend on players' choices - strategic options

2. Properties of the game.

a) Constant-sum - the total of the payoffs to the various players is a constant C. If C = 0, then the game is said to be a zero-sum game (eg: in HU poker, one player wins exactly the amount of money/chip that the other one loses). All constant sum games can be converted to zero-sum games by scalling payoffs accordingly.

b) Sequential (players take turns; chess, poker, board games) or simultaneous (players act simultaneously; odds & Evens, Roshambo).

c) Deterministic - if we can predict the future state of the game based on current state/information (chess). There is no element of randomness.

d) Hidden Information - if all the known information is available to all the players (chess - yes, poker - no). Optimal strategies in games with hidden information can contain mixed strategies.

3. Optimal strategy pairs (in 2 player zero-sum games).

a) Strategy pairs: combination of strategy_1 (for player_1) and strategy_2 (player_2).

b) A strategy pair is said to be optimal if neither player can improve his EV by unilaterally changing his strategy.

c) An optimal strategy is a strategy with the highest EV against Nemesis (an opponent who always plays MES against anyone; nobody can exploit him!). In other words: an optimal strategy pair consists of two strategies that maximally exploit each other (which has been probably mentioned already like 100x times...).

d) Optimal strategy pairs are also called Nash equilibria (for anyone who hasn't figured it out yet) and...multiplayer games also have at least one equilibria. Additionally some games can have multiple.

e) As far as optimal strategies in zero-sum 2 players games go:
- they always exist as long as mixed strategies are allowed.
- if an optimal strategy contains a mixed strategy, then the EV of each strategic alternative must be equal against the opponent's optimal strategy (otherwise the player could increase his EV by dumping that lower EV strategy, which is not possible by Nash equilibrium definition).

4. Definitions of various types of strategies.

Dominated strategy: A strategy S is said to be dominated if there is a strategy S' such that EV(S') >= EV(S) against all opponents' strategies and EV(S') > EV(S) for at least one (Roshambo-F was an example of a game where a strategy of "playing the Flower" was a dominated strategy). Strictly dominating means that a strategy S' performs better than S against all possible strategies (eg: opponent goes all-in, we have AA in BB: EV(calling) > EV(folding) no matter what opponents strategy (range) is). We can obtain simpler games by recursively removing dominated strategies from a larger game (or like they wrote in the book: "A game G can be reduced to a sub-game G' by removing dominated strategic options from both sides. An optimal strategy pair for G' will be an optimal strategy pair for G.")

Co-optimal strategy: a strategy that is part of an optimal strategy. A dominated strategy CAN also be a co-optimal strategy (there is an example of that in the book).

Mixed strategies: occur when both sides have the ability to exploit each other's pure strategies (and "the mixture" will be between the two extremes of those pure strategies). If we know what components are mixed, we can solve game systematically by solving equations that make both sides indifferent to the various mixed options (like in Cops & Robbers game). If we dont know which components are mixed, we have to guess at structure of solution → parametrizations.

5. Example.

There were 3 games described in this chapter, if anyone has some problems with understanding/solving them I can help. Other than that, what do you want me to do as an additional example ffs? :mad:
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
I got/understood the three examples when using the matrixes, although unsure of the equations and some of the explanations in the book. (think I should have stayed on at school )

The one for Roshambo S

<B, rock > = (0)( a) + (-l)(b) + (1)(c)
Does the 1st part of the equation mean <B, rock > mean : -
Is it Bs a b and c choices bases on A choosing Rock

i.e. When A chooses Rock
Bs a) choice is a tie
Bs b) choice is b loses 1
Bs c) choice is b wins 1

and using the three results
rock = c - b
paper =a - 2c
scissors =2b - a

wasnt sure how it got to a=2b=2c (1/2, 1/4, 1/4)
 
Last edited:
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Sooo... the game is simultaneous, right? B can't look at what A is showing (playing) in any particular moment and then just pick the best response (if he could, it would be rather a ME strategy not an Optimal one) - what I'm basically saying is that B is not choosing anything. It's A actually in that example...so A has basically a "fixed" strategy of playing each option (rock, paper, scissors) some % of time (a%, b%, c%) (it does not depend on B's choices). B also has the same options and we are just calculating EVs of B playing each one of them - that's what those equations are. It's the EV of B playing rock against A's strategy - so you know, rock will tie with itself a% of time, lose to paper b% of time and win with scissors c% of time.

wasnt sure how it got to a=2b=2c (1/2, 1/4, 1/4)

You come up with those EVs for each of B's possible action and than you want to find at which A's strategy, B is indifferent between his actions -> meaning that it does not matter what B chooses -> it occurs when the EVs of all of his possible actions are the same: EV(rock) = EV(paper) = EV(scissors). You get 3 equations with 3 variables, you solve them and then you get that: a = 2b = 2c which is equal to a strategy: (a, a/2, a/2). Because a + b + c = 1 (those are all % after all) you come up with: a + a/2 + a/2 = 1. From that you get a = 1/2 and if you put it back into the A's strategy...oh look, you get: (1/2, 1/4, 1/4) :cool:
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
what I'm basically saying is that B is not choosing anything. It's A actually in that example...
thats what i was trying to say but probably didnt explain it well

<B, rock > is bs results to A choosing rock

<B, rock > = (0)( a) + (-l)(b) + (1)(c)

i.e. When A chooses Rock
Bs a) choice is a tie
Bs b) choice is b loses 1
Bs c) choice is b wins 1

also thanks for the second part a = 2b = 2c
Over 20 years since i did algebra :eek:
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
going back to the a = 2b = 2c
from

<B, rock > = c - b

<B, paper > = a - 2c

<B, scissors > = 2b - a

not sure how we get a=2b=2c
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
going back to the a = 2b = 2c
from

<B, rock > = c - b

<B, paper > = a - 2c

<B, scissors > = 2b - a

not sure how we get a=2b=2c

As I said, you want to make B indifferent so you search for (a, b, c) that make B's EVs equal.

EV(rock) = EV(paper) = EV(scissors)

c - b = a - 2c = 2b - a

Three equations, three variables:

1) c - b = a - 2c => a = 3c - b
2) a - 2c = 2b - a => b = a - c
3) c - b = 2b - a => c = 3b - a


Now you can pick for example first equation and

* substitute for c (from 3)): a = 3(3b - a) - b => a = 2b
* substitute for b (from 2)): a = 3c - a + c => a = 2c

You get that a = 2b AND a = 2c which can be also written as: a = 2b = 2c
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
As I said, you want to make B indifferent so you search for (a, b, c) that make B's EVs equal.

EV(rock) = EV(paper) = EV(scissors)

c - b = a - 2c = 2b - a

Three equations, three variables:

1) c - b = a - 2c => a = 3c - b
2) a - 2c = 2b - a => b = a - c
3) c - b = 2b - a => c = 3b - a


Now you can pick for example first equation and

* substitute for c (from 3)): a = 3(3b - a) - b => a = 2b
* substitute for b (from 2)): a = 3c - a + c => a = 2c

You get that a = 2b AND a = 2c which can be also written as: a = 2b = 2c

what does the => mean, does it eman therefore as in excel it means Equal or greater than.

Also is the 2nd equation right, when i substituted the letter for numbers i got the 1st and 3rd to work out but not the 2nd
 

Attachments

  • mop.jpg
    mop.jpg
    6.3 KB · Views: 179
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
what does the => mean, does it eman therefore as in excel it means Equal or greater than.
It's an arrow :) I use it to indicate that something is related to or came from something. Viewing it as "therefore" probably works too. I will use "->" in the future because you are right that "=>" can be confused with "greater or equal".
Also is the 2nd equation right, when i substituted the letter for numbers i got the 1st and 3rd to work out but not the 2nd
I'm not sure I understand what you mean... You picked some number for, let's say: b, and the other ones (a and c) come out "wrong"?
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
It's an arrow :) I use it to indicate that something is related to or came from something. Viewing it as "therefore" probably works too. I will use "->" in the future because you are right that "=>" can be confused with "greater or equal".

I'm not sure I understand what you mean... You picked some number for, let's say: b, and the other ones (a and c) come out "wrong"?

the 3 equations
1) c - b = a - 2c => a = 3c - b
a = 7 b=8 c=5
so
5 - 8 = 7 -10 = -3

3) c - b = 2b - a => c = 3b - a
a = 15 b = 6 c = 3
so
3 - 6 = 12 - 15 = -3

But in Equation 2
2) a - 2c = 2b - a => b = a - c

a = 12 b = 15 c = 3
so 12 - 6 = 30 - 12
1st part is 6 and 2nd part is 18
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
But in Equation 2
2) a - 2c = 2b - a => b = a - c

a = 12 b = 15 c = 3
so 12 - 6 = 30 - 12
1st part is 6 and 2nd part is 18

Because b = 9 not 15 (a - c = 12 - 3 = 9). I'm still not sure what you're trying to do tbh...because if you randomly pick two numbers (eg: a and b) and then calculate the third one (c)...both sides of the equation will always be equal because...that's the point of solving the equation after all... Am I missing something?
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
Because b = 9 not 15 (a - c = 12 - 3 = 9). I'm still not sure what you're trying to do tbh...because if you randomly pick two numbers (eg: a and b) and then calculate the third one (c)...both sides of the equation will always be equal because...that's the point of solving the equation after all... Am I missing something?

oops sorry got numbers wrong way round, i was just using numbers to make sure I got my algebra right
and if A =12 B = 9 and C = 3

12 - 6 = 18 - 12 = 6
Whereas equation 1 and 2 = -3
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
a tleast my algebra is getting better as managed to work my way thorugh the cops and robbers ones

<robber, rob > = (-1)(x) + (1)(1 - x)
=-x + 1 - 1x
<robber, rob > = 1 - 2x

<robber, don't > = (1)(x) + 0(1 - x)
=1x + 0 + 0x
<robber, don't > = x

so
1 - 2x = x
1 = 3x
x = 1/3

-------------------------------------------------------------------------
<cop, patrol > = (1)(x) + (-1)(1 - x)
= 1x -1 + 1x
<cop, patrol > = 2x - 1

<cop, don't > = (-1)(x) + 0(1 - x)
= -x + 0 -0x
<cop, don't > = - x

so
2x - 1 = -x
3x = 1
x = 1/3
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
That's actually really impressive that you are willing to go over/solve those examples on your own (I was too lazy to do it, I just "accepted" the solutions/results that were already written in the book :eek: ).

There is one more thing I would like to discuss, that is not mentioned in the book (or maybe it is, I don't know I haven't seen it yet), but it might come in handy in the near future Epsilon-equilibrium. It's really cool because it allows us to estimate how close a particular solution/guess is to Nash equlibrium without even knowing the equlibrium itself (which is often the case). I will get back to it tomorrow/Friday.

In the meantime, does anyone want to try to solve a simple game? :)
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
That's actually really impressive that you are willing to go over/solve those examples on your own (I was too lazy to do it, I just "accepted" the solutions/results that were already written in the book :eek: ).

There is one more thing I would like to discuss, that is not mentioned in the book (or maybe it is, I don't know I haven't seen it yet), but it might come in handy in the near future Epsilon-equilibrium. It's really cool because it allows us to estimate how close a particular solution/guess is to Nash equlibrium without even knowing the equlibrium itself (which is often the case). I will get back to it tomorrow/Friday.

In the meantime, does anyone want to try to solve a simple game? :)
Ok the simpler the better :)
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Ok the simpler the better :)

So...this game is not really that related to the content of the chapter but I think you can give it a try if you want. It's a combinatioral game - which basically means that: a) the play is sequential with perfect information and no chance moves b) the game is determined by a set of positions - so players alternate moves from one position to another; there is usually a set of starting positions from which they begin the game and some kind of terminal position where they want to end up to win the game. There are probably few more things but we won't get into that...

So here are the rules:
1) 2 players: I and II
2) a pile of 21 chips (so this is the starting position whereas the pile of 0 chips is the terminal position)
3) both players alternate moves with player I starting the game. A move consists of removing 1, 2 or 3 chips from the pile
4) player that removes the last chip wins (in other words: a player that can't make a move loses and the player who moves to the terminal position wins)

The question is of course...what is the winning strategy for player I (the player who starts the game)?


Straightaway, I include a little hint so that you guys won't waste much time on this one if you get stuck or don't know where/how to begin.
A good starting point is to analyze the game looking at the simplest cases and trying to spot "the pattern" (which is very often a good approach when analyzing various problems). So begin with a pile of 1 chip - which player can win the game in this case? What about a pile of 2, 3 chips? Does anything change when you add the 4th chip? (hint2: yes, it does :)) Try to find a pattern! What does player_I have to do to guarantee himself to be always in the position for winning the game?
 
R

rhombus

Legend
Silver Level
Joined
Mar 1, 2012
Total posts
2,601
Chips
0
working backwards from 4 as player 2 cant finish, Player 1 should try to go to 8 12 16 20.

so whatever player 2 does player 1 will go down in 4s so 1st move is 20 maybe
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
working backwards from 4 as player 2 cant finish, Player 1 should try to go to 8 12 16 20.

so whatever player 2 does player 1 will go down in 4s so 1st move is 20 maybe

Yea, that's pretty much it :) I knew it was too easy! If we want to be more precise about it, there is this terminology of N and P positions. N-position is a position that is a winning position for the Next player (the one who is about to make the move) and consequently a P-position is a winning position for the Previous player (the one who just moved). Of course the terminal position is a P-position.

Applying this to that game, you just crushed, we get:

Code:
0 1 2 3 4 5 6 7 8 9 10 ... 19 20 21 ... (number of chips)
P N N N P N N N P N N ...   N  P N ...  (position)

So...yea the winning strategy is to always move to the P-position :)

The "formal" algorithm used for labeling those positions which you can use for solving that type of games (they are called "Take-away" or "Substraction" games afaik - a Game of Nim is actually one of the more popular one of that kind) goes like:

1. Label every terminal position as a P-position.
2. Label every position reachable in one move from labelled P-positions as a N-position.
3. Label all the positions whose only moves are to the labelled N-positions as P-positions.
4. IF there were no new P-positions found in step 3, stop. Otherwise go to step 2.
 
Fknife

Fknife

Legend
Silver Level
Joined
Oct 26, 2013
Total posts
1,128
Chips
0
Going back to what I was saying about Epsilon Equilibrium: it's a way of measuring how close given strategies are to the equlibrium without knowing how the equlibrium itself looks like. This is how you do it: you take a pair of strategies, let's say A and B. You calculate what is the EV of playing a strategy A against B. Then, you make A play ME against B and you calculate the EV of that strategy. Next, you do the same but with B playing ME against A. Epsilon Equlibrium is the maximum change in EV one can gain from deviating from it's "original" strategy. Obviously, it will be 0 if we are already at the equlibrium because (going by definition) neither player can improve their EV by deviating.

To give an example how to calculate dEV, I made a simple spreadsheet to quickly calculate EVs in Roshambo . Suppose we don't know what the optimal strategy pair for this game looks like so we take a guess (rock%, paper%, scissors%):

A: (10, 30, 60)
B: (50, 20, 30)

1) EV of A's strategy against B's strategy is: EV(A) = -$0.11
2) Best response (ME strategy) for A against B is to start playing only Paper (becuse B plays rock "too much"): EV(A') = $0.20
3) Best response for B against A is to start playing only Rock (because A plays Scissors a lot): EV(A'') = -$0.30

dEV = $0.33 (it's the longer distance from those ME strategies to the original one: dEV = max(EV(A') - EV(A); EV(A'') - EV(A))). Next guess:

A: (35, 25, 40)
B: (45, 20, 35)

1) EV(A) = -$0.02
2) EV(A') = $0.10
3) EV(A'') = -$0.15

dEV = $0.13 -> it's closer to dEV = 0 then the previous guess so...we are closer to Equilibrium :)

If we go down this path of minimizing dEV (which is what those poker GTO programs do) we will eventually end up in equlibrium (which is (33%, 33%, 33%) for Roshambo)) or close to it if we want to save time and we don't mind stopping at a certain (close) distance from it.

I'm leaving for the weekend so in case of questions etc I will get back to them on monday. For something more about "dEV in poker", you can check out this post and that video (and probably few others on that channel):

 
Top