Euchre wisdom: pass on the bower, lose for an hour? Part 2
Context
This post is a followup to Euchre wisdom: pass on the bower, lose for an hour? and CFR for euchre. In the counterfactual regret minimization (CFR) for euchre post, we saw that CFR for just the pre-play phase of euchre could result in a stronger agent than just perfect information monte carlo tree search (PIMCTS). We also saw that the pre-play CFR bot did not recommend always picking up the jack as the dealer – in fact it recommended passing ~30% of the time.
In this post, I summarize the bot’s strategy.
All of these results are based on using CFR for the pre-play phases of euchre and PIMCTS with 50 worlds for the play phase. The CFR was trained over 20m iterations using CFR with external sampling.
Analyzing high-level strategy
We can plot a heatmap of the bots policy based on what cards are in a hand. Each cell represents all hands containing the matching card from the cell’s row and column. The color of the cell represents the probability of the bot picking up the jack of spades. The blue colors mean to pickup the jack more than 50% of the time. While red colors are for <50% of the time. The cards are sorted from highest likelihood of pickup to lowest.
As an example, the top left area of the heatmap represents when the agent has multiple spade cards (would be trump). This area is dark blue – as expected, the bot is likely to pickup the jack of spades if it already has many spades in it’s hand.
I identified 5 major regions of the policy:
Area | Recommended action |
---|---|
A) At least one spade in hand | Likely pickup, would have many trump cards |
B) At least one off-suit ace in hand | Usually pickup, high non-trump cards |
C) Other cards | Neutral, these cards don’t seem to have a large impact on the recommended action |
D) Low clubs | Lean towards not picking up |
E) Red jack | Rarely pickup, would have highest or second highest card if someone else calls a red suit as trump |
Translating to a decision tree
The heatmap gives a highlevel overview of how the bots value cards, but it can only represent information for 2 cards at a time. To get a deeper understanding of the strategy, we can use the heatmap regions as additional inputs to train a decision tree model.
Using Minimal Cost-Complexity Pruning, I generated a simplified, 3-level, decision tree, that achieves 94% accuracy for predicting the action a bot would take.
Decision tree for passing or picking up the jack based on cards in hand
|--- 0 spades | |--- Pass |--- 1 spade | |--- 1+ red jacks | | |--- 0 offsuit aces | | | |--- Pass | | |--- 1+ offsuit ace | | |--- Pickup | |--- 0 red jacks | |--- Pickup |--- 2+ spades | |--- Pickup
As expected, having more spades or offsuit aces is a strong indicator for picking up the jack of spades – these are strong cards that can win tricks.
Of note, the bots will rarely pickup a jack if they have no other spades. This only happens for 11 hands [0].
Passing to make room for euchring the opponent
As with the heatmap, having a red jack causes the bots to be more likely to pass. And in the last post, I noted that JcQcJhAh9d|Js|PPP
– a hand with a single red jack – is always a pass for the bots.
To better understand what’s going on I had the bots play 50k games where the dealer is dealt this hand. Of those games, 52% (26k) made it to the dealer being able to choose to pass or pickup the jack.
When the dealer picks up the jack, their team wins 62% of games. But the average score is 0.0002 – the dealer’s team wins as many points as they lose by often getting ecuhred (losing the majority of tricks when calling trump).
Distribution of scores JcQcJhAh9d|Js|PPPT
Score
-2 |████ 38%
-1 | 0 (can't happen since dealer's team called trump)
1 |██████ 57%
2 |█ 5%
Things look remarkably different if the dealer passes. Their team wins 74% of games. But the average score is only 0.0003. An extra point every ten thousand games.
Distribution of scores JcQcJhAh9d|Js|PPPP
Score
-2 |2%
-1 |██ 23%
1 |██ 18%
2 |█████ 54%
The bot is able to get two points most of the time. This could be due to either euchering the opponent or by taking all 5 tricks. We can look at a breakdown of the score by who calls trump to better understand what is happening:
Heatmap of dealer team’s score by who called trump
Player 0 is the player left of the dealer and player 3 is the dealer. The dealer is getting most of the wins by euchering the other team. Player zero is calling clubs as trump 32% of the time. And with the Jc
, Qc
, and an offsuit ace (Ah
), the dealer is in a good position to euchre.
The propensity for player 0 to call clubs explains area D from the heatmap at the start of the post. The bot is more likely to pass if it has clubs in its hand because it values those cards for euchering the other team when they call Clubs as trump.
Future work and limitations
This post outlines simplified rules for when to pass on the bower. But these results are not based on fully solving the game of euchre. And there is not guarantee that the combined CFR + PIMCTS agent will converge to an optimal strategy.
Next steps are to:
- Baseline performance against human players
- Get a theoretical measure for exploitability
- Extend CFR into the play phase of the game
[0] Hands where the bot will pickup with no other spades:
AcKhAhKdAd
AcAhQdKdAd
AcThAhTdAd
Ac9hThAhAd
AcQhAhQdAd
AcQhAh9dAd
Ac9hAh9dAd
AcQhKhAhAd
AcAh9dKdAd
AcThQhAhAd
AcAhTdQdAd