math mode does not work in gitea
This commit is contained in:
parent
9823a78177
commit
3d28e60ba0
1 changed files with 6 additions and 6 deletions
12
README.md
12
README.md
|
@ -35,19 +35,19 @@ considered draw if no one won.
|
|||
## Combinatorics
|
||||
|
||||
Without taking into account anything, we can estimate the upper bound of the
|
||||
number of possible boards. There is $ 3^9 = 19683 $ possibilites.
|
||||
number of possible boards. There is `3**9 = 19683` possibilites.
|
||||
|
||||
There are 8 different symetries possibles (dihedral group of order 8, aka the
|
||||
symetry group of the square). This drastically reduce the number of possible
|
||||
boards.
|
||||
|
||||
Taking into account the symetries and the impossible boards (more O than X for
|
||||
example), we get $765$ boards.
|
||||
example), we get `765` boards.
|
||||
|
||||
Since we do not need to store the last board in the DAG, this number drops to
|
||||
$627$ non-ending boards.
|
||||
`627` non-ending boards.
|
||||
|
||||
This make our state space size to be $627$ and our action space size to be $9$.
|
||||
This make our state space size to be `627` and our action space size to be `9`.
|
||||
|
||||
## Reward
|
||||
|
||||
|
@ -60,10 +60,10 @@ determined. We backtract over all the states and moves to update the Q-table,
|
|||
given the appropriate reward for each player.
|
||||
Since the learning is episodic it can only be done at the end.
|
||||
|
||||
The learning rate $\alpha$ is set to $1$ because the game if fully
|
||||
The learning rate α is set to `1` because the game if fully
|
||||
deterministic.
|
||||
|
||||
We use an $\varepsilon$-greedy (expentionnally decreasing) strategy for
|
||||
We use an ε-greedy (expentionnally decreasing) strategy for
|
||||
exploration/exploitation.
|
||||
|
||||
The Bellman equation is simplified to the bare minimum for the special case of
|
||||
|
|
Loading…
Reference in a new issue