math mode does not work in gitea

master
otthorn 3 years ago
parent 9823a78177
commit 3d28e60ba0

@ -35,19 +35,19 @@ considered draw if no one won.
## Combinatorics
Without taking into account anything, we can estimate the upper bound of the
number of possible boards. There is $ 3^9 = 19683 $ possibilites.
number of possible boards. There is `3**9 = 19683` possibilites.
There are 8 different symetries possibles (dihedral group of order 8, aka the
symetry group of the square). This drastically reduce the number of possible
boards.
Taking into account the symetries and the impossible boards (more O than X for
example), we get $765$ boards.
example), we get `765` boards.
Since we do not need to store the last board in the DAG, this number drops to
$627$ non-ending boards.
`627` non-ending boards.
This make our state space size to be $627$ and our action space size to be $9$.
This make our state space size to be `627` and our action space size to be `9`.
## Reward
@ -60,10 +60,10 @@ determined. We backtract over all the states and moves to update the Q-table,
given the appropriate reward for each player.
Since the learning is episodic it can only be done at the end.
The learning rate $\alpha$ is set to $1$ because the game if fully
The learning rate α is set to `1` because the game if fully
deterministic.
We use an $\varepsilon$-greedy (expentionnally decreasing) strategy for
We use an ε-greedy (expentionnally decreasing) strategy for
exploration/exploitation.
The Bellman equation is simplified to the bare minimum for the special case of

Loading…
Cancel
Save