From 582e41687bc9da511e6e8f3e18e26966858809be Mon Sep 17 00:00:00 2001 From: Otthorn Date: Mon, 17 May 2021 00:53:53 +0200 Subject: [PATCH] :pencil2: typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2ada376..f8766d3 100644 --- a/README.md +++ b/README.md @@ -60,7 +60,7 @@ determined. We backtract over all the states and moves to update the Q-table, given the appropriate reward for each player. Since the learning is episodic it can only be done at the end. -The learning rate α is set to `1` because the game if fully +The learning rate α is set to `1` because the game is fully deterministic. We use an ε-greedy (expentionnally decreasing) strategy for