Steps to deploy the Q_Learning algorithm
Steps to implement the Q_Leaning algorithm
1. Initialize the table of values Q, Q(s,a).
2. Observe the current state s.
3. Action selection a for the state is based on one of the action selection strategies (ε-soft, ε-greedy, or softmax).
4. Take action and observe the r-value as well as the new s' state.
5. Update the Q value for the state using the observed boost value and the largest possible boost value for the next state. The implementation is updated based on the formula described above.
6. Set the state to a new state and repeat the process until you reach the end state.