In Reinforcement Learning, what term describes the agent's strategy for selecting actions based on the current state?
Policy
Q-function
Overlook minor misbehaviors
Impose harsh punishments for any infraction

Artificial Intelligence Exercises are loading ...