Tech Term Decoded: Reinforcement Learning

Definition

Reinforcement Learning (RL) is a machine learning approach that deals with how agents can learn to make decisions through trial and error to increase total rewards. RL enables machines to learn by interacting with an environment and receiving feedback based on their actions in the form of rewards or penalties [1].

A good example of reinforcement learning would be a bank call center queue management system with the goal of reducing customer wait times. The AI learns through trial and error without human judgement to route calls efficiently during peak banking hours and gets the reward of shorter queues and higher customer satisfaction scores.

How Reinforcement learning works [2].

Origin

Reinforcement learning originated in the 1930’s through the pioneering work of behavioral psychologist B. F. Skinner. He found that reinforcement could be used to shape animal behavior when he demonstrated that animals could be trained to perform complex tasks through simple reinforcement mechanisms, such as receiving a food reward for performing a desired control. This led to him developing the concept of positive reinforcement based on the idea that an animal or any agent can learn to optimize its behavior by learning from past experience.

Today, RL is viewed as a type of machine learning that enables computers to learn from types of reinforcement, such as punishments or rewards, in order to improve [3]

Context and Usage

Reinforcement learning (RL) can be applied to a number of real-world use cases, some of which include the following:

Training system which would issue custom instructions and materials with respect to the requirements of students

Text summarization engines, dialogue agents (text, speech), gameplays

Robotics for Industrial Automation

Autonomous Self Driving Cars

Why it Matters

Reinforcement learning is an important approach developers use to train machine learning systems due to the fact that it enables an agent to learn to navigate the complexities of the environment for which it was created. For instance, a robot in an industrial setting can be taught to perform a specific task or an agent can be taught to control a video game. With time, the agent learns from its environment and optimizes its behaviors using a feedback system that involves rewards and punishments [4].

In Practice

A good real-life case study of reinforcement learning in practice can be seen in the case of AlphaGo Zero. Using reinforcement learning, AlphaGo Zero was able to learn the game of Go from scratch by playing against itself. After 40 days of self-training, Alpha Go Zero was able to outperform the version of Alpha Go known as Master that has defeated world number one Ke Jie [5].

Tech Term Decoded: Reinforcement Learning

Post a Comment

Contact Form