Skip to main content

Command Palette

Search for a command to run...

What is Reinforcement Learning? A 3-Minute Overview

Published
4 min read
What is Reinforcement Learning? A 3-Minute Overview
Y

With hands-on experience from my internships at Samsung R&D and Wictronix, where I worked on innovative algorithms and AI solutions, as well as my role as a Microsoft Learn Student Ambassador teaching over 250 students globally, I bring a wealth of practical knowledge to my Hashnode blog. As a three-time award-winning blogger with over 2400 unique readers, my content spans data science, machine learning, and AI, offering detailed tutorials, practical insights, and the latest research. My goal is to share valuable knowledge, drive innovation, and enhance the understanding of complex technical concepts within the data science community.

I hope you are doing great so recently I wrote some blogs related to supervised machine learning and as right now I am disvoering the world of reinforcement learning so I thought why not to share what I have learned about it in short.

Introdutction to Reinforcement learning

Most of us are aware about 2 conventional types of machine learning types supervised and unsupervised machine learning, but there is another type of machine learning which is very much different from these 2 types and that is Reinforcement learning,

So today you will learn about reinforcement learning, but before diving deep into this type let me give you a short recap of when we use supervised and unsupervised machine learning. Now in order to figure out when to implement which type of machine learning, first analyze your data because if your data is labeled, means when your data is having both input ( features ) and corresponding output ( labels ) then we use the supervised machine learning and if labels are not there in our data then we use unsupervised machine learning.

Now the main question arises, What is Reinforcement Learning?

image.png

Example to solidfy the understanding

If you think this general definition is little difficult to understand , then don't worry let me explain you what you read above . As the definition defines that " Reinforcement learning is feedback-based machine learning , so from this line we got to know that in reinforcement learning there will be some feedback ( that will be in form of reward or punishment ) which will be given to the agent ( which is basically the entity that we suppose will take action ) everytime our agent will take some action in the environement.

If reinforcement learning is feedback learning , then do we need data for training our machine learning model ?

Thing which makes Reinforcement learning different from conventional types is an absence of data for training the machine learning model because our model will itself get trained through the reward it will get for the desired output and the punishment for the undesired output. Let us consider this example

image.png

In the above illustration, we have our grid system with 12 blocks and the S4 block is the one where we want our agent ( that is robot in this case ) to reach. Now since it is reinforcement learning so we will not provide any prior data to train our robot to reach block S4 efficiently, thus when our robot will take the first action from the present position to let say block S6 which is not the right path, we will give our robot negative feedback.

In the same way next time our robot will again move to some block, let say this time to S10 which is the right path so we will give a positive feeback . In this way with trial and error our robot will reach to the desired output by taking the guidance from the feedback it recieves , but there is one problem and that is what about the efficiency because in order to reach to block S4 from S9 there are 2 paths

image.png

How to increase efficiency of Reinforcement based ML model ?

So the way we improve the efficiency of the robot is by using an algorithm, referred to as Generative Action Selection through Probability (GRASP), which improve the exploration in reinforcement learning by reshaping the environment also known as exploration space to limit the choice which robot will have to make .In the next blog we will take a look at different algorithms used in reinforcement learning.

Short note

So this was a very quick introduction to reinforcement learning and in the next coming blogs we will start off with more fundamental concepts and terminologies related to reinforcement leanring. Also if you have any kind of suggestions then let me know in the comments it would really help me a lot.

💡
Also I would love to connect with you, so here is my Twitter and linkedin