[Data Science] What is bias and why is it so important ?

नमस्ते,

I have started reading about neural networks and deep learning on http://neuralnetworksanddeeplearning.com.

While learning about data analysis, one would be wondering what is this bias variable and why is it used in all the machine learning algorithms. Even I used to wonder about this while working on my Bachelor thesis. Initially I used to think that bias is a just a variable that we add in order to achieve better accuracy. But later on, I realized that this is something fundamental and used in all the algorithms I have came across till now. While reading chapter 1 today on the above link (which has a very good example), I thought for sometime about it and tried to explain it to myself and came up with a simple scenario. (Feel free to share your scenarios).

Put in simple terms, bias is just like a real life inclination towards a particular outcome or preference that a person has. High bias can result in under-fitting and we would not get the desirable output.

Consider two individuals ‘A’ and ‘B’. Say both stay in a city having high pollution levels and high traffic. ‘A’ has a high preference for taking private vehicle (negative bias) and B has a high bias towards taking the public transport or walk to destination (positive bias).

In a normal scenario, A’s bias is not perfect whereas B’s bias is too perfect. In case of emergency or accident, A’s bias might prove helpful since they can take the injured person quickly to the nearby hospital in their vehicle but B’s high bias towards public transport would not be feasible since public transport can prove to be more time consuming.

Hence we need to find a person that will help to minimize this error/bias and help in choosing the correct mode of transport/option based on a given scenario. This will help us achieve a good-fit.

In supervised learning, the data set would mostly be from a real scenario and there would be some bias in the predicate/output. Hence we use bias variable in order to minimize the error caused as a result of the bias in the training data set. Having a perfect-fit might not be possible, I think; since even humans have some level of bias towards something/someone.

P.S: The above scenario is just an assumption and not a bias towards anything 😉

[Data Science] What is bias and why is it so important ?

Published by Jackson Isaac

Leave a comment Cancel reply

[Data Science] What is bias and why is it so important ?

Share this:

Related

Published by Jackson Isaac

Leave a comment Cancel reply