Skip to main content

Simple Linear Regression

 Simple Linear Regression

 

Simple linear regression is used to find out the best relationship between a single input variable (predictor, independent variable, input feature, input parameter) & output variable (predicted, dependent variable, output feature, output parameter) provided that both variables are continuous in nature. This relationship represents how an input variable is related to the output variable and how it is represented by a straight line.


After looking at scatter plot we can understand:

  1. The direction
  2. The strength
  3. The linearity

The above characteristics are between variable Y and variable X. The above scatter plot shows us that variable Y and variable X possess a strong positive linear relationship. Hence, we can project a straight line which can define the data in the most accurate way possible.

If the relationship between variable X and variable Y is strong and linear, then we conclude that particular independent variable X is the effective input variable to predict dependent variable Y.

To check the collinearity between variable X and variable Y, we have correlation coefficient (r), which will give you numerical value of correlation between two variables. You can have strong, moderate or weak correlation between two variables. Higher the value of “r”, higher the preference given for particular input variable X for predicting output variable Y. Few properties of “r” are listed as follows:

  1. Range of r: -1 to +1
  2. Perfect positive relationship: +1
  3. Perfect negative relationship: -1
  4. No Linear relationship: 0
  5. Strong correlation: r > 0.85 (depends on business scenario)

Command used for calculation “r” in RStudio is:

> cor(X, Y)

where, X: independent variable & Y: dependent variable Now, if the result of the above command is greater than 0.85 then choose simple linear regression.

If r < 0.85 then use transformation of data to increase the value of “r” and then build a simple linear regression model on transformed data.

Steps to Implement Simple Linear Regression:

  1. Analyze data (analyze scatter plot for linearity)
  2. Get sample data for model building
  3. Then design a model that explains the data
  4. And use the same developed model on the whole population to make predictions.

Comments

Popular posts from this blog

Understanding Logistic Regression using R

  1. What is Logistic Regression? Logistic Regression is one of the machine learning algorithms used for solving classification problems. It is used to estimate probability whether an instance belongs to a class or not. If the estimated probability is greater than threshold, then the model predicts that the instance belongs to that class, or else it predicts that it does not belong to the class as shown in fig 1. This makes it a binary classifier. Logistic regression is used where the value of the dependent variable is 0/1, true/false or yes/no. Example 1 Suppose we are interested to know whether a candidate will pass the entrance exam. The result of the candidate depends upon his attendance in the class, teacher-student ratio, knowledge of the teacher and interest of the student in the subject are all independent variables and result is dependent variable. The value of the result will be yes or no. So, it is a binary classification problem. Practical Implementation of Logistic Re...

785 Enterprise Analytics Jobs In Mumbai, Maharashtra, India (38 New)

Choosing the proper and greatest digital advertising courses in Delhi & NCR is an ideal determination to make for better career and to work in smart method on larger salaries or ensure a better supply of earnings. Boston Institute of Analytics is an international organization that imparts coaching in predictive analytics, machine learning and synthetic intelligence to school college students and dealing professionals by means of classroom coaching carried out by business experts. In this teacher-led, stay training (onsite or distant), contributors will learn how to use SSAS to analyze giant volumes of data in databases and knowledge warehouses. BBA business analytics & knowledge science is India's first specialised Undergraduate program specializing in the data science and enterprise analytics business. Dr Abhijit Dasgupta, Director - Large Information & Visual Analytics, was a source of fixed assist and motivation together with every different faculty member we stu...

The Concept of KNN Algorithm Using R

Understanding the Concept of KNN Algorithm Using R   The huge amount of data that we’re generating every day, has led to an increase of the need for advanced Machine Learning Algorithms. One such well-performed algorithm is the K Nearest Neighbour algorithm. In this blog on KNN Algorithm In R, we will understand what is KNN algorithm in Machine Learning and its unique features including the pros and cons, how the KNN algorithm works, an essay example of it, and finally moving to its implementation of KNN using the  R Language. It is quite essential to know Machine Learning basics. Here’s a brief introductory section on what is Machine Learning and its types. Machine learning  is a subset of Artificial Intelligence that provides machines the power to find out automatically and improve from their gained experience without being explicitly programmed. There are mainly three types of Machine Learning discussed briefly below: Supervised Learning: It is that part of M...