Blog Home / Financial Terms / What is Multiple Regression?

What is Multiple Regression?

A multiple regression analysis examines the relationship between many independent variables and one dependent variable.

A multiple regression analysis examines the relationship between many independent variables and one dependent variable.

What is Multiple Regression?

Multiple regression is a statistical method for examining the relationship between numerous independent variables and a single dependent variable. It analyses and aims to predict the value of a single dependent variable by using known independent variables. Each predictor value is weighed, with the weights indicating how much of an impact it has on the overall forecast.

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of MLR is to model the linear relationship between the explanatory (independent) variables and response (dependent) variables. In essence, multiple regression is the extension of ordinary least-squares (OLS) regression because it involves more than one explanatory variable.

Why is it essential?

Researchers can use multiple regression analysis to assess the strength of the relationship between an outcome (the dependent variable) and several predictor variables and the importance of each predictor to the relationship, often with the effect of other predictors statistically eliminated.

What Multiple Linear Regression (MLR) Can Tell You

Simple linear regression is a function that allows an analyst or statistician to make predictions about one variable based on the information that is known about another variable. Linear regression can only be used when one has two continuous variables—an independent variable and a dependent variable. The independent variable is the parameter that is used to calculate the dependent variable or outcome. A multiple regression model extends to several explanatory variables.

The MLR model is based on the following assumptions:

  • There is a linear relationship between the dependent variables and the independent variables
  • The independent variables are not too highly correlated with each other
  • yi observations are selected independently and randomly from the population
  • Residuals should be normally distributed with a mean of 0 and variance σ

The coefficient of determination (R-squared) is a statistical metric that is used to measure how much of the variation in outcome can be explained by the variation in the independent variables. R2 always increases as more predictors are added to the MLR model, even though the predictors may not be related to the outcome variable.

R2 by itself can’t thus be used to identify which predictors should be included in a model and which should be excluded. R2 can only be between 0 and 1, where 0 indicates that the outcome cannot be predicted by any of the independent variables and 1 indicates that the outcome can be predicted without error from the independent variables.

When interpreting the results of multiple regression, beta coefficients are valid while holding all other variables constant (“all else equal”). The output from a multiple regression can be displayed horizontally as an equation, or vertically in table form.

Example of How to Use Multiple Linear Regression (MLR)

As an example, an analyst may want to know how the movement of the market affects the price of ExxonMobil (XOM). In this case, the linear equation will have the value of the S&P 500 index as the independent variable, or predictor, and the price of XOM as the dependent variable.

In reality, multiple factors predict the outcome of an event. The price movement of ExxonMobil, for example, depends on more than just the performance of the overall market. Other predictors such as the price of oil, interest rates, and the price movement of oil futures can affect the price of Exon Mobil (XOM) and the stock prices of other oil companies. To understand a relationship in which more than two variables are present, MLR is used.

MLR is used to determine a mathematical relationship among several random variables.

 In other terms, MLR examines how multiple independent variables are related to one dependent variable. Once each of the independent factors has been determined to predict the dependent variable, the information on the multiple variables can be used to create an accurate prediction on the level of effect they have on the outcome variable. The model creates a relationship in the form of a straight line (linear) that best approximates all the individual data points.

Referring to the MLR equation above, in our example:

  • yi = dependent variable—the price of XOM
  • xi1 = interest rates
  • xi2 = oil price
  • xi3 = value of S&P 500 index
  • xi4= price of oil futures
  • B0 = y-intercept at time zero
  • B1 = regression coefficient that measures a unit change in the dependent variable when xi1 changes—the change in XOM price when interest rates change
  • B2 = coefficient value that measures a unit change in the dependent variable when xi2 changes—the change in XOM price when oil prices change

The least-squares estimates—B0, B1, B2…Bp—are usually computed by statistical software. As many variables can be included in the regression model in which each independent variable is differentiated with a number—1,2, 3, 4…p.

The Difference Between Linear and Multiple Regression

Ordinary linear squares (OLS) regression compares the response of a dependent variable given a change in some explanatory variables. However, a dependent variable is rarely explained by only one variable. In this case, an analyst uses multiple regression, which attempts to explain a dependent variable using more than one independent variable.

Multiple regressions can be linear and nonlinear. MLRs are based on the assumption that there is a linear relationship between both the dependent and independent variables. It also assumes no major correlation between the independent variables.

How Are Multiple Regression Models Used in Finance?

Any econometric model that looks at more than one variable may be a multiple. Factor models compare two or more factors to analyze relationships between variables and the resulting performance. The Fama and French Three-Factor Mod is such a model that expands on the capital asset pricing model (CAPM) by adding size risk and value risk factors to the market risk factor in CAPM (which is itself a regression model). By including these two additional factors, the model adjusts for this outperforming tendency, which is thought to make it a better tool for evaluating manager performance.

Owais Siddiqui
4 min read
Shares

2 comments

Leave a comment

Your email address will not be published. Required fields are marked *