Analyzing the Output

Help for analyzing the output in the multiple regression analysis and forecasting template for Excel

User Guide Navigation:
Business Spreadsheets User Guides: Project Planning and Management Multiple Regression Analysis and Forecasting Investment and Business Valuation Real Options Valuation Portfolio Optimization Portfolio Performance Tracking Customer Invoicing
Multiple Regression Analysis and Forecasting Help Topics: Inputting Data Analyzing the Output Forecasting the Dependent Variable

The Output of the Multiple Regression Forecasting model essentially provides the regression equation to predict the dependent variable (given the independent variable values) with some key statistical tests to evaluate its proficiency. Information on each of the measures is available by clicking on the corresponding question mark icons.

R-Squared: The primary statistic is the R-Squared, which essentially returns the percentage change in dependent variable that can be explained by changes in the independent variables. The idea is that new combinations of independent variables can be used to maximize this percentage. The chart at the top right visualizes the R-Squared by plotting the actual versus predicted values of the dependent variable. If the dots are closer to the line, the R-squared is higher and a stronger relationship exists. If the dots are widely spread, the inverse is true.

F-statistic: The next key statistic is whether the analysis is significant at a 95% level of confidence. The F- statistic must be greater than the critical F-statistic to the right. This depends on the number of observations used and over relationship between the variables. Essentially this test must be met to even consider using the analysis as a predictor for the dependent variable.

Below the equation parameters, are the actual components of the multiple regression equation. To the right of this, analysis is made to the individual independent variables to help ascertain and eliminate any problem variables. Individual R-squared statistics are returned so that independent variables with low R-Squares may be removed and potentially replaced by stronger value drivers in the analysis. This becomes somewhat of an iterative process to strengthen the overall analysis if the feature selection values were set at zero to disable the automatic feature selection.

To the right of the equation components, analysis is made to the individual independent variables to help ascertain and eliminate any problem variables. Individual R-squared statistics are returned so that independent variables with low R-Squares may be removed and potentially replaced by stronger value drivers in the analysis. This becomes somewhat of an iterative process to strengthen the overall analysis.

Durbin-Watson statistic: The Durbin-Watson statistic, at the top right of the Output sheet, tests the overall regression equation for the presence of Autocorrelation. Essentially, this tests whether a significant part of the change in the dependent variable is actually a product of time rather than the independent variables (i.e. the fact that simply an observation period may have effects on future observations). Comparison against critical ranges of the Durbin-Watson statistic highlight the presence of positive or negative autocorrelation however the statistic should be close to 2 for minimal autocorrelation. If the overall equation possesses autocorrelation, then the analysis of this statistic for each independent variable against the dependent is displayed below can be used to isolate the source.

Multicollinearity: Sometimes two or more independent variables may have a similar relationship to the dependent and be closely related to each other. This is known as multicollinearity and essentially biases the analysis to have a stronger predictive outcome. In such a case it is usually appropriate to eliminate the independent variable that is more difficult to predict in the forecast.

To detect multicolinearity, the model runs an R-Squared statistic for each independent variable against all other independent variables. Where a strong relationship is detected, independent variables responsible are listed.

Screenshot: Analyzing the Output

Back