Consider another example in which we want to create a model for predicting the number of weekly operating errors in a manufacturing plant. The variables that have been measured include:
- Number Of Weekly Operating Errors (the dependent variable)
- Machine Downtime (%)
- Number Of Workers
Let's start with a simple linear regression using machine downtime as the explanatory variable, which results in the equation:
# Of Weekly Errors = 50.49 − 3.82 × % Downtime
We can use a residual plot to help evaluate the fit of our simple regression to the data.
This residual plot does not show any discernible trend, so we can assume that for this simple regression, a linear relationship is the correct model.
However, suppose we want to turn this into a multiple regression by adding a second variable, to see if we can create a model that is a better fit for the data.