Statistics 4 Business - RISHABH LALA

Linear Regression for Better Business Decisions in Engineering Projects

In the realm of civil engineering, the ability to predict project outcomes accurately is invaluable. Linear regression, a foundational statistical tool, plays a crucial role in this predictive process. By understanding and applying the assumptions and diagnostics of linear regression models, engineers and project managers can make informed decisions, optimize resources, and improve the efficiency and profitability of projects.

The Essence of Linear Regression in Engineering: Linear regression models predict a continuous outcome variable based on one or more predictor variables. The model's accuracy hinges on several critical assumptions:

Linearity: The relationship between predictors and the outcome must be linear. Residuals are linear. Fix: either check and remove the outliers or apply transformations to achieve linearity of data. Plot the residuals against the fit and check if the data lies in a band or not.
Multivariate Normality: Residuals follow a normal distribution. QQ plot of residuals (Quantile-Quantile plot), Formal Test is Shapiro Wilk's Test (n<2000) and Kolmogorov Smirnov Test (n>2000).
Independence: Observations must be independent of each other. Residuals are uncorrelated with response variables (target).
Normality of Residuals: The model's residuals, or differences between observed and predicted values, should follow a normal distribution.
Homoscedasticity: The variance of error terms should be constant across all levels of independent variables. There should be no pattern in the residuals. Meaning, as the value of x axis (one of the axis) increases, there should be no observed pattern. Then the data is homoscedastic. Heteroscedasticity (opposite of homoscedasticity) means you are failing assumption of linear regression. We have Bartlett test (P>0.05 -> two samples (variance of residuals and the fit) have equal variance), Levene Test, Breusch-Pegan Test, White Test. To correct heteroscedasticity: HCSE (H- consistent standard errors, called robust standard errors). However, it only fixes the variance but not the biases. Or Correct it using FGLS or WLS.
No Multicollinearity: Predictor variables should not be correlated with each other (ideal case).

Practical Application and Example: Consider a civil engineering firm planning a bridge construction project. The project's cost could be influenced by various factors such as material costs, labor rates, and project duration. By employing linear regression, the firm can predict total project costs based on these factors.
Formula: Total Cost=(0)+β1(Material Cost)+β2(Labor Rate)+β3(Project Duration)+βTotal Cost=β0+β1(Material Cost)+β2(Labor Rate)+β3(Project Duration)+ϵ

Material Cost, Labor Rate, and Project Duration are the predictor variables.
Total Cost is the outcome variable.
β0,β1,β2, and β3 are the coefficients estimated by the regression model.
ϵ represents the error term.

Diagnostics: After fitting the model, diagnostics are crucial. Residual plots can help check for violations of linearity and homoscedasticity. For instance, plotting residuals against fitted values should ideally show no clear pattern, indicating a good model fit.

Linear regression is a powerful tool for making better business decisions in engineering projects. By rigorously checking regression assumptions and employing diagnostic tools, engineers can enhance model accuracy. This leads to more reliable predictions and outcomes, driving project success and innovation in civil engineering practices.

​Linear Regression for Better Business Decisions in Engineering Projects

Linear Regression for Better Business Decisions in Engineering Projects