Dick’s Sporting Goods is known for selling all varitey sport equements like exerciseing meashon, fishing good, hunting products and all sport accessories as well as supplyies. The company was founded in 1948 by Richard Stack. He was currently woking at the Army/Navy store in his hometown which is New York. By the end of World War II. His grandmother gave him $300 and he rented a store to start the first Dick’s Sporting Goods at that time. Slowly this compay stated to take off and he was expanding more and opeing sevral stores.
Now Dick’s has about 610 stores in 48 states as of 2015 mainly in the eastern side of the United States. Steps for Regression Regression measures a statistical strength between dependent variable and independent variables which is will be changing most of the time. Rgression takes a vriety of variables and trys to prdict Y to find a mathematical reelation between them. Whenever we are doing multiple regression that means that there is more than one independent variable to perdict the Yor the dependent.
The main idea is to see the relationshp between a indpendent variabls and the depndent. A fitted regression is implecating a line that minimizes the sum of squared. After finding out all the variabls that is need to contunue the work, the first step we need to do in the Mnitab is a Correlation Matrix. The reason we are doing this step is to see if any of the variabls are correlated and we do not want them to be so. The Correlation Matrix will let us know that if there is any collinearity that may be between the independet variables.
Collinearity could be said that there is an issue of two or more independet variabls are very hgihly correlated with each other as well as the coefficients having large standard erros as a consequence with also smaller t statistics. When two or more independet variables correlate very highly that means it will not make the forcast better in fact it will make it less accurate as a result. So the most preferable action to take when this happen would be to get rid of one and keep the remaning data.
There is also a different way of checking the visual inspection of the relationship between the independent variables and the dependent variables by using Scatterplots. In the Scatterplots if we draw an imaginer straight line between the data that fits also that if we see a lot of data closely around the line of best fits means that there is storng relationship with the two variables. On the other hand if it is difficult to locate where you would draw the line, means that there is no significant and may be no correlation for the most part.
The next part is percentage of errer in other word known as the alpha value. Typically the alpha value of . 01, . 05 and . 10 are reasonable alpha values that one should use. After we have the dependent and independent variables the next step is to go head and run a regression. This way the regression will tell us if there is signifcance in the data or not. To be able to check the signifcance in the data, we must take a look at the I and P value. A T-Value is a test that measures the variation btween a sample statistically. P-value is probability of the observed a result equal to or more extreme.
After runing regression if we look back at the p-value and we see that it is higher that the alpha than we should definitely throw it of the data. We also have to be careful to eliminate one variable at a time so we can observe the different changes we see in the data. Also after eliminating a varable we should always run the regression again. Any variables that are insignificant should be eliminated for us to present the most accurate model with our available data that we got. VIF (Variance Inflation Factors) will also apper in the regression results.
The VIE measures the amount of variance coefficient that inflated estimated with the independent variables. If the VIF is Closer to 1, then it tells us that there is no collinearity and on the other hand if the VIF is higher than 5, there is collinearity. If the VIFs are below 2. 5 it means that there is no collinearity and we must reject the null. The only time we will accepted the null is when the VIF is higher than 5. For the serial correlation there are two different tests that we can check it with, first the Durbin Watson test then the LM test.
Durbin Watson tests for any presence of autcorrelation in the residuals. If it is between 1. 50 to 2. 50 than it means that there is no serial correlation and we accept the null but anything below 1. 50 or above 2. 50 than we must regect the null (there is serial correlation). The LM test is number of observation df multiplied by the R squared value. Than we must look those results to the average fitted residuals. If the results comes out to be higher than there is serial correlaiton, also if it’s lower there will not be a serial correlation.
The KB test will tell us if the model is heteroscedastic or homoscedastic. But to do this test, squared fits and residual are in place in regression model to see what their coefficients. If it’s below 2. 0 we accept and it tells us that it is homoscedastic but above 2. 0 we reject and that will be heteroscedastic. T-Test will be able to tell us if the coefficient are significantly apart from zero. If the T valuses are not significantly apart from zero than we accept the null. F-Test helps us determine how sold the model is or strong.
It compares the fits of several linear models. Above 5. 0 we reject null and below 5. 0 we accept the null. The Nine Regression Issues The first problem is dealing with normal distribution as well as the value of the residual not being equal to zero. One way we could tackle this issue is by using visual inspection of histogram on the 4-1 plot. At this point we could do two things to solve ths issue we can either ignore it which have a consequence of larger error and inaccurate model or we can removing the variables and have lower residual error and more accurate model.
The secound problem is the Serial Correlation. Which is relationship between a variable over various time intervals. The covariance error does not equal to zero as well. Visual inspection could be done with the 4-1 plot. We can use the Durbin Watson test to detect it, as well as LM test. If the variance is not equal to 1 then we will have serial correlation. In this case we won’t even be able to use the F-test and T-test because it will be unreliable and result will be inefficient. But getting a better data and variables will be helpful.
For my company which is Dick’s Sporting good some of the data that I wanted were hard to find so I used the best one I can find that is available. A potential fix for temporary solution might be by adding lag but we still need to come up with a better one. The third problem is Heteroscedasticity, this issue can be detected by visual inspection using the graph 4-1 plot. Another test we can use for Heteroscedasticity is the KB test. It can couse the model to have the smallest variance and that will be very inefficient. Possible fixes are to get better data and ignore if it’s not significant.
Converting to logarithms is also helpful to make it better. The fourth problem is when the signs are incorrect and we can check that by using the regression analysis. The consequences of this mistake will result in multico-linearity. Better data will fixes most of the problem. The fifth Problem is Collinearity between the independent variables. It can be detected by using the Correlation Matrix and also form the VIF (Variance Inflation Factors). The consequence of this might lead the confidence intervals to be wider. Possible fix might be by dropping a variable and rerunning the model.
When we drop a variable our model may lack an important variable. The sixth problem is Co-efficient are not significantly different from zero. One way that we can detect this is by using T-test if the coefficient does not come out to be significantly different from zero then the model is inaccurate. To be able to fix this we can drop the variables so that it changes the model. The seventh problem is all coefficients being equal to zero and canceling each other out as well. To detect this we must use the F-test.
Possible fixe is to get better variables and data to change he model, which might lead us to have an accurate model. The eighth problem is R Squared is near zero or the independent variables are not explaining the dependent variables. We can find this in the Regression Analysis. The consequence of this issue might lead us to have ineffective model. But it can be fixed getting better data, which then lead us to more accurate model. The ninth problem, SSE is significant and also found in the regression. If this is found it is an indication that the model is inaccurate and ineffective. We can get better variables to make it more accurate.