A Brief Interpretation of Output of Simple Regression

print this page
send email









(1)
number of observations:  It must be greater than the 'number of
Number of variables plus 1'. Here we want to estimate for 1 variable only, so number of observations must be 3 or more , and we have 41 observations it is good.
 It is better to have Large number of observations to get a good result. (like 100 or more observations. (The larger the better )
(2) and (3)
C is the constant and its value is 0.155798. This result says that if there is no X, or say if X is zero then, value of Y is 0.155798.

(4)
0.422690 is standard error of 0.155798. Standard error measures how reliable the coefficient 0.155798 is. you can perform hypothesis test for 0.155798 and confidence interval with this value later. (The smaller the better)


(5)
 0.368588 is t-Stattistic for coefficient 0.155798
 If you divide coefficient by its standard error  you will get  its t-statistic.  0.155798/0.422690=0.368588. So 0.368588 is the t-Stattistics for 0.155798
T statistics tells us whether coefficient is significant or not. If  absolute t-statistics (without positive or negative sign) is greater than the critical value of T distribution then coefficient is significant. Insignificant otherwise. For instance, t-critical value for 41 observations and two parameters is 1.685 . since 0.368588 is not greater than 1.685 the coefficient 0.155798 is not significant.  However, for now  you don’t need to perform tests using t-statistics  because Eviews calculates P-values for you which is easier to calculate the significance .
(6)
0.7144  is P-value of t-Statistics
It tells us whether the coefficient is significant or not. It is easier than the (5) If P-value is 0.01 or smaller than 0.01 then, coefficient is significant at 1% level meaning that the estimated coefficient is very strongly significant. And if it is 0.05 or smaller than 0.05 than the coefficient is also strongly significant at 5% level. If it is  0.10 or smaller, then, the coefficient is significant but not so strong as previous two. In the table P-value is 0.7144 which means that the of 0.155798 is not significant. 
(7)
X is independent variable, the variable whose effect on Y you want to test
(8)
The coefficient for the independent variable. It is the most important part of this table. It tells us how much the dependent variable change if the X change 1 unit. The estimated value 1.025555 means that if X increase by 1 unit the Y increases by 1.025555 unit and if X decreases by 1 unit the Y decreases by 1.025555 unit. Please note, the coefficient is positive  it means the relation between X and Y is positive or X has a positive effect on Y . If you find a negative value then it means they have a negative relation or X has negative impact on Y.
(9) standard error for 1.025555. same explanation as (4)
(10) T-statistics for 1.025555. same explanation as (5)
(11) P-value of T-statistics (1.025555) same explanation as (6)
 note, here P-value is 0.0000 this is smaller than 0.01. it implies that the the  coefficient 1.025555 is strongly significant (at 1% significant level). Now you can say variable X  significantly affects Y, or Variable X has a statistically significant effect on Y.

(12) R-squared :
 It is always between 0 and 1 and generally positive. It tells you how much successful your model is in predicting . A  higher  R-square is better. In very poor model R square is close to zero like 0.03 etc. R-squared  is found to be   0.79193. it implies that  about 79% of changes in Y are explained by the changes in independent variable X.
(13) Adjusted r square :
It is always equal to or smaller than  the R-squared. It does the same job as R-squared does, measuring how much good your model is in predicting.  But it has a Specialty , that is, if you add more variable even irrelevant  variable  R squre  incresase  but adjusted r squae  doesn’t. Therefore adjusted R square is kind of smarter than the R square ;)   
The higher the  adjusted r square   (close to 1)  the better the model. Sometimes  in a very poor model adjusted R-square become negative . A negative r square is considered as zero r square .
(14) and (14) S.E. of regression and  Sum squared resid. :
Both measure how much the estimated Y differ from actual Y ( actual Ys  are the value of Y in Y series of your data file). Of course, you know, it is not good if they differ too far from each other. A smaller S.E. of regression and  a smaller Sum squared resid are the better for any model.
Note: S.E. of regression  is calculated by dividing the Sum squared resid  with df hence, In a regression with very large number of observation  S.E. of regression become very small.
(15) Log likelihood:
 It is useful when you compare two nested models. You’ll always find this value negative.  a higher value is better for example -40 is better than -90. A negative value but closer to zero  indicates a best fitting model. And you will choose a model  from two models that has a higher log-likelihood.
(16)  F-statistic :
 It is used for testing the overall significance of a model specially in a model where independent variables are more than one.  Do all the independent variables in the model significantly affect  the dependent variable ? F-statistics will answer this question.  If F-statistics  is greater than the F-critical then you can say that all the variables are significant . (the F-critical  values are available in last pages of Econometrics and Statistics books)
(17) Prob(F-statistic):
However you don’t need to check F-statistic and F-critical value. By looking  to the Prob(F-statistic) you can easily check overall significance of all independent variables. If the Prob(F-statistic) is equal or smaller than 0.01 you can say that all the variables jointly in the model  significantly affect dependent variable at 1% significance level.  If it is equal or smaller than 0.05 you can say that all the variables jointly in the model  significantly affect the dependent variable at 5% significance level. And if it is equal to or smaller than 0.10 this time you   can say that all the variables jointly in the model are significantly affect dependent variable at 10% significance level.

(18) Mean dependent var:
simply it is the average of Y the dependent variable.
(19) S.D. dependent var :
it measures how much the  values of Y differe from its average value.

(20)Akaike info criterion    (21)Schwarz criterion    (22)Hannan-Quinn criter :
They are calculated with almost same formulas using log-likelihood and are called MODEL SELECTION CRITERIA. Smaller values are preferred. If you have different models to compare, A preferred model is the model with smaller value of  (AIC), (SC) and (HQ) or with smaller value of any two of them.  However AIC  is best used for Time series models.

(24)  Durbin-Watson stat:
 It tells us whether our model suffer ‘serial correlation problem’
If it is  close to 2  ; No serial correlation in the model
If it is close to 0 ; positive correlation in the model
If it is close to 4 ; Negative correlation in the model
It is better if we get the Durbin-Watson stat near to 2 such as 1.70, 2.01, 2.20 etc.   In our model we found 1.69882 indicating no serial correlation in the model.





Important Note !
The purpose of this post is to give the basic idea about the results of a simple regression model computed by Econometric software. (I have used Eviews). So, some of my comments about some results are too straightforward. For example a ''higher R-square is better'' does not make sense if you are dealing with non-stationary variables.