A | B |

QUESTION: What does a regression slope coefficient tell you? | ANSWER: A slope coefficient in a linear specification tells you how much Y changes when X is increased by one unit. |

QUESTION: What does a regression intercept tell you? | ANSWER: An intercept in a regression model is the answer to the QUESTION: When all of the explanatory variables are zero, the intercept gives the expected value of Y. |

QUESTION: Geometrically, what is the easiest way to remember "slope"? | ANSWER: Slope is most easily imagined as "rise over run". |

QUESTION: Why do you have to be careful when looking at P-values associated with regression coefficients? | ANSWER: P-values (or prob-values) are sometimes defined differently in different types of regression software. |

QUESTION: What hypotheses are tested by the automatically provided t-ratios on regression coefficients? | ANSWER: The automatic t-ratios produced in SHAZAM regressions test the "zero hypothesis." Is the parameter point estimate too far from zero for us to believe that its true value could be zero? |

QUESTION: Can the automatic t-ratios test hypotheses other than the "zero hypotheses" for each estimated coefficient? | ANSWER: If you want to test something other than the zero hypothesis about a regression slope or intercept parameter, you need to use a TEST command (or construct the test by hand). |

QUESTION: What do ordinary regression models assume about the processes that give rise to the values of the right-hand-side variables? | ANSWER: Ordinary regression models assume that all right-hand-side variables are exogenous--determined somewhere outside the process that gives rise to the value of Y for each observation. |

QUESTION: What problem may exist if the same person (or entity) who chooses by their actions the value of the Y variable is also choosing the values of the X variables? | ANSWER: With individual data, if the person whose decisions give the value of Y can also choose the values of the X variable(s) we want to use to explain Y, there is potential endogeneity bias. |

QUESTION: What is the main hazard associated with omitted right-hand-side variables? | ANSWER: If an included right-hand-side variable is correlated with an inadvertently omitted right-hand-side variable that affects Y, then the coefficient on the included variable will capture the effect of the omitted variable, creating "omitted variables bias" in the coefficient on the included variable. |

QUESTION: What different effects can omitted variables have on your estimated model? | ANSWER: Omitted relevant explanatory variables can either create the appearance of an effect that is really not htere, or, mask an effect that is there. |

QUESTION: What effect(s) can heteroscedastic errors have on your hypothesis testing? | ANSWER: If your data have heteroscedasticity (nonconstant error variances), then failing to correct for this problem means that your usual OLS standard errors, t-ratios, and p-values are just plain incorrectly computed...your hypothesis tests are invalid. |

QUESTION: What effects can serially correlated errors have on your hypothesis testing? | ANSWER: If you data have serial correlation (timewise correlated error magnitudes), then failing to correct means that your usual OLS standard errors, t-ratios, and p-values are just plain incorrectly computed. |

QUESTION: What is the usual hypothesis testing consequence of POSITIVE serial correlation in the errors of a regression model? | ANSWER: POSITIVE serial correlation in regression errors typically leads to underestimated parameter standard errors, over-estimated t-ratios, and p-values that are too small. We tend to reject the zero hypothesis when it should not be rejected. |

QUESTION: With survey data on individuals, why do we have to worry about systematic non-response? | ANSWER: When working with survey data, if patterns of response and non-response are in any way systematically related to the values of Y for each person, you risk non-response bias in your parameter estimates. |

QUESTION: What's another way of describing variables that are uncorrelated? | ANSWER: When two variables are uncorrelated, we sometimes refer to them as orthogonal. |

QUESTION: In a regression model that is linear in both parameters and variables, what can be said about slopes, as opposed to derivatives of Y with respect to each X? | ANSWER: In a linear regression model (linear in both parameters and variables), a slope coefficient on a variable is the same thing as the derivative of Y with respect to that variable. |

QUESTION: What is the interpretation of a slope coefficient in a log-log regression model? | ANSWER: In a log-log model (a model that is linear in the logarithms of all variables), a slope coefficient can be interpreted as an elasticity of Y with respect to that X variable. |

QUESTION: Why do we sometimes need to depart from linear-in-variables regression models? | ANSWER: Linear-in-variables models fit planar surfaces (or hyperplanes) to the scatter of data. For curved surfaces, you need non-linear-in-variables models. |

QUESTION: What is the geometric consequence of using an interaction term? | ANSWER: Interaction terms (new variables constructed by multiplying together other variables) serve to allow an otherwise planar surface to develop a twist. |

QUESTION: What is the advantage of using polynomial terms in X in a regression model? | ANSWER: Polynomial terms (notably squared terms) allow the derivative of Y with respect to X to change over the range of the data. Handy if you know the effect of X on Y is not the same at all values of X. |

QUESTION: In a model that involves more than one term that contains the same X variable, how do you test whether X affects Y? | ANSWER: In a nonlinear model where there are many terms involving the same X variable, testing whether X has any effect on Y means testing whether ANY of the coefficients on ANY term involving X could be nonzero--an F test of the joint significance of all the different X-coefficients. |

QUESTION: Linear-in-everything specifications are usually adequate. True, False, Uncertain? | ANSWER: Never settle for a simple linear-in-variables specification without thinking about whether it captures everything that might be going on in your data. Explore some nonlinear-in-variables specifications. See what happens. If the simple model is adequate, fine. But you don't know until you try fancier models. |

QUESTION: If you have included every X variable that you have in your data set, you have a complete model. True, False, Uncertain? | ANSWER: Always think about potential missing explanatory variables, ones that may be important, but that you do not have. Are they correlated with your included variables? Could this be biasing the coefficients on these variables? |

QUESTION: What happens if you omit an explanatory variable that is UNcorrelated with any included right-hand-side variable in your model? | ANSWER: If an omitted variable is uncorrelated with any of your included variables, the slopes on those included variables will be unbiased. However, if the omitted variable would contribute a lot of explanatory power, your parameter standard errors will be bigger in its absence and it will be harder to reject hypotheses about the coefficients in the smaller model than it would be to reject them in the full model. |

QUESTION: Can R-squared statistics for a regression be used to test hypotheses about goodness-of-fit of your regression model? | ANSWER: R-squared measures goodness-of-fit, but we cannot use it to test statistical hypotheses because its distribution under the null hypothesis is not well-understood. Use the same ingredients to construct an appropriate F-test. For this alternative test statistic, fortunately, the distribution under the null hypothesis is known. |

QUESTION: How do R-squared values for cross-sectional data typically compare to R-squared values for time-series data? | ANSWER: R-squared values are almost always much lower for cross-sectional samples of individual data than they are for time-series of aggregate data. The aggregation process washes out a lot of the noise in aggregate data. |

QUESTION: Big t-test statistics for a model using time-series data are a cause for celebration, since you have found lots of statistically significantly different-from-zero coefficients. True, False, Uncertain? | ANSWER: Do not cheer for big t-ratios (tiny SHAZAM p-values) when you have time-series data. You may have overlooked positively serially correlated errors and the apparent statistical significance results may be just plain wrong. |

QUESTION: Most economic relationships are inherently linear, so linear regressions are especially appropriate for modelling relationships among economic variables. True, False, Uncertain? | ANSWER: The world is not necessarily linear. Sometimes, economic intuition will suggest obvious potential forms of nonlinearity--such as goods which are normal at low incomes and inferior at higher incomes. |