Java Games: Flashcards, matching, concentration, and word search.

Statistics: College: Chapter 9: "Correlation and Simple Regression"

Tools

The Learning and Tutoring Center, Inc. Suite C-201 6798 Crosswinds Drive, North Saint Petersburg, Florida 33710

A	B
What is "Pearson's r"?	It is a descriptive statistic that summaries the "magnitude" and "direction" of a relationship between two variables.
When is a "correlation coefficient" used?	It is used to draw "inferences" about relationships in populations.
What is the most widely tested correlational null hypothesis?	It is that there is "no relationship" between the two variables in the population (when there is "no" relationship, the "correlation" is 0)..
What is the symbol for the "population correlation coeffcient"?	It is the following: p-rho
What is the "alternative hypothesis"?	It is that there is a relationship between the two variables.
What is the "nondirectional alternative hyhpothesis"?	It is H1: p does does equal .00. The alternative hypotheses "does not predict the nature" of the relationship between two variables; therefore, it is "nondirectional".
When can reserachers use the "alternate hypothesis" to test for direction?	They can when there is a reason to believe that "p is greater than zero" or a reason to believe that "p is less than zero".
In testing the "null hypothesis", when do researchers use "r (correlation coefficeint)?	Researchers use "r" to estimate for "p (probability)"; in addition, "r" can be used to compare to a sampling distribution to determine whether its value is "improbable" or if the null hyhpothesis is true.
How can a theoretical "sampling distribution of a correlation coefficient" be constructed?	It can be constructed in the same fashion as other sampling distributions.
When is it appropriate to use "Pearson r"?	It is appropriate to use "Pearson r" when both variables are measured approximately on an interval level or on a ratio level; in addtion, "Pearson r" is suitable for detecting "linear relationships" between two variables.
What assumptions are made before hand for testing the "null hypothesis" that "p=.00"?	The assumptions are the following: 1) it is assumed that the participants are randomly selected and "independently" sampled from a population; 2) it is assumed that the variables being correlated (X and Y) have an underlying distribution that is "bivariate normal (that is scores on variable X are assumed to be normally distributed "for each value of variable Y, and visa versa); 3) it is assumed scores are "homscedantic (that is for each value of X, the varibility of Y scores must be about the same, and visa versa").
I "bivariate normality" easy to test?	No, but fortunately failure to meet this assumption typically has only a "small effect" on the validity of a statistical test (particularly when the "sample size is larger than 15).
What is "Pearson r"?	It is an "inferential statistic".
What is the "formula" for finding the "degrees of freedom" using "Pearson r statistic".	The formula is the following: df = N-2
With "Pearon r" when is the "null hypothesis" "rejected"?	It is "rejected" when the "value for r" is "greater" than the "critical vaule".
When is it inappropriate to use the "critical value" for a "directional test" "after computing r"?	it is inappropriate if a directional hypothesis was "not specified in advance".
Why when researchers accept a "null hypothesis" they can not totally conclude that there is no relationship between the two variables in the population.	That is because there may be a "Type II error (an "incorrect acceptance" of the "null hypothesis"); in addition, another reason it can be a "false acceptance" of the null hypothesis is that it is a possibility that the variables are "related" to the population but "in a nonlinear" manner."
What is used by researchers when they are comparing "two correlations" looking if there are errors?	Researchers can use a "logarithemic transformation of r" developed by the statistian Fisher; it is called the "r-to-z transformation" which allows researchers to use normal distribution for comparing two correlation coefficients; researchers use "tables" to "convert" the "r score" into a "z score" that can be used to compare correalation coefficients.
Does the "Pearson r statistic" indicate the "magnitude" for researchers?	Yes, it does indicate the "magnitude"; if its "sign is negative" it indicates that the high values of a variable are associated with "low values" on the second; if its "sign is positive" it indicates that "high values for X" are associated with "high values for Y".
Using the "Pearson r statistic", how is the "magnitude of a relationship measured"?	The "higher the abosolute value" of the correration coefficient, the "stronger the relationship".
What is the "symbol" for "magnitude"?	The sysmbole for "magnitude" is the following: r^2 (it is called the "coefficient of determination")
What is the "coefficient of determination"?	It is the measurre of "magnitude" which is "r^2"; in addition, it tells researchers the proportion of variance in variable Y that is associated with variable X (the proportion of variance that is "shared" by the two variables.
What is the primary reason for conducting a "power analysis" when planning a study?	It is to learn how large a sample is needed to minimize the risk of a "Type II error".
What is one of the primary factors that affects the "pearson r statistic"?	It is the existence of a "curvilinear, rather than linear", relationship between the two variables.
With the "Person r statistic", what happens when the "range is restricted"?	The "deviations" are smaller and thus the "magnitude of r" is "reduced".
How do researchers estimate a "post hoc power" for a "completed analysis"?	Researchers find a close proximation to the actual sample size in the column corresponding "most closely" to the obtain "r".
What results when "only extreme groups from both ends of a distribution" are included in the sample?	The "magnitude" of the "correlation coefficient" may "increaase".
When extremem groups froma population have been sampled, what care must be given?	Care must be taken not to interpret correlation coefficients as reflecting relationships for the entire population.
When a sample is relatively small, how can a person with an "extreme value effect " it?	When a sample is "relatively small", a "person with an extreme value" on one or both variables being correlated can have a "dramamtic effect" on the "magnitude" of the "coefficient".
What does aq "smaller ellipse" surrounding the values "without the outliner" suggest?	It suggests a "modest correlation".
When the outliner is included, what does the "shape" pf the "outer ellipse" indicate?	it indicates a "much stronger relationship".
When do researchers both "include the ouliners" and "exclude the outliners"" in a study?	They do it when the "disparity" in the results is "great".
What is "unrealiability"?	"Unrealiability" is that virtually "all quantitative measures (ones that can be seen and counted)" contain some "measurement error".
What effect do "measurement errors" have on the result of a study?	"Measurement errors" "reduce" the "magnitude" of "correlation coefficients"; in addition, this effect is called "attenuation".
What graphing tool help to recognize "anomalies in the data" (such as "outliners" and "extreme values")	The graphying tool would be the "scatterplot".
What is the "Spearman's rank-order correlation"?	It is a "nonparametric analog" of Pearson's r.
When is the "Spearman's rank-order correlation" used?	It is used in the following ways: 1) When the "dependent variable" is "measured" on the "ordinal scale"; 2) When one or both variables being "correlated" is "severely skrewed" or has an "outliner"; 3) It is preferred by some researchers when there are "fewer than 30 cases".
What is a "Sperman's correlation coefficient" called?	It can be called "Spearman's rho" (symbol for this is "r subscript s").
What must be in ploace to compute "Spearman's correlation coefficient"?	"Both" variables must be in "rank order".
What is the "range" for "Spearman's correlation coefficient"?	The range is the same as it is for the Pearson r statistic which is "between -1.00 through 0.00 to +1.00"; in addition, a "high positive value" indicates a strong tendency for the "paired ranks" to be "similar" and a "negative value" indicates a tendency for "low ranks" on "one variable" to be "associated" with "high ranks on the others".
When using "Sperman's correlation coefficient" what is the "null hypothesis"?	The "null hypothesis" is that there is "no linear relationship" between the "two sets of ranks" [that the "population correlation (p) is "zero"].
With "Sperman's correlation coefficient" when can the "null hypothesis" be "rejected"?	The "null hypothesis" can be "rejected" when the "computed value" is "greater than the critical value".
Why is the "Spearman's correlation coefficient" overall considered "somewhat less accurate than might be desired"?	That is because the "approximations to theoretical sampling distributions" are "imperfect"; especially, for samples of "intermediate size".
What is the "Kendalls tau" statistic?	It is a method for developing a "correlation statistic" that has "advantagous statistical properties"; however, "Kendalls tau" is somewhat more "complicated to compute".
What is the "point-biserial correlation coefficient"?	This "correlation coefficient" summarizes the strengh and direction of a relationship between a "dichotomousm nominal-level variable."
What does "regression" mean?	"Regression" refers to "techniques" that are used to "analyze relationships" between "varaibles" and to make "predictions" about "values of variables"; in addition, it is used as a "foundation" for "many other complex statistical analysis".
What is the advantage of "linear relationships"?	When a "relationship" is "linear and perfect", "knowledge of one variable" allows you to "know or predict" the "value of the second variable" with "complete accuracy".
How is a "linear relationship" expressed?	It is expressed by a "straight line"; in addition, it is allows for a "simple equation (Y=X)".
What is the "slope" of a "line"?	It is the "degree" of "increase, decrease, of lack of either".
In what terms can any "line" be described?	All "lines" can be described in terms of "slope and intersect" (this is known as a "linear model).
What is the "intercept" of a "line"?	The "intercept" of a "line' is where its "line" cross the "Y-axix on a graph).
What is the "regression equation"?	The "regresion equation" is the "formula" for the "best-fitting straight line" to "characterize" the "linear relationship" between "X and Y" (equation: Y=a +bX).
What is the relationship between the "regress equation" and the "linear (straight line) equation"?	They "both" have the "same" equation, Y=a + bX, however, except that the "regression equation" "predicts" values of the variable Y (Y^1, called "Y predicted); in addition, in this equation "Y is the dependent variable" and "X is the independent variable (or predictor variable).
What must researchers do to "solve" the "regression equation"?	Y=a + bX: Researchers must solve for "a" the "intercept constant" and "b the slope" (the "slope is called a "regression coefficient").
What is the advantage using "refression"?	It allows researchers to "make predictions" about the "values of one variable" "based" on "values of a second variable"..
What are "errors of prediction" or "residuals"?	When reseachers are "comparing" the "actual values" with the "predicted values", by subtracting them, the difference between them indicates "errors" present" (if the "difference is small" then the "errors are insignificant" but as the "difference becomes larger" then "the significance of the errors increases").
Is the "regression equation" the "best" representation of "X and Y"?	Yes, because the "regression equation" results in a "line" that "minimizes" the "errors of prediction"; in addition, more precisely, the "regression equation" "minimizes" the "sums of the squares of the prediction errors (this is where the "term" "least squares" comes from)".
What are "standard regression procedures called?	They are call the following: "OLS regression (which stands for "ordinary least-squares regresssion)".
What is used when a "regression equation" is used to "make" a "prediction" and then to be sure that the "prediction is good"?	Reasearchers use the "standard error of estimate (SE)" to test the "validity" of "the prediction"; in addtion, t he "smaller" the "SE estimate", the "more accurate predictions" are "likely to be".
Why are "correlation coefficients" "frequently displayed" in "research reports"?	They are frequently used in research reports because they are "efficient indexs" and because they "summarize concisely" the "magnitude", "nature", and the "direction" of a relationship between two variables.
What are some of the "descriptive applications" of "bivariate correlation analysis"?	They are the following: 1) is to answer the question; 2) to make predictions; 3) assessing an instruments "reliability"; 4) assessment of "validity"; 5) as a "variable selection" for "complex models".
When researchers conduct "reliability assessments", what are the "correlation coefficients" called?	They are called the following: "reliability coefficients ("reliability coefficients" "range" between -1.00 and +1.00 but in "most cases are always positive"; in addtion, the higher" the "reliability coefficient" the "greater is the reliability of the instrument ("reliability coefficients generally should be at least .70 and in some cases even "higher" to be "considered acceptable")".
What is "validity"?	"Validity" concerns the "degree" to which "an instrument" is "measureing" what "it suppose to be measuring".
What plays an "important role" in "measuring validity"?	"Correlation" plays an "important role" in "validity assesments".
What is the "criterion-related validity" of an "instrument"?	It is an "approach" that involves "examining" the relationship between an "instrument" and "practical criterion".
What is "test-retest reliability"?	It is when researchers are "interested" in "assessing an instrument's stability over time (they "administer" it to the "same people" on "two seperate occasions".
What is "interrater reliability"?	This is when "two observers" are used to provide "independent ratings" about "key phenomena".

This activity was created by a Quia Web subscriber.
Learn more about Quia

Create your own activities