Rweb in Stat 3011 Home Page Stat 3011 Home Page About the Rweb in Stat 3011 Web Pages
For an example dataset we use the data for Exercise 6.37 on page 231 of the textbook. The dataset URL is
http://superior.stat.umn.edu/~charlie/3011/ex0637.datIf you submit the R command
qqnorm(x)you get a normal quantile-quantile plot of the data.
Ideally, the points in this plot should lie close to a straight line. They won't lie exactly on a straight line because of randomness, but they should be close. To help see the line, there is also a command to fit a line (not the regression line) to the plot. The two commands
qqnorm(x) qqline(x)make the plot and draw a line. Now it is even clearer that the four points on the right show a clear departure from linearity. These data do not appear to be normally distributed.
To tell whether that Q-Q plot is close enough to a straight line so that the data should be considered normally distributed is hard. Even data that are actually normally distributed will not be exactly on a line. To see this, do
z <- rnorm(length(x)) qqnorm(z) qqline(z)Here z has exactly the same sample size as x and is exactly normally distributed by construction (
rnorm
is the Rweb function that generates normal random numbers). If the Q-Q plot
for x looks no less linear than the Q-Q plot for z,
then we can safely say that x is also normally distributed.
Since Q-Q plots are rather variable, especially if the sample size is small, you should probably make several Q-Q plots for known normal data (just repeat the three lines above). This will give you some idea of the statistical variability of Q-Q plots. If the Q-Q plot for x looks no less linear than any of the Q-Q plots for z, then we are even more confident that x is also normally distributed.
See also Q-Q Plots of Regression Residuals.