University of Minnesota
School of Statistics

Statistics 5021
Spring Semester 2003

Contents

Email and Office Hours

Instructor Christopher Bingham 372 Ford Hall, 612-625-1024, Email: kb@umn.edu
Office hours: Monday, Wednesday, Friday 2:30-4:00 or by appointment
Teaching Assistants Shr-Wei Chen 612-624-5569, Email: swchen@stat.umn.edu
Office Hours Wednesday 11:00 am - 1:00 pm, 352 Ford Hall
Fan Yang 612-625-6844, Email: yangfan@stat.umn.edu
Office Hours Tuesday 9:45 - 10:45, 352 Ford Hall

When you email, please identify yourself as a student in Stat 5021 (if you are)

Lecture and Lab Schedule

Lectures Monday, Wednesday, Friday 10:10 - 11:00 p.m., Akerman 319
Labs Section 2: Thursday 9:05 - 10:55, B53 Ford Hall (Computer Lab)
Section 3: Thursday 10:10 - 11:00, B53 Ford Hall (Computer Lab)
Section 4: Thursday 11:15 - 12:05, B53 Ford Hall (Computer Lab)

Announcements

1/27/03 There was a typo in the example of MacAnova handout that has been corrected. There was an error describing the "depth" column of a stemplot produced by MacAnova function stemleaf().

The Thursday January 23 lab sessions will meet.

Some handouts may be available only as Acrobat PDF (Portable Document Format) files. Most computers come with a free PDF reader (Acrobat Reader, sometimes called Acroread). if you don't have it you should download it Link to download site for Acrobat Reader.

Course Information

Course Information Sheet (PDF file)

Assignments

Solutions to Homework

Solutions to Homeworks 1 - 12 are now available for downloading. They are in the form of PDF documents prepared by the TA's. To access solutions, you will need the same password as required for class notes.

Solutions to Exams

Overhead displays from lectures

You can download PDF files of edited overheads from lectures. You will need the user name and password announced in class and lab.

Computing

The computer program we will be emphasizing in this course, and which will be used in most examples, is MacAnova, a free interactive program running under Windows 95/98/NT/XP, Macintosh and Unix.

The latest version is release 1 of MacAnova 4.13. If you have a version of MacAnova earlier than Release 4 of MacAnova 4.12, you should download the latest version.

Back to Top

Data Files

Data sets on Moore and McCabe CD ROM
On the CD-ROM that comes with the text are data files for both Macintosh and Windows computers of most of the data sets in Tables, Examples and Exercises in Moore and McCabe. Unfortunately, a number of these are not well formatted for use withMacAnova.

I have prepared more up-to-date and corrected versions of these files which are also self documenting. These were derived from files downloaded directly from the publishers web page for IPS.

These have been posted on the web. You can download the all the data sets in compressed archive files, IPS4Data.sit, for Macintosh and IPS4Data.zip for Windows, or you can download individual files, one for each data set

Note: An error in data set ex07_131.txt has been found (it also included data from Table 7.1). It has been corrected and data set ta07_001.txt has been added. The archive files have not been corrected as yet.

When working with MacAnova, you should use these specially prepared data sets rather than the data sets on the CD ROM which comes with thes book. They are designed to be read by MacAnova command readdata(). Type help(readdata) for information about these commands or see An Introduction to MacAnova.

Other data sets (updated 2/07/03)
Besides the Moore and McCabe data sets, I will from time to time post other data sets on the data download page. These may be data sets used as examples in lecture or to be analyzed as part of homework.

Downloading and using the archive files

Windows: File names that end in .zip are archive files, containing multiple files in a specially coded form. They can not be read directly by a word processor. On most Windows computers is a utility program called WinZip which is designed to decode Zip files. It can also create them, if that's what you want.

There is also a freeware program Aladdin Expander which can decode Zip files as well as many other specially coded files.

You should download IPS4Data.zip using the right button on your mouse. Then you need to run either WinZip or Aladdin Expander to decode the files.

Macintosh: Files whose names ending .sit are Stuffit archive files. Most Macintoshes come equiped with Stuffit Expander which can unpack these files. Both Netscape and Explorer are usually configured automatically to use the freeware program Stuffit Expander to unpack sit files as they are download. If not, you can just to Drag and Drop the downloaded file on the Stuffit Expander icon.

MacAnova Macro Files for Stat 5021

I will post here links to any new macro files that will be helpful to you in doing homework.

Macro file Description
box5num.mac or
box5num.mac.txt

This contains macro box5num() which draws boxplots
similar to those in Moore and McCabe using only
the five number summary and computing quartiles the same way.

You can use box5num() exactly like vboxplot(),
as in box5num(x,title:"Marsupial pouch sizes").

In addition, you can use keyword fivenum, direct the output
of a five number summary for each distribution plotted, as in
box5num(x,title:"Marsupial pouch sizes",fivenum:T)

densities.mac or
densities.mac.txt
Macro to draw densities of standard distributions such as normal
Student's t, Chi-squared, and F. Take a look at some examples
of their use.
densityest.mac or
densityest.mac.txt
Macro to compute an estimate of the density data. You can plot
the estimated density by lineplot(densityest(sample,xvals))
or add the estimated density to a histogram by
addlines(densityest(sample, xvals)), where sample
contains the data and xvals contains values, usually
equally spaced at which the density will be estimated.
runs.mac or
runs.mac.txt
Macro to determine lengths of runs of identical values in a vector.
Example of usage:
Cmd> addmacrofile(getfilename()) # find runs.mac or runs.mac.txt

Cmd> x <- vector(1,1,3,3,3,3,2,2,1,2,1,1,1,1,1,3,3)

Cmd> lengths <- runs(x); lengths
WARNING: searching for unrecognized macro runs near lengths <- runs(
(1)           2           4           2           1           1
(6)           5           2

Cmd> max(lengths) # longest run
(1)           5

catdata.mac or
catdata.mac.txt
(Updated 4/17/03)
File of macros for working with categorical data. A page with examples will be posted soon.
Example of usage of macro binomlimits() in catdata.mac:
Cmd> addmacrofile(getfilename()) # find and attach catdata.mac

Cmd> x <- 22; n <- 42 # observed count and number of trials

Cmd> binomlimits(x,n,.95) # traditional limits (Wald)
WARNING: searching for unrecognized macro binomlimits near 
binomlimits(
(1)     0.37277     0.67485

Cmd> binomlimits(x,n,.95,"wilson")
(1)     0.37739     0.66609	

Cmd> binomlimits(x,n,.95,"exact") # "exact limits", always conservative
(1)     0.36418     0.67996

Cmd> binomlimits(x,n,.95,"score") # close to exact
(1)     0.37722      0.6664

meansanova.mac or
meansanova.mac.txt
(Updated 5/07/03)
File containing macro meansanova() for doing analysis of variance using cell means, cell standard deviations and cell sample sizes. Here is an example of its simplest usage, for one-way ANOVA. The means, standard deviations and sample sizes are from Figure 12.11, p. 766 of IPS.
Cmd> addmacrofile("") # find meansanova()

Cmd> xbar <- vector(41.0454545, 46.7272727, 44.2727273)

Cmd> sd <- vector(5.6355781, 7.3884196, 5.7667505)

Cmd> meansanova(xbar,sd,n,fstat:T)
Model used is @Y=@GROUPS
                DF          SS          MS           F     P-value
CONSTANT         1  1.2786e+05  1.2786e+05  3207.18687     < 1e-08
GROUPS           2       357.3      178.65     4.48108    0.015151
ERROR1          63      2511.7      39.868
	
Variables SS and DF, but not RESIDUALS are defined as with anova(), and most of the functions that you can use after anova() can still be used.
Cmd> SS
    CONSTANT      GROUPS      ERROR1
  1.2786e+05       357.3      2511.7

Cmd> DF
    CONSTANT      GROUPS      ERROR1
           1           2          63

Cmd> secoefs(GROUPS)
component: coefs
(1)     -2.9697      2.7121     0.25758
component: se
(1)      1.0991      1.0991      1.0991

Cmd> RESIDUALS
UNDEFINED
	
meansanova() can also do two-way and higher order ANOVAS without covariates.

The two links for each file are to identical files. Some browsers have difficulty with file names that end in .mac. If you have such a problem, use the second link.

Take a look at instructions on how to set up MacAnova to use a macro file.

Frequently asked questions about MacAnova use

Any questions about MacAnova that are of general interest I will add to a list of Frequently Asked Questions About MacAnova.

Distribution of Final Examination Scores

Here is a stemplot of the final exam scores (out of 170 possible points)

n=27, Min=23, Q1=92.5, M=120, Q3=134, Max=157
    1     2|3
    2     3|8
    2     4|
    2     5|
    3     6|8
    4     7|0
    5     8|9
    8     9|123
    9    10|1
   13    11|5677
  ( 5)   12|00788
    9    13|026
    6    14|147
    3    15|377

          1|1 represents 11  Leaf digit unit = 1

Handouts and examples

I will put links to any handouts here. Some may be available only in PDF format.


Sample Exams

DOWNLOAD MacAnova


The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.

C Bingham
Updated Fri May 16 08:58:02 CDT 2003