Assignment # 9, Stat 8053, Fall 2011

Problems

These problems are due in class on Wednesday, December 7.

  1. Problem 12.3.
  2. Data in the file http://www.stat.umn.edu/~sandy/courses/8053/Data/crudeoil1.txt consists of crude oil samples from three zones of sandstone, called Wilhelm, Sub-Mulinia and Upper. Each sample was measured for five trace elements: X1= vanadium; X2= iron; X3= beryllium; X4= saturated hydrocarbons, and X5= aromatic hydrocarbons.

    Apply Fisher's LDA, and be sure to assess the likely error rates for classification of future samples. Do this analysis twice, once assuming that the prior probabilities of the three zones are equal, and then assuming that the probability for Wilhelm is 0.1, and for the other two zones it is 0.45.

    Compare the results you get with results you would get from other classification methods we have learned such as a neural network or random forest, and if you have the time with the fit of a multinomial logit model (using multinom in the nnet package; see Agresti Ch. 7 for details) that you probably learned in 8052.



S Weisberg
2011-11-30