Rweb in Stat 3011 Home Page Stat 3011 Home Page About the Rweb in Stat 3011 Web Pages
Numbers and other data can be stored in variables for later use.
Variable names in R are strings of letters, digits, and dots
beginning with a letter, such as x
, x2
,
and a.very.long.variable.name
. Case matters:
foo
, Foo
, and FOO
are names
of different variables.
To assign a value to a variable, use the assignment operator <-
like this
x <- 23(note that this "arrow" is composed of two characters "less than" and "hyphen")
If submit this to Rweb, nothing appears to happen. Rweb does assign the value to that variable, but doesn't do anything with it. Moreover, the Rweb server doesn't remember anything between submissions, so it doesn't even remember this assignment.
But if you follow the assignment with some use of the variable, you can see the effect of the assignment. For example, if you submit the commands
x <- 23 2 * x + 3Rweb will print the result
49
.
If the following expression is just a single variable by itself, Rweb prints the value of the variable. For example, if you submit the commands
x <- 23 xRweb will print
23
.
The basic R data type is not a single number, but a "vector", which is
what R calls a sequence of numbers. R uses vectors to represent whole
data sets. The R function c
collects numbers into a single
vector object, for example,
x <- c(2,4,11,17)creates a vector of length 4.
Often data sets consist of several vectors of the same length, which
consist of measurements of different variables on the same individuals.
R has a data structure that caters to this situation called a
data frame. If x
, y
, and z
are vectors of the same length, then
fred <- data.frame(x, y, z)produces a data frame containing these vectors.
For our purposes the most important use of data frames is reading in a dataset from a file on a web server. This is done using the "Dataset URL" area just below the the text area where you submit R commands to the Rweb server.
Rweb always treats this URL as a plain text file containing a data frame, which it reads in and makes all the variables in the frame available for calculations in the submission.
For an example dataset we use the data Example 4.2 on page 103 of the textbook. To use this data type
http://superior.stat.umn.edu/~charlie/3011/te0402.datin the "Dataset URL" window. This dataset contains two variables
x
and y
. If you use this dataset URL and
submit the R command
plot(x, y)you get the scatter plot of the data. It should look just like the left hand panel of Figure 4.3 in the textbook except for the labels.
In order to use a data set read in over the web, you need to know what the
variable names are. Typically, we will just use x
for a single
variable, and x
and y
if there are two. In a more
complicated data set, you will have to look and see. For example, the data
for Exercise 4.7 in the textbook is in the file
http://superior.stat.umn.edu/~charlie/3011/ex0407.datThis data set has four variables, two x,y pairs for two different scatter plots. We can't call both of the "x" variables x, although that's what the textbook does. That would confuse the computer. So we called them
x.a
,
y.a
,
x.n
, and
y.n
,
the "a" and "n" suffixes relating to the two different parts of the data, which
are "Angustifolia" and "Nacional". There is no way you can just guess what
names we used. How do you find them? There are two ways.
Rweb:> names(X) [1] "x.a" "y.a" "x.n" "y.n"