R is an open-source language that began it's life as a derivation from the widely popular statistical package S-PLUS, which was the commercial adaptation of S, designed by Becker and Chambers. However, it's large, and sometimes hard to teach with, so two professors began writing a smaller version that would be easier to teach to students.
The two professors who designed the language were Ross Ihaka and Robert Gentlemen. Seeing their first names, you might get an idea of where 'R' came from as a title. However, R is also the letter proceeding S, and is a reduced version of S. They both may have impacted the name of the language. In any case, that's enough history. If you're interested in learning more, or getting a more thorough introduction, the document
Into to R is a good place to start. However if you would like a hardcover book to read, I highly recommend
Introductory Statistics with R, by Peter Daalgard.
Interestingly enough, Z-Bo has pointed out that the R plotting package conforms to W.S. Cleveland's heuristics for unbiased graphical representations of data, defined in his book "The Elements of Graphing Data"[5], as linked to on JSTOR. For more information about multivariate analysis with graphing, the book
Interactive and Dynamic Graphics for Analysis by Dianne Cook, Deborah F. Swayne et al. is informative, and full of diagrams. It does not deal strictly with R, but with the statistical package GGobi as well, which is used to display multivariate graphs and produce far more graphic output than R.
Syntax:
R syntax is reminiscent of PERL and C, though without sigils and the need for semicolons. It has powerful native vector support, such as:
c(...) - constructs a list of the given comma-separated elements
seq(start, end, by) - constructs a list containing start, incrementing by 'by', until end.
rep(i, n) - constructs a list repeating i n times
m:n - this returns a list of the integers m:n, which can be used in conjunction with c(...) and array subscripting
Conditional statements are also supported, using typical C syntax (although 'if' must be lower-case). There are also several data types built in, and a rich object system that can be used to create your own types.
Data.frame - these are the whole picture, and can contain mixed categorical and qualitative data.
Matrices - ideal for multi-dimensional qualitative data
Lists - can be concatenated and treated as one large list
There is also several large libraries of useful functions and example data for teachers to use, and students to learn from.
- Standard library
- Contains functions such as:
table(), which calculates totals
is.matrix(), is.data.frame(), to test for type
as.matrix(), as.data.frame(), to cast
- Graphics library
- Contains the necessary functions to create graphical plots
- Barplot - can be used to make bar plots, and spine plots
- Pie - somewhat self-explanatory (makes pie charts)
- ablines - makes a line of the form y = ax + b
Using these functions, one can model a wide variety of data accurately and prettily, as well as being able to conduct statistical methods to find information about the data.
Summary() - finds the min, median, max, and mean of the vector.
sd() - finds the standard deviation of the vector
mean() - finds the average of the vector
Using R
The R Project provides a GUI interface to the R interpreter that is easy to use as well as pleasing to the eye. The install file is less than 30 MB (barely), and
installs to 20 MB in the minimal install, and 60 MB in the full.
Help files are readily provided in several formats and have the added bonus of being genuinely helpful. They contain the complete argument list for the function, and a description of each. Along with this is an example of how to use the function, and sometimes detailed explanations as to the arguments; for example, the various algorithms that are available to calculate the bar width for a bar plot. To access the help on a specific function, one may call help("mean"), or ?mean, for example. If you are unsure as to the syntax for the particular command you are looking for, help.search("Cleveland Dot Plot") will use a fuzzy search to find matching articles.
On the project website, there is also a
wiki, which has articles on how to get started as well as tips and tricks for more advanced users.
http://links.jstor.org
[5]/sici?sici=0039-7989(198512)34%3A4%3C471%3ATEOGD%3E2.0.CO%3B2-M