- Why Call C or Fortran from R?
- C functions and Fortran subroutines callable from R
- Compiling and Dynamic Loading
- The Call to C
- R Wrapper Functions
- Using R Random Number Generators
- Error Messages from C

R is rotten at iterative algorithms that require loops iterated many times (especially Markov chain Monte Carlo), not as bad as S-PLUS, but still bad. A way to get all the speed advantages of C or Fortran with most of the convenience of R is to write the inner loop in C and call it from R.

This is a very brief introduction. For much more, read the chapters System and foreign language interfaces and The R API: entry points for C code in the book Writing R Extensions available (free) from CRAN.

Here is a very simple C function (downloadable as foo.c).

void foo(int *nin, double *x) { int n = nin[0]; int i; for (i=0; i<n; i++) x[i] = x[i] * x[i]; }

which does nothing but square the elements of the *n*-vector *x*.

Notice two properties that are required for a function callable from R

- The function does not return a value. All work is accomplished as a "side effect" (changing the values of arguments).
- All the arguments are pointers. Even scalars are vectors (of length one) in R.

Here is a Fortran function that does the same thing (downloadable as bar.f).

subroutine bar(n, x) integer n double precision x(n) integer i do 100 i = 1, n x(i) = x(i) ** 2 100 continue end

Notice it has the same two properties as the C function.

- The function does not return a value. That is, it is a Fortran
*subroutine*rather than a Fortran*function*. Fortran*functions*are not callable from R. - The second property required for C functions is automatic in Fortran. Fortran has only one way of passing arguments, which corresponds to C pointer arguments. The fact that R passes all arguments Fortran style is a legacy of the first version of S, which was written in Fortran.

Put the C example code in a file foo.c and compile it to a shared library. The command

R CMD SHLIB foo.c

in UNIX (outside of R) does this. Now the code can be dynamically loaded into R by doing (in R)

dyn.load("foo.so")

(note: the filename may be `foo.sl`

on some computer architectures).

The actual call to C from R is made by the R function `.C`

like this

.C("foo", n=as.integer(5), x=as.double(rnorm(5)))

The function `.C`

takes *nargs* + 1 arguments if
the C function being called takes *nargs* arguments.
The first argument is the name of the C function to be called,
and the rest are the arguments to the function call. The arguments
generally have to be coerced to the proper type using
`as.integer`

, `as.double`

, and similar functions.
The function `.C`

returns a list whose elements are the arguments
to the function call (in this case *n* and *x*). The values
are the values after the call (as changed by the function `foo`

).

For more on this subject see the R help for the .C function.

The Fortran is similar. In UNIX do

R CMD SHLIB bar.f

and in R do

dyn.load("bar.so") .Fortran("bar", n=as.integer(5), x=as.double(rnorm(5)))

We just change the call from `.C`

to `.Fortran`

.

It is considered user-friendly
to wrap the call to `.C`

in an R function, like

foo <- function(x) { if (!is.numeric(x)) stop("argument x must be numeric") out <- .C("foo", n=as.integer(length(x)), x=as.double(x)) return(out$x) }

This has three benefits

- It allows some error checking in R, where it is easier than in C.
- It allows some arguments (like
*n*here) to be calculated so they don't have to be supplied by the user. - It allows you to return only what the user needs.

Now the call is much simpler, just

foo(x)

The R random number generators and also all the other functions
for probability distributions (not only `rnorm`

but also
`dnorm`

, `pnorm`

, and `qnorm`

and
so forth for other distributions) are callable from C (but not Fortran,
see however the section of *Writing R Extensions* about Calling C from FORTRAN and vice versa on how to write
Fortran calls to C functions that call the R random number generators).

Here is an example (downloadable as baz.c).

#include <R.h> #include <Rmath.h> void baz(int *nin, double *sin, double *tin, double *x) { int n = nin[0]; double s = sin[0]; double t = tin[0]; int i; GetRNGstate(); for (i = 0; i < n; i++) x[i] = rbeta(s, t); PutRNGstate(); }

Also the statements

GetRNGstate(); PutRNGstate();

must be done *before* and *after* (respectively)
all calls to random number functions (in this example `rbeta`

).
With that done, the random numbers work in C just like in R.

set.seed(42) rbeta(5, 1.5, 2.5) dyn.load("baz.so") baz <- function(n, s, t) { .C("baz", n = as.integer(n), s = as.double(s), t = as.double(t), x = double(n))$x } set.seed(42) baz(5, 1.5, 2.5)

(note, exactly the same random numbers when using the same seeds).

And how does one know how to call the C function `rbeta`

(which is *not* the R function of the same name, rather
the C function that the R function calls)? Unfortunately there
is no API documentation, so one must
RTFS.

On our computers, the source code happens to be (when this was written)
in the directory `/APPS/src/R-2.6.2`

.
For example, the source code for `rbeta`

is in
`/APPS/src/R-2.6.2/src/nmath/rbeta.c`

, but we may move it
(the directory name changes with each version), and if you
aren't here, you will have to figure out where it is locally
or get it from
CRAN.

Use the R `error`

or `warning`

function.
They work just like `printf`

but produce an R error
or warning as the case may be.
Here is an example (downloadable as qux.c).

#include <R.h> void qux(int *nin, double *x) { int n = nin[0]; int i; if (n < 1) error("arg n was %d, must be positive\n", n); for (i = 0; i < n; i++) x[i] = x[i] * x[i]; }

And try it out.

dyn.load("qux.so") .C("qux", n = as.integer(0), x = as.double(1:4))

Questions or comments to: Charles Geyer charlie@stat.umn.edu