Massimo Caliman
by Massimo Caliman
3 min read

Categories

  • R

Command-Line Loops

The for, while and loop constructs in general are useful when programming, but are not particularly easy when working interactively from the command line. There are some functions that implement loops to make life easier. These are the *apply functions (lapply,sapply,apply,tapply,mapply)

A useful auxiliary function in combination with the *apply functions is split.

lapply

lapply accepts three arguments: (1) an X list; (2) a FUN function; (3) other arguments via its … argument . If X is not a list, it will be forced to a list using as.list. The actual loop is done internally in C code (so pay attention to the data side of profiling). lapply always returns a list, regardless of the input class .

f <- function(x) {x=x+1}
l <- 1:4
y <- lapply(x,f)
y
[[1]]
[1] 2

[[2]]
[1] 3

[[3]]
[1] 4

[[4]]
[1] 5

sapply

sapply attempts to simplify the result of lappy if possible. If the result is a list where each element is of length 1, a vector is returned If the result is a list where each element is a vector of the same length ( > 1 ) , an array is returned. If sapply fails to simplify, a list is returned.

> z<-sapply(x,f)
> z
[1] 2 3 4 5

Convenient, isn’t it?

apply

apply is used for evaluating a function (often anonymous in nature) to the columns or rows of an array. It is most often used to apply a function to the rows or columns of an array It is not faster than writing a for loop, but it works in a single line.

> str(apply)
function (X, MARGIN, FUN, ...) 

MARGIN is an integer vector indicating which margins are to be “maintained”. The example below will hopefully make this clearer (rnorm is only needed to populate an array with a normal distribution)

x <- matrix(rnorm(200), 20, 10)
> x <- matrix(rnorm(200), 20, 10)
> x
            [,1]        [,2]       [,3]         [,4]        [,5]        [,6]
 [1,]  0.3963635 -0.62742303  0.8559908  0.041758983  0.27968550 -1.51067608
 [2,]  2.7989429 -0.31108344  1.1125721  0.493892152 -0.99877881 -0.03635573
 [3,]  1.2729939  0.68517518 -0.5754894  0.001258338  0.89842612  0.76899917
 [4,] -0.1972043  0.14728076  0.5571848 -2.317407366 -0.94433394  0.22643282
 [5,] -1.0229955 -0.88704721 -0.2349964 -0.583044998 -0.38175450  0.79859082
 [6,] -2.2307039 -0.88393872 -1.1304861 -0.980230092 -1.19778042 -0.43944938
 [7,] -0.3239625 -0.07626231 -0.9167339  0.801793737  0.04583193 -0.44677327
 [8,] -2.1721930 -1.82162706 -0.1360091 -0.182583486  0.55254262 -0.55473856
 [9,]  0.5550490  0.17079315  0.3854185 -0.373603391 -1.23080985 -0.21907699
[10,] -1.2234382 -1.87715072  0.6801925 -0.017450599  0.58342107  1.06882246
[11,]  0.6361036  0.97475767 -0.3576995 -0.976218424 -0.31131696 -0.68747609
[12,] -0.8353006  0.10183363 -0.0431286  0.806635605  0.03548096  0.21567848
[13,] -0.5288823 -0.63436779  0.6056696  1.327055934  1.14380212  1.37146119
[14,] -1.9836052 -0.73356595 -0.6248551 -2.267205407  0.02242875  0.16491377
[15,]  0.8844882 -0.80770863 -1.9859983 -0.451040532 -0.32321439 -0.01792890
[16,]  1.0193884  2.37281492  2.4778778 -0.360414709 -2.43461189  0.30425171
[17,] -0.8749364  0.08347713  0.8584569  0.517227375 -0.65717674  0.67537576
[18,]  1.2128844 -0.69963544  1.4121788  0.318647904  1.81284950 -2.80769002
[19,] -1.0978996 -0.65164157  0.3670463  0.324967520  0.30765970  1.63676395
[20,] -0.2675550  0.48960663 -0.1027670 -0.120393201 -0.17096206 -0.91307091
             [,7]        [,8]        [,9]       [,10]
 [1,] -2.07670554  1.23033091  0.76424487  1.12007931
 [2,]  0.50517705  0.37403268 -0.45168099  0.33952347
 [3,] -0.61391033 -1.38076619 -0.36363689  0.26394571
 [4,] -0.44591035 -0.49826675 -1.24337397 -0.82563916
 [5,]  1.37051701  0.12793799 -0.85043741  0.61634465
 [6,] -0.28234386  0.81821623  1.77128818  0.51458107
 [7,] -1.38204064  1.33920876  1.18078293 -1.75728697
 [8,] -0.27306276  0.60752641  1.90182306  0.21603295
 [9,] -1.01114509 -0.19292137 -2.46298102  0.39364777
[10,]  0.49441357 -0.92996020  0.68627864  0.77054459
[11,]  0.60748224  0.68570630  0.85982787 -1.03110133
[12,]  0.05768264 -0.02326520 -0.02293647  0.45317369
[13,] -0.44162534  0.01338081 -0.83990493  0.23590051
[14,]  0.19310634 -0.67757005 -0.56180696  1.91955470
[15,]  0.37236650 -2.81973163  0.28650254 -1.45773888
[16,]  0.70371313  1.34526893  0.41251129  0.32920038
[17,]  0.52115092  0.92246177 -0.70595800 -0.74992266
[18,]  0.25084633  1.41506634  0.67849775  1.00418821
[19,] -0.74722596 -0.57426375  0.65830451 -0.01854943
[20,] -2.49597657  0.85960698 -0.18431592  0.34410820
>  apply(x, 2, mean)
 [1] -0.19912313 -0.24928564  0.16022123 -0.19981773 -0.14843056 -0.02009729
 [7] -0.23467453  0.13209995  0.07565145  0.13402934

mapply

mapply` is a multivariate version of apply which applies a function in parallel over a series of arguments . An example is below, l1 contains the same elements as l2 rep is a function that “replicates” the passed element a required number of times and returns it as a list, e.g.

> rep(1,5)
[1] 1 1 1 1 1
l1<-list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))

l2<-mapply(rep, 1:4, 4:1)

tapply

tapply is used to apply a function to subsets of a vector

x <- c(rnorm(10), runif(10), rnorm(10, 1))
f <- gl(3, 10)
tapply(x, f, mean)

split

split takes a vector (or other type of list of objects) and divides it into groups determined by a factor or list of factors.

For example, we split the elements of x >1 and those <=1 into two groups

> split(x,x[]>1)
$`FALSE`
 [1] -1.13857804  0.21853696 -0.25563963  0.51938904  0.24105612  0.13492624
 [7] -0.59042828 -0.52559665  0.67772599  0.79394383  0.81569013  0.48512614
[13]  0.34841619  0.60159291  0.45714304  0.40336332  0.71783361  0.86962872
[19]  0.84581567  0.94133809 -0.36174582  0.83549908  0.05453992

$`TRUE`
[1] 1.343577 1.742792 1.389213 1.252149 1.650007 1.350450 3.162776