## Apply Family

### Apply function

# Apply function:
# An aggregating function, like for example calculating mean or sum etc that returns a number or scalar.
Import iris dataset for particle.

### R code with apply function

>data_prac=iris[,-5]

-5 removes the Species column from the iris datasets. As we cannot apply "apply“ - -
# function on data sets with factor or character variable

>apply(data_prac,2,mean)
Sepal.Length   Sepal.Width   Petal.Length   Petal.Width
5.843333  3.057333  3.758000  1.199333

# Where data = data_prac , 2 refers to all the columns in the dataset, mean is the function that will applied to all the columns.
# Similarlly we can replace mean and put sum or var

>apply(data_prac,2,mean)
Sepal.Length  Sepal.Width  Petal.Length  Petal.Width
5.843333       3.057333       3.758000       1.199333
Create a character vector with length of number-of-rows-of-iris-dataset, such that, each element gets a character value – “greater than 5″ if the corresponding ‘Sepal.Length’ > 5, else it should get “lesser than 5″.
a) Make the logic for above problem statement using a 'for-loop' and a 'if-else' statement
b)Make the logic for above problem statement using a ifelse() function
c)Create a logic for the same problem statement using apply() function
A) Chara_vector<-NULL
for(i in 1:nrow(iris)){
if(iris\$Sepal.Length[i]>5){
Chara_vector[i]<-"Grater Than 5"
}
else{
Chara_vector[i]<-"Less than 5"
}
}

B) chara_vector2<-ifelse(iris\$Sepal.Length>5,"Greater than 5","Less than 5")

C) chara_vector3<-NULL
chara_vector3<-apply(iris[,-c(3,4,5)],2,function(x) {ifelse((x)>5,"Greater than 5","Less than 5")})
chara_vector3[,"Sepal.Length"]

### Lapply

lapply is similar to apply, but it takes a list as an input and return list as an output.
Also every elements in the list can have different size.

### R code for lapply

We have to Create a list first:
>data_lapp<-list(x=1:5,y=6:12,z=15:25)

# x, y and z are of different variable length.

> lapply(data_lapp,FUN = mean)
\$x
[1] 3
\$y
[1] 9
\$z
[1] 20

> # data_lapp is the dataset, FUN= function, here it is mean. Mean will be calcullated for x,y and z.
> #Similarly to calculate variance

>
> lapply(data_lapp,FUN = var)
\$x
[1] 2.5
\$y
[1] 4.666667
\$z
[1] 11

# import the data set

# Include the dataset half_yearly and finally score dataset
> # Calculating the mean and var for the half_yearly and finally

> list_score=list(math_half=half_yearly\$Maths>70,math_final=finally\$Maths>70)
> lapply(list_score,mean)
\$math_half
[1] 0.38
\$math_final
[1] 0.5 > lapply(list_score,var)
\$math_half
[1] 0.2404082
\$math_final
[1] 0.255102

### Sapply

# sapply is similar to lapply, but returns vector instead of list.

R-code for sapply
> sapply(data_lapp,FUN = mean)
x y z
3 9 20

# Similarly we can apply other functions also.

>list_score=list(math_half=half_yearly\$Maths>70,math_final=finally\$Maths>70)
> sapply(list_score,mean)
math_half math_final
0.38 0.50
> sapply(list_score,var)
math_half math_final
0.2404082 0.2551020

### tapply

# tapply- splits the array based on specified data, usually factors levels and then applies function to it.
# take iris dataset, their we have only one factor variable which has three class.

>data_taap=iris
>tapply(data_taap\$Sepal.Length,data_taap\$Species,mean)
setosa  versicolor  virginica
5.006  5.936  6.588

# data_taap\$Sepal.Length is the variable to which function is applied, >data_taap\$Species is the spliting
# factor variable , and mean is the function applied.

### Mapply

m-apply (multivariate version of sapply )

# it is multivariate version of sapply.
# here we can create several list and do mapply to perform function on each element on the list.

# factor variable , and mean is the function applied.

> l1 <- list(a = c(1:10), b = c(11:20))
>l2 <- list(c = c(21:30), d = c(31:40))

# sum the corresponding elements of l1 and l2

>mapply(sum, l1\$a, l1\$b, l2\$c, l2\$d)
[1] 64  68   72   76  80  84  88  92  96  100