+ - 0:00:00
Notes for current slide
Notes for next slide



Simulation and Control Structures

Dr. Mine Dogucu

1 / 20

Goals

  • Probability Distribution and simulating data in R

  • Control structures (if/else, for and while loops and mapping)

2 / 20
  • Go to course organization on GitHub
  • Start a new repo called week-07-simulate-data-username where username represents your own username.
3 / 20

Probability Distributions - Normal

dnorm(x = -1.96, mean = 0, sd = 1)
[1] 0.05844094
pnorm(q = -1.96, mean = 0, sd = 1)
[1] 0.0249979
qnorm(p = 0.0249979, mean = 0, sd = 1, lower.tail = TRUE)
[1] -1.96
rnorm(n = 3, mean = 0, sd = 1)
[1] -1.7421388 0.2278699 -0.8605599
4 / 20

Probability Distributions - Normal

ggplot(data = data.frame(x = c(-3, 3)), aes(x)) +
stat_function(fun = dnorm, n = 101, args = list(mean = 0, sd = 1)) + ylab("") +
scale_y_continuous(breaks = NULL)

5 / 20

Other probability Functions

  • dbinom(), pbinom(), qbinom(), rbinom()
  • dbeta(), pbeta(), qbeta(), rbeta()
  • dunif(), punif(), qunif(), runif()
6 / 20
runif(1)
[1] 0.1590387
set.seed(92697)
runif(1)
[1] 0.7773408

set.seed() allows reproducibility of results when randomness is introduced.

7 / 20

while loops

count <- 0
while(count < 10) {
count <- count + 1
print(count)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
count <- 0
while(count < 10) {
print(count)
count <- count + 1
}
[1] 0
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
8 / 20

if/else

if(condition) {
## do something
}
## Rest of the code
if(condition) {
## do something
} else {
## do something else
}
if(condition) {
## do something
} else if(another condition) {
## do something different
} else {
## do something different
}
9 / 20
count <- 0
while(count < 10) {
if(count < 5){
print(paste(count, "small number"))
}
count <- count + 1
}
[1] "0 small number"
[1] "1 small number"
[1] "2 small number"
[1] "3 small number"
[1] "4 small number"
10 / 20
count <- 0
while(count < 10) {
if(count %% 2 == 0){
print(paste(count, "even number"))
}
count <- count + 1
}
[1] "0 even number"
[1] "2 even number"
[1] "4 even number"
[1] "6 even number"
[1] "8 even number"
11 / 20
count <- 0
while(count < 10) {
if(count %% 2 == 0){
print(paste(count, "even number"))
} else {
print(paste(count, "odd number"))
}
count <- count + 1
}
[1] "0 even number"
[1] "1 odd number"
[1] "2 even number"
[1] "3 odd number"
[1] "4 even number"
[1] "5 odd number"
[1] "6 even number"
[1] "7 odd number"
[1] "8 even number"
[1] "9 odd number"
12 / 20

for loops

for (i in 1:10){
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
13 / 20

for loops

sample_size <- c(30, 60, 100)
for (i in 1:3){
print(sample_size[i])
}
[1] 30
[1] 60
[1] 100
sample_size <- c(30, 60, 100)
for (i in 1:length(sample_size)){
print(sample_size[i])
}
[1] 30
[1] 60
[1] 100
14 / 20

apply : R Documentation

Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.

apply(X, MARGIN, FUN, ..., simplify = TRUE)

X an array, including a matrix.

MARGIN a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows and columns. Where X has named dimnames, it can be a character vector selecting dimension names.

FUN the function to be applied: see ‘Details’. In the case of functions like +, %*%, etc., the function name must be backquoted or quoted.

optional arguments to FUN.

15 / 20
some_matrix <- matrix(C <- (1:30), nrow = 5, ncol = 6)
some_matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 6 11 16 21 26
[2,] 2 7 12 17 22 27
[3,] 3 8 13 18 23 28
[4,] 4 9 14 19 24 29
[5,] 5 10 15 20 25 30
apply(some_matrix, 1, sum) # adding rows
[1] 81 87 93 99 105
apply(some_matrix, 2, sum) # adding columns
[1] 15 40 65 90 115 140
16 / 20

lapply and sapply: R Documentation

Apply a Function over a List or Vector Description

lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

sapply is a user-friendly version and wrapper of lapply by default returning a vector, matrix or, if simplify = "array", an array if appropriate, by applying simplify2array(). sapply(x, f, simplify = FALSE, USE.NAMES = FALSE) is the same as lapply(x, f).

17 / 20
sapply(c(0, 1, 2), exp)
[1] 1.000000 2.718282 7.389056
lapply(c(0, 1, 2), exp)
[[1]]
[1] 1
[[2]]
[1] 2.718282
[[3]]
[1] 7.389056
18 / 20
sapply(c(0, 1, 2), exp)
[1] 1.000000 2.718282 7.389056
lapply(c(0, 1, 2), exp) %>%
unlist()
[1] 1.000000 2.718282 7.389056
19 / 20

This week's (long) task:

How does missing data (missing completely at random) impact bias and variance in simple linear regression?

Design a simulation to answer this question. You are in charge of developing sub-questions.

20 / 20

Goals

  • Probability Distribution and simulating data in R

  • Control structures (if/else, for and while loops and mapping)

2 / 20
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow