[1] "Hello, World!"
14:540:384: Simulation Models in IE (Spring 2025)
2025-01-22
Develop a motivating example for IE simulation models
Demonstrate the type of coding complexity that we will build up to during the course
Gauge students’ level of expertise in probability models and programming
Tip
For \(\textsf{R}\) tutorials, go to top-right corner of \(\textsf{RStudio}\) and look for the “Tutorial” tab. Follow instructions in the pane.
\(\textsf{R}\) is a go-to language for many domains because of its mature packages for many kinds of data analysis. These packages are easily accessible via the Comprehensive R Archive Network (CRAN)
note: click on the tabs to see additional content
This tells CRAN to find your desired package and download a copy to your working environment.
install.packages("<pkg_name>") \(\rightarrow\) follow promptsThis tells your current R session to read the [downloaded] packages and then you can use them.
library(<pkg_name>).An offshore wind farm operates 300 turbines. The breakdown rate follows an exponential distribution with a rate of one per 2000 hours of operation. How many breakdowns should be anticipated per year?
With this information we can simply take the parameter, \(\lambda\), and multiply it by the hours in a year and the total number of turbines to get the expected number of failures in the year.
\[ X \sim Exp \left( \lambda = \frac{1}{2,000\ hrs} \right) \]
\[ f_X(t | \lambda) = \lambda e^{-\lambda t} \]
\[ N = 300\ turbines \frac{365\ days}{year} \frac{24\ hrs}{day} \frac{1\ breakdown/turbine}{2,000\ hrs} = 1,314\ breakdowns/year \]
This is a silly problem to simulate, but we will do it anyway. We have a few options:
# I've defined certain "params" in the header of the document
lambda <- 1 / params$breakdown_rate # Breakdown rate (failures per hour)
N <- params$sim_yrs # Total simulation time in years
K <- params$turbines # Number of turbines
yr <- 24 * 365 # number of hours in a year
# define a RV, X, which will sample the exp dist. when we call X()
X <- function(n=1) {
# n gives the number of samples, with the default of 1
rexp(n, rate = lambda)
}Tip
<- and = are chosen to follow convention. <- assigns a “value” to a variable while = is used in function signatures.
Sample it once:
Sample it several times:
Averaging a few samples
Averaging a lot of samples
# set the samples and increment
n_values = seq(10, 5000, by = 10)
# create a tibble with N, mean, and sd for sampling X() n times
X_sim <- tibble(
n = n_values,
Mean = map_dbl(n_values, ~ mean(X(.x))),
SD = map_dbl(n_values, ~ sd(X(.x)))
)
#plot results
X_sim %>%
ggplot(aes(x = n, y = Mean)) +
geom_line(color = "blue") +
geom_ribbon(aes(ymin = Mean - SD, ymax = Mean + SD), fill = "blue", alpha = 0.2) +
scale_x_log10() +
labs(
x = "Sample Size (n)",
y = "Mean ± SD"
) +
theme_minimal()# Initialize total breakdown count
breakdowns <- 0
# Iterate over each turbine
for (k in 1:K) {
# simulate for N years
for (n in 1:N) {
# Reset initial time to 0
t <- 0
# while t is less than one year
while (t <= yr) {
# Sample from the exponential distribution (time to next breakdown)
t <- t + X()
# Increment the breakdown count if within the simulation period
if (t <= yr) {
breakdowns <- breakdowns + 1
}
}
}
}
# Calculate the average number of breakdowns per year across all turbines
breakdowns <- breakdowns / NWe should expect 1310.12 breakdowns in a year for the 300-turbine wind farm.
flowchart LR
A["X ~ Exp(λ)
breakdowns = 0
t = 0"]
B{"t <= 1 year"}
C["t += X()"]
D{"t <= 1 year"}
E["breakdowns++"]
Terminate([Terminate])
A --> B
B --true--> C
C --> D
D --true--> E
D --false--> B
E --> B
B --false-----> Terminate
We should expect 1309.88 breakdowns in a year for the 300-turbine wind farm.
Tip
Search documentation by going to the lower right pane and clicking the “Help” tab. Then type your term (i.e. rpois) into the search bar.
Repairs are assumed to take 96 hours (the average). How much turbine-down time should be expected per year? How many breakdowns should be anticipated per year?
\[ N = 300\ turbines \frac{365\ days}{year} \frac{24\ hrs}{day} \frac{1\ breakdown/turbine}{2,096\ hrs} = 1,253.8\ breakdowns/year \]
\[ N = 300\ turbines \frac{365\ days}{year} \frac{24\ hrs}{day} \frac{96\ hrs\ downtime}{2,096\ hrs\ operation} = 120,366.4\ hrs\ downtime \]
# repair time, Y
Y <- params$repair_avg
# Initialize breakdowns and uptime
breakdowns <- 0
uptime <- 0
# Iterate over each turbine
for (k in 1:K) {
# simulate for N years
for (n in 1:N) {
# Reset simulation time to 0
t <- 0
# while t is less than one year
while (t <= yr) {
# sample breakdown time
time_to_breakdown <- X()
# check if the breakdown time will exceed a year
if (t + time_to_breakdown <= yr) {
# if not, then we get another breakdown
breakdowns <- breakdowns + 1
# the simulation time increases by the time to breakdown
t <- t + time_to_breakdown
# and the uptime increases by the time to breakdown
uptime <- uptime + time_to_breakdown
# for now, Y is a constant
time_to_repair <- Y
t <- t + time_to_repair
} else {
# if the breakdown time exceeds one year, then uptime
# extends to the end of the year and we don't get a new breakdown
uptime <- uptime + (yr - t)
t <- t + time_to_breakdown
}
}
}
}
# Calculate the average number of breakdowns per year across all turbines
uptime <- uptime / N
breakdowns <- breakdowns / N
downtime <- 300*yr - uptime| Metric | Analytical | Simulation |
|---|---|---|
| \(breakdowns/yr\) | 1253.8 | 1255.52 |
| \(total\ turbine\ downtime/yr\) | 120366.4 | 119890.2 |
Assumes every turbine is working at the start of the simulation
What can we say about the probability of downtime exceeding some value?
If we want to increase up time, should we focus on expediting repairs or performing preventive maintenance to increase time between failures?
Do all repairs take the same amount of time?
\[ Y \sim LogNormal(\mu, \sigma^2) \] \[ f_Y(t | \mu, \sigma) = \frac{1}{\sqrt{2\pi \sigma t}} \exp \left( \frac{-(log(t)-\mu)^2}{2\sigma^2} \right) \]
These are the same distribution with the left x-axis on a log scale. The lower end of ranges look basically like a Normal distribution for both plots, but the right plot shows that the higher values of \(t\) are more likely than if it were just a Normal distribution. The pdf looks like it is being stretched.
Tip
You often can’t validate your entire simulation. However, you can and should verify the components as much as possible
Need to determine how many times \(X + Y\) divide into one year
Adding two distributions is not straightforward. How do you add a pair of 6-sided die?
This is really a convolution, denoted as \(X \ast Y\). It can be found as:
\[(f_X * f_Y)(t) = \int_{-\infty}^{\infty} f_X(\tau) f_Y(t - \tau) \, d\tau\]
Sometimes, this can be achieved easier through Laplace Transforms
Unfortunately, a closed-form analytical solution is not always possible
# Initialize breakdowns and uptime
breakdowns <- 0
uptime <- 0
# Iterate over each turbine
for (k in 1:K) {
# simulate for N years
for (n in 1:N) {
# Reset simulation time to 0
t <- 0
# while t is less than one year
while (t <= yr) {
# sample breakdown time
time_to_breakdown <- X()
# check if the breakdown time will exceed a year
if (t + time_to_breakdown <= yr) {
# if not, then we get another breakdown
breakdowns <- breakdowns + 1
# the simulation time increases by the time to breakdown
t <- t + time_to_breakdown
# and the uptime increases by the time to breakdown
uptime <- uptime + time_to_breakdown
# now we will sample Y
time_to_repair <- Y()
t <- t + time_to_repair
} else {
# if the breakdown time exceeds one year, then uptime
# extends to the end of the year and we don't get a new breakdown
uptime <- uptime + (yr - t)
t <- t + time_to_breakdown
}
}
}
}
# Calculate the average number of breakdowns per year across all turbines
uptime <- uptime / N
breakdowns <- breakdowns / N
downtime <- 300*yr - uptime| Metric | Analytical | Simulation |
|---|---|---|
| \(breakdowns/yr\) | 1253.8 | 1267.12 |
| \(total\ turbine\ downtime/yr\) | 120366.4 | 120610.4 |
We want to ask questions about the distribution of failures and repairs. We need to collect more data from the simulation and store it in an easy format for manipuation.
# Iterate over each turbine
for (k in 1:K) {
# simulate for N years
for (n in 1:N) {
# move these to inner loop to record results of each iteration
breakdowns <- 0
uptime <- 0
t <- 0
# while t is less than one year
while (t <= yr) {
# sample breakdown time
time_to_breakdown <- X()
# check if the breakdown time will exceed a year
if (t + time_to_breakdown <= yr) {
# if not, then we get another breakdown
breakdowns <- breakdowns + 1
# the simulation time increases by the time to breakdown
t <- t + time_to_breakdown
# and the uptime increases by the time to breakdown
uptime <- uptime + time_to_breakdown
# now sampling Y
time_to_repair <- Y()
t <- t + time_to_repair
} else {
# if the breakdown time exceeds one year, then uptime
# extends to the end of the year and we don't get a new breakdown
uptime <- uptime + (yr - t)
t <- t + time_to_breakdown
}
}
# write the data to the rows of our tibble
results <- results |>
add_row(
turbine_id = k,
year = n,
breakdowns = breakdowns,
uptime = uptime
)
}
}
# calculating a column for downtime
results <- results |>
mutate(downtime = yr - uptime)We have some familiarity with the development environment in R, literate programming, and Quarto
Provided motivation for why even “simple” questions may be best answered with simulation
Introduced Exponential, Poisson, and LogNormal distributions
Demonstrated how to write a computer program which simulates our problem, check it against an available analytical solution, and then extend it to the more complex case