#1 Simulations Overview (Wind Turbine Repair)

14:540:384: Simulation Models in IE (Spring 2025)

Daniel Moore

2025-01-22

Learning Objectives

Develop a motivating example for IE simulation models
Demonstrate the type of coding complexity that we will build up to during the course
Gauge students’ level of expertise in probability models and programming

1 R Programming Basics

1.1 Development Stack

\(\textsf{R}\): R is a language and environment for statistical computing and graphics. Its predecessor, “S”, was developed by Bell Labs in Murray Hill, NJ.
\(\textsf{RStudio}\): Used by millions of people weekly, the RStudio integrated development environment (IDE) is a set of tools built to help you be more productive with R and Python.
Quarto: An open-source scientific and technical publishing system. Analyze. Share. Reproduce. You have a story to tell with data—tell it with Quarto.
Markdown. Lightweight language to format plain text.

1.2 Hello, World!

Standard test to make sure a programming environment is properly configured

print("Hello, World!")

[1] "Hello, World!"

Tip

For \(\textsf{R}\) tutorials, go to top-right corner of \(\textsf{RStudio}\) and look for the “Tutorial” tab. Follow instructions in the pane.

1.3 Package Management

\(\textsf{R}\) is a go-to language for many domains because of its mature packages for many kinds of data analysis. These packages are easily accessible via the Comprehensive R Archive Network (CRAN)

note: click on the tabs to see additional content

Install Packages
Load Packages

This tells CRAN to find your desired package and download a copy to your working environment.

Option 1: Go to Files Pane (lower right) \(\rightarrow\) Packages tab \(\rightarrow\) Install \(\rightarrow\) <search for your package> \(\rightarrow\) follow prompts in console
Option 2: In the console pane (lower left), type install.packages("<pkg_name>") \(\rightarrow\) follow prompts

Code

# installing packages
install.packages("tidyverse")
install.packages("ggExtra")

This tells your current R session to read the [downloaded] packages and then you can use them.

Option 1: File Pane \(\rightarrow\) Packages Tab \(\rightarrow\) scroll or search for your package \(\rightarrow\) click the checkbox
Option 2: In the console or your working file put: library(<pkg_name>).

Code

# loading packages
library(tidyverse)
library(ggExtra)

2 Code Time!

An offshore wind farm operates 300 turbines. The breakdown rate follows an exponential distribution with a rate of one per 2000 hours of operation. How many breakdowns should be anticipated per year?

With this information we can simply take the parameter, \(\lambda\), and multiply it by the hours in a year and the total number of turbines to get the expected number of failures in the year.

\[ X \sim Exp \left( \lambda = \frac{1}{2,000\ hrs} \right) \]

\[ f_X(t | \lambda) = \lambda e^{-\lambda t} \]

\[ N = 300\ turbines \frac{365\ days}{year} \frac{24\ hrs}{day} \frac{1\ breakdown/turbine}{2,000\ hrs} = 1,314\ breakdowns/year \]

This is a silly problem to simulate, but we will do it anyway. We have a few options:

Sample \(X\) to get breakdown times and see how many we get in one year. Then multiply that by 300
Sample \(X\) to get breakdown times for each turbine and add them all together
Do (1) or (2) for \(N\) years and then divide the result by \(N\). (\(N\) could be less than 1. What does that mean?)
Recognize Exponential gives failure times, Poisson gives failures in a given time. Use that instead

Simulation Parameters

# I've defined certain "params" in the header of the document
lambda <- 1 / params$breakdown_rate # Breakdown rate (failures per hour)
N <- params$sim_yrs                 # Total simulation time in years
K <- params$turbines                # Number of turbines
yr <- 24 * 365                      # number of hours in a year

# define a RV, X, which will sample the exp dist. when we call X()
X <- function(n=1) {
  # n gives the number of samples, with the default of 1
  rexp(n, rate = lambda)
}

Tip

<- and = are chosen to follow convention. <- assigns a “value” to a variable while = is used in function signatures.

Sampling Random Variable

Sample it once:

X()

[1] 1820.334

Sample it several times:

X(5)

[1] 5476.1236  317.0939 3002.9103 8743.8847  934.9308

Averaging a few samples

mean(X(15))

[1] 1225.256

Averaging a lot of samples

mean(X(15000))

[1] 2008.425

N-Sample Visualization

Code

# set the samples and increment
n_values = seq(10, 5000, by = 10)

# create a tibble with N, mean, and sd for sampling X() n times
X_sim <- tibble(
  n = n_values,
  Mean = map_dbl(n_values, ~ mean(X(.x))),
  SD = map_dbl(n_values, ~ sd(X(.x)))
)

#plot results
X_sim %>%
  ggplot(aes(x = n, y = Mean)) +
  geom_line(color = "blue") +
  geom_ribbon(aes(ymin = Mean - SD, ymax = Mean + SD), fill = "blue", alpha = 0.2) +
  scale_x_log10() +
  labs(
    x = "Sample Size (n)",
    y = "Mean ± SD"
  ) +
  theme_minimal()

Simulation Code

# Initialize total breakdown count
breakdowns <- 0

# Iterate over each turbine
for (k in 1:K) {
  # simulate for N years
  for (n in 1:N) {
    # Reset initial time to 0
    t <- 0
    
    # while t is less than one year
    while (t <= yr) {
      # Sample from the exponential distribution (time to next breakdown)
      t <- t + X()
      
      # Increment the breakdown count if within the simulation period
      if (t <= yr) {
        breakdowns <- breakdowns + 1
      }
    }
  }
}

# Calculate the average number of breakdowns per year across all turbines
breakdowns <- breakdowns / N

We should expect 1310.12 breakdowns in a year for the 300-turbine wind farm.

Code Flow Chart:

flowchart LR
  A["X ~ Exp(λ)
  breakdowns = 0
  t = 0"]
  B{"t <= 1 year"}
  C["t += X()"]
  D{"t <= 1 year"}
  E["breakdowns++"]
  Terminate([Terminate])
  
  A --> B
  B --true--> C
  C --> D
  D --true--> E
  D --false--> B
  E --> B
  B --false-----> Terminate

Smarter Simulation

Exponential gives time between events
Poisson gives number of events in a given time

# Poisson parameter is # events per time period
lambda_p <- yr * lambda

# we want to sample K*N times, This gives K*N breakdowns (vector)
poisson_breakdowns <- rpois(N*K, lambda_p)

# sum the vector and divide by N
poisson_breakdowns <- sum(poisson_breakdowns) / N

We should expect 1309.88 breakdowns in a year for the 300-turbine wind farm.

Tip

Search documentation by going to the lower right pane and clicking the “Help” tab. Then type your term (i.e. rpois) into the search bar.

2.2 Wind Turbine Repair

Repairs are assumed to take 96 hours (the average). How much turbine-down time should be expected per year? How many breakdowns should be anticipated per year?

Analytical Solution

With a fixed repair time, we can tack 96 hours on to the breakdown rate, 1 break down every2000 hours + 96 hours to repair = 1 breakdown every 2096.

\[ N = 300\ turbines \frac{365\ days}{year} \frac{24\ hrs}{day} \frac{1\ breakdown/turbine}{2,096\ hrs} = 1,253.8\ breakdowns/year \]

Accounting for repair time, our number of breakdowns…decreased? Is this sensible?

\[ N = 300\ turbines \frac{365\ days}{year} \frac{24\ hrs}{day} \frac{96\ hrs\ downtime}{2,096\ hrs\ operation} = 120,366.4\ hrs\ downtime \]

Are we confident that this math checks out? Let’s simulate it…

Simulation Modification

# repair time, Y
Y <- params$repair_avg

# Initialize breakdowns and uptime
breakdowns <- 0
uptime <- 0

# Iterate over each turbine
for (k in 1:K) {
  # simulate for N years
  for (n in 1:N) {
    # Reset simulation time to 0
    t <- 0
    
    # while t is less than one year
    while (t <= yr) {
      
      # sample breakdown time
      time_to_breakdown <- X()

      # check if the breakdown time will exceed a year
      if (t + time_to_breakdown <= yr) {
        # if not, then we get another breakdown
        breakdowns <- breakdowns + 1
        # the simulation time increases by the time to breakdown
        t <- t + time_to_breakdown
        # and the uptime increases by the time to breakdown
        uptime <- uptime + time_to_breakdown
        
        # for now, Y is a constant
        time_to_repair <- Y

        t <- t + time_to_repair
        
      } else {
        # if the breakdown time exceeds one year, then uptime
        # extends to the end of the year and we don't get a new breakdown
        uptime <- uptime + (yr - t)
        t <- t + time_to_breakdown
      }
    }
  }
}

# Calculate the average number of breakdowns per year across all turbines
uptime <- uptime / N
breakdowns <- breakdowns / N
downtime <- 300*yr - uptime

Results Comparison

Metric	Analytical	Simulation
\(breakdowns/yr\)	1253.8	1255.52
\(total\ turbine\ downtime/yr\)	120366.4	119890.2

Shortfalls?

Assumes every turbine is working at the start of the simulation
What can we say about the probability of downtime exceeding some value?
If we want to increase up time, should we focus on expediting repairs or performing preventive maintenance to increase time between failures?
Do all repairs take the same amount of time?

2.3 LogNormally Distributed Repair Time

Most repairs take about the same amount of time, but some take a long time (think specialty overseas parts etc).

\[ Y \sim LogNormal(\mu, \sigma^2) \] \[ f_Y(t | \mu, \sigma) = \frac{1}{\sqrt{2\pi \sigma t}} \exp \left( \frac{-(log(t)-\mu)^2}{2\sigma^2} \right) \]

LogNormal Distribution

These are the same distribution with the left x-axis on a log scale. The lower end of ranges look basically like a Normal distribution for both plots, but the right plot shows that the higher values of \(t\) are more likely than if it were just a Normal distribution. The pdf looks like it is being stretched.

Creating LogNormal Distribution

mu <- params$repair_avg
sigma <- params$repair_std
mu_log <- log( mu^2 / ( mu^2 + sigma^2)^(1/2) )
sigma_log <- (log(1 + (sigma^2/mu^2)))^(1/2)

Y <- function(n=1){
  rlnorm(n, mu_log, sigma_log)
}

Verifying LogNormal Distribution

repair_times <- Y(500)

mu_calc <- mean(repair_times)
sigma_calc <- sd(repair_times)

Sample mean: 96.1573838
Sample standard deviation: 25.0175831

Tip

You often can’t validate your entire simulation. However, you can and should verify the components as much as possible

Analytical Solution

Need to determine how many times \(X + Y\) divide into one year
Adding two distributions is not straightforward. How do you add a pair of 6-sided die?
This is really a convolution, denoted as \(X \ast Y\). It can be found as:

\[(f_X * f_Y)(t) = \int_{-\infty}^{\infty} f_X(\tau) f_Y(t - \tau) \, d\tau\]

Sometimes, this can be achieved easier through Laplace Transforms
Unfortunately, a closed-form analytical solution is not always possible

Simulation Adjustment

# Initialize breakdowns and uptime
breakdowns <- 0
uptime <- 0

# Iterate over each turbine
for (k in 1:K) {
  # simulate for N years
  for (n in 1:N) {
    # Reset simulation time to 0
    t <- 0
    
    # while t is less than one year
    while (t <= yr) {
      
      # sample breakdown time
      time_to_breakdown <- X()

      # check if the breakdown time will exceed a year
      if (t + time_to_breakdown <= yr) {
        # if not, then we get another breakdown
        breakdowns <- breakdowns + 1
        # the simulation time increases by the time to breakdown
        t <- t + time_to_breakdown
        # and the uptime increases by the time to breakdown
        uptime <- uptime + time_to_breakdown
        
        # now we will sample Y
        time_to_repair <- Y()

        t <- t + time_to_repair
        
      } else {
        # if the breakdown time exceeds one year, then uptime
        # extends to the end of the year and we don't get a new breakdown
        uptime <- uptime + (yr - t)
        t <- t + time_to_breakdown
      }
    }
  }
}

# Calculate the average number of breakdowns per year across all turbines
uptime <- uptime / N
breakdowns <- breakdowns / N
downtime <- 300*yr - uptime

Results Comparison, LogNormal Dist.

Metric	Analytical	Simulation
\(breakdowns/yr\)	1253.8	1267.12
\(total\ turbine\ downtime/yr\)	120366.4	120610.4

2.4 Collecting more Simulation Data

We want to ask questions about the distribution of failures and repairs. We need to collect more data from the simulation and store it in an easy format for manipuation.

Tidy Data & Tidyverse

# Create an empty table which we will populate during simulation
results <- tibble(
  turbine_id = integer(),
  year = integer(),
  breakdowns = integer(),
  uptime = numeric()
)

Tip

tidyverse is the one-stop-shop for (nearly) everything needed for working with data and creating great visualizations. It includes several different packages which all adhere to the “tidy” way of doing things. We’ll see more of this in the future.

Final Simulation

# Iterate over each turbine
for (k in 1:K) {
  # simulate for N years
  for (n in 1:N) {
    # move these to inner loop to record results of each iteration
    breakdowns <- 0
    uptime <- 0
    t <- 0
    
    # while t is less than one year
    while (t <= yr) {
      
      # sample breakdown time
      time_to_breakdown <- X()

      # check if the breakdown time will exceed a year
      if (t + time_to_breakdown <= yr) {
        # if not, then we get another breakdown
        breakdowns <- breakdowns + 1
        # the simulation time increases by the time to breakdown
        t <- t + time_to_breakdown
        # and the uptime increases by the time to breakdown
        uptime <- uptime + time_to_breakdown
        
        # now sampling Y
        time_to_repair <- Y()

        t <- t + time_to_repair
        
      } else {
        # if the breakdown time exceeds one year, then uptime
        # extends to the end of the year and we don't get a new breakdown
        uptime <- uptime + (yr - t)
        t <- t + time_to_breakdown
      }
    }
    
    # write the data to the rows of our tibble
    results <- results |>
      add_row(
        turbine_id = k,
        year = n,
        breakdowns = breakdowns,
        uptime = uptime
      )
  }
}

# calculating a column for downtime 
results <- results |>
  mutate(downtime = yr - uptime)

Simulation Visualizations

Breakdowns
Downtime
Downtime vs. Breakdowns

Results Cumulative Distributions

Total Breakdowns
Annual Downtime

3 Conclusion

We have some familiarity with the development environment in R, literate programming, and Quarto
Provided motivation for why even “simple” questions may be best answered with simulation
Introduced Exponential, Poisson, and LogNormal distributions
Demonstrated how to write a computer program which simulates our problem, check it against an available analytical solution, and then extend it to the more complex case