# 8.2.1: Simulating Multi-step Experiments - Mathematics

## Lesson

Let's simulate more complicated events.

Exercise (PageIndex{1}): Notice and Wonder: Ski Business

What do you notice? What do you wonder?

Exercise (PageIndex{2}): Alpine Zoom

Alpine Zoom is a ski business. To make money over spring break, they need it to snow at least 4 out of the 10 days. The weather forecast says there is a (frac{1}{3}) chance it will snow each day during the break.

Use the applet to simulate the weather for 10 days of break to see if Alpine Zoom will make money.

1. Describe a chance experiment that you could use to simulate whether it will snow on the first day of break.
2. How could this chance experiment be used to determine whether Alpine Zoom will make money?
• In each trial, spin the spinner 10 times, and then record the number of 1’s that appeared in the row.
• The applet reports if the Alpine Zoom will make money or not in the last column.
• Click Next to get the spin button back to start the next simulation.
3. Based on your simulations, estimate the probability that Alpine Zoom will make money.

Exercise (PageIndex{3}): Kiran's Game

Kiran invents a game that uses a board with alternating black and white squares. A playing piece starts on a white square and must advance 4 squares to the other side of the board within 5 turns to win the game.

For each turn, the player draws a block from a bag containing 2 black blocks and 2 white blocks. If the block color matches the color of the next square on the board, the playing piece moves onto it. If it does not match, the playing piece stays on its current square.

1. Take turns playing the game until each person in your group has played the game twice.
2. Use the results from all the games your group played to estimate the probability of winning Kiran’s game.
3. Do you think your estimate of the probability of winning is a good estimate? How could it be improved?

How would each of these changes, on its own, affect the probability of winning the game?

1. Change the rules so that the playing piece must move 7 spaces within 8 moves.
2. Change the board so that all the spaces are black.
3. Change the blocks in the bag to 3 black blocks and 1 white block.

Exercise (PageIndex{4}): Simulation Nation

Match each situation to a simulation.

Situations:

1. In a small lake, 25% of the fish are female. You capture a fish, record whether it is male or female, and toss the fish back into the lake. If you repeat this process 5 times, what is the probability that at least 3 of the 5 fish are female?
2. Elena makes about 80% of her free throws. Based on her past successes with free throws, what is the probability that she will make exactly 4 out of 5 free throws in her next basketball game?
3. On a game show, a contestant must pick one of three doors. In the first round, the winning door has a vacation. In the second round, the winning door has a car. What is the probability of winning a vacation and a car?
4. Your choir is singing in 4 concerts. You and one of your classmates both learned the solo. Before each concert, there is an equal chance the choir director will select you or the other student to sing the solo. What is the probability that you will be selected to sing the solo in exactly 3 of the 4 concerts?

Simulations:

1. Toss a standard number cube 2 times and record the outcomes. Repeat this process many times and find the proportion of the simulations in which a 1 or 2 appeared both times to estimate the probability.
2. Make a spinner with four equal sections labeled 1, 2, 3, and 4. Spin the spinner 5 times and record the outcomes. Repeat this process many times and find the proportion of the simulations in which a 4 appears 3 or more times to estimate the probability.
3. Toss a fair coin 4 times and record the outcomes. Repeat this process many times, and find the proportion of the simulations in which exactly 3 heads appear to estimate the probability.
4. Place 8 blue chips and 2 red chips in a bag. Shake the bag, select a chip, record its color, and then return the chip to the bag. Repeat the process 4 more times to obtain a simulated outcome. Then repeat this process many times and find the proportion of the simulations in which exactly 4 blues are selected to estimate the probability.

### Summary

The more complex a situation is, the harder it can be to estimate the probability of a particular event happening. Well-designed simulations are a way to estimate a probability in a complex situation, especially when it would be difficult or impossible to determine the probability from reasoning alone.

To design a good simulation, we need to know something about the situation. For example, if we want to estimate the probability that it will rain every day for the next three days, we could look up the weather forecast for the next three days. Here is a table showing a weather forecast:

today (Tuesday)WednesdayThursdayFriday
probability of rain(0.2)(0.4)(0.5)(0.9)
Table (PageIndex{1})

We can set up a simulation to estimate the probability of rain each day with three bags.

• In the first bag, we put 4 slips of paper that say “rain” and 6 that say “no rain.”
• In the second bag, we put 5 slips of paper that say “rain” and 5 that say “no rain.”
• In the third bag, we put 9 slips of paper that say “rain” and 1 that says “no rain.”

Then we can select one slip of paper from each bag and record whether or not there was rain on all three days. If we repeat this experiment many times, we can estimate the probability that there will be rain on all three days by dividing the number of times all three slips said “rain” by the total number of times we performed the simulation.

## Practice

Exercise (PageIndex{5})

Priya’s cat is pregnant with a litter of 5 kittens. Each kitten has a 30% chance of being chocolate brown. Priya wants to know the probability that at least two of the kittens will be chocolate brown.

To simulate this, Priya put 3 white cubes and 7 green cubes in a bag. For each trial, Priya pulled out and returned a cube 5 times. Priya conducted 12 trials.

Here is a table with the results.

trial numberoutcome
1ggggg
2gggwg
3wgwgw
4gwggg
5gggwg
6wwggg
7gwggg
8ggwgw
9wwwgg
10ggggw
11wggwg
12gggwg
Table (PageIndex{2})
1. How many successful trials were there? Describe how you determined if a trial was a success.
2. Based on this simulation, estimate the probability that exactly two kittens will be chocolate brown.
3. Based on this simulation, estimate the probability that at least two kittens will be chocolate brown.
4. Write and answer another question Priya could answer using this simulation.
5. How could Priya increase the accuracy of the simulation?

Exercise (PageIndex{6})

A team has a 75% chance to win each of the 3 games they will play this week. Clare simulates the week of games by putting 4 pieces of paper in a bag, 3 labeled “win” and 1 labeled “lose.” She draws a paper, writes down the result, then replaces the paper and repeats the process two more times. Clare gets the result: win, win, lose. What can Clare do to estimate the probability the team will win at least 2 games?

Exercise (PageIndex{7})

1. List the sample space for selecting a letter a random from the word “PINEAPPLE.”
2. A letter is randomly selected from the word “PINEAPPLE.” Which is more likely, selecting “E” or selecting “P?” Explain your reasoning.

(From Unit 8.1.5)

Exercise (PageIndex{8})

On a graph of side length of a square vs. its perimeter, a few points are plotted.

1. Add at least two more ordered pairs to the graph.

2. Is there a proportional relationship between the perimeter and side length? Explain how you know.

(From Unit 2.4.2)

## 7.8 Probability and Sampling

IM 6–8 Math was originally developed by Open Up Resources and authored by Illustrative Mathematics®, and is copyright 2017-2019 by Open Up Resources. It is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). OUR's 6–8 Math Curriculum is available at https://openupresources.org/math-curriculum/.

The second set of English assessments (marked as set "B") are copyright 2019 by Open Up Resources, and are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

Spanish translation of the "B" assessments are copyright 2020 by Illustrative Mathematics, and are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The Illustrative Mathematics name and logo are not subject to the Creative Commons license and may not be used without the prior and express written consent of Illustrative Mathematics.

## Lesson 7

In this lesson, students see that compound events can be simulated by using multiple chance experiments. In this case, it is important to communicate precisely what represents one outcome of the simulation (MP6). For example, if we want to know the probability that a family with three children will have at least one girl, we can toss one coin to represent each child and use each set of three coin tosses to represent one family. Therefore, if we toss a coin 30 times, we will have run this simulation only 10 times.

Students continue to consider how a real-world situation can be represented using simulation (MP4).

### Learning Goals

Let’s simulate more complicated events.

### Required Preparation

Print and cut up spinners from the Alpine Zoom blackline master. One spinner for each group of 3 students.

For the Kiran’s Game activity, a paper bag containing 4 snap cubes (2 black and 2 white) is needed for every 3 students.

Other simulation tools (number cubes, bags with colored snap cubes, etc.) should be available.

### Print Formatted Materials

IM 6–8 Math was originally developed by Open Up Resources and authored by Illustrative Mathematics®, and is copyright 2017-2019 by Open Up Resources. It is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). OUR's 6–8 Math Curriculum is available at https://openupresources.org/math-curriculum/.

The second set of English assessments (marked as set "B") are copyright 2019 by Open Up Resources, and are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

Spanish translation of the "B" assessments are copyright 2020 by Illustrative Mathematics, and are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The Illustrative Mathematics name and logo are not subject to the Creative Commons license and may not be used without the prior and express written consent of Illustrative Mathematics.

## Lesson 7

Priya’s cat is pregnant with a litter of 5 kittens. Each kitten has a 30% chance of being chocolate brown. Priya wants to know the probability that at least two of the kittens will be chocolate brown.

To simulate this, Priya put 3 white cubes and 7 green cubes in a bag. For each trial, Priya pulled out and returned a cube 5 times. Priya conducted 12 trials.

Here is a table with the results.

trial number outcome
1 ggggg
2 gggwg
3 wgwgw
4 gwggg
5 gggwg
6 wwggg
7 gwggg
8 ggwgw
9 wwwgg
10 ggggw
11 wggwg
12 gggwg

How many successful trials were there? Describe how you determined if a trial was a success.

How could Priya increase the accuracy of the simulation?

### Solution

For access, consult one of our IM Certified Partners.

### Problem 2

A team has a 75% chance to win each of the 3 games they will play this week. Clare simulates the week of games by putting 4 pieces of paper in a bag, 3 labeled “win” and 1 labeled “lose.” She draws a paper, writes down the result, then replaces the paper and repeats the process two more times. Clare gets the result: win, win, lose. What can Clare do to estimate the probability the team will win at least 2 games?

### Solution

For access, consult one of our IM Certified Partners.

### Problem 3

1. List the sample space for selecting a letter a random from the word “PINEAPPLE.”
2. A letter is randomly selected from the word “PINEAPPLE.” Which is more likely, selecting “E” or selecting “P?” Explain your reasoning.

### Solution

For access, consult one of our IM Certified Partners.

### Problem 4

On a graph of side length of a square vs. its perimeter, a few points are plotted.

Add at least two more ordered pairs to the graph.

Expand Image

Description: <p>Two points are plotted in the coordinate plane with the origin labeled “O”. The horizontal axis is labeled “perimeter” and the numbers 0 through 20 are indicated. The vertical axis is labeled “side length” and the numbers 0 through 8 are indicated. The two points plotted are 9 comma 2 point 2 5 and 20 comma 5.</p>

### Solution

For access, consult one of our IM Certified Partners.

IM 6–8 Math was originally developed by Open Up Resources and authored by Illustrative Mathematics®, and is copyright 2017-2019 by Open Up Resources. It is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). OUR's 6–8 Math Curriculum is available at https://openupresources.org/math-curriculum/.

The second set of English assessments (marked as set "B") are copyright 2019 by Open Up Resources, and are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

Spanish translation of the "B" assessments are copyright 2020 by Illustrative Mathematics, and are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The Illustrative Mathematics name and logo are not subject to the Creative Commons license and may not be used without the prior and express written consent of Illustrative Mathematics.

## Introduction to Econometrics with R

This section treats five sources that cause the OLS estimator in (multiple) regression models to be biased and inconsistent for the causal effect of interest and discusses possible remedies. All five sources imply a violation of the first least squares assumption presented in Key Concept 6.4.

misspecification of the functional form

missing data and sample selection

simultaneous causality bias

Beside these threats for consistency of the estimator, we also briefly discuss causes of inconsistent estimation of OLS standard errors.

### Omitted Variable Bias: Should I include More Variables in My Regression?

Inclusion of additional variables reduces the risk of omitted variable bias but may increase the variance of the estimator of the coefficient of interest.

We present some guidelines that help deciding whether to include an additional variable:

Specify the coefficient(s) of interest

Identify the most important potential sources of omitted variable bias by using knowledge available before estimating the model. You should end up with a baseline specification and a set of regressors that are questionable

Use different model specifications to test whether questionable regressors have coefficients different from zero

Use tables to provide full disclosure of your results, i.e., present different model specifications that both support your argument and enable the reader to see the effect of including questionable regressors

By now you should be aware of omitted variable bias and its consequences. Key Concept 9.2 gives some guidelines on how to proceed if there are control variables that possibly allow to reduce omitted variable bias. If including additional variables to mitigate the bias is not an option because there are no adequate controls, there are different approaches to solve the problem:

usage of panel data methods (discussed in Chapter 10)

usage of instrumental variables regression (discussed in Chapter 12)

usage of a randomized control experiment (discussed in Chapter 13)

#### Misspecification of the Functional Form of the Regression Function

If the population regression function is nonlinear but the regression function is linear, the functional form of the regression model is misspecified. This leads to a bias of the OLS estimator.

### Functional Form Misspecification

A regression suffers from misspecification of the functional form when the functional form of the estimated regression model differs from the functional form of the population regression function. Functional form misspecification leads to biased and inconsistent coefficient estimators. A way to detect functional form misspecification is to plot the estimated regression function and the data. This may also be helpful to choose the correct functional form.

It is easy to come up with examples of misspecification of the functional form: consider the case where the population regression function is [Y_i = X_i^2] but the model used is [Y_i = eta_0 + eta_1 X_i + u_i.] Clearly, the regression function is misspecified here. We now simulate data and visualize this.

It is evident that the regression errors are relatively small for observations close to (X=-3) and (X=3) but that the errors increase for (X) values closer to zero and even more for values beyond (-4) and (4) . Consequences are drastic: the intercept is estimated to be (8.1) and for the slope parameter we obtain an estimate obviously very close to zero. This issue does not disappear as the number of observations is increased because OLS is biased and inconsistent due to the misspecification of the regression function.

### Errors-in-Variable Bias

When independent variables are measured imprecisely, we speak of errors-in-variables bias. This bias does not disappear if the sample size is large. If the measurement error has mean zero and is independent of the affected variable, the OLS estimator of the respective coefficient is biased towards zero.

Suppose you are incorrectly measuring the single regressor (X_i) so that there is a measurement error and you observe (overset_i) instead of (X_i) . Then, instead of estimating the population the regression model [ Y_i = eta_0 + eta_1 X_i + u_i ] you end up estimating [egin Y_i =& , eta_0 + eta_1 overset_i + underbrace<eta_1 (X_i - overset_i) + u_i>_ <=v_i> Y_i =& , eta_0 + eta_1 overset_i + v_i end] where (overset_i) and the error term (v_i) are correlated. Thus OLS would be biased and inconsistent for the true (eta_1) in this example. One can show that direction and strength of the bias depend on the correlation between the observed regressor, (overset_i) , and the measurement error, (w_i =X_i - overset_i) . This correlation in turn depends on the type of the measurement error made.

The classical measurement error model assumes that the measurement error, (w_i) , has zero mean and that it is uncorrelated with the variable, (X_i) , and the error term of the population regression model, (u_i) :

which implies inconsistency as (sigma_^2, sigma_^2 > 0) such that the fraction in (9.1) is smaller than (1) . Note that there are two extreme cases:

If there is no measurement error, (sigma_^2=0) such that (widehat<eta>_1 xrightarrow

<eta_1>) .

if (sigma_^2 gg sigma_^2) we have (widehat<eta>_1 xrightarrow

<0>) . This is the case if the measurement error is so large that there essentially is no information on (X) in the data that can be used to estimate (eta_1) .

The most obvious way to deal with errors-in-variables bias is to use an accurately measured (X) . If this not possible, instrumental variables regression is an option. One might also deal with the issue by using a mathematical model of the measurement error and adjust the estimates appropriately: if it is plausible that the classical measurement error model applies and if there is information that can be used to estimate the ratio in equation (9.1), one could compute an estimate that corrects for the downward bias.

For example, consider two bivariate normally distributed random variables (X,Y) . It is a well known result that the conditional expectation function of (Y) given (X) has the form [egin E(Yvert X) = E(Y) + ho_ frac>>left[X-E(X) ight]. ag <9.2>end] Thus for [egin (X, Y) sim mathcalleft[egin50 100end,egin10 & 5 5 & 10 end ight] ag <9.3>end] according to (9.2), the population regression function is [egin Y_i =& , 100 + 0.5 (X_i - 50) =& , 75 + 0.5 X_i. ag <9.4>end] Now suppose you gather data on (X) and (Y) , but that you can only measure (overset = X_i + w_i) with (w_i overset mathcal(0,10)) . Since the (w_i) are independent of the (X_i) , there is no correlation between the (X_i) and the (w_i) so that we have a case of the classical measurement error model. We now illustrate this example in R using the package mvtnorm (Genz et al. 2020) .

We now estimate a simple linear regression of (Y) on (X) using this sample data and run the same regression again but this time we add i.i.d. (mathcal(0,10)) errors added to (X) .

Next, we visualize the results and compare with the population regression function.

In the situation without measurement error, the estimated regression function is close to the population regression function. Things are different when we use the mismeasured regressor (X) : both the estimate for the intercept and the estimate for the coefficient on (X) differ considerably from results obtained using the “clean” data on (X) . In particular (widehat<eta>_1 = 0.255) , so there is a downward bias. We are in the comfortable situation to know (sigma_X^2) and (sigma^2_w) . This allows us to correct for the bias using (9.1). Using this information we obtain the biased-corrected estimate [frac cdot widehat<eta>_1 = frac<10+10> <10>cdot 0.255 = 0.51] which is quite close to (eta_1=0.5) , the true coefficient from the population regression function.

Bear in mind that the above analysis uses a single sample. Thus one may argue that the results are just a coincidence. Can you show the contrary using a simulation study?

### Sample Selection Bias

When the sampling process influences the availability of data and when there is a relation of this sampling process to the dependent variable that goes beyond the dependence on the regressors, we say that there is a sample selection bias. This bias is due to correlation between one or more regressors and the error term. Sample selection implies both bias and inconsistency of the OLS estimator.

There are three cases of sample selection. Only one of them poses a threat to internal validity of a regression study. The three cases are:

Data are missing at random.

Data are missing based on the value of a regressor.

Data are missing due to a selection process which is related to the dependent variable.

Let us jump back to the example of variables (X) and (Y) distributed as stated in equation (9.3) and illustrate all three cases using R.

If data are missing at random, this is nothing but loosing observations. For example, loosing (50\%) of the sample would be the same as never having seen the (randomly chosen) half of the sample observed. Therefore, missing data do not introduce an estimation bias and “only” lead to less efficient estimators.

The gray dots represent the (500) discarded observations. When using the remaining observations, the estimation results deviate only marginally from the results obtained using the full sample.

Selecting data randomly based on the value of a regressor has also the effect of reducing the sample size and does not introduce estimation bias. We will now drop all observations with (X > 45) , estimate the model again and compare.

Note that although we dropped more than (90\%) of all observations, the estimated regression function is very close to the line estimated based on the full sample.

In the third case we face sample selection bias. We can illustrate this by using only observations with (X_i<55) and (Y_i>100) . These observations are easily identified using the function which() and logical operators: which(dat$X < 55 & dat$Y > 100)

We see that the selection process leads to biased estimation results.

There are methods that allow to correct for sample selection bias. However, these methods are beyond the scope of the book and are therefore not considered here. The concept of sample selection bias is summarized in Key Concept 9.5.

### Simultaneous Causality Bias

So far we have assumed that the changes in the independent variable (X) are responsible for changes in the dependent variable (Y) . When the reverse is also true, we say that there is simultaneous causality between (X) and (Y) . This reverse causality leads to correlation between (X) and the error in the population regression of interest such that the coefficient on (X) is estimated with bias.

Suppose we are interested in estimating the effect of a (20\%) increase in cigarettes prices on cigarette consumption in the United States using a multiple regression model. This may be investigated using the dataset CigarettesSW which is part of the AER package. CigarettesSW is a panel data set on cigarette consumption for all 48 continental U.S. federal states from 1985-1995 and provides data on economic indicators and average local prices, taxes and per capita pack consumption.

After loading the data set, we pick observations for the year 1995 and plot logarithms of the per pack price, price, against pack consumption, packs, and estimate a simple linear regression model.

Remember from Chapter 8 that, due to the log-log specification, in the population regression the coefficient on the logarithm of price is interpreted as the price elasticity of consumption. The estimated coefficient suggests that a (1\%) increase in cigarettes prices reduces cigarette consumption by about (1.2\%) , on average. Have we estimated a demand curve? The answer is no: this is a classic example of simultaneous causality, see Key Concept 9.6. The observations are market equilibria which are determined by both changes in supply and changes in demand. Therefore the price is correlated with the error term and the OLS estimator is biased. We can neither estimate a demand nor a supply curve consistently using this approach.

We will return to this issue in Chapter 12 which treats instrumental variables regression, an approach that allows consistent estimation when there is simultaneous causality.

#### Sources of Inconsistency of OLS Standard Errors

There are two central threats to computation of consistent OLS standard errors:

Heteroskedasticity: implications of heteroskedasticiy have been discussed in Chapter 5. Heteroskedasticity-robust standard errors as computed by the function vcovHC() from the package sandwich produce valid standard errors under heteroskedasticity.

Serial correlation: if the population regression error is correlated across observations, we have serial correlation. This often happens in applications where repeated observations are used, e.g., in panel data studies. As for heteroskedasticity, vcovHC() can be used to obtain valid standard errors when there is serial correlation.

Inconsistent standard errors will produce invalid hypothesis tests and wrong confidence intervals. For example, when testing the null that some model coefficient is zero, we cannot trust the outcome anymore because the test may fail to have a size of (5\%) due to the wrongly computed standard error.

Key Concept 9.7 summarizes all threats to internal validity discussed above.

### Threats to Internal Validity of a Regression Study

The five primary threats to internal validity of a multiple regression study are:

Misspecification of functional form

Errors in variables (measurement errors in the regressors)

All these threats lead to failure of the first least squares assumption [E(u_ivert X_<1i>,dots ,X_) eq 0] so that the OLS estimator is biased and inconsistent.

Furthermore, if one does not adjust for heteroskedasticity and/or serial correlation, incorrect standard errors may be a threat to internal validity of the study.

Many real-world situations are difficult to repeat enough times to get an estimate for a probability. If we can find probabilities for parts of the situation, we may be able to simulate the situation using a process that is easier to repeat.

For example, if we know that each egg of a fish in a science experiment has a 13% chance of having a mutation, how many eggs do we need to collect to make sure we have 10 mutated eggs? If getting these eggs is difficult or expensive, it might be helpful to have an idea about how many eggs we need before trying to collect them.

We could simulate this situation by having a computer select random numbers between 1 and 100. If the number is between 1 and 13, it counts as a mutated egg. Any other number would represent a normal egg. This matches the 13% chance of each fish egg having a mutation.

We could continue asking the computer for random numbers until we get 10 numbers that are between 1 and 13. How many times we asked the computer for a random number would give us an estimate of the number of fish eggs we would need to collect.

To improve the estimate, this entire process should be repeated many times. Because computers can perform simulations quickly, we could simulate the situation 1,000 times or more.

## DEM Simulation of Biaxial Compression Experiments of Inherently Anisotropic Granular Materials and the Boundary Effects

The reliability of discrete element method (DEM) numerical simulations is significantly dependent on the particle-scale parameters and boundary conditions. To verify the DEM models, two series of biaxial compression tests on ellipse-shaped steel rods are used. The comparisons on the stress-strain relationship, strength, and deformation pattern of experiments and simulations indicate that the DEM models are able to capture the key macro- and micromechanical behavior of inherently anisotropic granular materials with high fidelity. By using the validated DEM models, the boundary effects on the macrodeformation, strain localization, and nonuniformity of stress distribution inside the specimens are investigated using two rigid boundaries and one flexible boundary. The results demonstrate that the boundary condition plays a significant role on the stress-strain relationship and strength of granular materials with inherent fabric anisotropy if the stresses are calculated by the force applied on the wall. However, the responses of the particle assembly measured inside the specimens are almost the same with little influence from the boundary conditions. The peak friction angle obtained from the compression tests with flexible boundary represents the real friction angle of particle assembly. Due to the weak lateral constraints, the degree of stress nonuniformity under flexible boundary is higher than that under rigid boundary.

#### 1. Introduction

The natural granular materials such as sands and gravels universally have the characteristics of anisotropy due to deposition under gravity or compaction. A number of studies in the bearing capacity of shallow foundations [1–3] and slope stability [4, 5] demonstrated that the deformation and strength anisotropy of the granular materials played a significant role on the geotechnical engineering.

The mechanical behavior of granular materials with inherent fabric anisotropy has been investigated using almost all the available laboratory testing methods such as triaxial compression tests [6, 7], direct shear tests [8, 9], plane strain compression tests [6, 10, 11], and hollow cylinder torsion shear tests [10, 12]. All of these experimental results indicate that the deformation and strength of inherently anisotropic granular materials are significantly dependent on the direction of applied stresses with respect to the bedding plane. In order to correlate these macrodeformation behaviors to the evolution of fabric characteristics, various new testing technologies including microstructural observation of thin sections fixed by resin [13], X-ray CT [14, 15], and stereophotogrammetry [16] have been used. However, these methods are too expensive or even impossible to capture the particle-scale quantities during the whole process of deformation.

Instead of making efforts on the particle-scale fabric measurement of real 3D laboratory experiments, the biaxial compression tests were conducted using two-dimensional rod assemblages [19, 20]. In the tests conducted by Konishi et al. [19], the photoelastic rods with oval cross-section were used to investigate the inherent anisotropy and shear strength. Their test results indicated that the deformation behavior of these 2D rods resembled that of real granular materials to a great extent. However, compared to the 3D laboratory experiments, it is much easier to catch the evolution of the fabric characteristics during the deformation process of the specimen.

The discrete element method (DEM) is capable of providing the detailed information about particle movement, rotation, and interaction between particles. A large number of numerical simulations for the biaxial/triaxial compression tests [18, 21–25] and direct/simple shear tests [18, 22, 26–28] have demonstrated that DEM is a powerful tool to study the microdeformation mechanism of granular materials. However, these DEM models differ greatly in the simulation of particle shape and boundary conditions, which have great effects on the macro- and particle-scale responses of granular materials.

The present paper aims at simulating the biaxial compression tests of ellipse-shaped steel rod assembly with high fidelity. The DEM model is validated by comparing the macro- and particle-scale responses of laboratory experiments and numerical simulations for two series of biaxial compression tests. The effects of boundary conditions on the stress-strain relationship, strength, strain localization, and stress nonuniformity are investigated.

#### 2. Validation of Discrete Element Models

##### 2.1. Biaxial Compression Experiments

Two series of biaxial compression tests on ellipse-shaped steel rod assembly are used to validate the DEM models in this paper. The biaxial compression test equipment was developed by the second author [17]. Its structure diagram was shown in Figure 1. A rectangular sample container

, 240 mm in height and 120 mm in width, was constituted by the top plate

of a conventional triaxial compression apparatus and the component labeled as

was the reaction frame. During the shearing, the vertical deformation of the sample was controlled by the vertical movement of the loading platform , while the top plate was kept immovable. It should be pointed out that the base together with both side plates and bottom plate moved upward at the same speed as the movement of loading platform in this equipment, which was not common for the compression tests. The vertical pressure applied on the sample was measured by the force gauge

was used to measure the confined pressure applied on the left and right side platens, respectively. Each of the left and right side plates together with the base of the frame was installed two displacement sensors, and totally six displacement sensors, denoted by DT1 to DT6, were used.

The materials tested were the ellipse-shaped steel rods with a uniform aspect ratio (the ratio of the minor axis length [

]) of 1 : 2 and a length of 40 mm. The aggregate of the specimen was made by mixing three kinds of rods with their major axis length of 4 mm, 2 mm, and 1 mm. And their mass ratio was controlled to be 8 : 2 : 1.

To investigate the loading direction-dependent responses of the rod assembly, the specimens with various tilting angles, denoted by

, were fabricated as Figure 2 shown. The tilting angle is defined as the angle between the bedding plane and the plane of the major principal stress. One black rectangular frame was used to contain the rod assembly, whose inside dimensions were 240 mm high, 120 mm wide, and 50 mm long. To fix the black rectangular frame at a prescribed tilting angle, one transparent organic glass with marked lines and holes was used. The specimen with the tilting angle of was fabricated as follows. Firstly, according to the required tilting angle , the horizontal black rectangular frame was rotated clockwise by the angle of and fixed on the organic glass using bolts. Then the mixed iron rods were placed into the frame layer by layer by hand while keeping the major axis of rods horizontal. When the frame was filled with iron rods, small shaking was applied for 1 minute to uniform the rod assembly. After that, the frame was removed from the organic glass and returned back to the horizontal direction by rotating counterclockwise by . Finally the rod assembly was pushed horizontally to the rectangular sample container of the biaxial compression equipment using an organic glass plate, which has the same inside width and height as the frame and the rectangular sample container . Till now the specimen with the tilting angle of was prepared and ready for biaxial compression tests. Two series of biaxial compression tests by changing tilting angles and confining pressures were conducted.

(a)
(b)
(a)
(b)

: (a) fix the rectangular frame according to the required tilting angle of

##### 2.2. Discrete Element Model

The DEM simulation package PPDEM developed by Fu and Dafalias [18, 22] was used in this study. As described in the papers of Fu and Dafalias [18, 22], the PPDEM is capable of characterizing any noncircular particle shapes by using “polyarc” element. The initial fabric anisotropy of the specimen can be well represented by modeling the deposition process under gravity. In addition, local quantities such as local stress, strain, particle orientation, rotation, and void ratio can be measured conveniently by defining a polygon-shaped “mask”, whose vertex is attached to a particle.

To simulate the biaxial compression experiments of the rod assembly described above, three particle sizes with the major axis length of 4 mm, 2 mm, and 1 mm, respectively, were produced with their number ratio of 1 : 1 : 2, which was the same as the tested rod assembly. The biaxial compression specimens with various tilting angles were produced using the same method as described by Fu and Dafalias [18]. As Figure 3(a) shows, a “master pack” of 30000 particles was fabricated firstly by particle pluviation, whose bedding plane is horizontal. Then the “master pack” was rotated counterclockwise by the tilting angle of , and the biaxial compression specimens were “trimmed” horizontally out of the master pack. Around 10000 particles were included in the “trimmed” specimen with the initial size of 240 mm in height (

). The initial void ratios of all the specimens with different tilting angles were

(a)
(b)
(a)
(b) Specimen preparation and boundary of biaxial compression simulations: (a) fabrication of specimen with a tilting angle of

When the specimen was fabricated, four rigid walls were applied as the boundary of the specimen. The loading in the numerical simulations was controlled to be the same as that in laboratory experiments, which was shown in Figure 3(b). After the specimen was consolidated isotropically at the required confining pressure of

, the shearing began. In the vertical direction, the bottom wall moved upward at a specified rate while the top wall was kept immovable. The two end walls were free to move in the horizontal direction. For the left and right lateral walls, the horizontal confining pressure of was maintained constant by the wall servocontrol. However, both lateral walls moved vertically at the same speed as the bottom wall.

As described by Fu et al. [29], the overlap-area contact law was adopted for the interparticle behavior in PPDEM. The research conducted by Mirghasemi et al. [30] has demonstrated that the format of contact laws has minor effects on the macroscopic behavior of particle assemblage as long as the model parameters are appropriately selected. Thus the contact model for the tested steel rods has not been measured, and the overlap-area contact law is used for the numerical simulations. By conducting parameter sensitivity analysis, it is found that two parameters, the interparticle friction angle and the friction angle between particle and wall, have significant effects on the macromechanical behavior of particle assembly. These two parameters are chosen to be 30° and 10°, respectively, by comparing the stress-strain relationship between experiments and simulations for two series of tests varied in tilting angles and confining pressures.

##### 2.3. Verification of Discrete Element Model

Two series of biaxial compression tests are used to validate the discrete element models. One is the tests with the tilting angle of varying from 0° to 90° with interval of 15° while keeping the same confining pressure

kPa the other is conducted by changing the confining pressures at the same tilting angle

. Figures 4(a)–4(d) compare the evolution of the stress ratio

, and deformation pattern of laboratory experiments and numerical simulations for the first series of tests with , 30°, 60°, and 90°, respectively. The principal stresses of and are calculated using the same method as the laboratory experiments. The is obtained by dividing the average vertical force of two end walls by the specimen width, and the is calculated by dividing the horizontal force of two lateral walls by the specimen height.

(a)
(b)
(c)
(d)
(a)
(b)
(c)
(d) Stress-strain relationship and deformation comparison between experiments and simulations at different tilting angles: (a)

As Figures 4(a)–4(d) show, the key direction-related mechanical behavior of granular materials with inherent fabric anisotropy can be captured by numerical simulations with high fidelity, although the initial shear modulus of all simulations is a little bit higher than that of experiments. The response of the granular materials is significantly dependent on the loading direction for both simulations and experiments. For and , the principal stress ratio reaches a peak followed by strain softening. As the tilting angle increases, the strain softening is weakened. For and , the development of the principal stress ratio tends to be strain hardening, which progressively approaches plateaus and then remains constant. With the continuation of the deformation, the specimen contracts firstly and then dilates. The dilation is reduced with the increase of the tilting angle . It should be pointed out that these results are qualitatively similar to the plane strain test results obtained by Oda et al. [6] and Tatsuoka et al. [10]. In addition, the deformation pattern of specimens at typical states of A to G visually looks similar, which shows that both experiments and simulates should possess the same particle-scale deformation mechanism.

Figure 5 gives the comparison of the peak friction angle

with respect to the tilting angle . The peak friction angle corresponds to the maximum principal stress ratio in Figures 4(a)–4(d), which is calculated through the curve of the principal stress ratio versus axial strain by assuming zero cohesion. It can be seen that the evolution of the peak friction angle with respect to the tilting angle follows the same tendency for both experiments and simulations. As the tilting angle increases, the peak friction angle decreases. The biggest difference of between simulations and experiments is about 2°, which happens at .

Figure 6 shows the comparison of experiments and simulations for the second series of tests, in which three different confining pressures are applied for the tilting angle . For , the stress-strain relationship shows the characteristics of strain softening under three different confining pressures. The effects of confining pressure can be modeled. With the increase of confining pressure, the peak strength reduces. The specimen contracts followed by dilation. The maximum volumetric contraction increases as the confining pressure increases.

Stress-strain relationship comparison between experiments and simulations under different confining pressures.

#### 3. Investigation of Boundary Effects

It should be pointed out that the two lateral platens move vertically at the same speed as the bottom platen in the above biaxial compression tests and simulations, which is not common for compression tests. In the following, this mode of boundary control is denoted by Rigid boundary A. To investigate the effects of boundary condition, two other boundaries are used as Figure 7 shows. One boundary, denoted by Rigid boundary B, is the same as Rigid boundary A except that the two lateral walls are free to move in the vertical direction. The other boundary, denoted by Flexible boundary, resembles the conventional triaxial compression tests. The top and bottom boundaries are simulated by the rigid walls. The two lateral boundaries are flexible like membrane. The confining pressure is directly applied on particles as described by Fu and Dafalias [18].

(a)
(b)
(a)
(b)

## Course Descriptions

Elements of topology on the real line. Rigorous treatment of limits, continuity, differentiation, and the Riemann integral. Taylor series. Introduction to metric spaces. Pointwise and uniform convergence for sequences and series of functions. Applications.

MATH-GA.1410-002 Intro To Math Analysis I Recitation

3 Points, Mondays, 7:10-8:25PM, TBA

MATH-GA.2010-001 Numerical Methods I

3 Points, Mondays, 5:10-7:00PM, Benjamin Peherstorfer

A good background in linear algebra, and some experience with writing computer programs (in MATLAB, Python or another language). MATLAB will be used as the main language for the course. Alternatively, you can also use Python for the homework assignments. You are encouraged but not required to learn and use a compiled language.

This course is part of a two-course series meant to introduce graduate students in mathematics to the fundamentals of numerical mathematics (but any Ph.D. student seriously interested in applied mathematics should take it). It will be a demanding course covering a broad range of topics. There will be extensive homework assignments involving a mix of theory and computational experiments, and an in-class final. Topics covered in the class include floating-point arithmetic, solving large linear systems, eigenvalue problems, interpolation and quadrature (approximation theory), nonlinear systems of equations, linear and nonlinear least squares, nonlinear optimization, and Fourier transforms. This course will not cover differential equations, which form the core of the second part of this series, Numerical Methods II.

Recommended Text (Springer books are available online from the NYU network):

• Deuflhard, P. & Hohmann, A. (2003). Numerical Analysis in Modern Scientific Computing. Texts in Applied Mathematiks [Series, Bk. 43]. New York, NY: Springer-Verlag.

Further Reading (available on reserve at the Courant Library):

• Bau III, D., & Trefethen, L.N. (1997). Numerical Linear Algebra. Philadelphia, PA: Society for Industrial & Applied Mathematics.Quarteroni, A., Sacco, R., & Saleri, F. (2006). Numerical Mathematics (2 nd ed.). Texts in Applied Mathematics [Series, Bk. 37]. New York, NY: Springer-Verlag.

If you want to brush up your MATLAB:

• Gander, W., Gander, M.J., & Kwok, F. (2014). Scientific Computing &ndash An Introduction Using Maple and MATLAB. Texts in Computation Science and Engineering [Series, Vol. 11]. New York, NY: Springer-Verlag.
• Moler, C. (2004). Numerical Computing with Matlab. SIAM. Available online.

MATH-GA.2011-001 Advanced Topics In Numerical Analysis: Inverse Problems

3 Points, Wednesdays, 5:10-7:00PM, Georg Stadler

MATH-GA.2011-002 Advanced Topics In Numerical Analysis: Numerical Optimization

3 Points, Tuesdays, 1:25-3:15PM, Margaret Wright

MATH-GA.2011-003 Advanced Topics In Numerical Analysis: Computational Methods For Classical PDEs In The Physical Sciences

3 Points, Wednesdays, 11:00-12:50PM, Aleksandar Donev

This seminar will follow up on Numerical Methods II and cover more advanced computational methods (finite difference (FD), volume (FV), and element (FE) schemes including boundary conditions, and boundary integral methods) for solving PDEs that arise in physical sciences. The class will assume familiarity with multi-step and multi-stage (Runge-Kutta) methods for solving systems of ODEs including stability theory, basic finite difference methods for elliptic, parabolic, and hyperbolic PDEs (including von Neumann/Fourier stability analysis), and basic spectral methods (e.g. FFT-based schemes for periodic domains). The class will cover electrostatics (FD and FEM for Poisson including geometric and algebraic multigrid, Laplace in confined domains using boundary integral), elasticity (variational formulation, finite-element methods), linear wave equation (electromagnetism, acoustics, geofluids), and fluid dynamics (FD/FV for 1D conservation laws including dispersion and dissipation, stability, accuracy, and modified equations, FV advection-diffusion including limiters, MAC/FEM for incompressible Navier-Stokes, immersed-boundary method, Stokes flow including boundary-integral methods).

MATH-GA.2041-001 Computing In Finance

3 Points, Thursdays, 7:10-9:00PM, Lee Maclin and Eran Fishler

Prerequisites: Procedural programming, some knowledge of Java recommended.

Description: This course will introduce students to the software development process, including applications in financial asset trading, research, hedging, portfolio management, and risk management. Students will use the Java programming language to develop object-oriented software, and will focus on the most broadly important elements of programming - superior design, effective problem solving, and the proper use of data structures and algorithms. Students will work with market and historical data to run simulations and test strategies. The course is designed to give students a feel for the practical considerations of software development and deployment. Several key technologies and recent innovations in financial computing will be presented and discussed.

MATH-GA.2043-001 Scientific Computing

3 Points, Thursdays, 5:10-7:00PM, Jonathan Goodman

Prerequisites: Undergraduate multivariate calculus and linear algebra. Programming experience strongly recommended but not required.

Overview: This course is intended to provide a practical introduction to computational problem solving. Topics covered include: the notion of well-conditioned and poorly conditioned problems, with examples drawn from linear algebra the concepts of forward and backward stability of an algorithm, with examples drawn from floating point arithmetic and linear-algebra basic techniques for the numerical solution of linear and nonlinear equations, and for numerical optimization, with examples taken from linear algebra and linear programming principles of numerical interpolation, differentiation and integration, with examples such as splines and quadrature schemes an introduction to numerical methods for solving ordinary differential equations, with examples such as multistep, Runge Kutta and collocation methods, along with a basic introduction of concepts such as convergence and linear stability An introduction to basic matrix factorizations, such as the SVD techniques for computing matrix factorizations, with examples such as the QR method for finding eigenvectors Basic principles of the discrete/fast Fourier transform, with applications to signal processing, data compression and the solution of differential equations.

This is not a programming course but programming in homework projects with MATLAB/Octave and/or C is an important part of the course work. As many of the class handouts are in the form of MATLAB/Octave scripts, students are strongly encouraged to obtain access to and familiarize themselves with these programming environments.

• Bau III, D., & Trefethen, L.N. (1997). Numerical Linear Algebra. Philadelphia, PA: Society for Industrial & Applied Mathematics
• Quarteroni, A.M., & Saleri, F. (2006). Texts in Computational Science & Engineering [Series, Bk. 2]. Scientific Computing with MATLAB and Octave (2 nd ed.). New York, NY: Springer-Verlag
• Otto, S.R., & Denier, J.P. (2005). An Introduction to Programming and Numerical Methods in MATLAB. London: Springer-Verlag London

MATH-GA.2045-001 Nonlinear Problems In Finance: Models And Computational Methods

3 Points, Wednesdays, 7:10-9:00PM, TBA

Prerequisites: Continuous Time Finance or permission of instructor.

Description: The classical curriculum of mathematical finance programs generally covers the link between linear parabolic partial differential equations (PDEs) and stochastic differential equations (SDEs), resulting from Feynmam-Kac&rsquos formula. However, the challenges faced by today&rsquos practitioners mostly involve nonlinear PDEs. The aim of this course is to provide the students with the mathematical tools and numerical methods required to tackle these issues, and illustrate the methods with practical case studies like American option pricing, uncertain volatility, uncertain mortality, different rates for borrowing and lending, calibration of models to market smiles, credit valuation adjustment (CVA), transaction costs, illiquid markets, super-replication under delta and gamma constraints, etc.

We will strive to make this course reasonably comprehensive, and to find the right balance between ideas, mathematical theory, and numerical implementations. We will spend some time on the theory: optimal stopping, stochastic control, backward stochastic differential equations (BSDEs), McKean SDEs, branching diffusions. But the main focus will deliberately be on ideas and numerical examples, which we believe help a lot in understanding the tools and building intuition.

PDE methods suffer from the curse of dimensionality. Since most quantitative finance problems are highdimensional,

we will mostly focus on simulation-based methods (a.k.a. Monte Carlo algorithms). This course exposes the students with a wide variety of Machine Learning techniques, old and new, including parametric regression, nonparametric regression, neural networks, kernel trick, etc. These techniques allow us to compute some quantities that are key ingredients of the nonlinear Monte Carlo algorithms.

The Python programming language will be used to provide simple numerical simulations illustrating the methods presented in the course. Homeworks will allow the students to check their understanding of the course by solving exercises inspired by our experience as quantitative analysts, and will involve some coding in Python.

• Guyon, J. and Henry-Labordère, P.: Nonlinear Option Pricing, Chapman & Hall/CRC Financial Mathematics Series, 2014.

MATH-GA.2046-001 Advanced Statistical Inference And Machine Learning

3 Points, Wednesdays, 5:10-7:00PM, Gordon Ritter

Prerequisites: Financial Securities and Markets Risk & Portfolio Management and Computing in Finance, or equivalent programming experience.

Description: A rigorous background in Bayesian statistics geared towards applications in finance, including decision theory and the Bayesian approach to modeling, inference, point estimation, and forecasting, sufficient statistics, exponential families and conjugate priors, and the posterior predictive density. A detailed treatment of multivariate regression including Bayesian regression, variable selection techniques, multilevel/hierarchical regression models, and generalized linear models (GLMs). Inference for classical time-series models, state estimation and parameter learning in Hidden Markov Models (HMMs) including the Kalman filter, the Baum-Welch algorithm and more generally, Bayesian networks and belief propagation. Solution techniques including Markov Chain Monte Carlo methods, Gibbs Sampling, the EM algorithm, and variational mean field. Real world examples drawn from finance to include stochastic volatility models, portfolio optimization with transaction costs, risk models, and multivariate forecasting.

MATH-GA.2047-001 Trends In Financial Data Science

3 Points, Tuesdays, 7:10-9:00PM, Petter Kolm and Ivailo Dimov

Prerequisites: The following four courses, or equivalent: (1) Data Science and Data-Driven Modeling, (2) Financial Securities and Markets, (3) Machine Learning & Computational Statistics, and (4) Risk and Portfolio Management. It is important you have experience with the Python stack.

Course description: This is a full semester course covering recent and relevant topics in alternative data, machine learning and data science relevant to financial modeling and quantitative finance. This is an advanced course that is suitable for students who have taken the more basic graduate machine learning and finance courses Data Science and Data-Driven Modeling, and Machine Learning & Computational Statistics, Financial Securities and Markets, and Risk and Portfolio Management.

MATH-GA.2049-001 Alternative Data In Quantitative Finance (2nd Half Of Semester)

1.5 Points, Thursdays, 7:10-9:00PM, Gene Ekster

Prerequisites: Risk and Portfolio Management and Computing in Finance. In addition, students should have a working knowledge of statistics, finance, and basic machine learning. Students should have working experience with the Python stack (numpy/pandas/scikit-learn).

Description: This half-semester elective course examines techniques dealing with the challenges of the alternative data ecosystem in quantitative and fundamental investment processes. We will address the quantitative tools and technique for alternative data including identifier mapping, stable panel creation, dataset evaluation and sensitive information extraction. We will go through the quantitative process of transferring raw data into investment data and tradable signals using text mining, time series analysis and machine learning. It is important that students taking this course have working experience with Python Stack. We will analyze real-world datasets and model them in Python using techniques from statistics, quantitative finance and machine learning.

MATH-GA.2070-001 Data Science And Data-Driven Modeling (1st Half Of Semester)

1.5 Points, Tuesdays, 7:10-9:00PM, TBA

This is a half-semester course covering practical aspects of econometrics/statistics and data science/machine learning in an integrated and unified way as they are applied in the financial industry. We examine statistical inference for linear models, supervised learning (Lasso, ridge and elastic-net), and unsupervised learning (PCA- and SVD-based) machine learning techniques, applying these to solve common problems in finance. In addition, we cover model selection via cross-validation manipulating, merging and cleaning large datasets in Python and web-scraping of publicly available data.

MATH-GA.2110-002 Linear Algebra I

3 Points, Tuesdays, 5:10-7:00PM, TBA

Prerequisites: Undergraduate linear algebra or permission of the instructor.

Description: Linear spaces, subspaces. Linear dependence, linear independence span, basis, dimension, isomorphism. Quotient spaces. Linear functionals, dual spaces. Linear mappings, null space, range, fundamental theorem of linear algebra. Underdetermined systems of linear equations. Composition, inverse, transpose of linear maps, algebra of linear maps. Similarity transformations. Matrices, matrix Multiplication, Matrix Inverse, Matrix Representation of Linear Maps determinant, Laplace expansion, Cramer's rule. Eigenvalue problem, eigenvalues and eigenvectors, characteristic polynomial, Cayley-Hamilton theorem. Diagonalization.

Lax, P.D. (2007). Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts [Series, Bk. 78]. Linear Algebra and Its Applications (2 nd ed.). Hoboken, NJ: John Wiley & Sons/ Wiley-Interscience.

MATH-GA.2111-001 Linear Algebra (One-Term)

3 Points, Thursdays, 9:00-10:50AM, Dimitris Giannakis

Description: Linear algebra is two things in one: a general methodology for solving linear systems, and a beautiful abstract structure underlying much of mathematics and the sciences. This course will try to strike a balance between both. We will follow the book of our own Peter Lax, which does a superb job in describing the mathematical structure of linear algebra, and complement it with applications and computing. The most advanced topics include spectral theory, convexity, duality, and various matrix decompositions.

Text: Lax, P.D. (2007). Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts [Series, Bk. 78]. Linear Algebra and Its Applications (2 nd ed.). Hoboken, NJ: John Wiley & Sons/ Wiley-Interscience.

Recommended Text: Strang, G. (2005). Linear Algebra and Its Applications (4 th ed.). Stamford, CT: Cengage Learning.

3 Points, Thursdays, 7:10-9:00PM, Alena Pirutka

Prerequisites: Elements of linear algebra and the theory of rings and fields.

Description: Basic concepts of groups, rings and fields. Symmetry groups, linear groups, Sylow theorems quotient rings, polynomial rings, ideals, unique factorization, Nullstellensatz field extensions, finite fields.

• Artin, M. (2010). Featured Titles for Abstract Alagebra [Series]. Algebra (2 nd ed.). Upper Saddle River, NJ: Pearson
• Chambert-Loir, A. (2004). Undergraduate Texts in Mathematics [Series]. A Field Guide to Algebra (2005 ed.). New York, NY: Springer-Verlag
• Serre, J-P. (1996). Graduate Texts in Mathematics [Series, Vol. 7]. A Course in Arithmetic (Corr. 3 rd printing 1996 ed.). New York, NY: Springer-Verlag

3 Points, Thursdays, 5:10-7:00PM, Robert Ji Wai Young

Prerequisites: Any knowledge of groups, rings, vector spaces and multivariable calculus is helpful. Undergraduate students planning to take this course must have V63.0343 Algebra I or permission of the Department.

Description: After introducing metric and general topological spaces, the emphasis will be on the algebraic topology of manifolds and cell complexes. Elements of algebraic topology to be covered include fundamental groups and covering spaces, homotopy and the degree of maps and its applications. Some differential topology will be introduced including transversality and intersection theory. Some examples will be taken from knot theory.

• Hatcher, A. (2002). Algebraic Topology. New York, NY: Cambridge University Press
• Munkres, J. (2000). Topology (2 nd ed.). Upper Saddle River, NJ: Prentice-Hall/ Pearson Education
• Guillemin, V., Pollack, A. (1974). Differential Topology. Englewood Cliffs, NJ: Prentice-Hall
• Milnor, J.W. (1997). Princeton Landmarks in Mathematics [Series]. Topology from a Differentiable Viewpoint (Rev. ed.). Princeton, NJ: Princeton University Press

MATH-GA.2350-001 Differential Geometry I

3 Points, Tuesdays, 3:20-5:10PM, Jeff Cheeger

Prerequisites: Multivariable calculus and linear algebra.

Description: Differentiable manifolds, tangent bundle, embedding theorems, vector fields and differential forms. Introduction to Riemannian metrics, connections and geodesics.

Text: Lee, J.M. (2009). Graduate Studies in Mathematics [Series, Vol. 107]. Manifolds and Differential Geometry. Providence, RI: American Mathematical Society.

MATH-GA.2420-001 Advanced Topics Mathematics: Working Group In Modeling And Simulation

3 Points, Thursdays, 12:30-2:00PM, Aleksandar Donev and Miranda Holmes-Cerfon and Leif Ristroph

As part of our new NSF research training group (RTG) in Modeling & Simulation, we will be organizing a lunchtime group meeting for students, postdocs, and faculty working in applied mathematics who do modeling & simulation. The aim is to create a space to discuss applied mathematics research in an informal setting: to (a) give students and postdocs a chance to present their research (or a topic of common interest) and get feedback from the group, (b) learn about other ongoing and future research activities in applied math at the Institute, and (c) discuss important open problems and research challenges.

MATH-GA.2420-002 Advanced Topics: Seminar In AOS

3 Points, Fridays, 3:45-5:00PM, TBA

Description: The Atmosphere Ocean Science Student Seminar focuses on research and presentation skills. The course is spread across two semesters, and participants are expected to participate in both to earn the full 3 credits. Participants will prepare and present a full length (45-50 minute) talk on their research each semester, for a total of two over the duration of the course. In addition, short &ldquoelevator talks&rdquo are developed and given in the second semester, the goal being to encapsulate the key points of your research in under 5 minutes. A main goal of the course is learning to present your research to different audiences. We consider overview talks, appropriate for a department wide colloquium, specialty talks, as would be given in a focused seminar, and a broad pitch you would give when meeting people and entering the job market. When not presenting, students are expected to engage with the speaker, asking questions and providing feedback at the end of the talk.

MATH-GA.2430-001 Real Variables (One-Term)

3 Points, Mondays, Wednesdays, 9:30-10:45AM, Jalal Shatah

Note: Master's students need permission of course instructor before registering for this course.

Prerequisites: A familiarity with rigorous mathematics, proof writing, and the epsilon-delta approach to analysis, preferably at the level of MATH-GA 1410, 1420 Introduction to Mathematical Analysis I, II.

Description: Measure theory and integration. Lebesgue measure on the line and abstract measure spaces. Absolute continuity, Lebesgue differentiation, and the Radon-Nikodym theorem. Product measures, the Fubini theorem, etc. L p spaces, Hilbert spaces, and the Riesz representation theorem. Fourier series.

Main Text: Folland's Real Analysis: Modern Techniques and Their Applications

Secondary Text: Bass' Real Analysis for Graduate Students

MATH-GA.2450-001 Complex Variables I

3 Points, Mondays, 9:00-10:50AM, Antoine Cerfon

Description: Complex numbers analytic functions Cauchy-Riemann equations Cauchy's theorem Laurent expansion analytic continuation calculus of residues conformal mappings.

Text: Marsden and Hoffman, Basic Complex Analysis, 3d edition

MATH-GA.2451-001 Complex Variables (One-Term)

3 Points, Tuesdays, Thursdays, 2:00-3:15PM, Fengbo Hang

Note: Master's students need permission of course instructor before registering for this course.

Prerequisites: Complex Variables I (or equivalent) and MATH-GA 1410 Introduction to Math Analysis I.

Description: Complex numbers, the complex plane. Power series, differentiability of convergent power series. Cauchy-Riemann equations, harmonic functions. conformal mapping, linear fractional transformation. Integration, Cauchy integral theorem, Cauchy integral formula. Morera's theorem. Taylor series, residue calculus. Maximum modulus theorem. Poisson formula. Liouville theorem. Rouche's theorem. Weierstrass and Mittag-Leffler representation theorems. Singularities of analytic functions, poles, branch points, essential singularities, branch points. Analytic continuation, monodromy theorem, Schwarz reflection principle. Compactness of families of uniformly bounded analytic functions. Integral representations of special functions. Distribution of function values of entire functions.

Text: Ahlfors, L. (1979). International Series in Pure and Applied Mathematics [Series, Bk. 7]. Complex Analysis (4thin ed.). New York, NY: McGraw-Hill.

MATH-GA.2490-001 Introduction To Partial Differential Equations

3 Points, Mondays, 11:00-12:50PM, Guido DePhilippis

Note: Master's students should consult course instructor before registering for PDE II in the spring.

Prerequisites: Knowledge of undergraduate level linear algebra and ODE also some exposure to complex variables (can be taken concurrently).

Description: A basic introduction to PDEs, designed for a broad range of students whose goals may range from theory to applications. This course emphasizes examples, representation formulas, and properties that can be understood using relatively elementary tools. We will take a broad viewpoint, including how the equations we consider emerge from applications, and how they can be solved numerically. Topics will include: the heat equation the wave equation Laplace's equation conservation laws and Hamilton-Jacobi equations. Methods introduced through these topics will include: fundamental solutions and Green's functions energy principles maximum principles separation of variables Duhamel's principle the method of characteristics numerical schemes involving finite differences or Galerkin approximation and many more.

• Guenther, R.B., & Lee, J.W. (1996). Partial Differential Equations of Mathematical Physics and Integral Equations. Mineola, NY: Dover Publications.
• Evans, L.C. (2010). Graduate Studies in Mathematics [Series, Bk. 19]. Partial Differential Equations (2 nd ed.). Providence, RI: American Mathematical Society.

3 Points, Thursdays, 9:00-10:50AM, Sylvia Serfaty

Prerequisites: MATH-GA 2500 (Partial Differential Equations), or a comparable introduction to PDE using Sobolev spaces and functional analysis.

Selected PDE topics of broad importance and applicability, including: boundary integral methods for elliptic PDE regularity via Schauder estimates steepest-descent and dynamical systems perspectives on some nonlinear parabolic equations weak and strong solutions of the Navier-Stokes equations and topics from the calculus of variations, including homogenization and Gamma convergence.

MATH-GA.2563-001 Harmonic Analysis

3 Points, Wednesdays, 3:20-5:10PM, Sinan Gunturk

Real analysis basic knowledge of complex variables and functional analysis.

Description:

Fourier series and integrals, Hardy-Littlewood maximal function, interpolation theory, Hilbert transform, singular integrals and Calderon-Zygmund theory, oscillatory integrals, Littlewood-Paley theory, pseudo-differential operators and Sobolev spaces. If time allows: paradifferential calculus, T1 theorem.

The course will follow the book

Fourier Analysis by Javier Duoandikoetxea, Graduate Studies in Mathematics, AMS, 2001, as well as the lecture notes by Terry Tao

MATH-GA.2701-001 Methods Of Applied Math

3 Points, Thursdays, 3:20-5:10PM, Aaditya Rangan

Prerequisites: Elementary linear algebra and differential equations.

Description: This is a first-year course for all incoming PhD and Masters students interested in pursuing research in applied mathematics. It provides a concise and self-contained introduction to advanced mathematical methods, especially in the asymptotic analysis of differential equations. Topics include scaling, perturbation methods, multi-scale asymptotics, transform methods, geometric wave theory, and calculus of variations

• Barenblatt, G.I. (1996). Cambridge Texts in Applied Mathematics [Series, Bk. 14]. Scaling, Self-similarity, and Intermediate Asymptotics: Dimensional Analysis and Intermediate Asymptotics. New York, NY: Cambridge University Press
• Hinch, E.J. (1991). Cambridge Texts in Applied Mathematics [Series, Bk. 6]. Perturbation Methods. New York, NY: Cambridge University Press
• Bender, C.M., & Orszag, S.A. (1999). Advanced Mathematical Methods for Scientists and Engineers [Series, Vol. 1]. Asymptotic Methods and Perturbation Theory. New York, NY: Springer-Verlag
• Whitham, G.B. (1999). Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts [Series Bk. 42]. Linear and Nonlinear Waves (Reprint ed.). New York, NY: John Wiley & Sons/ Wiley-Interscience
• Gelfand, I.M., & Fomin, S.V. (2000). Calculus of Variations. Mineola, NY: Dover Publications

MATH-GA.2702-001 Fluid Dynamics

3 Points, Wednesdays, 1:25-3:15PM, Antoine Cerfon

Prerequisites: Introductory complex variable and partial differential equations.

Description: The course will expose students to basic fluid dynamics from a mathematical and physical perspectives, covering both compressible and incompressible flows. Topics: conservation of mass, momentum, and Energy. Eulerian and Lagrangian formulations. Basic theory of inviscid incompressible and compressible fluids, including the formation of shock waves. Kinematics and dynamics of vorticity and circulation. Special solutions to the Euler equations: potential flows, rotational flows, irrotational flows and conformal mapping methods. The Navier-Stokes equations, boundary conditions, boundary layer theory. The Stokes Equations.

Text: Childress, S. Courant Lecture Notes in Mathematics [Series, Bk. 19]. An Introduction to Theoretical Fluid Mechanics. Providence, RI: American Mathematical Society/ Courant Institute of Mathematical Sciences.

Recommended Text: Acheson, D.J. (1990). Oxford Applied Mathematics & Computing Science Series [Series]. Elementary Fluid Dynamics. New York, NY: Oxford University Press.

MATH-GA.2707-001 Time Series Analysis & Statistical Arbitrage

3 Points, Mondays, 5:10-7:00PM, Farshid Asl and Robert Reider

Prerequisites: Financial Securities and Markets Scientific Computing in Finance (or Scientific Computing) and familiarity with basic probability.

Description: The term "statistical arbitrage" covers any trading strategy that uses statistical tools and time series analysis to identify approximate arbitrage opportunities while evaluating the risks inherent in the trades (considering the transaction costs and other practical aspects). This course starts with a review of Time Series models and addresses econometric aspects of financial markets such as volatility and correlation models. We will review several stochastic volatility models and their estimation and calibration techniques as well as their applications in volatility based trading strategies. We will then focus on statistical arbitrage trading strategies based on cointegration, and review pairs trading strategies. We will present several key concepts of market microstructure, including models of market impact, which will be discussed in the context of developing strategies for optimal execution. We will also present practical constraints in trading strategies and further practical issues in simulation techniques. Finally, we will review several algorithmic trading strategies frequently used by practitioners.

MATH-GA.2751-001 Risk & Portfolio Management

3 Points, Wednesdays, 5:10-7:00PM, Kenneth Winston

Prerequisites: Multivariate calculus, linear algebra, and calculus-based probability.

Description: Risk management is arguably one of the most important tools for managing investment portfolios and trading books and quantifying the effects of leverage and diversification (or lack thereof).

This course is an introduction to portfolio and risk management techniques for portfolios of (i) equities, delta-1 securities, and futures and (ii) basic fixed income securities.

A systematic approach to the subject is adopted, based on selection of risk factors, econometric analysis, extreme-value theory for tail estimation, correlation analysis, and copulas to estimate joint factor distributions. We will cover the construction of risk measures (e.g. VaR and Expected Shortfall) and portfolios (e.g. portfolio optimization and risk). As part of the course, we review current risk models and practices used by large financial institutions.

It is important that students taking this course have good working knowledge of multivariate calculus, linear algebra and calculus-based probability.

MATH-GA.2755-001 Project & Presentation

3 Points, Thursdays, 5:10-7:00PM, Petter Kolm

Students in the Mathematics in Finance program conduct research projects individually or in small groups under the supervision of finance professionals. The course culminates in oral and written presentations of the research results.

MATH-GA.2791-001 Financial Securities And Markets

3 Points, Wednesdays, 7:10-9:00PM, Marco Avellaneda

Prerequisites: Multivariate calculus, linear algebra, and calculus-based probability.

This course provides a quantitative introduction to financial securities for students who are aspiring to careers in the financial industry. We study how securities traded, priced and hedged in the financial markets. Topics include: arbitrage risk-neutral valuation the log-normal hypothesis binomial trees the Black-Scholes formula and applications the Black-Scholes partial differential equation American options one-factor interest rate models swaps, caps, floors, swaptions, and other interest-based derivatives credit risk and credit derivatives clearing valuation adjustment and capital requirements. It is important that students taking this course have good working knowledge of multivariate calculus, linear algebra and calculus-based probability.

MATH-GA.2793-001 Dynamic Asset Pricing (2nd Half Of Semester)

1.5 Points, Mondays, 7:10-9:00PM, Alireza Javaheri and Samim Ghamami

Prerequisites: Calculus-based probability, Stochastic Calculus, and a one semester course on derivative pricing (such as what is covered in Financial Securities and Markets).

MATH-GA.2803-001 Fixed Income Derivatives: Models & Strategies In Practice (1st Half Of Semester)

1.5 Points, Thursdays, 7:10-9:00PM, Leon Tatevossian and Amir Sadr

Prerequisites: Computing in Finance, or equivalent programming skills and Financial Securities and Markets, or equivalent familiarity with Black-Scholes interest rate models.

Description: This half-semester class focuses on the practical workings of the fixed-income and rates-derivatives markets. The course content is motivated by a representative set of real-world trading, investment, and hedging objectives. Each situation will be examined from the ground level and its risk and reward attributes will be identified. This will enable the students to understand the link from the underlying market views to the applicable product set and the tools for managing the position once it is implemented. Common threads among products &ndash structural or model-based &ndash will be emphasized. We plan on covering bonds, swaps, flow options, semi-exotics, and some structured products.

A problem-oriented holistic view of the rate-derivatives market is a natural way to understand the line from product creation to modeling, marketing, trading, and hedging. The instructors hope to convey their intuition about both the power and limitations of models and show how sell-side practitioners manage these constraints in the context of changes in market backdrop, customer demands, and trading parameters.

MATH-GA.2805-001 Trends In Sell-Side Modeling: Xva, Capital And Credit Derivatives

3 Points, Tuesdays, 5:10-7:00PM, Leif Andersen

Prerequisites: Advanced Risk Management Financial Securities and Markets, or equivalent familiarity with market and credit risk models and Computing in Finance, or equivalent programming experience.

Description: This class explores technical and regulatory aspects of counterparty credit risk, with an emphasis on model building and computational methods. The first part of the class will provide technical foundation, including the mathematical tools needed to define and compute valuation adjustments such as CVA and DVA. The second part of the class will move from pricing to regulation, with an emphasis on the computational aspects of regulatory credit risk capital under Basel 3. A variety of highly topical subjects will be discussed during the course, including: funding costs, XVA metrics, initial margin, credit risk mitigation, central clearing, and balance sheet management. Students will get to build a realistic computer system for counterparty risk management of collateralized fixed income portfolios, and will be exposed to modern frameworks for interest rate simulation and capital management.

MATH-GA.2830-004 Advanced Topics In Applied Math: Mathematical Statistics

3 Points, Fridays, 2:00-3:40PM, Jonathan Niles-Weed

MATH-GA.2830-005 Advanced Topics In Applied Math: Mathematical Statistics Lab

3 Points, Fridays, 4:55-5:45PM, TBA

MATH-GA.2901-001 Essentials Of Probability

3 Points, Wednesdays, 5:10-7:00PM, Charles Newman

Calculus through partial derivatives and multiple integrals no previous knowledge of probability is required.

The course introduces the basic concepts and methods of probability.

Topics include: probability spaces, random variables, distributions, law of large numbers, central limit theorem, random walk, Markov chains and martingales in discrete time, and if time allows diffusion processes including Brownian motion.

Probability Essentials, by J.Jacod and P.Protter. Springer, 2004.

MATH-GA.2902-002 Stochastic Calculus Optional Problem Session

3 Points, Wednesdays, 5:30-7:00PM, TBA

MATH-GA.2903-001 Stochatic Calculus (2nd Half Of Semester)

1.5 Points, Mondays, 7:10-9:00PM, Jonathan Goodman

Prerequisite: Multivariate calculus, linear algebra, and calculus-based probability.

Description: The goal of this half-semester course is for students to develop an understanding of the techniques of stochastic processes and stochastic calculus as it is applied in financial applications. We begin by constructing the Brownian motion (BM) and the Ito integral, studying their properties. Then we turn to Ito&rsquos lemma and Girsanov&rsquos theorem, covering several practical applications. Towards the end of the course, we study the linkage between SDEs and PDEs through the Feynman-Kac equation. It is important that students taking this course have good working knowledge of calculus-based probability.

MATH-GA.2911-001 Probability Theory I

3 Points, Tuesdays, 11:00-12:50PM, Paul Bourgade

A first course in probability, familiarity with Lebesgue integral, or MATH-GA 2430 Real Variables as mandatory co-requisite.

First semester in an annual sequence of Probability Theory, aimed primarily for Ph.D. students. Topics include laws of large numbers, weak convergence, central limit theorems, conditional expectation, martingales and Markov chains.

MATH-GA.2931-001 Advanced Topics In Probability: Topic TBA (September 21st Thru November 12th)

3 Points, Tuesdays, Thursdays, 9:00-10:50AM, Ofer Zeitouni

MATH-GA.3001-001 Geophysical Fluid Dynamics

3 Points, Tuesdays, 9:00-10:50AM, Oliver Buhler

This course serves as an introduction to the fundamentals of geophysical fluid dynamics. No prior knowledge of fluid dynamics will be assumed, but the course will move quickly into the subtopic of rapidly rotating, stratified flows. Topics to be covered include (but are not limited to): the advective derivative, momentum conservation and continuity, the rotating Navier-Stokes equations and non-dimensional parameters, equations of state and thermodynamics of Newtonian fluids, atmospheric and oceanic basic states, the fundamental balances (thermal wind, geostrophic and hydrostatic), the rotating shallow water model, vorticity and potential vorticity, inertia-gravity waves, geostrophic adjustment, the quasi-geostrophic approximation and other small-Rossby number limits, Rossby waves, baroclinic and barotropic instabilities, Rayleigh and Charney-Stern theorems, geostrophic turbulence. Students will be assigned bi-weekly homework assignments and some computer exercises, and will be expected to complete a final project. This course will be supplemented with out-of-class instruction.

## Simulating Dependent Random Variables Using Copulas

This example shows how to use copulas to generate data from multivariate distributions when there are complicated relationships among the variables, or when the individual variables are from different distributions.

MATLAB® is an ideal tool for running simulations that incorporate random inputs or noise. Statistics and Machine Learning Toolbox™ provides functions to create sequences of random data according to many common univariate distributions. The Toolbox also includes a few functions to generate random data from multivariate distributions, such as the multivariate normal and multivariate t. However, there is no built-in way to generate multivariate distributions for all marginal distributions, or in cases where the individual variables are from different distributions.

Recently, copulas have become popular in simulation models. Copulas are functions that describe dependencies among variables, and provide a way to create distributions to model correlated multivariate data. Using a copula, a data analyst can construct a multivariate distribution by specifying marginal univariate distributions, and choosing a particular copula to provide a correlation structure between variables. Bivariate distributions, as well as distributions in higher dimensions, are possible. In this example, we discuss how to use copulas to generate dependent multivariate random data in MATLAB, using Statistics and Machine Learning Toolbox.

### Dependence Between Simulation Inputs

One of the design decisions for a Monte-Carlo simulation is a choice of probability distributions for the random inputs. Selecting a distribution for each individual variable is often straightforward, but deciding what dependencies should exist between the inputs may not be. Ideally, input data to a simulation should reflect what is known about dependence among the real quantities being modelled. However, there may be little or no information on which to base any dependence in the simulation, and in such cases, it is a good idea to experiment with different possibilities, in order to determine the model's sensitivity.

However, it can be difficult to actually generate random inputs with dependence when they have distributions that are not from a standard multivariate distribution. Further, some of the standard multivariate distributions can model only very limited types of dependence. It's always possible to make the inputs independent, and while that is a simple choice, it's not always sensible and can lead to the wrong conclusions.

For example, a Monte-Carlo simulation of financial risk might have random inputs that represent different sources of insurance losses. These inputs might be modeled as lognormal random variables. A reasonable question to ask is how dependence between these two inputs affects the results of the simulation. Indeed, it might be known from real data that the same random conditions affect both sources, and ignoring that in the simulation could lead to the wrong conclusions.

## Abstract

The use of Pipeline Inspection Gauges (PIGs) to clean the pipes and understand the exact location of existing pipes is essential to protect them from damage during operation. In-depth study of the most important component sealing disc is a new method to solve the failure of the current PIG in the working process. This study focuses on redesigning the sealing disc in combination with soft robot drive technology. First, the structure of a new multi-airbag sealing disc is composed of a conventional sealing disc and a plurality of airbags located above it. Then, the multi-airbag sealing disc was successfully fabricated by using 3D printing technology and multi-step molding, and finally the active control of the multi-airbag sealing disc was realized by inflating the airbag. For the newly designed multi-airbag sealing disc, the most important concern is the bending angle β of the sealing disc and the expansion ratio η of the airbag. Numerical and experimental results were compared to verify the accuracy of the numerical method. Based on this method, the effects of the position of the airbag, the height rate, and the thickness rate on the bending angle of the multi-airbag sealing disc and the expansion ratio of the airbag were studied. The results show that at a given air pressure, the position of the airbag does not substantially affect the expansion ratio of the airbag itself, and the smaller the distance between the airbag and the outer edge of the sealing disc, the easier it is to bend the sealing disc. The larger the height rate, the smaller the thickness rate, and the larger the bending angle of the sealing disc, the larger the expansion ratio of the airbag.