Introduction to Monte Carlo Simulation

For Summer 1997 Envision-It! Workshop

A technique which has had a great impact in many different fields of computational science is a technique called "Monte Carlo Simulation." This technique derives its name from the casinos in Monte Carlo - a Monte Carlo simulation uses random numbers to model some sort of a process. This technique works particularly well when the process is one where the underlying probabilities are known but the results are more difficult to determine. A great deal of the CPU time on some of the fastest computers in the world is spent performing Monte Carlo simulations because we can write down some of the fundamental laws of physics but cannot analytically solve them for problems of interest.

An example of how a Monte Carlo simulation can be used in everyday life is a project one of my students did in a FORTRAN class - he wanted to find the best strategy for winning money at blackjack. The conventional approach (using just statistics) would be to write down the probability of having a particular combination (such as having an Ace and a Five with the dealer showing a Jack) and then calculating the expected payoff from each possible scenario (hit or no hit, but then you must calculate what to do if you get a seven after a hit). If you start thinking about all of the possible it quickly becomes overwhelming. This problem, however, works very well as a Monte Carlo simulation. We know the underlying probabilities (drawing any particular card out a deck is 1/52 or if there are 5 decks in a shoe it is 1/5*52) and so all we need are the "rules" to use. In this case, the student wrote a program which would randomly generate a shoe of 5 decks of cards. It would then "deal" the cards to him and the "dealer." The "dealer" always followed the standard rules (hit on 16 and stay on 17) and the student programmed in the betting stratagy he wanted to test (on what combinations he would hit or stay, double, split, etc.). Then he would run the program and have it generate several hundred shoes and keep track of the winnings (or losings) and print the results at the end (this would take about an hour on a PC). One can then test various stratagies and see how they will actually work out in the long run.

Let us do a simple example of a Monte Carlo simulation to illustrate the technique. First, let us consider the following problem, we want to make a simulation that will allow us to find the value of pi. We will do this in the following way: consider a square that has one corner at the origin of a coordinate system and has sides of length 1 - it will obviously have an area of 1. Now consider inscribing a quarter of a circle inside of this with a radius of 1 - we know that its area is pi/4. We can a Monte Carlo simulation to find the relative area of the circle and square and then multiply the circle's area by 4 to find pi. In particular, the way we will find the area of the circle is to note the following: for a point (X,Y) to be inside of a circle of radius 1, its distance from the origin (X2+Y2) will be less than or equal to 1. We can generate thousands of random (X,Y) positions and determine whether each of them are inside of the circle. Each time it is inside of the circle, we will add one to a counter. After generating a large number of points, the ratio of the number of points inside the circle to the total number of points generated will approach the ratio of the area of the circle to the area of the square. Thus the value of pi would simply be

pi is approximately 4*(Number of Points inside circle)/ (Total Points Generated)
Thus we can find an approximation to pi using simple math.

The program below can be used to find an approximation of pi

   %  pimc.m
   %  Matlab Program to Find Pi using Random Numbers
   %  Tom Huber,  June 15, 1996
   Nrand = input('How Many Random Numbers ');
   NInside = 0;
   for nloops=1:Nrand
      Xrand = rand;   % Generate Random XY Point
      Yrand = rand;
      Rrand = Xrand^2 + Yrand^2;  % Find its distance from origin
      if (Rrand <= 1)
          NInside = NInside + 1;
   disp(['Total Generated: ' num2str(Nrand) ' Inside Pts: ' ...
   piapprox = 4*NInside/Nrand;
   disp(['  Approximation to pi = ' num2str(piapprox)]);

Try running the program with about 1000 random points. How good is the approximation? Does it give the same result each time you run the program? By how much does it vary? Does the result improve with 5000 points? How much longer does it take to calculate?

We can greatly improve the speed of the program above by optimizing it for Matlab. In particular, Matlab is very fast at working with vector and matrices of numbers. In the student edition of matlab, the largest vector which can be used is 8192 elements, so we will have the program generate 8192 random numbers with one call to rand. Then we will determine the radius for all of these simultaneously using the command

Rrand = Xrand.^2 + Yrand.^2;
Note the use of the .^ operator - this squares each element of the vector separately instead of doing a matrix multiplication. Finally, we can check all of the elements of the vector to determine if the radius is < 1 using a single command

CheckValue = Rrand<=1.;
This will create a vector CheckValue which will have a 1 for every element which satisfies the condition (Rrand<=1.) and a 0 for every element which does not satisfy the condition. Finally, we can determine the number inside by adding up all of the values in CheckValue

NInside = NInside + sum(CheckValue);
The program following program can be used to find an approximation of pi and will be much faster than the previous version.
   %  pimc2.m
   %  Optimized Matlab Program to Find Pi using Random Numbers
   %  Tom Huber,  June 15, 1996
   Nrand = 8192;  % Largest size of an array in Student Version of Matlab
   Nmax  = input('How Many Loops (of 8192 Random Numbers Each) ');
   NTrand = 0;
   NInside = 0;
   for nloops=1:Nmax
      Xrand = rand(1,Nrand);        % Generates 8192 Random XY Points
      Yrand = rand(1,Nrand);
      Rrand = Xrand.^2 + Yrand.^2;  % Finds the radius for all 8192 random points
      CheckValue = Rrand<=1.;  % Has 1 if True & 0 if False for each element
      NInside = NInside + sum(CheckValue);  % Total number of Points Inside
      NTrand = NTrand + Nrand;              % Total number of Points Generated
   disp(['Total Generated: ' num2str(NTrand) ' Inside Pts: ' ...
   piapprox = 4*NInside/NTrand;
   pierror = 4*sqrt(NInside)/NTrand;
   disp(['  Approximation to pi = ' num2str(piapprox) ...
      ' With Error ' num2str(pierror)]);
Try running 1 loop of the program above and note how much faster it is than running 8192 loops of the previous program. Now run 10 or 100 loops and notice how the accuracy of the approximation improves.

For Monte Carlo simulations, the processes are random, so each time it is run it will come up with slightly different results. It can be shown that the error in a random number of counts generated by a Monte Carlo simulation is approximately the square-root of the number. Thus, in this case the uncertainty in our value of pi is

pierror = 4*sqrt(NInside)/NTrand;
This shows one problem with a Monte Carlo method - to improve the accuracy of the results by a factor of 10 we need to run approximately 100 times longer. The bottom line is that although this method of finding pi is very easy conceptually, it would not be a very efficient method for finding pi to a large number of digits (there are programs that have calculated pi to well over a million digits, but not using a Monte Carlo approach since it would take far too much CPU time).

Now, for the problem we are studying, namely determining the global climate, there are several places where a Monte Carlo simulation can be of use. In particular, we will need it to help us determine the global average temperature and the amount of sunlight which falls into each latitude band. To determine the global average temperature, we want to average of the temperature of each latitude band, but there is obviously much more land area in the region from the equator to a latitude of 10o than in a band from 80o to the north pole. Therefore to determine the average temperature, we will want to weight the temperature of each band by the fraction of the earths land area in that band. We can do this analytically using integration, but this also can be done well with a Monte Carlo method.

We will modify the program above to generate random (XYZ) points in a cube of sides 1 unit. Next we will determine if they are on the surface of a sphere of radius 1 by using the following:

                       Rrand = Xrand.^2 + Yrand.^2 + Zrand.^2;
                       CheckValue = Rrand<=1.01 & Rrand>=.99;
this will determine if the points are on the surface of the sphere. Next, we will check if points are within each latitude band as well as on the surface of the sphere. The program will increment a counter for each point which meets these criteria. At the end, we can divide the number in each latitude band by the total number of points which were on the surface to find the area in each band.

A program to do this is below.

%  Spheremc.m
%  Program to Determine Fraction of Area in Latitude Bands on a Sphere
%  Tom Huber  June 25, 1996
Theta1 = 0;
Theta2 = 90;
NSubDiv = 9; % Nine Subdivisions of 10 Degrees Each
dTh = (Theta2-Theta1)/NSubDiv;  % Width of Each Division (10 Degrees)
ThLow = Theta1:dTh:Theta2-dTh;  % Lower Limit for Each Region ( 0,10,20..80)
ThHigh = Theta1+dTh:dTh:Theta2; % Upper Limit for Each Region (10,20,30..90)
Nrand = 8192;    % Number of Points for Student Edition of Matlab
Nmax  = input('How Many Loops of 8192 Values Each ');
NTrand = 0;               % Initialize Total number of Points Generated
NGoodPts = 0;             % Initialize Total number of Points on Sphere
NZone = zeros(1,NSubDiv); % Initialize Number in each zone
T0 = clock;               % Keep track of CPU time (for reference purposes)
for nloops=1:Nmax
   Xrand = rand(1,Nrand);    % Generate XYZ Points in space
   Yrand = rand(1,Nrand);
   Zrand = rand(1,Nrand);
   Rrand = Xrand.^2 + Yrand.^2 + Zrand.^2;  % Find distance from origin
   CheckValue = Rrand<=1.01 & Rrand>=.99;   % See if on surface of sphere
   NGoodPts = NGoodPts + sum(CheckValue);   % Keep track of total on surface
   Lat = asin(Zrand)*180/pi;                % Find the Latitude of Each point
   for i=1:NSubDiv                          % Sweep through all latitudes
       NZoneCheck = Lat < ThHigh(i) & Lat >= ThLow(i); % Check if in latitude
       NZoneCheck = NZoneCheck .* CheckValue;          %  and on surface
       NZone(i) = NZone(i) + sum(NZoneCheck);          % If so, add to sums
   NTrand = NTrand + Nrand;                 % Total number of Points Generated
T0 = clock - T0;                            % CPU Time at end of program
disp(['Total Generated: ' num2str(NTrand) ' Good Pts: '...
    num2str(NGoodPts)  ' Seconds: ' num2str(T0)]);
fLatitude = NZone/NGoodPts;
fError = fLatitude./sqrt(NZone);
fActual = sin(ThHigh*pi/180.)-sin(ThLow*pi/180.);
disp('Summary for Zones');
disp('Lower Angle, Upper Angle, Simulated Fraction in Band, Uncertainty,');
disp('   Actual Fraction (using Calculus)');
disp([ ThLow' ThHigh' fLatitude' fError' fActual']);

Electronic Copy:
Created: 8-JUL-97 by Tom Huber, Physics Department, Gustavus Adolphus College.