UON Programming Weather Forecast and Gambler Ruin Problem Project

Description

4 attachmentsSlide 1 of 4attachment_1attachment_1attachment_2attachment_2attachment_3attachment_3attachment_4attachment_4

Unformatted Attachment Preview

Assignment
Description of the assignment:
Some of the questions require exploring in different sources and conversation among them. After
collecting all the answers and explanations, you need to collect a unified summary and provide a
management techniques and strategies. Only one copy by representative is required. Please submit
one WORD copy for answering the questions and one .py file containing all Python codes. No
page limit or format is imposed. Try your best to submit a professional and comprehensive report.
Problem 1: Weather Forecast:
Suppose that whether or not it rains today depends on previous weather conditions through the last
two days. Specifically, suppose that if it has rained for the past two days, then it will rain tomorrow
with probability 0.7; if it rained today but not yesterday, then it will rain tomorrow with probability
0.5; if it rained yesterday but not today, then it will rain tomorrow with probability 0.4; if it has
not rained in the past two days, then it will rain tomorrow with probability 0.2.
If we let the state at time n depend only on whether or not it is raining at time n, then the preceding
model is not a Markov chain (why not?). However, we can transform this model into a Markov
chain by saying that the state at any time is determined by the weather conditions during both that
day and the previous day. In other words, we can say that the process is in

State 0
if it rained both today and yesterday,
State 1
if it rained today but not yesterday,
State 2
if it rained yesterday but not today, • State 3
yesterday or today.
if it did not rain either
The preceding would then represent a four-state Markov chain having a transition probability
matrix
You should carefully check the matrix P, and make sure you understand how it was obtained.
a)
b)
c)
d)
Calculate the proportion of days that it rains.
If it rains today, what is the probability that a it also rains tomorrow?
If it rains on Monday, what is the probability that it is sunny on Wednesday?
If it rains on Monday, what is the chance it also rains on three consecutive days (Tuesday,
Wednesday, and Thursday)?
Discussion Questions:
e) If you are a weather forecast manager, do you think the Markov Chains method alone is
enough in predicting the weather? What other factors should have been taken into account
for better predictions? How do you think the probability data in the matrix above (P) can
be obtained?
f) Provide some variations to extend this problem and create three additional questions for
the variation you just introduced. You do not need to answer the questions at this time.
Problem 2: Random Walk:
Consider a Markov chain whose state space consists of the integers i = 0, ±1, ±2, . . ., and has
transition probabilities given by
Pi,i+1 = p = 1 − Pi,i−1,
i = 0, ±1, ±2,…
the spring, being elongated, calls it to the left. Conversely, if the block is placed to the left
of its equilibrium position (x < 0), then the spring is compressed and pushes the block to the right. In both cases, we can express the component along the x-axis of the force due to the spring according to the following formula: Here, we have the following: • is the force. • is the elastic constant. • 𝐹𝐹𝑥𝑥 = −𝑘𝑘 ∗ 𝑥𝑥 is the horizontal coordinate that indicate the position of the mass m. From the second law of dynamics, we can derive the component of acceleration along x as follows: Here, we have the following: • is the acceleration. • is the elastic constant. • • 𝑎𝑎𝑥𝑥 = − 𝑘𝑘 ∗ 𝑥𝑥 𝑚𝑚 is the mass of the block. is the horizontal coordinate that indicate the position of the mass m. 20 Introducing Simulation Models = = If we indicate with the rate of change of the speed and with we can obtain the evolution equations of the dynamic system, as follows: Here, we have the following: • 𝜔𝜔2 = − the speed, 𝑑𝑑𝑑𝑑 = 𝑣𝑣 { 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑 = − 𝜔𝜔2 ∗ 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑘𝑘 𝑚𝑚 For these equations, we must associate the initial conditions of the system, which we can write in the following way: 𝑥𝑥(0) = 𝑥𝑥0 { 𝑣𝑣(0) = 𝑣𝑣0 The solutions to the previous differential equations are as follows: 𝑣𝑣0 𝑥𝑥(𝑡𝑡) = 𝑥𝑥0 cos(𝜔𝜔𝜔𝜔) + sin (𝜔𝜔𝜔𝜔) { 𝜔𝜔 𝑣𝑣(𝑡𝑡) = 𝑣𝑣0 cos(𝜔𝜔𝜔𝜔) − 𝑥𝑥0 𝜔𝜔 sin (𝜔𝜔𝜔𝜔) In this way, we obtained the mathematical model of the analyzed system. In order to study the evolution of the oscillation phenomenon of the mass block m over time, it is enough to vary the time and calculate the position of the mass at that instant and its speed. In decision-making processes characterized by high levels of complexity, the use of analytical models is not possible. In these cases, it is necessary to resort to models that differ from those of an analytical type for the use of the calculator as a tool not only for calculation, such as in mathematical programming models, but also for representing the elements that make up reality why studying the relationships between them. Predator-prey model In the field of simulations, simulating the functionality of production and logistic processes is considerably important. These systems are, in fact, characterized by high complexity, numerous interrelationships between the different processes that pass through them, segment failures, unavailability, and the stochasticity of the system parameters. Dynamical systems modeling 21 To understand how complex the analytical modeling of some phenomena is, let's analyze a universally widespread biological model. This is the predator-prey model, which was developed independently by the Italian researcher Vito Volterra and the American biophysicist Alfred Lotka. On an island, there are two populations of animals: prey and predators. The vegetation of the island provides the prey with nourishment in quantities that we can consider as unlimited, while the prey is the only food available for the predators. We can consider the birth rate of the prey constant over time; this means that in the absence of predators, the prey would grow by exponential law. Their mortality rate, on the other hand, depends on the probability they have of falling prey to a predator and therefore on the number of predators present per unit area. As for the predators, the mortality rate is constant, while their growth rate depends on the availability of food and therefore on the number of prey per unit area present on the island. We want to study the trend of the size of the two populations over time, starting from a known initial situation (number of prey and predators). To carry out a simulation of this biological system, we can model it by means of the following system of finite difference equations, where x(t) and y(t) are the number of prey and predators at time t, respectively: 𝑑𝑑𝑑𝑑 = 𝛼𝛼 ∗ 𝑥𝑥(𝑡𝑡) − 𝛽𝛽 ∗ 𝑥𝑥(𝑡𝑡) ∗ 𝑦𝑦(𝑡𝑡) { 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑 = 𝛾𝛾 ∗ 𝑦𝑦(𝑡𝑡) − 𝛿𝛿 ∗ 𝑥𝑥(𝑡𝑡) ∗ 𝑦𝑦(𝑡𝑡) 𝑑𝑑𝑑𝑑 Here, we have the following: • α, β, γ, δ are positive real parameters related to the interaction of the two species • is the instantaneous growth rates of the prey. • is the instantaneous growth rates of the predators. The following hypotheses underlie in this model: • In the absence of predators, the number of prey increases according to an exponential law, that is, with a constant rate. • Similarly, in the absence of prey, the number of predators decreases at a constant rate. This is a deterministic and continuous simulation model. In fact, the state of the system, represented by the size of the two populations in each instance of time, is univocally determined by the initial state and the parameters of the model. Furthermore, at least in principle, the variables – that is, the size of the populations – vary continuously over time. The differential system is in normal form and can be solved with respect to the derivatives of maximum order but cannot be separated into variables. It is not possible to solve this in an analytical form, but with numerical methods, it is solved immediately (the RungeKutta method). The solution obviously depends on the values of the four constants and the initial values. Summary In this chapter, we learned what is meant by simulation modeling. We understood the difference between modeling and simulation, and we discovered the strengths of simulation models, such as defects. To understand these concepts, we clarified the meaning of the terms that appear most frequently when dealing with these topics. We then analyzed the different types of models: static versus dynamic, deterministic versus stochastic, and continuous versus discrete. We then explored the workflow connected to a numerical simulation process and highlighted the crucial steps. Finally, we studied some practical modeling cases to understand how to elaborate on a model starting from the initial considerations. In the next chapter, we will learn how to approach a stochastic process and understand the random number simulation concepts. Then, we will explore the differences between pseudo and non-uniform random numbers, as well as the methods we can use for random distribution evaluation. 2 Understanding Randomness and Random Numbers In many real-life situations, it is useful to flip a coin in order to decide what to do. Many computers also use this procedure as part of their decision-making process. In fact, many problems can be solved in a very effective and relatively simple way by using probabilistic algorithms. In an algorithm of this type, decisions are made based on random contributions that remember the dice roll with the help of a randomly chosen value. The generation of random numbers has ancient roots, but only recently has the process been sped up, allowing it to be used on a large scale in scientific research as well. These generators are mainly used for computer simulations, statistical sampling techniques, or in the field of cryptography. In this chapter, we're going to cover the following topics: • Stochastic processes • Random number simulation • The pseudorandom number generator 24 Understanding Randomness and Random Numbers • Testing uniform distribution • Exploring generic methods for random distributions • Random number generation using Python Technical requirements In this chapter, we will introduce random number generation techniques. In order to understand these topics, a basic knowledge of algebra and mathematical modeling is needed. To work with the Python code in this chapter, you need the following files (available on GitHub at https://github.com/PacktPublishing/Hands-On-SimulationModeling-with-Python): • LinearCongruentialGenerator.py • LearmouthLewisGenerator.py • LaggedFibonacciAlgorithm.py • UniformityTest.py • Random.Generation.py Stochastic processes A stochastic process is a family of random variables that depends on a parameter, t. A stochastic process is specified using the following notation: {𝑋𝑋𝑡𝑡 , 𝑡𝑡 ∈ 𝑇𝑇} Here, t is a parameter, and T is the set of possible values ​​of t. Usually, time is indicated by t, so a stochastic process is a family of time-dependent random variables. The variability range of t, that is, the set, T, can be a set of real numbers, possibly coinciding with the entire time axis. But it can also be a discrete set of values. The random variables, Xt, are defined on the set, X, called the space of states. This can be a continuous set, in which case it is defined as a continuous stochastic process, or a discrete set, in which case it is defined as a discrete stochastic process. Consider the following elements: 𝑥𝑥0 , 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 ∈ 𝑋𝑋 Stochastic processes 25 This means the values ​​that the random variables, Xt, can take are called system states and represent the possible results of an experiment. The Xt variables are linked together by dependency relationships. We can know a random variable if we know both the values it can assume and the probability distribution. So, to understand a stochastic process, it is necessary not only to know the values ​​that Xt can take but also the probability distributions of the variables and the joint distributions between the values. Simpler stochastic processes, in which the variability range of t is a discrete set of time values, ​​can also be considered. Important note In practice, there are numerous phenomena that are studied through the theory of stochastic processes. A classic application in physics is the study of the motion of a particle in each medium, the so-called Brownian motion. This study is carried out statistically using a stochastic process. There are processes where even by knowing the past and the present, the future cannot be determined; whereas, in other processes, the future is determined by the present without considering the past. Types of stochastic process Stochastic processes can be classified according to the following characteristics: • Space of states • Time index • Type of stochastic dependence between random variables The state space can be discrete or continuous. In the first case, the stochastic process with discrete space is also called a chain, and space is often referred to as the set of non-negative integers. In the second case, the set of values assumed by the random variables is not finite or countable, and the stochastic process is in continuous space. The time index can also be discrete or continuous. A discrete-time stochastic process is also called a stochastic sequence and is denoted as follows: Here, the set, T, is finite or countable. {𝑋𝑋𝑛𝑛 | 𝑛𝑛 ∈ 𝑇𝑇} 26 Understanding Randomness and Random Numbers In this case, the changes of state are observed only in certain instances: finite or countable. If state changes occur at any instant in a finite or infinite set of real intervals, then there is a continuous-time process, which is denoted as follows: {𝑋𝑋(𝑡𝑡) | 𝑡𝑡 ∈ 𝑇𝑇} The stochastic dependence between random variables, X(t), for different values of t characterizes a stochastic process and sometimes simplifies its description. A stochastic process is stationary in the strict sense that the distribution function is invariant with respect to a shift on the time axis, T. A stochastic process is stationary in the broad sense that the first two moments of the distribution are independent of the position on the T axis. Examples of stochastic processes The mathematical treatment of stochastic processes seems complex, yet we find cases of stochastic processes every day. For example, the number of patients admitted to a hospital as a function of time, observed at noon each day, is a stochastic process in which the space of states is discrete, being a finite subset of natural numbers, and time is discrete. Another example of a stochastic process is the temperature measured in a room as a function of time, observed at every instant, with continuous state space and continuous time. Let's now look at a number of structured examples that are based on stochastic processes. The Bernoulli process The concept of a random variable allows us to formulate models that are useful for the study of many random phenomena. An important early example of a probabilistic model is the Bernoulli distribution, named in honor of the Swiss mathematician, James Bernoulli (1654-1705), who made important contributions to the field of probability. Some of these experiments consist of repeatedly performing a given test. For example, we want to know the probability of getting a head when throwing a coin 1,000 times. In each of these examples, we look for the probability of obtaining x successes in n trials. If x indicates the successes, then n - x will be the failures. A sequence of Bernoulli trials consists of a Bernoulli trial under the following hypotheses: • There are only two possible mutually exclusive results for each trial, arbitrarily called success and failure. • The probability of success, p, is the same for each trial. • All tests are independent. Stochastic processes 27 Independence means that the result of a test is not influenced by the result of any other test. For example, the event, the third test was successful, is independent of the event, the first test was successful. The toss of a coin is a Bernoulli trial: the heads event can be considered successful, and the tails event can be considered unsuccessful. In this case, the probability of success is p = 1/2. In rolling two dice, the event, the sum of the points is seven, and the complementary event are both unsuccessful. In this case, it is a Bernoulli trial and the probability of success is p = 1/6. Important note Two events are said to be complementary when the occurrence of the first excludes the occurrence of the second but one of the two will certainly occur. Let p denote the probability of success in a Bernoulli trial. The random variable, X, which counts the number of successes in n trials is called the binomial random variable of the n and p parameters. X can take integer values between 0 and n. Random walk The random walk is a discrete parameter stochastic process in which Xt, where X represents a random variable, describes the position taken at time t by a moving point. The term, random walk, refers to the mathematical formalization of statistics that describe the displacement of an object that moves randomly. This kind of simulation is extremely important for a physicist and has applications in statistical mechanics, fluid dynamics, and quantum mechanics. Random walks represent a mathematical model that is used universally to simulate a path formalized by a succession of random steps. This model can assume a variable number of degrees of freedom, depending on the system we want to describe. From a physical point of view, the path traced over time will not necessarily simulate a real motion, but it will represent the trend of the characteristics of the system over time. Random walks find applications in chemistry, biology, and physics, but also in other fields such as economics, sociology, and information technology. 28 Understanding Randomness and Random Numbers Random one-dimensional walking is a model that is used to simulate the movement of a particle moving along a straight line. There are only two potential movements on the allowed path: either to the right (with a probability that is equal to p) or to the left (with a probability that is equal to q) of the current position. Each step has a constant length and is independent of the others, as shown in the following diagram: Figure 2.1 – One-dimensional walking The position of the point in each instant is identified by its abscissa, X(n). This position, after n steps, will be characterized by a random term. Our aim is to calculate the probability of passing from the starting point after n movements. Obviously, nothing assures us that the point will return to the starting position. The variable, X(n), returns the abscissa of the particle after n steps. It is a discrete random variable with a binomial distribution. At each instant, the particle steps right or left based on the value returned by a random variable, Z(n). This variable can take only two values: +1 and -1. It assumes a + 1 value with a probability of p > 0 and a value of -1 with a probability that is equal to q. The sum
of the two probabilities is p + q = 1. The position of the particle at instant n is given by the
following equation:
𝑋𝑋𝑛𝑛 = 𝑋𝑋𝑛𝑛−1 + 𝑍𝑍𝑛𝑛 ; 𝑛𝑛 = 1, 2, …
This shows the average number of returns to the origin of the particle, named p. The
probability of a single return is given by the following geometric series:

𝑛𝑛=0
𝑛𝑛=0
𝜇𝜇 = ∑ 𝑛𝑛 𝑝𝑝𝑛𝑛 (1 − 𝑝𝑝) = ∑ 𝑛𝑛 𝑝𝑝𝑛𝑛
1
√𝑛𝑛 ∗ 𝜋𝜋
→∞
We assume that the probability of the particle returning to the origin tends to 1. This
means that despite the frequency of the returns decreasing with the increase in the
number of steps taken, they will always be in an infinite value of steps taken. So, we can
conclude that a particle with equal probability of left and right movement, left free to
walk casually to infinity with great probability, returns infinite times to the point from
which it started.
Stochastic processes
29
The Poisson process
There are phenomena in which certain events, with reference to a certain interval of time
or space, rarely happen. The number of events that occur in that interval varies from 0 to
n, and n cannot be determined a priori. For example, the number of cars passing through
an uncrowded street in a randomly chosen 5-minute time frame can be considered a rare
event. Similarly, the number of accidents at work that happen at a company in a week, or
the number of printing errors on a page of a book, is rare.
In the study of rare events, a reference to a specific interval of time or space is
fundamental. For the study of rare events, the Poisson probability distribution is used,
named in honor of the French mathematician, Simeon Denis Poisson (1781-1840), who
first obtained the distribution. The Poisson distribution is used as a model in cases where
the events or realizations of a process, distributed randomly in space or time, are counts,
that is, discrete variables.
The binomial distribution is based on a set of hypotheses that define the Bernoulli trials,
and the same happens for the Poisson distribution. The following conditions describe
the so-called Poisson process:
• The realizations of the events are independent, meaning that the occurrence of
an event in a time or space interval has no effect on the probability of the event
occurring a second time in the same, or another, interval.
• The probability of a single realization of the event in each interval is proportional to
the length of the interval.
• In any arbitrarily small part of the interval, the probability of the event occurring
more than once is negligible.
An important difference between the Poisson distribution and the binomial distribution
is the number of trials and successes. In a binomial distribution, the number, n, of trials
is finite and the number, x, of successes cannot exceed n; in a Poisson distribution, the
number of tests is essentially infinite and the number of successes can be infinitely large,
even if the probability of having x successes becomes very small as x increases.
30
Understanding Randomness and Random Numbers
Random number simulation
The availability of random numbers is a necessary requirement in many applications.
In some cases, the quality of the final application strictly depends on the possibility
of generating good quality random numbers. Think, for example, of applications such
as video games, cryptography, generating visuals or sound effects, telecommunications,
signal processing, optimizations, and simulations. In an algorithm of this type, decisions
are made based on the pull of a virtual currency, which is based on a randomly
chosen value.
There is no single or general definition of a random number since it often depends on the
context. The concept of random number itself is not absolute, as any number or sequence
of numbers can appear to be random to an observer, but not to another who knows the
law with which they are generated. Put simply, a random number is defined as a number
selected in a random process from a finite set of numbers. With this definition, we focus
on the concept of randomness in the process of selecting a sequence of numbers.
In many cases, the problem of generating random numbers concerns the random
generation of a sequence of 0 and 1, from which numbers in any format can be obtained:
integers, fixed points, floating points, or strings of arbitrary length. With the right
functions, it is possible to obtain good quality sequences that can also be used in scientific
applications, such as the Monte Carlo simulation. These techniques should be easy to
implement and be usable by any computer. In addition, like all software solutions, they
should be very versatile and quickly improved.
Important note
These techniques have a big problem that is inherent to the algorithmic nature
of the process: the final string can be predicted from the starting seed. This is
why we call this process pseudorandom.
Despite this, many problems of an algorithmic nature are solved very effectively and
relatively simply using probabilistic algorithms. The simplest example of a probabilistic
algorithm is perhaps the randomized quicksort. This is a probabilistic variant of the
homonymous sorting algorithm, where, by choosing the pivot element, the algorithm
manages to randomly guarantee optimal complexity in the average case, no matter
the distribution of the input. Cryptography is a field in which randomness plays
a fundamental role and deserves specific mention. In this context, randomness does
not lead to computational advantages, but it is essential to guarantee the security of
authentication protocols and encryption algorithms.
Random number simulation
Probability distribution
It is possible to characterize a random process from different points of view. One of
the most important characteristics is the probability distribution. The probability
distribution is a model that associates a probability with each observable modality
of a random variable.
The probability distribution can be either discrete or continuous, depending on whether
the variable is random, discrete, or continuous. It is discrete if the phenomenon is
observable with an integer number of modes. The throw of the dice is a discrete
statistical phenomenon because the number of observable modalities is equal to 6. The
random variable can take only six values ​​(1, 2, 3, 4, 5, and 6). Therefore, the probability
distribution of the phenomenon is discrete. The probability distribution is continuous
when the random variable assumes a continuous set of values; in this case, the statistical
phenomenon can be observed with an infinite or very high number of modalities. The
probability distribution of body temperature is continuous because it is a continuous
statistical phenomenon, that is, the values ​​of the random variable vary continuously.
Let’s now look at different kinds of probability distributions.
Uniform distribution
In many cases, processes characterized by a uniform distribution are considered and
used. This means that each element is as likely as any of the others to be selected if an
infinite number of extractions is performed. If you represent the elements and their
respective probabilities of being extracted on a graph, you get a rectangular graph
as follows:
Figure 2.2 – The probabilities of the elements
31
32
Understanding Randomness and Random Numbers
Since the probability is expressed as a real number between 0 and 1, where 0 represents
the impossible event and 1 the certain event, in a uniform distribution, each element will
have a 1/n probability of being selected, where n is the number of items. In this case, the
sum of all the probabilities must give a uniform result, since, in an extraction, at least one
of the elements is chosen for sure. A uniform distribution is typical of artificial random
processes such as dice rolling, lotteries, and roulette and is also the most used in several
applications.
Gaussian distribution
Another very common probability distribution is the Gaussian or normal distribution,
which has a typical bell shape. In this case, the smaller values, or those that are closer to
the center of the curve, are more likely to be extracted than the larger ones, which are far
away from the center. The following diagram shows a typical Gaussian distribution:
Figure 2.3 – Gaussian distribution
Gaussian distribution is important because it is typical of natural processes. For example,
it can represent the distribution of the noise in many electronic components, or it can
represent the distribution of errors in measurements. It is, therefore, used to simulate
statistical distributions in the fields of telecommunications or signal processing.
Properties of random numbers
By random number, we refer to a random variable distributed in a uniform way between 0
and 1. The statistical properties that a sequence of random numbers must possess are
as follows:
• Uniformity
• Independence
The pseudorandom number generator
Suppose you divide an interval, [0.1], into n subintervals of equal amplitude. The
consequence of the uniformity property is that if N observations of a random number
are made, then the number of observations in each subinterval is equal to N/n. The
consequence of the independence property is that the probability of obtaining a value in
a particular range is independent of the values ​​that were previously obtained.
The pseudorandom number generator
The generation of real random sequences using deterministic algorithms is impossible:
at most, pseudorandom sequences can be generated. These are, apparently, random
sequences that are actually perfectly predictable and can be repeated after a certain
number of extractions. A PRNG is an algorithm designed to output a sequence of
values that appear to be generated randomly.
The pros and cons of a random number generator
A random number generation routine must be the following:
• Replicable
• Fast
• Not have large gaps between two generated numbers
• Have a sufficiently long running period
• Generate numbers with statistical properties that are as close as possible to
ideal ones
The most common cons of random number generators are as follows:
• Numbers not uniformly distributed
• Discretization of the generated numbers
• Incorrect mean or variance
• Presence of cyclical variations
33
34
Understanding Randomness and Random Numbers
Random number generation algorithms
The first to deal with the random number generation problem was John von Neumann, in
1949. He proposed a method called middle-square. This method allows us to understand
some important characteristics of a random number generation process. To start, we need
to provide an input as the seed or a value that starts the sequence. This is necessary to be
able to generate different sequences each time. However, it is important to ensure that the
good behavior of the generator does not depend on which seed is used. From here, the
first flaw of the middle-square method appears, that is, when using the zero value as
a seed, only a sequence of zeros will be obtained.
Another shortcoming of this method is the repetitiveness of the sequences. As in all of the
other PRNGs that will be discussed, each value depends on the previous one, and, at most,
on the internal state variables of the generator. Since this is a limited number of digits, the
sequence can only be repeated from a certain point onward. The length of the sequence,
before it begins to repeat itself, is called the period. A long period is important because
many practical applications require a large amount of random data, and a repetitive
sequence might be less effective. In such cases, it is important that the choice of the seed
has no influence on the potential outcomes.
Another important aspect is the efficiency of the algorithm. The size of the output data
values ​​and internal state, and, therefore, the generator input (seed), are often intrinsic
features of the algorithm and remain constant. For this reason, the efficiency of a PRNG
should be assessed not so much in terms of computational complexity, but in terms of the
possibility of a fast and efficient implementation of the calculation architectures available.
In fact, depending on the architecture you are working on, the choice of different PRNGs,
or different design parameters of a certain PRNG, can result in a faster implementation by
many orders of magnitude.
Linear congruential generator
One of the most common methods for generating random numbers is the Linear
Congruence Generator (LCG). The theory on which it rests is simple to understand
and implement. It also has the advantage of being computationally light. The recursive
relationship underlying this technique is provided by the following equation:
𝑥𝑥𝑘𝑘+1 = (𝑎𝑎 ∗ 𝑥𝑥𝑘𝑘 + 𝑐𝑐) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
Here, we can observe the following:
• a is the multiplier (non-negative integers)
• c is the increment (non-negative integers)
The pseudorandom number generator
35
• m is the mode (non-negative integers)
• x0 is the initial value (seed or non-negative integers)
The modulo function, denoted by mod, results in the remainder of the Euclidean division
of the first number by the second. For example, 18 mod 4 gives 2 as it is the remainder of
the Euclidean division between the two numbers.
The linear congruence technique has the following characteristics:
• It is cyclical with a period that is approximately equal to m
• The generated numbers are discretized
To use this technique effectively, it is necessary to choose very large m values. As an
example, set the parameters of the method and generate the first 16 pseudorandom values.
Here is the Python code that allowed us to generate that sequence of numbers:
import numpy as np
a = 2
c = 4
m = 5
x = 3
for i in range (1,17):
x= np.mod((a*x+c), m)
print(x)
The following results are returned:
0
4
2
3
0
4
2
3
0
4
2
3
36
Understanding Randomness and Random Numbers
0
4
2
3
In this case, the period is equal to 4. It is easy to verify that, at most, m distinct
integers, Xn, can be generated in the interval, [0, m – 1]. If c = 0, the generator is called
multiplicative. Let’s analyze the Python code line by line. The first line was used to import
the library:
import numpy as np
numpy is a library of additional scientific functions of the Python language, designed
to perform operations on vectors and dimensional matrices. numpy allows you to work
with vectors and matrices more efficiently and faster than you can do with lists and lists
of lists (matrices). In addition, it contains an extensive library of high-level mathematical
functions that can operate on these arrays.
After importing the numpy library, we set the parameters that will allow us to generate
random numbers using LCG:
a
c
m
x
=
=
=
=
2
4
5
3
At this point, we can use the LCG formula to generate random numbers. We only generate
the first 16 numbers, but we will see from the results that these are enough to understand
the algorithm. To do this, we use a for loop:
for i in range (1,17):
x= np.mod((a*x+c), m)
print(x)
To generate random numbers according to the LCG formula, we have used the
np.mod() function. This function returns the remainder of a division when given
a dividend and divisor.
The pseudorandom number generator
37
Random numbers with uniform distribution
A sequence of numbers uniformly distributed between [0, 1] can be obtained using the
following formula:
𝑈𝑈𝑛𝑛 =
𝑋𝑋𝑛𝑛
𝑚𝑚
The obtained sequence is periodic, with a period less than or equal to m. If the period is
m, then it has a full period. This occurs when the following conditions are true:
• If m and c have prime numbers
• If m is divisible by a prime number, b, for which it must also be divisible
• If m is divisible by 4, then a – 1 must also be divisible by 4
Important note
By choosing a large value of m, you can reduce both the phenomenon of
periodicity and the problem of generating rational numbers.
Furthermore, it is not necessary for simulation purposes that all numbers between [0, 1]
are generated, because these are infinite. However, it is necessary that as many numbers as
possible within the range have the same probability of being generated.
Generally, a value of m is m ≥ 109 so that the generated numbers constitute a dense subset
of the interval, [0, 1].
An example of a multiplicative generator that is widely used in 32-bit computers is the
Learmonth-Lewis generator. This is a generator in which the parameters assume the
following values:
• a = 75
• c = 0
• m = 231 – 1
Let’s analyze the code that generates the first 100 random numbers according to this
method:
import numpy as np
a = 75
c = 0
m = 2**(31) -1
x = 0.1
38
Understanding Randomness and Random Numbers
for i in range(1,100):
x= np.mod((a*x+c),m)
u = x/m
print(u)
The code we have just seen was analyzed line by line in the Linear congruential generator
section of this chapter. The difference, in addition to the values of the parameters, lies
in the generation of a uniform distribution in the range of [0, 1] through the following
command:
u = x/m
The following results are returned:
Figure 2.4 – LCG output
Since we are dealing with random numbers, the output will be different from the
previous one.
A comparison between the different generators must be made based on the analysis
of the periodicity, the goodness of the uniformity of the numbers generated, and the
computational simplicity. This is because the generation of very large numbers can lead
to the use of expensive computer resources. Also, if the Xn numbers get too big, they are
truncated, which can cause a loss of the desired uniformity statistics.
The pseudorandom number generator
39
Lagged Fibonacci generator
The lagged Fibonacci algorithm for generating pseudorandom numbers arises from the
attempt to generalize the method of linear congruences. One of the reasons that led to the
search for new generators was the need, useful for many applications especially in parallel
computing, to lengthen the generator period. The period of a linear generator when m is
approximately 109 is enough for many applications, but not all of them.
One of the techniques developed is to make Xn + 1 dependent on the two previous values,
Xn and Xn − 1, instead of only on Xn, as is the case in the LCG method. In this case, the
period may arrive close to the value, m2, because the sequence will not repeat itself until
the following equality is obtained:
(𝑋𝑋𝑛𝑛+𝜆𝜆 , 𝑋𝑋𝑛𝑛+𝜆𝜆+1 ) = (𝑋𝑋𝑛𝑛 , 𝑋𝑋𝑛𝑛+1 )
The simplest generator of this type is the Fibonacci sequence represented by the following
equation:
𝑋𝑋𝑛𝑛+1 = (𝑋𝑋𝑛𝑛 + 𝑋𝑋𝑛𝑛−1 ) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
This generator was first analyzed in the 1950s and provides a period, m, but the sequence
does not pass the simplest statistical tests. We then tried to improve the sequence using
the following equation:
𝑋𝑋𝑛𝑛+1 = (𝑋𝑋𝑛𝑛 + 𝑋𝑋𝑛𝑛−𝑘𝑘 ) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
This sequence, although better than the Fibonacci sequence, does not return satisfactory
results. We had to wait until 1958, when Mitchell and Moore proposed the following
sequence:
𝑋𝑋𝑛𝑛 = (𝑋𝑋𝑛𝑛−24 + 𝑋𝑋𝑛𝑛−55 ) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚,
𝑛𝑛 ≥ 55
Here, m is even and X0, … X54 are arbitrary integers that are not all even. Constants 24
and 55 are not chosen at random but are numbers that define a sequence whose least
significant bits (Xn mod 2) have a period of length 255-1. Therefore, the sequence (Xn) must
have a period of length of at least 255-1. The succession has a period of 2M-1 (255-1) where
m = 2M.
Numbers 24 and 55 are commonly called lags and the sequence (Xn) is called a Lagged
Fibonacci Generator (LFG). The LFG sequence can be generalized with the following
equation:
𝑋𝑋𝑛𝑛 = (𝑋𝑋𝑛𝑛−𝑙𝑙 ⊗ 𝑋𝑋𝑛𝑛−𝑘𝑘 ) 𝑚𝑚𝑚𝑚𝑚𝑚 2𝑀𝑀 , 𝑙𝑙 > 𝑘𝑘 > 0
40
Understanding Randomness and Random Numbers
Here, ⊗ refers to any of the following operations: +, −, ×, or ⊗ (exclusive or).
Only some pairs (k, l) give sufficiently long periods. In these cases, the period is 2M-1
(2l-1). The pairs (k, l) must be chosen appropriately. The only condition on the first l
values is that at least one of them must be odd; otherwise, the sequence will be composed
of even numbers.
Let’s look at how to implement a simple example of additive LFG in Python using the
following parameters: x0 = x1 = 1 and m = 232. Here is the code to generate the first 100
random numbers:
import numpy as np
x0=1
x1=1
m=2**32
for i in range (1,101):
x= np.mod((x0+x1), m)
x0=x1
x1=x
print(x)
Let’s analyze the Python code line by line. The first line was used to import the library:
import numpy as np
After importing the numpy library, we set the parameters that will allow us to generate
random numbers using LFG:
x0=1
x1=1
m=2**32
At this point, we can use the LFG formula to generate random numbers. We only generate
the first 100 numbers. To do this, we use a for loop:
for i in range (1,101):
x= np.mod((x0+x1), m)
x0=x1
x1=x
print(x)
The pseudorandom number generator
41
To generate random numbers according to the LFG formula, we use the np.mod()
function. This function returns the remainder of a division when given a dividend and
divisor. After generating the random number, the two previous values are updated as
follows:
x0=x1
x1=x
The following random numbers are printed:
Figure 2.5 – Table of random numbers using LFG
The initialization of an LFG is particularly complex, and the results of this method are
very sensitive to the initial conditions. If extreme care is not taken when choosing the
initial values, statistical defects may occur in the output sequence. These defects could
harden the initial values along with subsequent values that have a careful periodicity.
Another potential problem with LFG is that the mathematical theory behind the method
is incomplete, making it necessary to rely on statistical tests rather than theoretical
performance.
42
Understanding Randomness and Random Numbers
Testing uniform distribution
Test adaptation (that is, the goodness of fit) in general, has the purpose of verifying
whether a variable under examination does or does not have a certain hypothesized
distribution on the basis, as usual, of experimental data. It is used to compare a set of
frequencies observed in a sample, with similar theoretical quantities assumed for the
population. By means of the test, it is possible to quantitatively measure the degree of
deviation between the two sets of values.
The results obtained in the samples do not always exactly agree with the theoretical results
that are expected according to the rules of probability. Indeed, it is very rare for this to
occur. For example, although theoretical considerations lead us to expect 100 heads and
100 tails from 200 flips of a coin, it is rare that these results are obtained exactly. However,
despite this, we must not unnecessarily deduce that the coin is rigged.
The chi…

attachment

Programming Problems

User generated content is uploaded by users for the purposes of learning and should be used following Studypool’s honor code & terms of service.

Reviews, comments, and love from our customers and community:

Article Writing

Keep doing what you do, I am really impressed by the work done.

Researcher

PowerPoint Presentation

I am speechless…WoW! Thank you so much!

Stacy V.

Part-time student

Dissertation & Thesis

This was a very well-written paper. Great work fast.

Student

Annotated Bibliography

I love working with this company. You always go above and beyond and exceed my expectations every time.

Student

Book Report / Review

I received my order wayyyyyyy sooner than I expected. Couldn’t ask for more.

Student

Essay (Any Type)

On time, perfect paper

Student

Case Study

Awesome! Great papers, and early!

Kaylin Green

Student

Thank you Dr. Rebecca for editing my essays! She completed my task literally in 3 hours. For sure will work with her again, she is great and follows all instructions

Researcher

Critical Thinking / Review

Extremely thorough summary, understanding and examples found for social science readings, with edits made as needed and on time. Transparent

Customer

Perfect!

Student