UON Programming Weather Forecast and Gambler Ruin Problem Project

Description

4 attachmentsSlide 1 of 4attachment_1attachment_1attachment_2attachment_2attachment_3attachment_3attachment_4attachment_4

Unformatted Attachment Preview

Assignment
Description of the assignment:
Some of the questions require exploring in different sources and conversation among them. After
collecting all the answers and explanations, you need to collect a unified summary and provide a
report about your observations about the companies and organizations using operations
management techniques and strategies. Only one copy by representative is required. Please submit
one WORD copy for answering the questions and one .py file containing all Python codes. No
page limit or format is imposed. Try your best to submit a professional and comprehensive report.
Problem 1: Weather Forecast:
Suppose that whether or not it rains today depends on previous weather conditions through the last
two days. Specifically, suppose that if it has rained for the past two days, then it will rain tomorrow
with probability 0.7; if it rained today but not yesterday, then it will rain tomorrow with probability
0.5; if it rained yesterday but not today, then it will rain tomorrow with probability 0.4; if it has
not rained in the past two days, then it will rain tomorrow with probability 0.2.
If we let the state at time n depend only on whether or not it is raining at time n, then the preceding
model is not a Markov chain (why not?). However, we can transform this model into a Markov
chain by saying that the state at any time is determined by the weather conditions during both that
day and the previous day. In other words, we can say that the process is in



State 0
if it rained both today and yesterday,
State 1
if it rained today but not yesterday,
State 2
if it rained yesterday but not today, • State 3
yesterday or today.
if it did not rain either
The preceding would then represent a four-state Markov chain having a transition probability
matrix
You should carefully check the matrix P, and make sure you understand how it was obtained.
Answer the following questions:
a)
b)
c)
d)
Calculate the proportion of days that it rains.
If it rains today, what is the probability that a it also rains tomorrow?
If it rains on Monday, what is the probability that it is sunny on Wednesday?
If it rains on Monday, what is the chance it also rains on three consecutive days (Tuesday,
Wednesday, and Thursday)?
Discussion Questions:
e) If you are a weather forecast manager, do you think the Markov Chains method alone is
enough in predicting the weather? What other factors should have been taken into account
for better predictions? How do you think the probability data in the matrix above (P) can
be obtained?
f) Provide some variations to extend this problem and create three additional questions for
the variation you just introduced. You do not need to answer the questions at this time.
Problem 2: Random Walk:
Consider a Markov chain whose state space consists of the integers i = 0, ±1, ±2, . . ., and has
transition probabilities given by
Pi,i+1 = p = 1 − Pi,i−1,
i = 0, ±1, ±2,…
where 0 < p < 1. In other words, on each transition the process either moves one step to the right (with probability p) or one step to the left (with probability 1 − p). One colorful interpretation of this process is that it represents the wanderings of a drunken man as he walks along a straight line. Another is that it represents the winnings of a gambler who on each play of the game either wins or loses one dollar. a) consider a gambler who starts with $100 who receives $1 with p = 0.4 each time he plays the game and losses $1 with probability 1 – p = 0.6. assuming that p = 0.4. Using a simulation model (in Python), calculate the probability of coming back to $100 at least once after playing the game 10 times. b) Repeat part b) using p = 0.5. c) What is the probability that the gambler gets ruined (losing all $100) after 1000 games? d) What is the probability that the gambler gets ruined (losing all $100) at all? Discussion Questions: e) Provide one more example of random walk in the real world. Add some variations to extending this problem and, accordingly, create three additional questions for the problem variation. You do not need to answer the questions at this time. Hands-On Simulation Modeling with Python Develop simulation models to get accurate results and enhance decision-making processes Giuseppe Ciaburro BIRMINGHAM—MUMBAI Hands-On Simulation Modeling with Python Copyright © 2020 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Commissioning Editor: Sunith Shetty Acquisition Editor: Devika Battike Senior Editor: David Sugarman Content Development Editor: Joseph Sunil Technical Editor: Sonam Pandey Copy Editor: Safis Editing Project Coordinator: Aishwarya Mohan Proofreader: Safis Editing Indexer: Manju Arasan Production Designer: Joshua Misquitta First published: July 2020 Production reference: 1160720 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-83898-509-7 www.packt.com Packt.com Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website. Why subscribe? • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals • Improve your learning with Skill Plans built especially for you • Get a free eBook or video every month • Fully searchable for easy access to vital information • Copy and paste, print, and bookmark content Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. Contributors About the author Giuseppe Ciaburro holds a PhD in environmental technical physics, along with two master’s degrees. His research was focused on machine learning applications in the study of urban sound environments. He works at the Built Environment Control Laboratory at the Università degli Studi della Campania Luigi Vanvitelli, Italy. He has over 18 years’ professional experience in programming (Python, R, and MATLAB), first in the field of combustion, and then in acoustics and noise control. He has several publications to his credit. About the reviewers Greg Walters has been involved with computers and computer programming since 1972. He is well versed in Visual Basic, Visual Basic.NET, Python, and SQL, and is an accomplished user of MySQL, SQLite, Microsoft SQL Server, Oracle, C++, Delphi, Modula-2, Pascal, C, 80x86 Assembler, COBOL, and Fortran. He is a programming trainer and has trained numerous individuals in many pieces of computer software, including MySQL, Open Database Connectivity, Quattro Pro, Corel Draw!, Paradox, Microsoft Word, Excel, DOS, Windows 3.11, Windows for Workgroups, Windows 95, Windows NT, Windows 2000, Windows XP, and Linux. He is currently retired and, in his spare time, is a musician and loves to cook. He is also open to working as a freelancer on various projects. Yoon Hyup Hwang is a seasoned data scientist in the marketing and finance industries and is the author of two applied machine learning books. He has almost a decade of experience building numerous machine learning models and data science products. He holds an M.S.E. in Computer and Information Technology from the University of Pennsylvania and a B.A. in Economics from the University of Chicago. He enjoys training various martial arts, snowboarding, and roasting coffee. He currently lives in the Greater New York Area with his artist wife, Sunyoung, and a playful dog, Dali (named after Salvador Dali). Packt is searching for authors like you If you’re interested in becoming an author for Packt, please visit authors. packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea. Table of Contents Preface Section 1: Getting Started with Numerical Simulation 1 Introducing Simulation Models Introducing simulation models 4 Decision-making workflow Comparing modeling and simulation Pros and cons of simulation modeling Simulation modeling terminology 5 6 6 7 Classifying simulation models 9 Comparing static and dynamic models 9 Comparing deterministic and stochastic models 9 Comparing continuous and discrete models10 Approaching a simulationbased problem 10 Problem analysis Data collection Setting up the simulation model Simulation software selection Verification of the software solution Validation of the simulation model Simulation and analysis of results 11 11 11 13 14 15 16 Dynamical systems modeling 16 Managing workshop machinery Simple harmonic oscillator Predator-prey model 17 18 20 Summary22 2 Understanding Randomness and Random Numbers Technical requirements Stochastic processes 24 24 Types of stochastic process 25 Examples of stochastic processes The Bernoulli process Random walk 26 26 27 The Poisson process 29 Random number simulation 30 Probability distribution Properties of random numbers 31 32 The pseudorandom number generator33 The pros and cons of a random number generator 33 Random number generation algorithms 34 Linear congruential generator 34 Random numbers with uniform distribution37 Lagged Fibonacci generator 39 Testing uniform distribution 42 The chi-squared test Uniformity test 42 45 Exploring generic methods for random distributions 51 The inverse transform sampling method 51 The acceptance-rejection method 52 Random number generation using Python 53 Introducing the random module The random.random() function The random.seed() function The random.uniform() function The random.randint() function The random.choice() function The random.sample() function Generating real-valued distributions 54 54 55 56 56 57 58 58 Summary 59 3 Probability and Data Generation Processes Technical requirements 62 Explaining probability concepts 62 Exploring probability distributions69 Understanding Bayes’ theorem 66 Probability density function Mean and variance Uniform distribution Binomial distribution Normal distribution Compound probability Bayes’ theorem Summary83 Types of events Calculating probability Probability definition with an example 62 63 63 66 68 Section 2: Simulation Modeling Algorithms and Techniques 70 71 72 76 79 4 Exploring Monte Carlo Simulations Technical requirements 88 Introducing Monte Carlo simulation88 Monte Carlo components First Monte Carlo application Monte Carlo applications Applying the Monte Carlo method for Pi estimation 89 89 90 91 Understanding the central limit theorem96 Law of large numbers Central limit theorem 96 97 simulation101 Generating probability distributions Numerical optimization Project management Performing numerical integration using Monte Carlo Defining the problem Numerical solution Min-max detection Monte Carlo method Visual representation 101 102 103 104 104 106 108 109 111 Summary113 Applying Monte Carlo 5 Simulation-Based Markov Decision Processes Technical requirements 116 Overview of Markov processes 116 The Bellman equation explained138 The agent-environment interface Exploring MDPs Understanding the discounted cumulative reward Comparing exploration and exploitation concepts 117 119 Multi-agent simulation 140 Summary142 Introducing Markov chains 124 Transition matrix Transition diagram Markov chain applications 122 123 125 126 127 Introducing random walks 127 Simulating a one-dimensional random walk129 Simulating a weather forecast 132 Dynamic programming concepts Principle of optimality The Bellman equation 139 139 140 6 Resampling Methods Technical requirements Introducing resampling methods Sampling concepts overview Reasoning about sampling Pros and cons of sampling Probability sampling How sampling works 144 144 145 146 146 147 147 Exploring the Jackknife technique148 Defining the Jackknife method 148 Estimating the coefficient of variation 150 Applying Jackknife resampling using Python 151 Demystifying bootstrapping 156 Introducing bootstrapping Bootstrap definition problem Bootstrap resampling using Python Comparing Jackknife and bootstrap 156 157 158 161 Explaining permutation tests 162 Approaching cross-validation techniques163 The validation set approach Leave-one-out cross validation K-fold cross validation Cross-validation using Python 163 164 165 165 Summary168 7 Using Simulation to Improve and Optimize Systems Technical requirements Introducing numerical optimization techniques 170 170 Defining an optimization problem 171 Explaining local optimality 173 Defining the descent methods 174 Approaching the gradient descent algorithm174 Understanding the learning rate 177 Explaining the trial and error method 178 Implementing gradient descent in Python178 Facing the Newton-Raphson method183 Using the Newton-Raphson algorithm for root-finding 183 Approaching Newton-Raphson for numerical optimization Applying the Newton-Raphson technique 184 185 Deepening our knowledge of stochastic gradient descent 189 Discovering the multivariate optimization methods in Python191 The Nelder–Mead method 191 Powell's conjugate direction algorithm 195 Summarizing other optimization methodologies197 Summary198 Section 3: Real-World Applications 8 Using Simulation Models for Financial Engineering Technical requirements Understanding the geometric Brownian motion model 202 202 Defining a standard Brownian motion 203 Addressing the Wiener process as random walk 204 Implementing a standard Brownian motion 205 Handling the stock price trend as time series213 Introducing the Black-Scholes model 215 Applying Monte Carlo simulation 216 Studying risk models for portfolio management 220 Using Monte Carlo methods for stock price prediction 207 Using variance as a risk measure 221 Introducing the value-at-risk metric 221 Estimating the VaR for some NASDAQ assets223 Exploring the Amazon stock price trend 208 Summary232 9 Simulating Physical Phenomena Using Neural Networks Technical requirements 234 Introducing the basics of neural networks 234 Understanding biological neural networks Exploring ANNs Splitting the data 256 Explaining multiple linear regression 258 Understanding a multilayer perceptron regressor model 260 235 236 Exploring deep neural networks264 Understanding feedforward neural networks 242 Exploring neural network training 243 264 265 266 Getting familiar with convolutional neural networks Examining recurrent neural networks Analyzing LSTM networks Summary267 Simulating airfoil self-noise using ANNs 244 Importing data using pandas Scaling the data using sklearn Viewing the data using matplotlib 246 249 252 10 Modeling and Simulation for Project Management Technical requirements 270 Introducing project management270 Addressing management problems using MDPtoolbox Changing the probability of fire 278 282 Understanding what-if analysis Scheduling project time using Monte Carlo simulation 284 271 Managing a tiny forest problem272 Summarizing the Markov decision process272 Exploring the optimization process 273 Introducing MDPtoolbox 275 Defining the tiny forest management example 275 Defining the scheduling grid 285 Estimating the task's time 286 Developing an algorithm for project scheduling287 Exploring triangular distribution 288 Summary294 11 What's Next? Summarizing simulation modeling concepts 295 Generating random numbers 296 Applying Monte Carlo methods 298 Addressing the Markov decision process299 Analyzing resampling methods 300 Exploring numerical optimization techniques302 Using artificial neural networks for simulation303 Applying simulation model to real life 304 Other Books You May Enjoy Leave a review - let other readers know what you think 317 Modeling in healthcare Modeling in financial applications Modeling physical phenomenon Modeling public transportation Modeling human behavior 304 305 306 307 308 Next steps for simulation modeling309 Increasing the computational power 309 Machine learning-based models 311 Automated generation of simulation models312 Summary313 Preface Simulation modeling helps you to create digital prototypes of physical models to analyze how they work and predict their performance in the real world. With this comprehensive guide, you’ll learn about various computational statistical simulations using Python. Starting with the fundamentals of simulation modeling, you’ll learn about concepts such as randomness and explore data generating processes, resampling methods, and bootstrapping techniques. You’ll then cover key algorithms such as Monte Carlo simulations and the Markov Decision Process, which are used to develop numerical simulation models, and discover how they can be used to solve real-world problems. As you make progress, you’ll develop simulation models to help you get accurate results and enhance decision-making processes. Using optimization techniques, you’ll learn to modify the performance of a model to improve results and make optimal use of resources. The book will guide you through creating a digital prototype using practical use cases for financial engineering, prototyping project management to improve planning, and simulating physical phenomena using neural networks. By the end of this book, you’ll be able to construct and deploy simulation models of your own to solve real-world challenges. Who this book is for Hands-On Simulation Modeling with Python is for simulation developers and engineers, model designers, and anyone already familiar with the basic computational methods that are used to study the behavior of systems. This book will help you explore advanced simulation techniques such as Monte Carlo methods, statistical simulations, and much more using Python. Working knowledge of the Python programming language is required. What this book covers Chapter 1, Introduction, analyzes the basics of numerical simulation and highlights the difference between modeling and simulation and the strengths of simulation models such as defects. The different types of models are analyzed, and we study practical modeling cases to understand how to elaborate a model starting from the initial considerations. viii Preface Chapter 2, Understanding Randomness and Random Numbers, defines stochastic processes and explains the importance of using them to address numerous real-world problems. The main methods for generating random numbers with practical examples in Python code, and the generation of uniform and generic distributions, are both explored. It also explains how to perform a uniformity test using the chi-square method. Chapter 3, Probability and the Data Generating Process, shows how to distinguish between the different definitions of probabilities and how they can be integrated to obtain useful information in the simulation of real phenomena. Chapter 4, Monte Carlo Simulations, explores techniques based on Monte Carlo methods for process simulation. We will first learn the basic concepts, and then we will see how to apply them to practical cases. Chapter 5, Simulation-Based Markov Decision Process, shows how to deal with decisionmaking processes with Markov chains. We will analyze the concepts underlying Markovian processes and then analyze some practical applications to learn how to choose the right actions for the transition between different states of the system. Chapter 6, Resampling Methods, shows how to apply resampling methods to approximate some characteristics of the distribution of a sample in order to validate a statistical model. We will analyze the basics of the most common resampling methods and learn how to use them by solving some practical problems. Chapter 7, Use of Simulation to Improve and Optimize Systems, shows how to use the main optimization techniques to improve the performance of our simulation models. We will see how to use the gradient descent technique, the Newton-Raphson method, and stochastic gradient descent. We will also see how to apply these techniques with practical examples. Chapter 8, Simulation Models for Financial Engineering, shows practical cases of using simulation methods in a financial context. We will learn how to use Monte Carlo methods to predict stock prices and how to assess the risk associated with a portfolio of shares. Chapter 9, Simulating Physical Phenomena with Neural Networks, shows how to develop models based on artificial neural networks to simulate physical phenomena. We will start by exploring the basic concepts of neural networks, and we will examine their architecture and its main elements. We will see how to train a network to update its weights. Preface ix Chapter 10, Modeling and Simulation for Project Management, deals with practical cases of project management using the tools we learned how to use in the previous chapters. We will see how to evaluate in advance the results of the actions undertaken in the management of a forest using Markov processes, and then move on to evaluating the time required for the execution of a project using the Monte Carlo simulation. Chapter 11, What’s Next?, provides a better understanding of the problems associated with building and deploying simulation models and additional resources and technologies to learn how to hone your machine learning skills. To get the most out of this book Working knowledge of Python programming language is required. If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code. Download the example code files You can download the example code files for this book from your account at www. packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/ support and register to have the files emailed directly to you. You can download the code files by following these steps: 1. Log in or register at www.packt.com. 2. Select the Support tab. 3. Click on Code Downloads. 4. Enter the name of the book in the Search box and follow the onscreen instructions. x Preface Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of: • WinRAR/7-Zip for Windows • Zipeg/iZip/UnRarX for Mac • 7-Zip/PeaZip for Linux The code bundle for the book is also hosted on GitHub at https://github.com/ PacktPublishing/Hands-On-Simulation-Modeling-with-Python. In case there’s an update to the code, it will be updated on the existing GitHub repository. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out! Download the color images We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/ files/downloads/9781838985097_ColorImages.pdf. Conventions used There are a number of text conventions used throughout this book. Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system.” A block of code is set as follows: import random import statistics import matplotlib.pyplot as plt When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold: import random import statistics import matplotlib.pyplot as plt Preface xi Any command-line input or output is written as follows: $ python jakknife_estimator.py Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: “Select System info from the Administration panel.” Tips or important notes Appear like this. Get in touch Feedback from our readers is always welcome. General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected] Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details. Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material. If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors. packtpub.com. Reviews Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you! For more information about Packt, please visit packt.com. Section 1: Getting Started with Numerical Simulation In this section, the basic concepts of simulation modeling are addressed. This section helps you to understand the fundamental concepts and elements of numerical simulation. This section contains the following chapters: Chapter 1, Introducing Simulation Models Chapter 2, Understanding Randomness and Random Numbers Chapter 3, Probability and Data Generating Processes 1 Introducing Simulation Models A simulation model is a tool capable of processing information and data and predicting the responses of a real system to certain inputs, thus becoming an effective support for analysis, performance evaluation, and decision-making processes. The term simulation refers to reproducing the behavior of a system. In general, we speak of simulation both in the case in which a concrete model is used and in the case in which an abstract model is used that reproduces reality using a computer. An example of a concrete model is a scale model of an airplane that is then placed in a wind tunnel to carry out simulated tests to estimate suitable performance measures. Although, over the years, physicists have developed theoretical laws that we can use to obtain information on the performance of dynamic systems, often, the application of these laws to a real case takes too long. In these cases, it is convenient to construct a numerical simulation model that allows us to simulate the behavior of the system under certain conditions. This elaborated model will allow us to test the functionality of the system in a simple and immediate way, saving considerable resources in terms of time and money. 4 Introducing Simulation Models In this chapter, we're going to cover the following main topics: • Introducing simulation models • Classifying simulation models • Approaching a simulation-based problem • Dynamical systems modeling Important Note In this chapter, an introduction to simulation techniques will be discussed. In order to deal with the topics at hand, it is necessary that you have a basic knowledge of algebra and mathematical modeling. Introducing simulation models Simulation uses abstract models built to replicate the characteristics of a system. The operation of a system is simulated using probability distributions to randomly generate system events, and statistical observations are obtained from the simulated system. It plays a very important role, especially in the design of a stochastic system and in the definition of its operating procedures. By not working directly on the real system, many scenarios can be simulated simply by changing the input parameters, thus limiting the costs that would occur if this solution were not used and, ultimately, reducing the time it would take. In this way, it is possible to quickly try alternative policies and design choices and model systems of great complexity by studying their behavior and evolution over time. Important Note Simulation is used when working on real systems is not convenient due to high costs, technical impossibility, and the non-existence of a real system. Simulation allows you to predict what happens to the real system if certain inputs are used. Changing these input parameters simulates different scenarios that allow us to identify the most convenient one from various points of view. Introducing simulation models 5 Decision-making workflow In a decision-making process, the starting point is identifying the problematic context that requires a change and therefore a decision. The context that's identified is then analyzed in order to highlight what needs to be studied for the decisions that need to be made; that is, those elements that seem the most relevant are chosen, the relationships that connect them are highlighted, and the objectives to be achieved are defined. At this point, a formal model is constructed, which allows us to simulate the identified system in order to understand its behavior and to arrive at identifying the decisions to be made. The following diagram describes the workflow that allows us to make a decision, starting from observing the problematic context: Figure 1.1 – Decision-making workflow This represents a way of spreading knowledge and involves various actors. Constructing a model is a two-way process: • Definition of conceptual models • Continuous interaction between the model and reality by comparison In addition, learning also has a participatory characteristic: it proceeds through the involvement of different actors. The models also allow you to analyze and propose organized actions so that you can modify the current situation and produce the desired solution. 6 Introducing Simulation Models Comparing modeling and simulation To start, we will clarify the differences between modeling and simulation. A model is a representation of a physical system, while simulation is the process of seeing how a model-based system would work under certain conditions. Modeling is a design methodology that is based on producing a model that implements a system and represents its functionality. In this way, it is possible to predict the behavior of a system and the effects of the variations or modifications that are made on it. Even if the model is a simplified representation of the system, it must still be close enough to the functional nature of the real system, but without becoming too complex and difficult to handle. Important Note Simulation is the process that puts the model into operation and allows you to evaluate its behavior under certain conditions. Simulation is a fundamental tool for modeling because, without necessarily resorting to physical prototyping, the developer can verify the functionality of the modeled system with the project specifications. Simulation allows us to study the system through a wide spectrum of conditions so that we can understand how representative the model is of the system that it refers to. Pros and cons of simulation modeling Simulation is a tool that's widely used in a variety of fields, from operational research to the application industry. This technique can be made successful by it overcoming the difficulties that each complex procedure contains. The following are the pros and cons of simulation modeling. Let's start with the concrete advantages that can be obtained from the use of simulation models (pros): • It reproduces the behavior of a system in reference to situations that cannot be directly experienced. • It represents real systems, even complex ones, while also considering the sources of uncertainty. • It requires limited resources in terms of data. • It allows experimentation in limited time frames. • The models that are obtained are easily demonstrable. Introducing simulation models 7 As anticipated, since it is a technique capable of reproducing complex scenarios, it has some limitations (cons): • The simulation provides indications of the behavior of the system, but not exact results. • The analysis of the output of a simulation could be complex and it could be difficult to identify which may be the best configuration. • The implementation of a simulation model could be laborious and, moreover, it may take long calculation times to carry out a significant simulation. • The results that are returned by the simulation depend on the quality of the input data: it cannot provide accurate results in the case of inaccurate input data. • The complexity of the simulation model depends on the complexity of the system it intends to reproduce. Nevertheless, simulation models represent the best solution for the analysis of complex scenarios. Simulation modeling terminology In this section, we will analyze the elements that make up a model and those that characterize a simulation process. We will give a brief description of each so that you understand their meaning and the role they play in the numerical simulation process. System The context of an investigation is represented through a system; that is, the set of elements that interact with each other. The main problem linked to this element concerns the system boundaries, that is, which elements of reality must be inserted in the system that represents it and which are left out and the relationships that exist between them. State variables A system is described in each instant of time by a set of variables. These are called state variables. For example, in the case of a weather system, the temperature is a state variable. In discrete systems, the variables change instantly at precise moments of time that are finite. In continuous systems, the variables vary in terms of continuity with respect to time. 8 Introducing Simulation Models Events An event is defined as any instantaneous event that causes the value of at least one of the status variables to change. The arrival of a blizzard for a weather system is an event, as it causes the temperature to drop suddenly. There are both external events and internal events. Parameters Parameters represent essential terms when building a model. They are adjusted during the model simulation process to ensure that the results are brought into the necessary convergence margins. They can be modified iteratively through sensitivity analysis or in the model calibration phase. Calibration Calibration represents the process by which the parameters of the model are adjusted in order to adapt the results to the data observed in the best possible way. When calibrating the model, we try to obtain the best possible accuracy. A good calibration requires eliminating, or minimizing, errors in data collection and choosing a theoretical model that is the best possible description of reality. The choice of model parameters is decisive and must be done in such a way as to minimize the deviation of its results when applied to historical data. Accuracy Accuracy is the degree of correspondence of the simulation result that can be inferred from a series of calculated values with the actual data, that is, the difference between the average modeled value and the true or reference value. Accuracy, when calculated, provides a quantitative estimate of the quality expected from a forecast. Several indicators are available to measure accuracy. The most used are mean absolute error (MAE), mean absolute percentage error (MAPE), and mean squared error (MSE). Sensitivity The sensitivity of a model indicates the degree to which the model's outputs are affected by changes in the selected input parameters. A sensitivity analysis identifies the sensitive parameters for the output of the model. It allows us to determine which parameters require further investigation so that we have a more realistic evaluation of the model's output values. Furthermore, it allows us to identify which parameters are not significant for the generation of a certain output and therefore can possibly be eliminated from the model. Finally, it tells us which parameters should be considered in a possible and subsequent analysis of the uncertainty of the output values provided by the model. Classifying simulation models 9 Validation This is the process that verifies the accuracy of the proposed model. The model must be validated to be used as a tool to support decisions. It aims to verify whether the model that's being analyzed corresponds conceptually to our intentions. The validation of a model is based on the various techniques of multivariate analysis, which, from time to time, study the variability and interdependence of attributes within a class of objects. Classifying simulation models Simulation models can be classified according to different criteria. The first distinction is between static and dynamic systems. So, let's see what differentiates them. Comparing static and dynamic models Static models are the representation of a system in an instant of time, or representative models of a system in which the time variable plays no role. An example of a static simulation is a Monte Carlo model. Dynamic models, on the other hand, describe the evolution of the system over time. In the simplest case, the state of the system at time t is described by a function x (t). For example, in population dynamics, x (t) represents the population present at time t. The equation that regulates the system is dynamic: it describes the instantaneous variation of the population or the variation in fixed time intervals. Comparing deterministic and stochastic models A model is deterministic when its evolution, over time, is uniquely determined by its initial conditions and characteristics. These models do not consider random elements and lend themselves to be solved with exact methods that are derived from mathematical analysis. In deterministic models, the output is well determined once the input data and the relationships that make up the model have been specified, despite the time required for data processing being particularly long. For these systems, the transformation rules univocally determine the change of state of the system. Examples of deterministic systems can be observed in some production and automation systems. Stochastic models, on the other hand, can be evolved by inserting random elements into the evolution. These are obtained by extracting them from statistical distributions. Among the operational characteristics of these models, there is not just one relationship that fits all. There's also probability density functions, which means there is no one-to-one correspondence between the data and system history. 10 Introducing Simulation Models A final distinction is based on how the system evolves over time: this is why we distinguish between continuous and discrete simulation models. Comparing continuous and discrete models Continuous models represent systems in which the state of the variables changes continuously as a function of time. For example, a car moving on a road represents a continuous system since the variables that identify it, such as position and speed, can change continuously with respect to time. In discrete models, the system is described by an overlapping sequence of physical operations, interspersed with inactivity pauses. These operations begin and end in welldefined instances (events). The system undergoes a change of state when each event occurs, remaining in the same state throughout the interval between the two subsequent events. This type of operation is easy to treat with the simulation approach. Important Note The stochastic or deterministic, or continuous or discrete, nature of a model is not its absolute property and depends on the observer's vision of the system itself. This is determined by the objectives and the method of study, as well as by the experience of the observer. Now that we've analyzed the different types of models in detail, we will learn how to develop a numerical simulation model. Approaching a simulation-based problem To tackle a numerical simulation process that returns accurate results, it is crucial to rigorously follow a series of procedures that partly precede and partly follow the actual modeling of the system. We can separate the simulation process workflow into the following individual steps: 1. Problem analysis 2. Data collection 3. Setting up the simulation model 4. Simulation software selection Approaching a simulation-based problem 11 5. Verification of the software solution 6. Validation of the simulation model 7. Simulation and analysis of results To fully understand the whole simulation process, it is essential to analyze the various phases that characterize a study based on simulation in depth. Problem analysis In this initial step, the goal is to understand the problem by trying to identify the aims of the study and the essential components, as well as the performance measures that interest them. Simulation is not simply an optimization technique and therefore there is no parameter that needs to be maximized or minimized. However, there is a series of performance indices whose dependence on the input variables must be verified. If an operational version of the system is already available, the work is simplified as it is enough to observe this system to deduce its fundamental characteristics. Data collection This represents a crucial step in the whole process since the quality of the simulation model depends on the quality of the input data. This step is closely related to the previous one. In fact, once the objective of the study has been identified, data is collected and subsequently processed. Processing the collected data is necessary to transform it into a format that can be used by the model. The origin of the data can be different: sometimes, the data is retrieved from company databases, but more often than not, direct measurements in the field must be made through a series of sensors that, in recent years, have become increasingly smart. These operations weigh down the entire study process, thus lengthening their execution times. Setting up the simulation model This is the crucial step of the whole simulation process; therefore, it is necessary to pay close attention to it. To set up a simulation model, it is necessary to know the probability distributions of the variables of interest. In fact, to generate various representative scenarios of how a system works, it is essential that a simulation generates random observations from these distributions. 12 Introducing Simulation Models For example, when managing stocks, the distribution of the product being requested and the distribution of time between an order and the receipt of the goods is necessary. On the other hand, when managing production systems with machines that can occasionally fail, it will be necessary to know the distribution of time until a machine fails and the distribution of repair times. If the system is not already available, it is only possible to estimate these distributions by deriving them, for example, from the observation of similar, already existing systems. If, from the analysis of the data, it is seen that this form of distribution approximates a standard type distribution, the standard theoretical distribution can be used by carrying out a statistical test to verify whether the data can be well represented by that probability distribution. If there are no similar systems from which observable data can be obtained, other sources of information must be used: machine specifications, instruction manuals for the machines, experimental studies, and so on. As we've already mentioned, constructing a simulation model is a complex procedure. Referring to simulating discrete events, constructing a model involves the following steps: 1. Defining the state variables 2. Identifying the values that can be taken by the state variables 3. Identifying the possible events that change the state of the system 4. Realizing a simulated time measurement, that is, a simulation clock, that records the flow of simulated time 5. Implementing a method for randomly generating events 6. Identifying the state transitions generated by events After following these steps, we will have the simulation model ready for use. At this point, it will be necessary to implement this model in a dedicated software platform; let's see how. Approaching a simulation-based problem 13 Simulation software selection The choice of the software platform that you will perform the numerical simulation with is fundamental for the success of the project. In this regard, we have several solutions that we can adopt. This choice will be made based on our knowledge of programming. Let's see what solutions are available: • Simulators: These are application-oriented packages for simulation. There are numerous interactive software packages for simulation, such as MATLAB, COMSOL Multiphysics, Ansys, SolidWorks, Simulink, Arena, AnyLogic, and SimScale. These pieces of software represent excellent simulation platforms whose performance differs based on the application solutions provided. These simulators allow us to elaborate on a simulation environment using graphic menus without the need to program. They are easy to use but many of them have excellent modeling capabilities, even if you just use their standard features. Some of them provide animations that show the simulation in action, which allows you to easily illustrate the simulation to non-experts. The limitations presented by this software solution are the high costs of the licenses, which can only be faced by large companies, and the difficulty in modeling solutions that have not been foreseen by the standards. • Simulation languages: A more versatile solution is offered by the different simulation languages ​​available. There are solutions that facilitate the task of the programmer who, with these languages, ​​can develop entire models or sub-models with a few lines of code that would otherwise require much longer drafting times, with a consequent increase in the probability of error. An example of a simulation language ​​is the general-purpose simulation system (GPSS). This is a generic programming language that was developed by IBM in 1965. In it, a simulation clock advances in discrete steps, modeling a system as transactions enter the system and are passed from one service to another. It is mainly used as a process flow-oriented simulation language and is particularly suitable for application problems. Another example of a simulation language ​​is SimScript, which was developed in 1963 as an extension of Fortran. SimScript is an event-based scripting language, so different parts of the script are triggered by different events. 14 Introducing Simulation Models • GPSS: General-purpose programming languages are designed to be able to create software in numerous areas of application. They are particularly suitable for the development of system software such as drivers, kernels, and anything that communicates directly with the hardware of a computer. Since these languages are ​​ not specifically dedicated to a simulation activity, they require the programmer to work harder to implement all the mechanisms and data structures necessary in a simulator. On the other hand, by offering all the potential of a high-level programming language, they offer the programmer a more versatile programming environment. In this way, you can develop a numerical simulation model perfectly suited to the needs of the researcher. In this book, we will use this solution by devoting ourselves to programming with Python. This software platform offers a series of tools that have been created by researchers from all over the world that make the elaboration of a numerical modeling system particularly easy. In addition, the open source nature of the projects written in Python makes this solution particularly inexpensive. Now that we've made the choice of the software platform we're going to use and have elaborated on the numerical model, we need to verify the software solution. Verification of the software solution In this phase, a check is carried out on the numerical code. This is known as debugging, which consists of ensuring that the code correctly follows the desired logical flow, without unexpected blocks or interruptions. The verification must be provided in real time during the creation phase because correcting any concept or syntax errors becomes more difficult as the complexity of the model increases. Although verification is simple in theory, debugging large-scale simulation code is a difficult task due to virtual competition. The correctness or otherwise of executions depends on time, as well as on the large number of potential logical paths. When developing a simulation model, you should divide the code into modules or subroutines in order to facilitate debugging. It is also advisable to have more than one person review the code, as a single programmer may not be a good critic. In addition, it can be helpful to perform the simulation when considering a large variety of input parameters and checking that the output is reasonable. Approaching a simulation-based problem 15 Important Note One of the best techniques that can be used to verify a discrete-event simulation program is one based on tracking. The status of the system, the content of the list of events, the simulated time, the status variables, and the statistical counters are shown after the occurrence of each event and then compared with handmade calculations to check the operation of the code. A track often produces a large volume of output that needs to be checked event by event for errors. Possible problems may arise, including the following: • There may be information that hasn't been requested by the analyst. • Other useful information may be missing, or a certain type of error may not be detectable during a limited debugging run. After the verification process, it is necessary to validate the simulation model. Validation of the simulation model In this step, it is necessary to check whether the model that has been created provides valid results for the system in question. We must check whether the performance measurements of the real system are well approximated by the measurements generated by the simulation model. A simulation model of a complex system can only approximate it. A simulation model is always developed for a set of objectives. A model that's valid for one purpose may not be valid for another. Important Note Validation is a where the level of accuracy between the model and the system is respected. It is necessary to establish whether the model adequately represents the behavior of the system. The value of a model can only be defined in relation to its use. Therefore, validation is a process that aims to determine whether a simulation model accurately represents the system for the set objectives. In this step, the ability of the model to reproduce the real functionality the system is ascertained; that is, it is ensured that the calibrated parameters, relative to the calibration scenario, can be used to correctly simulate other system situations. Once the validation phase is over, the model can be considered transferable and therefore usable for the simulation of any new control strategies and new intervention alternatives. As widely discussed in the literature on this subject, it is important to validate the model parameters that were previously calibrated on the basis of data other than that used to calibrate the model, always with reference to the phenomenon specific to the scenario being analyzed. 16 Introducing Simulation Models Simulation and analysis of results A simulation is a process that evolves during its realization and where the initial results help lead the simulation toward more complex configurations. Attention should be paid to some details. For example, it is necessary to determine the transient length of the system before reaching stationary conditions if you want performance measures of the system at full capacity. It is also necessary to determine the duration of the simulation after the system has reached equilibrium. In fact, it must always be kept in mind that a simulation does not produce the exact values ​​of the performance measures of a system since each simulation is a statistical experiment that generates statistical observations regarding the performance of the system. These observations are then used to produce estimates of performance measures. Increasing the duration of the simulation can increase the accuracy of these estimates. The simulation results return statistical estimates of a system's performance measures. A fundamental point is that each measurement is accompanied by the confidence interval, within which it can vary. These results could immediately highlight a better system configuration than the others, but more often, more than one candidate configuration will be identified. In this case, further investigations may be needed to compare these configurations. Dynamical systems modeling In this section, we will analyze a real case of modeling a production process. In this way, we will learn how to deal with the elements of the system and how to translate the production instances into the elements of the model. A model is created to study the behavior of a system over time. It consists of a set of assumptions about the behavior of the system being expressed using mathematical logical-symbolic relationships. These relationships are between the entities that make up the system. Recall that a model is used to simulate changes in the system and predict the effects of these changes on the real system. Simple models are resolved analytically, using mathematical methods, while complex models are numerically simulated on the computer, where the data is treated as the data of a real system. Dynamical systems modeling 17 Managing workshop machinery In this section, we will look at a simple example of how a discrete event simulation of a dynamic system is created. A discrete event system is a dynamic system whose states can assume logical or symbolic values, rather than numerical ones, and whose behavior is characterized by the occurrence of instantaneous events within an irregular timing sequence not necessarily known a priori. The behavior of these systems is described in terms of states and events. In a workshop, there are two machines, which we will call A1 and A2. At the beginning of the day, five jobs need to be carried out: W1, W2, W3, W4, and W5. The following table shows how long we need to work on the machines in minutes: Figure 1.2 – Table showing work time on the machines A zero indicates that a job does not require that machine. Jobs that require two machines must pass through A1 and then through A2. Suppose that we decide to carry out the jobs by assigning them to each machine so that when they become available, the first executable job is started first, in the order from 1 to 5. If, at the same time, more jobs can be executed on the same machine, we will execute the one with a minor index first. The purpose of modeling is to determine the minimum time needed to complete all the works. The events in which state changes can occur in the system are as follows: 1. A job becomes available for a machine. 2. A machine starts a job. 3. A machine ends a job. 18 Introducing Simulation Models Based on these rules and the evaluation times indicated in the previous table, we can insert the sequence of the jobs, along with the events scheduled according to the execution times, into a table: Figure 1.3 – Table of job sequences This table shows the times of the events in sequence, indicating the start and end of the work associated with the two machines available in the workshop. At the end of each job, a new job is sent to each machine according to the rules set previously. In this way, the deadlines for the work and the subsequent start of another job are clearly indicated. This is just as easy as it is to identify the time interval in which each machine is used and when it becomes available again. The table solution we have proposed represents a simple and immediate way of simulating a simple dynamic discrete system. The example we just discussed is a typical case of a dynamic system in which time proceeds in steps, in a discrete way. However, many dynamic systems are best described by assuming that time passes continuously. In the next section, we will analyze the case of a continuous dynamic system. Simple harmonic oscillator Consider a mass m resting on a horizontal plane, without friction, and attached to a wall by an ideal spring, of elastic constant k. Suppose that, when the horizontal coordinate x is zero, the spring is at rest. The following diagram shows the scheme of a simple harmonic oscillator: Dynamical systems modeling 19 Figure 1.4 – The scheme of a harmonic oscillator If the block of mass m is moved to the right with respect to its equilibrium position (x> 0),
the spring, being elongated, calls it to the left. Conversely, if the block is placed to the left
of its equilibrium position (x < 0), then the spring is compressed and pushes the block to the right. In both cases, we can express the component along the x-axis of the force due to the spring according to the following formula: Here, we have the following: • is the force. • is the elastic constant. • 𝐹𝐹𝑥𝑥 = −𝑘𝑘 ∗ 𝑥𝑥 is the horizontal coordinate that indicate the position of the mass m. From the second law of dynamics, we can derive the component of acceleration along x as follows: Here, we have the following: • is the acceleration. • is the elastic constant. • • 𝑎𝑎𝑥𝑥 = − 𝑘𝑘 ∗ 𝑥𝑥 𝑚𝑚 is the mass of the block. is the horizontal coordinate that indicate the position of the mass m. 20 Introducing Simulation Models = = If we indicate with the rate of change of the speed and with we can obtain the evolution equations of the dynamic system, as follows: Here, we have the following: • 𝜔𝜔2 = − the speed, 𝑑𝑑𝑑𝑑 = 𝑣𝑣 { 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑 = − 𝜔𝜔2 ∗ 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑘𝑘 𝑚𝑚 For these equations, we must associate the initial conditions of the system, which we can write in the following way: 𝑥𝑥(0) = 𝑥𝑥0 { 𝑣𝑣(0) = 𝑣𝑣0 The solutions to the previous differential equations are as follows: 𝑣𝑣0 𝑥𝑥(𝑡𝑡) = 𝑥𝑥0 cos(𝜔𝜔𝜔𝜔) + sin (𝜔𝜔𝜔𝜔) { 𝜔𝜔 𝑣𝑣(𝑡𝑡) = 𝑣𝑣0 cos(𝜔𝜔𝜔𝜔) − 𝑥𝑥0 𝜔𝜔 sin (𝜔𝜔𝜔𝜔) In this way, we obtained the mathematical model of the analyzed system. In order to study the evolution of the oscillation phenomenon of the mass block m over time, it is enough to vary the time and calculate the position of the mass at that instant and its speed. In decision-making processes characterized by high levels of complexity, the use of analytical models is not possible. In these cases, it is necessary to resort to models that differ from those of an analytical type for the use of the calculator as a tool not only for calculation, such as in mathematical programming models, but also for representing the elements that make up reality why studying the relationships between them. Predator-prey model In the field of simulations, simulating the functionality of production and logistic processes is considerably important. These systems are, in fact, characterized by high complexity, numerous interrelationships between the different processes that pass through them, segment failures, unavailability, and the stochasticity of the system parameters. Dynamical systems modeling 21 To understand how complex the analytical modeling of some phenomena is, let's analyze a universally widespread biological model. This is the predator-prey model, which was developed independently by the Italian researcher Vito Volterra and the American biophysicist Alfred Lotka. On an island, there are two populations of animals: prey and predators. The vegetation of the island provides the prey with nourishment in quantities that we can consider as unlimited, while the prey is the only food available for the predators. We can consider the birth rate of the prey constant over time; this means that in the absence of predators, the prey would grow by exponential law. Their mortality rate, on the other hand, depends on the probability they have of falling prey to a predator and therefore on the number of predators present per unit area. As for the predators, the mortality rate is constant, while their growth rate depends on the availability of food and therefore on the number of prey per unit area present on the island. We want to study the trend of the size of the two populations over time, starting from a known initial situation (number of prey and predators). To carry out a simulation of this biological system, we can model it by means of the following system of finite difference equations, where x(t) and y(t) are the number of prey and predators at time t, respectively: 𝑑𝑑𝑑𝑑 = 𝛼𝛼 ∗ 𝑥𝑥(𝑡𝑡) − 𝛽𝛽 ∗ 𝑥𝑥(𝑡𝑡) ∗ 𝑦𝑦(𝑡𝑡) { 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑 = 𝛾𝛾 ∗ 𝑦𝑦(𝑡𝑡) − 𝛿𝛿 ∗ 𝑥𝑥(𝑡𝑡) ∗ 𝑦𝑦(𝑡𝑡) 𝑑𝑑𝑑𝑑 Here, we have the following: • α, β, γ, δ are positive real parameters related to the interaction of the two species • is the instantaneous growth rates of the prey. • is the instantaneous growth rates of the predators. The following hypotheses underlie in this model: • In the absence of predators, the number of prey increases according to an exponential law, that is, with a constant rate. • Similarly, in the absence of prey, the number of predators decreases at a constant rate. This is a deterministic and continuous simulation model. In fact, the state of the system, represented by the size of the two populations in each instance of time, is univocally determined by the initial state and the parameters of the model. Furthermore, at least in principle, the variables – that is, the size of the populations – vary continuously over time. The differential system is in normal form and can be solved with respect to the derivatives of maximum order but cannot be separated into variables. It is not possible to solve this in an analytical form, but with numerical methods, it is solved immediately (the RungeKutta method). The solution obviously depends on the values of the four constants and the initial values. Summary In this chapter, we learned what is meant by simulation modeling. We understood the difference between modeling and simulation, and we discovered the strengths of simulation models, such as defects. To understand these concepts, we clarified the meaning of the terms that appear most frequently when dealing with these topics. We then analyzed the different types of models: static versus dynamic, deterministic versus stochastic, and continuous versus discrete. We then explored the workflow connected to a numerical simulation process and highlighted the crucial steps. Finally, we studied some practical modeling cases to understand how to elaborate on a model starting from the initial considerations. In the next chapter, we will learn how to approach a stochastic process and understand the random number simulation concepts. Then, we will explore the differences between pseudo and non-uniform random numbers, as well as the methods we can use for random distribution evaluation. 2 Understanding Randomness and Random Numbers In many real-life situations, it is useful to flip a coin in order to decide what to do. Many computers also use this procedure as part of their decision-making process. In fact, many problems can be solved in a very effective and relatively simple way by using probabilistic algorithms. In an algorithm of this type, decisions are made based on random contributions that remember the dice roll with the help of a randomly chosen value. The generation of random numbers has ancient roots, but only recently has the process been sped up, allowing it to be used on a large scale in scientific research as well. These generators are mainly used for computer simulations, statistical sampling techniques, or in the field of cryptography. In this chapter, we're going to cover the following topics: • Stochastic processes • Random number simulation • The pseudorandom number generator 24 Understanding Randomness and Random Numbers • Testing uniform distribution • Exploring generic methods for random distributions • Random number generation using Python Technical requirements In this chapter, we will introduce random number generation techniques. In order to understand these topics, a basic knowledge of algebra and mathematical modeling is needed. To work with the Python code in this chapter, you need the following files (available on GitHub at https://github.com/PacktPublishing/Hands-On-SimulationModeling-with-Python): • LinearCongruentialGenerator.py • LearmouthLewisGenerator.py • LaggedFibonacciAlgorithm.py • UniformityTest.py • Random.Generation.py Stochastic processes A stochastic process is a family of random variables that depends on a parameter, t. A stochastic process is specified using the following notation: {𝑋𝑋𝑡𝑡 , 𝑡𝑡 ∈ 𝑇𝑇} Here, t is a parameter, and T is the set of possible values ​​of t. Usually, time is indicated by t, so a stochastic process is a family of time-dependent random variables. The variability range of t, that is, the set, T, can be a set of real numbers, possibly coinciding with the entire time axis. But it can also be a discrete set of values. The random variables, Xt, are defined on the set, X, called the space of states. This can be a continuous set, in which case it is defined as a continuous stochastic process, or a discrete set, in which case it is defined as a discrete stochastic process. Consider the following elements: 𝑥𝑥0 , 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 ∈ 𝑋𝑋 Stochastic processes 25 This means the values ​​that the random variables, Xt, can take are called system states and represent the possible results of an experiment. The Xt variables are linked together by dependency relationships. We can know a random variable if we know both the values it can assume and the probability distribution. So, to understand a stochastic process, it is necessary not only to know the values ​​that Xt can take but also the probability distributions of the variables and the joint distributions between the values. Simpler stochastic processes, in which the variability range of t is a discrete set of time values, ​​can also be considered. Important note In practice, there are numerous phenomena that are studied through the theory of stochastic processes. A classic application in physics is the study of the motion of a particle in each medium, the so-called Brownian motion. This study is carried out statistically using a stochastic process. There are processes where even by knowing the past and the present, the future cannot be determined; whereas, in other processes, the future is determined by the present without considering the past. Types of stochastic process Stochastic processes can be classified according to the following characteristics: • Space of states • Time index • Type of stochastic dependence between random variables The state space can be discrete or continuous. In the first case, the stochastic process with discrete space is also called a chain, and space is often referred to as the set of non-negative integers. In the second case, the set of values assumed by the random variables is not finite or countable, and the stochastic process is in continuous space. The time index can also be discrete or continuous. A discrete-time stochastic process is also called a stochastic sequence and is denoted as follows: Here, the set, T, is finite or countable. {𝑋𝑋𝑛𝑛 | 𝑛𝑛 ∈ 𝑇𝑇} 26 Understanding Randomness and Random Numbers In this case, the changes of state are observed only in certain instances: finite or countable. If state changes occur at any instant in a finite or infinite set of real intervals, then there is a continuous-time process, which is denoted as follows: {𝑋𝑋(𝑡𝑡) | 𝑡𝑡 ∈ 𝑇𝑇} The stochastic dependence between random variables, X(t), for different values of t characterizes a stochastic process and sometimes simplifies its description. A stochastic process is stationary in the strict sense that the distribution function is invariant with respect to a shift on the time axis, T. A stochastic process is stationary in the broad sense that the first two moments of the distribution are independent of the position on the T axis. Examples of stochastic processes The mathematical treatment of stochastic processes seems complex, yet we find cases of stochastic processes every day. For example, the number of patients admitted to a hospital as a function of time, observed at noon each day, is a stochastic process in which the space of states is discrete, being a finite subset of natural numbers, and time is discrete. Another example of a stochastic process is the temperature measured in a room as a function of time, observed at every instant, with continuous state space and continuous time. Let's now look at a number of structured examples that are based on stochastic processes. The Bernoulli process The concept of a random variable allows us to formulate models that are useful for the study of many random phenomena. An important early example of a probabilistic model is the Bernoulli distribution, named in honor of the Swiss mathematician, James Bernoulli (1654-1705), who made important contributions to the field of probability. Some of these experiments consist of repeatedly performing a given test. For example, we want to know the probability of getting a head when throwing a coin 1,000 times. In each of these examples, we look for the probability of obtaining x successes in n trials. If x indicates the successes, then n - x will be the failures. A sequence of Bernoulli trials consists of a Bernoulli trial under the following hypotheses: • There are only two possible mutually exclusive results for each trial, arbitrarily called success and failure. • The probability of success, p, is the same for each trial. • All tests are independent. Stochastic processes 27 Independence means that the result of a test is not influenced by the result of any other test. For example, the event, the third test was successful, is independent of the event, the first test was successful. The toss of a coin is a Bernoulli trial: the heads event can be considered successful, and the tails event can be considered unsuccessful. In this case, the probability of success is p = 1/2. In rolling two dice, the event, the sum of the points is seven, and the complementary event are both unsuccessful. In this case, it is a Bernoulli trial and the probability of success is p = 1/6. Important note Two events are said to be complementary when the occurrence of the first excludes the occurrence of the second but one of the two will certainly occur. Let p denote the probability of success in a Bernoulli trial. The random variable, X, which counts the number of successes in n trials is called the binomial random variable of the n and p parameters. X can take integer values between 0 and n. Random walk The random walk is a discrete parameter stochastic process in which Xt, where X represents a random variable, describes the position taken at time t by a moving point. The term, random walk, refers to the mathematical formalization of statistics that describe the displacement of an object that moves randomly. This kind of simulation is extremely important for a physicist and has applications in statistical mechanics, fluid dynamics, and quantum mechanics. Random walks represent a mathematical model that is used universally to simulate a path formalized by a succession of random steps. This model can assume a variable number of degrees of freedom, depending on the system we want to describe. From a physical point of view, the path traced over time will not necessarily simulate a real motion, but it will represent the trend of the characteristics of the system over time. Random walks find applications in chemistry, biology, and physics, but also in other fields such as economics, sociology, and information technology. 28 Understanding Randomness and Random Numbers Random one-dimensional walking is a model that is used to simulate the movement of a particle moving along a straight line. There are only two potential movements on the allowed path: either to the right (with a probability that is equal to p) or to the left (with a probability that is equal to q) of the current position. Each step has a constant length and is independent of the others, as shown in the following diagram: Figure 2.1 – One-dimensional walking The position of the point in each instant is identified by its abscissa, X(n). This position, after n steps, will be characterized by a random term. Our aim is to calculate the probability of passing from the starting point after n movements. Obviously, nothing assures us that the point will return to the starting position. The variable, X(n), returns the abscissa of the particle after n steps. It is a discrete random variable with a binomial distribution. At each instant, the particle steps right or left based on the value returned by a random variable, Z(n). This variable can take only two values: +1 and -1. It assumes a + 1 value with a probability of p > 0 and a value of -1 with a probability that is equal to q. The sum
of the two probabilities is p + q = 1. The position of the particle at instant n is given by the
following equation:
𝑋𝑋𝑛𝑛 = 𝑋𝑋𝑛𝑛−1 + 𝑍𝑍𝑛𝑛 ; 𝑛𝑛 = 1, 2, …
This shows the average number of returns to the origin of the particle, named p. The
probability of a single return is given by the following geometric series:


𝑛𝑛=0
𝑛𝑛=0
𝜇𝜇 = ∑ 𝑛𝑛 𝑝𝑝𝑛𝑛 (1 − 𝑝𝑝) = ∑ 𝑛𝑛 𝑝𝑝𝑛𝑛
1
√𝑛𝑛 ∗ 𝜋𝜋
→∞
We assume that the probability of the particle returning to the origin tends to 1. This
means that despite the frequency of the returns decreasing with the increase in the
number of steps taken, they will always be in an infinite value of steps taken. So, we can
conclude that a particle with equal probability of left and right movement, left free to
walk casually to infinity with great probability, returns infinite times to the point from
which it started.
Stochastic processes
29
The Poisson process
There are phenomena in which certain events, with reference to a certain interval of time
or space, rarely happen. The number of events that occur in that interval varies from 0 to
n, and n cannot be determined a priori. For example, the number of cars passing through
an uncrowded street in a randomly chosen 5-minute time frame can be considered a rare
event. Similarly, the number of accidents at work that happen at a company in a week, or
the number of printing errors on a page of a book, is rare.
In the study of rare events, a reference to a specific interval of time or space is
fundamental. For the study of rare events, the Poisson probability distribution is used,
named in honor of the French mathematician, Simeon Denis Poisson (1781-1840), who
first obtained the distribution. The Poisson distribution is used as a model in cases where
the events or realizations of a process, distributed randomly in space or time, are counts,
that is, discrete variables.
The binomial distribution is based on a set of hypotheses that define the Bernoulli trials,
and the same happens for the Poisson distribution. The following conditions describe
the so-called Poisson process:
• The realizations of the events are independent, meaning that the occurrence of
an event in a time or space interval has no effect on the probability of the event
occurring a second time in the same, or another, interval.
• The probability of a single realization of the event in each interval is proportional to
the length of the interval.
• In any arbitrarily small part of the interval, the probability of the event occurring
more than once is negligible.
An important difference between the Poisson distribution and the binomial distribution
is the number of trials and successes. In a binomial distribution, the number, n, of trials
is finite and the number, x, of successes cannot exceed n; in a Poisson distribution, the
number of tests is essentially infinite and the number of successes can be infinitely large,
even if the probability of having x successes becomes very small as x increases.
30
Understanding Randomness and Random Numbers
Random number simulation
The availability of random numbers is a necessary requirement in many applications.
In some cases, the quality of the final application strictly depends on the possibility
of generating good quality random numbers. Think, for example, of applications such
as video games, cryptography, generating visuals or sound effects, telecommunications,
signal processing, optimizations, and simulations. In an algorithm of this type, decisions
are made based on the pull of a virtual currency, which is based on a randomly
chosen value.
There is no single or general definition of a random number since it often depends on the
context. The concept of random number itself is not absolute, as any number or sequence
of numbers can appear to be random to an observer, but not to another who knows the
law with which they are generated. Put simply, a random number is defined as a number
selected in a random process from a finite set of numbers. With this definition, we focus
on the concept of randomness in the process of selecting a sequence of numbers.
In many cases, the problem of generating random numbers concerns the random
generation of a sequence of 0 and 1, from which numbers in any format can be obtained:
integers, fixed points, floating points, or strings of arbitrary length. With the right
functions, it is possible to obtain good quality sequences that can also be used in scientific
applications, such as the Monte Carlo simulation. These techniques should be easy to
implement and be usable by any computer. In addition, like all software solutions, they
should be very versatile and quickly improved.
Important note
These techniques have a big problem that is inherent to the algorithmic nature
of the process: the final string can be predicted from the starting seed. This is
why we call this process pseudorandom.
Despite this, many problems of an algorithmic nature are solved very effectively and
relatively simply using probabilistic algorithms. The simplest example of a probabilistic
algorithm is perhaps the randomized quicksort. This is a probabilistic variant of the
homonymous sorting algorithm, where, by choosing the pivot element, the algorithm
manages to randomly guarantee optimal complexity in the average case, no matter
the distribution of the input. Cryptography is a field in which randomness plays
a fundamental role and deserves specific mention. In this context, randomness does
not lead to computational advantages, but it is essential to guarantee the security of
authentication protocols and encryption algorithms.
Random number simulation
Probability distribution
It is possible to characterize a random process from different points of view. One of
the most important characteristics is the probability distribution. The probability
distribution is a model that associates a probability with each observable modality
of a random variable.
The probability distribution can be either discrete or continuous, depending on whether
the variable is random, discrete, or continuous. It is discrete if the phenomenon is
observable with an integer number of modes. The throw of the dice is a discrete
statistical phenomenon because the number of observable modalities is equal to 6. The
random variable can take only six values ​​(1, 2, 3, 4, 5, and 6). Therefore, the probability
distribution of the phenomenon is discrete. The probability distribution is continuous
when the random variable assumes a continuous set of values; in this case, the statistical
phenomenon can be observed with an infinite or very high number of modalities. The
probability distribution of body temperature is continuous because it is a continuous
statistical phenomenon, that is, the values ​​of the random variable vary continuously.
Let’s now look at different kinds of probability distributions.
Uniform distribution
In many cases, processes characterized by a uniform distribution are considered and
used. This means that each element is as likely as any of the others to be selected if an
infinite number of extractions is performed. If you represent the elements and their
respective probabilities of being extracted on a graph, you get a rectangular graph
as follows:
Figure 2.2 – The probabilities of the elements
31
32
Understanding Randomness and Random Numbers
Since the probability is expressed as a real number between 0 and 1, where 0 represents
the impossible event and 1 the certain event, in a uniform distribution, each element will
have a 1/n probability of being selected, where n is the number of items. In this case, the
sum of all the probabilities must give a uniform result, since, in an extraction, at least one
of the elements is chosen for sure. A uniform distribution is typical of artificial random
processes such as dice rolling, lotteries, and roulette and is also the most used in several
applications.
Gaussian distribution
Another very common probability distribution is the Gaussian or normal distribution,
which has a typical bell shape. In this case, the smaller values, or those that are closer to
the center of the curve, are more likely to be extracted than the larger ones, which are far
away from the center. The following diagram shows a typical Gaussian distribution:
Figure 2.3 – Gaussian distribution
Gaussian distribution is important because it is typical of natural processes. For example,
it can represent the distribution of the noise in many electronic components, or it can
represent the distribution of errors in measurements. It is, therefore, used to simulate
statistical distributions in the fields of telecommunications or signal processing.
Properties of random numbers
By random number, we refer to a random variable distributed in a uniform way between 0
and 1. The statistical properties that a sequence of random numbers must possess are
as follows:
• Uniformity
• Independence
The pseudorandom number generator
Suppose you divide an interval, [0.1], into n subintervals of equal amplitude. The
consequence of the uniformity property is that if N observations of a random number
are made, then the number of observations in each subinterval is equal to N/n. The
consequence of the independence property is that the probability of obtaining a value in
a particular range is independent of the values ​​that were previously obtained.
The pseudorandom number generator
The generation of real random sequences using deterministic algorithms is impossible:
at most, pseudorandom sequences can be generated. These are, apparently, random
sequences that are actually perfectly predictable and can be repeated after a certain
number of extractions. A PRNG is an algorithm designed to output a sequence of
values that appear to be generated randomly.
The pros and cons of a random number generator
A random number generation routine must be the following:
• Replicable
• Fast
• Not have large gaps between two generated numbers
• Have a sufficiently long running period
• Generate numbers with statistical properties that are as close as possible to
ideal ones
The most common cons of random number generators are as follows:
• Numbers not uniformly distributed
• Discretization of the generated numbers
• Incorrect mean or variance
• Presence of cyclical variations
33
34
Understanding Randomness and Random Numbers
Random number generation algorithms
The first to deal with the random number generation problem was John von Neumann, in
1949. He proposed a method called middle-square. This method allows us to understand
some important characteristics of a random number generation process. To start, we need
to provide an input as the seed or a value that starts the sequence. This is necessary to be
able to generate different sequences each time. However, it is important to ensure that the
good behavior of the generator does not depend on which seed is used. From here, the
first flaw of the middle-square method appears, that is, when using the zero value as
a seed, only a sequence of zeros will be obtained.
Another shortcoming of this method is the repetitiveness of the sequences. As in all of the
other PRNGs that will be discussed, each value depends on the previous one, and, at most,
on the internal state variables of the generator. Since this is a limited number of digits, the
sequence can only be repeated from a certain point onward. The length of the sequence,
before it begins to repeat itself, is called the period. A long period is important because
many practical applications require a large amount of random data, and a repetitive
sequence might be less effective. In such cases, it is important that the choice of the seed
has no influence on the potential outcomes.
Another important aspect is the efficiency of the algorithm. The size of the output data
values ​​and internal state, and, therefore, the generator input (seed), are often intrinsic
features of the algorithm and remain constant. For this reason, the efficiency of a PRNG
should be assessed not so much in terms of computational complexity, but in terms of the
possibility of a fast and efficient implementation of the calculation architectures available.
In fact, depending on the architecture you are working on, the choice of different PRNGs,
or different design parameters of a certain PRNG, can result in a faster implementation by
many orders of magnitude.
Linear congruential generator
One of the most common methods for generating random numbers is the Linear
Congruence Generator (LCG). The theory on which it rests is simple to understand
and implement. It also has the advantage of being computationally light. The recursive
relationship underlying this technique is provided by the following equation:
𝑥𝑥𝑘𝑘+1 = (𝑎𝑎 ∗ 𝑥𝑥𝑘𝑘 + 𝑐𝑐) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
Here, we can observe the following:
• a is the multiplier (non-negative integers)
• c is the increment (non-negative integers)
The pseudorandom number generator
35
• m is the mode (non-negative integers)
• x0 is the initial value (seed or non-negative integers)
The modulo function, denoted by mod, results in the remainder of the Euclidean division
of the first number by the second. For example, 18 mod 4 gives 2 as it is the remainder of
the Euclidean division between the two numbers.
The linear congruence technique has the following characteristics:
• It is cyclical with a period that is approximately equal to m
• The generated numbers are discretized
To use this technique effectively, it is necessary to choose very large m values. As an
example, set the parameters of the method and generate the first 16 pseudorandom values.
Here is the Python code that allowed us to generate that sequence of numbers:
import numpy as np
a = 2
c = 4
m = 5
x = 3
for i in range (1,17):
    x= np.mod((a*x+c), m)
    print(x)
The following results are returned:
0
4
2
3
0
4
2
3
0
4
2
3
36
Understanding Randomness and Random Numbers
0
4
2
3
In this case, the period is equal to 4. It is easy to verify that, at most, m distinct
integers, Xn, can be generated in the interval, [0, m – 1]. If c = 0, the generator is called
multiplicative. Let’s analyze the Python code line by line. The first line was used to import
the library:
import numpy as np
numpy is a library of additional scientific functions of the Python language, designed
to perform operations on vectors and dimensional matrices. numpy allows you to work
with vectors and matrices more efficiently and faster than you can do with lists and lists
of lists (matrices). In addition, it contains an extensive library of high-level mathematical
functions that can operate on these arrays.
After importing the numpy library, we set the parameters that will allow us to generate
random numbers using LCG:
a
c
m
x
=
=
=
=
2
4
5
3
At this point, we can use the LCG formula to generate random numbers. We only generate
the first 16 numbers, but we will see from the results that these are enough to understand
the algorithm. To do this, we use a for loop:
for i in range (1,17):
    x= np.mod((a*x+c), m)
    print(x)
To generate random numbers according to the LCG formula, we have used the
np.mod() function. This function returns the remainder of a division when given
a dividend and divisor.
The pseudorandom number generator
37
Random numbers with uniform distribution
A sequence of numbers uniformly distributed between [0, 1] can be obtained using the
following formula:
𝑈𝑈𝑛𝑛 =
𝑋𝑋𝑛𝑛
𝑚𝑚
The obtained sequence is periodic, with a period less than or equal to m. If the period is
m, then it has a full period. This occurs when the following conditions are true:
• If m and c have prime numbers
• If m is divisible by a prime number, b, for which it must also be divisible
• If m is divisible by 4, then a – 1 must also be divisible by 4
Important note
By choosing a large value of m, you can reduce both the phenomenon of
periodicity and the problem of generating rational numbers.
Furthermore, it is not necessary for simulation purposes that all numbers between [0, 1]
are generated, because these are infinite. However, it is necessary that as many numbers as
possible within the range have the same probability of being generated.
Generally, a value of m is m ≥ 109 so that the generated numbers constitute a dense subset
of the interval, [0, 1].
An example of a multiplicative generator that is widely used in 32-bit computers is the
Learmonth-Lewis generator. This is a generator in which the parameters assume the
following values:
• a = 75
• c = 0
• m = 231 – 1
Let’s analyze the code that generates the first 100 random numbers according to this
method:
import numpy as np
a = 75
c = 0
m = 2**(31) -1
x = 0.1
38
Understanding Randomness and Random Numbers
for i in range(1,100):
    x= np.mod((a*x+c),m)
    u = x/m
    print(u)
The code we have just seen was analyzed line by line in the Linear congruential generator
section of this chapter. The difference, in addition to the values of the parameters, lies
in the generation of a uniform distribution in the range of [0, 1] through the following
command:
u = x/m
The following results are returned:
Figure 2.4 – LCG output
Since we are dealing with random numbers, the output will be different from the
previous one.
A comparison between the different generators must be made based on the analysis
of the periodicity, the goodness of the uniformity of the numbers generated, and the
computational simplicity. This is because the generation of very large numbers can lead
to the use of expensive computer resources. Also, if the Xn numbers get too big, they are
truncated, which can cause a loss of the desired uniformity statistics.
The pseudorandom number generator
39
Lagged Fibonacci generator
The lagged Fibonacci algorithm for generating pseudorandom numbers arises from the
attempt to generalize the method of linear congruences. One of the reasons that led to the
search for new generators was the need, useful for many applications especially in parallel
computing, to lengthen the generator period. The period of a linear generator when m is
approximately 109 is enough for many applications, but not all of them.
One of the techniques developed is to make Xn + 1 dependent on the two previous values,
Xn and Xn − 1, instead of only on Xn, as is the case in the LCG method. In this case, the
period may arrive close to the value, m2, because the sequence will not repeat itself until
the following equality is obtained:
(𝑋𝑋𝑛𝑛+𝜆𝜆 , 𝑋𝑋𝑛𝑛+𝜆𝜆+1 ) = (𝑋𝑋𝑛𝑛 , 𝑋𝑋𝑛𝑛+1 )
The simplest generator of this type is the Fibonacci sequence represented by the following
equation:
𝑋𝑋𝑛𝑛+1 = (𝑋𝑋𝑛𝑛 + 𝑋𝑋𝑛𝑛−1 ) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
This generator was first analyzed in the 1950s and provides a period, m, but the sequence
does not pass the simplest statistical tests. We then tried to improve the sequence using
the following equation:
𝑋𝑋𝑛𝑛+1 = (𝑋𝑋𝑛𝑛 + 𝑋𝑋𝑛𝑛−𝑘𝑘 ) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚
This sequence, although better than the Fibonacci sequence, does not return satisfactory
results. We had to wait until 1958, when Mitchell and Moore proposed the following
sequence:
𝑋𝑋𝑛𝑛 = (𝑋𝑋𝑛𝑛−24 + 𝑋𝑋𝑛𝑛−55 ) 𝑚𝑚𝑚𝑚𝑚𝑚 𝑚𝑚,
𝑛𝑛 ≥ 55
Here, m is even and X0, … X54 are arbitrary integers that are not all even. Constants 24
and 55 are not chosen at random but are numbers that define a sequence whose least
significant bits (Xn mod 2) have a period of length 255-1. Therefore, the sequence (Xn) must
have a period of length of at least 255-1. The succession has a period of 2M-1 (255-1) where
m = 2M.
Numbers 24 and 55 are commonly called lags and the sequence (Xn) is called a Lagged
Fibonacci Generator (LFG). The LFG sequence can be generalized with the following
equation:
𝑋𝑋𝑛𝑛 = (𝑋𝑋𝑛𝑛−𝑙𝑙 ⊗ 𝑋𝑋𝑛𝑛−𝑘𝑘 ) 𝑚𝑚𝑚𝑚𝑚𝑚 2𝑀𝑀 , 𝑙𝑙 > 𝑘𝑘 > 0
40
Understanding Randomness and Random Numbers
Here, ⊗ refers to any of the following operations: +, −, ×, or ⊗ (exclusive or).
Only some pairs (k, l) give sufficiently long periods. In these cases, the period is 2M-1
(2l-1). The pairs (k, l) must be chosen appropriately. The only condition on the first l
values is that at least one of them must be odd; otherwise, the sequence will be composed
of even numbers.
Let’s look at how to implement a simple example of additive LFG in Python using the
following parameters: x0 = x1 = 1 and m = 232. Here is the code to generate the first 100
random numbers:
import numpy as np
x0=1
x1=1
m=2**32
for i in range (1,101):
    x= np.mod((x0+x1), m)
    x0=x1
    x1=x
    print(x)
Let’s analyze the Python code line by line. The first line was used to import the library:
import numpy as np
After importing the numpy library, we set the parameters that will allow us to generate
random numbers using LFG:
x0=1
x1=1
m=2**32
At this point, we can use the LFG formula to generate random numbers. We only generate
the first 100 numbers. To do this, we use a for loop:
for i in range (1,101):
    x= np.mod((x0+x1), m)
    x0=x1
    x1=x
    print(x)
The pseudorandom number generator
41
To generate random numbers according to the LFG formula, we use the np.mod()
function. This function returns the remainder of a division when given a dividend and
divisor. After generating the random number, the two previous values are updated as
follows:
x0=x1
x1=x
The following random numbers are printed:
Figure 2.5 – Table of random numbers using LFG
The initialization of an LFG is particularly complex, and the results of this method are
very sensitive to the initial conditions. If extreme care is not taken when choosing the
initial values, statistical defects may occur in the output sequence. These defects could
harden the initial values along with subsequent values that have a careful periodicity.
Another potential problem with LFG is that the mathematical theory behind the method
is incomplete, making it necessary to rely on statistical tests rather than theoretical
performance.
42
Understanding Randomness and Random Numbers
Testing uniform distribution
Test adaptation (that is, the goodness of fit) in general, has the purpose of verifying
whether a variable under examination does or does not have a certain hypothesized
distribution on the basis, as usual, of experimental data. It is used to compare a set of
frequencies observed in a sample, with similar theoretical quantities assumed for the
population. By means of the test, it is possible to quantitatively measure the degree of
deviation between the two sets of values.
The results obtained in the samples do not always exactly agree with the theoretical results
that are expected according to the rules of probability. Indeed, it is very rare for this to
occur. For example, although theoretical considerations lead us to expect 100 heads and
100 tails from 200 flips of a coin, it is rare that these results are obtained exactly. However,
despite this, we must not unnecessarily deduce that the coin is rigged.
The chi…

Purchase answer to see full
attachment

Explanation & Answer:
Programming Problems

User generated content is uploaded by users for the purposes of learning and should be used following Studypool’s honor code & terms of service.

Reviews, comments, and love from our customers and community:

Article Writing

Keep doing what you do, I am really impressed by the work done.

Alexender

Researcher

PowerPoint Presentation

I am speechless…WoW! Thank you so much!

Stacy V.

Part-time student

Dissertation & Thesis

This was a very well-written paper. Great work fast.

M.H.H. Tony

Student

Annotated Bibliography

I love working with this company. You always go above and beyond and exceed my expectations every time.

Francisca N.

Student

Book Report / Review

I received my order wayyyyyyy sooner than I expected. Couldn’t ask for more.

Mary J.

Student

Essay (Any Type)

On time, perfect paper

Prof. Kate (Ph.D)

Student

Case Study

Awesome! Great papers, and early!

Kaylin Green

Student

Proofreading & Editing

Thank you Dr. Rebecca for editing my essays! She completed my task literally in 3 hours. For sure will work with her again, she is great and follows all instructions

Rebecca L.

Researcher

Critical Thinking / Review

Extremely thorough summary, understanding and examples found for social science readings, with edits made as needed and on time. Transparent

Arnold W.

Customer

Coursework

Perfect!

Joshua W.

Student

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>