Description

2 attachmentsSlide 1 of 2attachment_1attachment_1attachment_2attachment_2

Unformatted Attachment Preview

MAT 243 Project One Summary Report

[Full Name]

Notes:

• Replace the bracketed text on page one (the cover page) with your personal information.

• You will use your selected team for all three projects

1. Introduction: Problem Statement

Discuss the statement of the problem in terms of the statistical analyses that are being performed. In

your response, you should address the following questions:

●

●

●

What is the problem you are going to solve?

What data set are you using?

What statistical methods will you be using to do the analysis for this project?

Answer the questions in a paragraph response. Remove all questions and this note before

submitting! Do not include Python code in your report.

2. Introduction: Your Team and the Assigned Team

In this project, you picked a team and you were assigned a team to do comparative analysis.

See Steps 1 and 2 in the Python script to address the following items:

●

●

What team did you pick and what years were picked to do the analysis?

What team and range of years were you assigned for the comparative study? (Hint: This is called

the assigned team in the Python script.) Present this information in a formatted table as shown

below.

Table 1. Information on the Teams

1. Yours

2. Assigned

Name of Team

Team (e.g. Knicks)

Team (e.g. Bulls)

Assigned Years

XXXX-YYYY (e.g. 2013 – 2015)

XXXX-YYYY (e.g. 2013 – 2015)

Answer the questions in a paragraph response. Remove all questions and this note (but not the

table) before submitting! Do not include Python code in your report.

3. Data Visualization: Points Scored by Your Team

In the Python script, you created a visualization for the distribution of points scored by your team.

See Step 3 in the Python script to address the following items in a paragraph response:

●

●

●

In general, how is data visualization used to study data distributions and trends?

In this activity, you were asked to pick one of the two plots that best describes the data

distribution of the variable for your team. Include a screenshot of this plot in your report.

Why did you pick this plot? Explain.

●

What can you say about the distribution of the variable by visually inspecting this plot? What

does this signify?

Answer the questions in a paragraph response. Remove all questions and this note before

submitting! Do not include Python code in your report.

4. Data Visualization: Points Scored by the Assigned Team

In the Python script, you created a visualization for the distribution of points scored by the assigned

team.

See Step 4 in the Python script to address the following items in a paragraph response:

●

●

●

In this activity, you were asked to pick one of the two plots that best describes the data

distribution of the variable for the assigned team. Include this plot in your report.

Why did you pick this plot? Explain.

What can you say about the distribution of the variable by visually inspecting this plot? What

does this signify?

Answer the questions in a paragraph response. Remove all questions and this note before

submitting! Do not include Python code in your report.

5. Data Visualization: Comparing the Two Teams

In the Python script, you created a visualization for the difference in the distributions of points scored by

your team and the assigned team.

See Step 5 in the Python script to address the following items in a paragraph response:

●

●

●

●

In general, how is data visualization used to compare two different data distributions?

In this activity, you were asked to pick one of the two plots that best compares the data

distributions of your team with the assigned team. Include a screenshot of this plot in your

report.

Why did you pick this plot? Explain.

How do the two distributions compare to each other?

Answer the questions in a paragraph response. Remove all questions and this note before

submitting! Do not include Python code in your report.

6. Descriptive Statistics: Points Scored By Your Team in Home Games

In the Python script, you calculated descriptive statistics on the points scored by your team in games

played at home venue. These included the mean, median, variance, and standard deviation for the

relative skill of your team.

See Step 6 in the Python script to address the following items:

●

Summarize all statistics in a formatted table as shown below. Use one row for each statistic. You

will need to add rows to the table in order to include all of your statistics.

Table 2. Descriptive Statistics for Points Scored by Your Team in Home Games

Statistic

(for example, Mean)

●

●

●

Value

X.XX

*Round off to 2 decimal places.

In general, how are the measures of central tendency and variability used to analyze a data

distribution?

Interpret each statistic in detail and explain what it represents in this scenario.

Use the mean and the median to describe the distribution of points scored by your team in home

games.

○ Describe the skew: Is it left, right, or bell-shaped?

○ Explain which measure of central tendency is best to use to represent the center of the

distribution based on its skew.

Answer the questions in a paragraph response. Remove all questions and this note (but not the

table) before submitting! Do not include Python code in your report.

7. Descriptive Statistics: Points Scored By Your Team in Away Games

In the Python script, you calculated descriptive statistics on the points scored by your team in games

played at opponent’s venue (away). These included the mean, median, variance, and standard deviation

for the relative skill of the assigned team.

See Step 7 in the Python script to address the following items:

●

Summarize all statistics in a formatted table as shown below. Use one row for each statistic. You

will need to add rows to the table in order to include all of your statistics.

Table 3. Descriptive Statistics for Points Scored by Your Team in Away Games

Statistic Name

Statistic

(for example, Mean)

●

●

●

Value

X.XX

*Round off to 2 decimal places.

Interpret each statistic in detail and explain what it represents in this scenario.

Use the mean and the median to describe the distribution of points scored by your team in away

games.

a. Describe the skew: Is it left, right, or bell-shaped?

b. Explain which measure of central tendency is best to use to represent the center of the

distribution based on its skew.

Is your team performing better in games played at home than those played away? Use the mean

and the standard deviation to answer this question. What can be deduced by comparing the

standard deviation of points scored in home games and points scored in away games?

Answer the questions in a paragraph response. Remove all questions and this note (but not the

table) before submitting! Do not include Python code in your report.

8. Confidence Intervals for the Average Relative Skill of All Teams in Your Team’s Years

In the Python script, you calculated a 95% confidence interval for the average relative skill of all teams in

the league during the years of your team. Additionally, you calculated the probability that a given team

in the league has a relative skill level less than that of the team that you picked.

See Step 8 in the Python script to address the following items:

●

Report the confidence interval in a formatted table as shown below.

Table 4. Confidence Interval for Average Relative Skill of Teams in Your Team’s Years

Confidence Level (%)

XX% (for example, 95%)

●

●

●

●

Confidence Interval

(X.XX, X.XX)

*Round off to 2 decimal places.

Describe how confidence intervals are generally used in estimating the measures of central

tendency for a population.

Provide a detailed interpretation of the confidence interval in terms of the average relative skill

of teams in the range of years that you picked.

How would your interval be different if you had used a different confidence level?

What is the probability that a given team in the league has a relative skill level less than that of

the team that you picked? Is it unusual that a team has a skill level less than your team?

Answer the questions in a paragraph response. Remove all questions and this note (but not the

table) before submitting! Do not include Python code in your report.

9. Confidence Intervals for the Average Relative Skill of All Teams in the Assigned Team’s Years

In the Python script, you calculated a 95% confidence interval for the average relative skill of all teams in

the league during the years of the assigned team. Additionally, you calculated the probability that a

given team in the league has a relative skill level less than that of the assigned team.

See Step 9 in the Python script to address the following items:

●

Report the confidence interval in a formatted table as shown below.

Table 5. Confidence Interval for Average Relative Skill of Teams in Assigned Team’s Years

Confidence Level (%)

XX% (for example, 95%)

Confidence Interval

(X.XX, X.XX)

*Round off to 2 decimal places.

●

●

●

Provide a detailed interpretation of the confidence interval in terms of the average relative skill

of teams in the assigned team’s range of years.

Discuss how your interval would be different if you had used a different confidence level.

How does this confidence interval compare with the previous one? What does this signify in

terms of the average relative skill of teams in the range of years that you picked versus the

average relative skill of teams in the assigned team’s range of years?

Answer the questions in a paragraph response. Remove all questions and this note (but not the

table) before submitting! Do not include Python code in your report.

10. Conclusion

Describe the results of your statistical analyses clearly, using proper descriptions of statistical terms and

concepts.

●

●

What is the practical importance of the analyses that were performed?

Describe what these results mean for the scenario.

Answer the questions in a paragraph response. Remove all questions and this note before

submitting! Do not include Python code in your report.

11. Citations

You were not required to use external resources for this report. If you did not use any resources, you

should remove this entire section. However, if you did use any resources to help you with your

interpretation, you must cite them. Use proper APA format for citations.

Insert references here in the following format:

Author’s Last Name, First Initial. Middle Initial. (Year of Publication). Title of book: Subtitle of book,

edition. Place of Publication: Publisher.

Project One: Data Visualization, Descriptive

Statistics, Confidence Intervals

This notebook contains the step-by-step directions for Project One. It is very important to run

through the steps in order. Some steps depend on the outputs of earlier steps. Once you have

completed the steps in this notebook, be sure to write your summary report.

You are a data analyst for a basketball team and have access to a large set of historical data that

you can use to analyze performance patterns. The coach of the team and your management have

requested that you use descriptive statistics and data visualization techniques to study

distributions of key performance metrics that are included in the data set. These data-driven

analytics will help make key decisions to improve the performance of the team. You will use the

Python programming language to perform the statistical analyses and then prepare a report of

your findings to present for the team’s management. Since the managers are not data analysts,

you will need to interpret your findings and describe their practical implications.

There are four important variables in the data set that you will study in Project One.

Variable What does it represent?

pts

Points scored by the team in a game

elo_n

A measure of the relative skill level of the team in the league

year_id Year when the team played the games

fran_id Name of the NBA team

The ELO rating, represented by the variable elo_n, is used as a measure of the relative skill of a

team. This measure is inferred based on the final score of a game, the game location, and the

outcome of the game relative to the probability of that outcome. The higher the number, the

higher the relative skill of a team.

In addition to studying data on your own team, your management has assigned you a second

team so that you can compare its performance with your own team’s.

Team

Your Team

Assigned

Team

What does it represent?

This is the team that has hired you as an analyst. This is the team that you will

pick below. See Step 2.

This is the team that the management has assigned to you to compare against your

team. See Step 1.

Reminder: It may be beneficial to review the summary report template for Project One prior to

starting this Python script. That will give you an idea of the questions you will need to answer

with the outputs of this script.

Step 1: Data Preparation & the Assigned Team

This step uploads the data set from a CSV file. It also selects the assigned team for this analysis.

Do not make any changes to the code block below.

1. The assigned team is the Chicago Bulls from the years 1996-1998

Click the block of code below and hit the Run button above.

import numpy as np

import pandas as pd

import scipy.stats as st

import matplotlib.pyplot as plt

from IPython.display import display, HTML

nba_orig_df = pd.read_csv(‘nbaallelo.csv’)

nba_orig_df = nba_orig_df[(nba_orig_df[‘lg_id’]==’NBA’) &

(nba_orig_df[‘is_playoffs’]==0)]

columns_to_keep =

[‘game_id’,’year_id’,’fran_id’,’pts’,’opp_pts’,’elo_n’,’opp_elo_n’,

‘game_location’, ‘game_result’]

nba_orig_df = nba_orig_df[columns_to_keep]

# The dataframe for the assigned team is called assigned_team_df.

# The assigned team is the Chicago Bulls from 1996-1998.

assigned_years_league_df = nba_orig_df[(nba_orig_df[‘year_id’].between(1996,

1998))]

assigned_team_df =

assigned_years_league_df[(assigned_years_league_df[‘fran_id’]==’Bulls’)]

assigned_team_df = assigned_team_df.reset_index(drop=True)

display(HTML(assigned_team_df.head().to_html()))

print(“printed only the first five observations…”)

print(“Number of rows in the data set =”, len(assigned_team_df))

game_id

year_i fran_i

opp_pt

pts

d

d

s

199511030CH

1996

I

199511040CH

1

1996

I

199511070CH

2

1996

I

199511090CL

3

1996

E

199511110CH

4

1996

I

0

Bulls

105 91

Bulls

107 85

Bulls

117 108

Bulls

106 88

Bulls

110 106

opp_elo_ game_locatio game_resul

n

n

t

1598.292 1531.744

H

W

4

9

1604.394 1458.641

H

W

0

5

1605.798 1310.934

H

W

3

9

1618.870 1452.826

A

W

1

8

1621.159 1490.286

H

W

1

1

elo_n

printed only the first five observations…

Number of rows in the data set = 246

Step 2: Pick Your Team

In this step, you will pick your team. The range of years that you will study for your team is

2013-2015. Make the following edits to the code block below:

1. Replace ??TEAM?? with your choice of team from one of the following team names.

*Bucks, Bulls, Cavaliers, Celtics, Clippers, Grizzlies, Hawks, Heat, Jazz, Kings, Knicks,

Lakers, Magic, Mavericks, Nets, Nuggets, Pacers, Pelicans, Pistons, Raptors, Rockets,

Sixers, Spurs, Suns, Thunder, Timberwolves, Trailblazers, Warriors, Wizards*

Remember to enter the team name within single quotes. For example, if you picked the

Suns, then ??TEAM?? should be replaced with ‘Suns’.

After you are done with your edits, click the block of code below and hit the Run button above.

# Range of years: 2013-2015 (Note: The line below selects ALL teams within the

three-year period 2013-2015. This is not your team’s dataframe.

your_years_leagues_df = nba_orig_df[(nba_orig_df[‘year_id’].between(2013,

2015))]

# The dataframe for your team is called your_team_df.

# —- TODO: make your edits here —your_team_df =

your_years_leagues_df[(your_years_leagues_df[‘fran_id’]==’Lakers’)]

your_team_df = your_team_df.reset_index(drop=True)

display(HTML(your_team_df.head().to_html()))

print(“printed only the first five observations…”)

print(“Number of rows in the data set =”, len(your_team_df))

game_id

year_i fran_i

opp_pt

pts

d

d

s

201210300LA

2013

L

201210310PO

1

2013

R

201211020LA

2

2013

L

201211040LA

3

2013

L

201211070UT

4

2013

A

0

Lakers 91 99

Lakers 106 116

Lakers 95 105

Lakers 108 79

Lakers 86 95

opp_elo_ game_locatio game_resul

n

n

t

1541.758 1533.929

H

L

5

7

1531.718 1460.701

A

L

4

5

1518.798 1580.867

H

L

1

9

1527.592 1409.056

H

W

7

6

1521.160 1535.967

A

L

3

4

elo_n

printed only the first five observations…

Number of rows in the data set = 246

Step 3: Data Visualization: Points Scored by Your Team

The coach has requested that you provide a visual that shows the distribution of points scored by

your team in the years 2013-2015. The code below provides two possible options. Pick ONE of

these two plots to include in your summary report. Choose the plot that you think provides the

best visual for the distribution of points scored by your team. In your summary report, you must

explain why you think your visual is the best choice.

Click the block of code below and hit the Run button above.

NOTE: If the plots are not created, click the code section and hit the Run button again.

import seaborn as sns

# Histogram

fig, ax = plt.subplots()

plt.hist(your_team_df[‘pts’], bins=20)

plt.title(‘Histogram of points scored by Your Team in 2013 to 2015’,

fontsize=18)

ax.set_xlabel(‘Points’)

ax.set_ylabel(‘Frequency’)

plt.show()

print(“”)

# Scatterplot

plt.title(‘Scatterplot of points scored by Your Team in 2013 to 2015’,

fontsize=18)

sns.regplot(your_team_df[‘year_id’], your_team_df[‘pts’], ci=None)

plt.show()

Step 4: Data Visualization: Points Scored by the Assigned

Team

The coach has also requested that you provide a visual that shows a distribution of points scored

by the Bulls from years 1996-1998. The code below provides two possible options. Pick ONE of

these two plots to include in your summary report. Choose the plot that you think provides the

best visual for the distribution of points scored by your team. In your summary report, you will

explain why you think your visual is the best choice.

Click the block of code below and hit the Run button above.

NOTE: If the plots are not created, click the code section and hit the Run button again.

import seaborn as sns

# Histogram

fig, ax = plt.subplots()

plt.hist(assigned_team_df[‘pts’], bins=20)

plt.title(‘Histogram of points scored by the Bulls in 1996 to 1998’,

fontsize=18)

ax.set_xlabel(‘Points’)

ax.set_ylabel(‘Frequency’)

plt.show()

# Scatterplot

plt.title(‘Scatterplot of points scored by the Bulls in 1996 to 1998’,

fontsize=18)

sns.regplot(assigned_team_df[‘year_id’], assigned_team_df[‘pts’], ci=None)

plt.show()

Step 5: Data Visualization: Comparing the Two Teams

Now the coach wants you to prepare one plot that provides a visual of the differences in the

distribution of points scored by the assigned team and your team. The code below provides two

possible visuals. Choose the plot that allows for the best comparison of the data distributions.

Click the block of code below and hit the Run button above.

NOTE: If the plots are not created, click the code section and hit the Run button again.

import seaborn as sns

# Side-by-side boxplots

both_teams_df = pd.concat((assigned_team_df, your_team_df))

plt.title(‘Boxplot to compare points distribution’, fontsize=18)

sns.boxplot(x=’fran_id’,y=’pts’,data=both_teams_df)

plt.show()

print(“”)

# Histograms

fig, ax = plt.subplots()

plt.hist(assigned_team_df[‘pts’], 20, alpha=0.5, label=’Assigned Team’)

plt.hist(your_team_df[‘pts’], 20, alpha=0.5, label=’Your Team’)

plt.title(‘Histogram to compare points distribution’, fontsize=18)

plt.xlabel(‘Points’)

plt.legend(loc=’upper right’)

plt.show()

Step 6:

Descriptive Statistics: Relative Skill of Your Team

The management of your team wants you to run descriptive statistics on the relative skill of your

team from 2013-2015. In this project, you will use the variable ‘elo_n’ to respresent the relative

skill of the teams. Calculate descriptive statistics including the mean, median, variance, and

standard deviation for the relative skill of your team. Make the following edits to the code block

below:

1. Replace ??MEAN_FUNCTION?? with the name of Python function that calculates the

mean.

2. Replace ??MEDIAN_FUNCTION?? with the name of Python function that calculates

the median.

3. Replace ??VAR_FUNCTION?? with the name of Python function that calculates the

variance.

4. Replace ??STD_FUNCTION?? with the name of Python function that calculates the

standard deviation.

After you are done with your edits, click the block of code below and hit the Run button above.

print(“Your Team’s Relative Skill in 2013 to 2015”)

print(“——————————————————-“)

# —- TODO: make your edits here —mean = your_team_df[‘elo_n’].mean()

median = your_team_df[‘elo_n’].median()

variance = your_team_df[‘elo_n’].var()

stdeviation = your_team_df[‘elo_n’].std()

print(‘Mean =’, round(mean,2))

print(‘Median =’, round(median,2))

print(‘Variance =’, round(variance,2))

print(‘Standard Deviation =’, round(stdeviation,2))

Your Team’s Relative Skill in 2013 to 2015

——————————————————Mean = 1440.49

Median = 1412.34

Variance = 6337.75

Standard Deviation = 79.61

Step 7 – Descriptive Statistics – Relative Skill of the Assigned

Team

The management also wants you to run descriptive statistics for the relative skill of the Bulls

from 1996-1998. Calculate descriptive statistics including the mean, median, variance, and

standard deviation for the relative skill of the assigned team.

You are to write this code block yourself.

Use Step 6 to help you write this code block. Here is some information that will help you write

this code block.

1. The dataframe for the assigned team is called assigned_team_df.

2. The variable ‘elo_n’ respresent the relative skill of the teams.

3. Your statistics should be rounded to two decimal places.

Write your code in the code block section below. After you are done, click this block of code and

hit the Run button above. Reach out to your instructor if you need more help with this step.

print(“Assigned Team’s Relative Skill in 1996 to 1998”)

print(“——————————————————“)

mean_=assigned_team_df[‘elo_n’].mean()

median_ = assigned_team_df[‘elo_n’].median()

variance_ = assigned_team_df[‘elo_n’].var()

stdeviation_ = assigned_team_df[‘elo_n’].std()

print(“Mean = “, round(mean_,2))

print(“Median = “, round(median_,2))

print(“Variance = “, round(variance_,2))

print(“Standard Deviation = “, round(stdeviation_,2))

Assigned Team’s Relative Skill in 1996 to 1998

—————————————————–Mean = 1739.8

Median = 1751.23

Variance = 2651.55

Standard Deviation = 51.49

Step 8: Confidence Intervals for the Average Relative Skill of

All Teams in Your Team’s Years

The management wants to you to calculate a 95% confidence interval for the average relative

skill of all teams in 2013-2015. To construct a confidence interval, you will need the mean and

standard error of the relative skill level in these years. The code block below calculates the mean

and the standard deviation. Your edits will calculate the standard error and the confidence

interval. Make the following edits to the code block below:

1. Replace ??SD_VARIABLE?? with the variable name representing the standard

deviation of relative skill of all teams from your years. (Hint: the standard deviation

variable is in the code block below)

2. Replace ??CL?? with the confidence level of the confidence interval.

3. Replace ??MEAN_VARIABLE?? with the variable name representing the mean relative

skill of all teams from your years. (Hint: the mean variable is in the code block below)

4. Replace ??SE_VARIABLE?? with the variable name representing the standard error.

(Hint: the standard error variable is in the code block below)

The management also wants you to calculate the probability that a team in the league has a

relative skill level less than that of the team that you picked. Assuming that the relative skill of

teams is Normally distributed, Python methods for a Normal distribution can be used to answer

this question. The code block below uses two of these Python methods. Your task is to identify

the correct Python method and report the probability.

After you are done with your edits, click the block of code below and hit the Run button above.

print(“Confidence Interval for Average Relative Skill in the years 2013 to

2015”)

print(“———————————————————————————————————–“)

# Mean relative skill of all teams from the years 2013-2015

mean = your_years_leagues_df[‘elo_n’].mean()

# Standard deviation of the relative skill of all teams from the years 20132015

stdev = your_years_leagues_df[‘elo_n’].std()

n = len(your_years_leagues_df)

#Confidence interval

# —- TODO: make your edits here —stderr = stdev/(n ** 0.5)

conf_int_95 = st.norm.interval(0.95, mean, stderr)

print(“95% confidence interval (unrounded) for Average Relative Skill (ELO) in

the years 2013 to 2015 =”, conf_int_95)

print(“95% confidence interval (rounded) for Average Relative Skill (ELO) in

the years 2013 to 2015 = (“, round(conf_int_95[0], 2),”,”,

round(conf_int_95[1], 2),”)”)

print(“n”)

print(“Probability a team has Average Relative Skill LESS than the Average

Relative Skill (ELO) of your team in the years 2013 to 2015”)

print(“——————————————————————————————————————————————————–“)

mean_elo_your_team = your_team_df[‘elo_n’].mean()

choice1 = st.norm.sf(mean_elo_your_team, mean, stdev)

choice2 = st.norm.cdf(mean_elo_your_team, mean, stdev)

# Pick the correct answer.

print(“Which of the two choices is correct?”)

print(“Choice 1 =”, round(choice1,4))

print(“Choice 2 =”, round(choice2,4))

Confidence Interval for Average Relative Skill in the years 2013 to 2015

———————————————————————————————————-95% confidence interval (unrounded) for Average Relative Skill (ELO) in the

years 2013 to 2015 = (1502.0236894390478, 1507.1824625533618)

95% confidence interval (rounded) for Average Relative Skill (ELO) in the

years 2013 to 2015 = ( 1502.02 , 1507.18 )

Probability a team has Average Relative Skill LESS than the Average Relative

Skill (ELO) of your team in the years 2013 to 2015

——————————————————————————————————————————————————–Which of the two choices is correct?

Choice 1 = 0.7147

Choice 2 = 0.2853

Step 9 – Confidence Intervals for the Average Relative Skill

of All Teams in the Assigned Team’s Years

The management also wants to you to calculate a 95% confidence interval for the average

relative skill of all teams in the years 1996-1998. Calculate this confidence interval.

You are to write this code block yourself.

Use Step 8 to help you write this code block. Here is some information that will help you write

this code block. Reach out to your instructor if you need help.

1. The dataframe for the years 1996-1998 is called assigned_years_league_df

2. The variable ‘elo_n’ represents the relative skill of teams.

3. Start by calculating the mean and the standard deviation of relative skill (ELO) in years

1996-1998.

4. Calculate n that represents the sample size.

5. Calculate the standard error which is equal to the standard deviation of Relative Skill

(ELO) divided by the square root of the sample size n.

6. Assuming that the population standard deviation is known, use Python methods for the

Normal distribution to calculate the confidence interval.

7. Your statistics should be rounded to two decimal places.

The management also wants you to calculate the probability that a team had a relative skill level

less than the Bulls in years 1996-1998. Assuming that the relative skill of teams is Normally

distributed, calculate this probability.

You are to write this code block yourself.

Use Step 8 to help you write this code block. Here is some information that will help you write

this code block.

1. Calculate the mean relative skill of the Bulls. Note that the dataframe for the Bulls is

called assigned_team_df. The variable ‘elo_n’ represents the relative skill.

2. Use Python methods for a Normal distribution to calculate this probability.

3. The probability value should be rounded to four decimal places.

Write your code in the code block section below. After you are done, click this block of code and

hit the Run button above. Reach out to your instructor if you need more help with this step.

print(“Confidence Interval for Average Relative Skill in the years 1996 to

1998”)

print(“———————————————————————————————————–“)

# Mean relative skill of all teams from the years 1996-1998

mean = assigned_years_league_df[‘elo_n’].mean()

# Standard deviation of the relative skill of all teams from the years 19961998

stdev = assigned_years_league_df[‘elo_n’].std()

n = len(assigned_years_league_df)

#Confidence interval

# —- TODO: make your edits here —stderr = stdev/(n ** 0.5)

conf_int_95 = st.norm.interval(0.95, mean, stderr)

print(“95% confidence interval (unrounded) for Average Relative Skill (ELO) in

the years 1996 to 1998 =”, conf_int_95)

print(“95% confidence interval (rounded) for Average Relative Skill (ELO) in

the years 1996 to 1998 = (“, round(conf_int_95[0], 2),”,”,

round(conf_int_95[1], 2),”)”)

print(“n”)

print(“Probability a team has Average Relative Skill LESS than the Average

Relative Skill (ELO) of Bulls in the years 1996 to 1998”)

print(“——————————————————————————————————————————————————–“)

mean_elo_assigned_team = assigned_team_df[‘elo_n’].mean()

answer1 = st.norm.cdf(mean_elo_assigned_team, mean, stdev)

# Pick the correct answer.

print(“Answer =”, round(answer1,4))

Confidence Interval for Average Relative Skill in the years 1996 to 1998

———————————————————————————————————-95% confidence interval (unrounded) for Average Relative Skill (ELO) in the

years 1996 to 1998 = (1487.6565859527095, 1493.6465501840999)

95% confidence interval (rounded) for Average Relative Skill (ELO) in the

years 1996 to 1998 = ( 1487.66 , 1493.65 )

Probability a team has Average Relative Skill LESS than the Average Relative

Skill (ELO) of Bulls in the years 1996 to 1998

——————————————————————————————————————————————————–Answer = 0.9732

End of Project One

Download the HTML output and submit it with your summary report for Project One. The

HTML output can be downloaded by clicking File, then Download as, then HTML. Do not

include the Python code within your summary report.

Purchase answer to see full

attachment

Explanation & Answer:

10 Pages

User generated content is uploaded by users for the purposes of learning and should be used following Studypool’s honor code & terms of service.