Files

Adog64 5223b711a6 5th semester files

2024-02-22 14:23:12 -05:00

14 KiB

Raw Blame History

Chapter 5

Statistical Inference

Statistical inference is the process of drawing conclusions about the entire population based on information from a sample.

Parameter vs. Statistic

A parameter is a number that summarizes data from an entire population.

A statistic is a number that summarizes data from a sample.

	parameter	statistic
mean	`\mu`	`\bar{x}`
standard deviation	`\sigma`	`s`
variance	`\sigma^2`	`s^2`

Example

Suppose you were interested in the number of hours that Rowan students spend studying on Sundays. You take a random sample of n = 100 students and the average time they study on Sunday is $\bar{x}= 3.2$[hrs].

We use $\bar{x} = 3.2$[hrs] as our best estimate for \mu.

Variability of Sample Statistics

We normally think of a parameter as a fixed value. Sample statistics vary from sample to sample.

Sampling Distribution

A sampling distribution is the distribution of sample statistics computed for different samples of the same sample size from the same population.

The mean of the sample means is \mu. For a random sample of size, n, the standard error is given by:

\text{var}(\bar{x}) = {\sigma^2 \over n}

Central Limit Theorem

If \bar{x} is the mean of a random sample of size, n, taken from a population with mean, \mu, and finite variance, \sigma^2, then the limiting form of the distribution.

z = {\sqrt{n} (\bar{x} - \mu )\over \sigma}

As n \to \infty, is the standard normal distribution. This generally holds for n \ge 30. If n < 30, the approximation is good so long as the population is not too different from a normal distribution.

Unbiased Estimator

A statistic, \hat{\theta}, is said to be an unbiased estimator of the parameter, \theta, if:

E[\hat{\theta}] = \theta

E[\hat{\theta} - \theta] =0

The mean:

\bar{x} = {1\over n} \sum_{i=1}^{n} x_i

is an unbiased estimator of \mu.

Proof:

E[\bar{x}] = E\left[ {1\over n} \sum_{i=1}^n x_i\right] = {1\over n} E[x_1 + x_2 + x_3 + \cdots + x_n] = {1\over n} \left[ E[x_1] + E[x_2] + \cdots + E[x_n]\right] = {1\over n} [\mu + \mu + \cdots + \mu] = {1\over n} [n\mu] = \mu

Confidence Interval for `\mu` if `\sigma` is known:

If our sample size is "large", then the CLT tells us that:

{\sqrt{n} (\bar{x} - \mu) \over \sigma} \sim N(0,1) \text{ as } n \to \infty 1 - \alpha = P(-z_{\alpha \over 2} \le {\bar{x} - \mu \over \sigma/\sqrt{n}} \le z_{\alpha \over2}

A (1 - \alpha)% confidence interval for \mu is:

\bar{x} \pm z_{\alpha \over 2} {\sigma \over \sqrt{n}}

90% CI: z_{\alpha \over 2} = 1.645

95% CI: $z_{\alpha \over 2} = 1.96$$

99% CI: z_{\alpha \over 2} = 2.576

Example

In a random sample of 75 Rowan students, the sample mean height was 67 inches. Suppose the population standard deviation is known to be \sigma = 7 inches. Construct a 95% confidence interval for the mean height of all rowan students.

\bar{x} \pm z_{\alpha \over 2} {\sigma \over \sqrt{n}} \bar{x} = 67 z_{\alpha \over 2} = 1.96 \sigma = 7 n = 75

A 95% CI for \mu:

67 \pm 1.96 \left({7\over\sqrt{75}}\right) = (65.4, 68.6)

Interpretation

95% confident that the mean height of all Rowan students is somewhere between 65.4 and 68.6 inches.

From the sample, we found that \bar{x} = 67 inches. Using the confidence interval, we are saying that we are 95% confident that \mu is somewhere between 65.4 and 68.6 inches.

A limitation of z confidence interval is that \sigma is unlikely to be known.

Confidence interval for `\mu` if `\sigma` is unknown:

If \sigma is unknown, we then estimate the standard error, {\sigma \over \sqrt{n}} as {s \over \sqrt{n}}.

When we estimate the standard error, the distribution is not normal. Instead, it follows a t-distribution with n-1 degrees of freedom. The new distribution is given as:

{\bar{x} - \mu \over {s \over \sqrt{n}}}

A (1 - \alpha)% confidence interval for \mu when \sigma is unknown is:

\bar{x} \pm t^* {s\over \sqrt{n}}

Where t^* is an end point chosen from the t-distribution. t^* varies based on sample size and desired confidence level.

Example

A research engineer for a time manufacturer is investigating tire life for a new rubber compound and has built 115 tires and tested them to end-of-life in a road test. The sample mean and standard deviation are 60139.7, and 3645.94 kilometers.

Find a 90% confidence interval for the mean life of all such tires.

\bar{x} \pm t^* {s\over\sqrt{n}} \bar{x} = 60139.7 s = 3645.94 n = 115 t^* = \texttt{t\_crit\_value(115, 0.90)} = 1.658 60139.7 \pm 1.658 {3645.94 \over \sqrt{115}} = (59567.1, 60703.3)

Width of a Confidence Interval

\bar{x} \pm t_{\alpha \over 2} {s \over \sqrt{n}}

As sample size increases the width of the confidence interval decreases, and \bar{x} becomes a better approximation of \mu.

\lim_{n\to\infty} {s \over \sqrt{n}} = 0 \lim_{n\to\infty} P(|\bar{x} - \mu| < \varepsilon) = 1

Where \varepsilon > 0.

One-Sided Confidence Intervals

\left(-\infty, \bar{x} + t_\alpha {s \over \sqrt{n}}\right)

Confidence Intervals in Python

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats

conf_levels = []
iterations = 100

def tvalue(sample_size, conf_level):
    return stat.t.ppf(1 - (1 - conf_level)/2, sample_size - 1)

for i in range(iterations)
    sample = np.random.chisquare(df=10, size=100)
    sample_mean = np.mean(sample)
    std = np.std(sample)
    t_value = tvalue(100, .95)
    lb = sample_mean - t_value*(std / np.sqrt(100))
    ub = sample_mean + t_value*(std / np.sqrt(100))
    conf_levels.append((lb, ub))

plt.figure(figsize=(15,5))

for j, (lb, ub) in enumerate(conf_levels):
    if 10 < lb or 10 > ub:
        plt.plot([j,j], [lb,ub], 'ro-', color='red')
    else:
        plt.plot([j,j], [lb,ub], 'ro-', color='green')

plt.show()

Hypothesis Testing

Many problems require that we decide whether to accept or reject a statement about some parameter.

Hypothesis

A claim that we want to test or investigate

Hypothesis Test

A statistical test that is used to determine whether results from a sample are convincing enough to allow us to conclude something about the population.

Use sample evidence to back up claims about a population

Null Hypothesis

The claim that there is no effect or no difference (H_0).

Alternative Hypothesis

The claim for which we seek evidence (H_a).

Using `H_0` and `H_a`

Does the average Rowan student spend more than $300 each semester on books?

In a sample of 226 Rowan students, the mean cost of a students textbook was $344 with a standard deviation of $106.

H_0: \mu = 300.

H_a: \mu > 300.

H_0 and H_a are statements about population parameters, not sample statistics.

In general, the null hypothesis is a statement of equality (=), while the alternative hypothesis is a statement of inequality (<, >, \ne).

Possible outcomes of a hypothesis test

Reject the null hypothesis
- Rejecting H_0 means we have enough evidence to support the alternative hypothesis
Fail to reject the null hypothesis
- Not enough evidence to support the alternative hypothesis

Figuring Out Whether Sample Data is Supported

If we assume that the null hypothesis is true, what is the probability of observing sample data that is as extreme or more extreme than what we observed.

In the Rowan example, we found that \bar{x} = 344.

One-Sample T-test for a Mean

To test a hypothesis regarding a single mean, there are two main parametric options:

z-test
t-test

The z-test requires knowledge of the population standard deviation. Since \sigma is unlikely to be known, we will use a t-test.

To test H_0: \mu = \mu_0 against its alternative H_a: \mu \ne \mu_0, use the t-statistic.

t^* = {\bar{x} - \mu_0 \over {s \over \sqrt{n}}}

P-Value

A measure of inconsistency between the null hypothesis and the sample data.

Significance Level `(\alpha)`

\alpha for a test of hypothesis is a boundary below which we conclude that a p-value shows statistically significant evidence against the null.

Common \alpha levels are 0.01, 0.05, 0.10.

The lower the \alpha, the stronger the evidence required to reject H_0. If the p-value is less than \alpha, reject H_0, but if the p-value is greater than \alpha, fail to reject H_0.

Steps of a Hypothesis Test

State the H_0 and H_a
Calculate the test statistic
Find the p-value
Reject or fail to reject H_0
Write conclusion in the context of the problem

Example

A researcher is interested in testing a particular brand of batteries and whether its battery life exceeds 40 hours.

A random sample of n=70 batteries has a mean life of \bar{x} = 40.5 hours with s = 1.75 Let \alpha = 0.05.

H_0: \mu = 40

H_a \mu > 40

t^* = {\bar{x} - \mu_0 \over {s \over \sqrt{n}}} t^* = {40.5 - 40 \over {1.75 \over \sqrt{70}}} = 2.39

Find the p-value:

P(t_{\alpha \over 2} \ge |t^*|) = 0.0097

>>> from scipy.stats import t

>>> t = 2.39    # The t-score
>>> s = 70      # The sample size

>>> t.sf(abs(t), s)
0.009772027372500908

If in fact H_0 is true, the probability of observing a test statistic that is as extreme or more extreme than t^* = 2.39 is about 0.0097. That is to say, the sample is very unlikely to occur under H_0. Since the p-value is less than \alpha, H_0 is rejected.

Sample evidence suggests that the mean battery life of this particular brand exceeds 40 hours.

Type 1 Error

When H_0 is rejected despite it being true.

The probability that a type 1 error occurs is \alpha

Type 2 Error

When H_0 is not rejected despite it being false.

NOTE:

Our group of subjects should be representative of the entire population of interests.

Because we cannot impose an experiment on an entire population, we are often forced to examine a small, and we hope that the sample statistics, \bar{x} and s^2, are good estimates of the population parameters, \mu and \sigma^2.

Example

The effects of caffeine on the body have been well studied. In one experiment, a group of 20 male college students were trained in a particular tapping movement and to tap a rapid rate. They were randomly divided into caffeine and non-caffeine groups and given approximately 2 cups of coffee (Either 200[mg] of caffeine or decaf). After a two hour period, the tapping rate was measured.

The population of interest is male college-aged students.

The question of interest: is the mean tap rate of the caffeinated group different than that of the non-caffeinated group.

Let \mu_c be the mean of the caffeinated group, and \mu_d be the mean of the caffeinated group.

H_0: \mu_c = \mu_d

H_a: \mu_c \ne \mu_d

Two-Sample T-Test

To test:

H_0: \mu_1 = \mu_2

H_a: \mu_1 \ne \mu_2

Use the following statistic:

t^* = {(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2) \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}

Where:

s_p^2 = {(n_1 -1)s_1^2 + (n_2 - 1)s_2^2 \over n_1 + n_2 -2}

Where t^* follows a t-distribution with n_1 + n_2 -2 degrees of freedom under H_0. Thus, the p-value is P(t_{n_1 + n_2 -2} \ge |t^*|) for a one sided test, and twice that for a two sided test.

Assumptions:

The two populations are independently normally distributed with the same variance.

Example

H_0: \mu_c = \mu_d

H_a: \mu_c = \mu_d

s_p^2 = {(n_1 -1)s_1^2 + (n_2 - 1)s_2^2 \over n_1 + n_2 -2} s_p^2 = {(10 -1)(5.73) + (10 - 1)(4.9) \over 18} = 5.315 s_p = \sqrt{5.315}

Find the p-value:

2P(t_{n_1 + n_2 - 2} \ge |3.394|) = 0.00326

Since the p-value < \alpha, we reject H_0.

Sample evidence suggests that the mean tap rate for the caffeinated group is different than that for the con-caffeinated group.

Example

The thickness of a plastic film in mils on a substrate material is thought to be influenced by the temperature at which the coating is applied. A completely randomized experiment is carried out. 11 substrates are coated at 125$^\circ$F, resulting in sample mean coating thickness of \bar{x}_1 = 103.5, and sample standard deviation of s_1 = 10.2. Another 13 substrates are coated at 150$^\circ$F, where \bar{x}_2 = 99.7 and s_2 = 15.1. It is suspected that raising the process temperature would reduce the mean coating thickness. Does the data support this claim? Use \alpha = 0.01.

	125$^\circ$F	150$^\circ$F
`\bar{x}`	103.5	99.7
`s`	10.2	15.1
`n`	11	13

H_0: \mu_1 = \mu_2

H_a: \mu_1 < \mu_2

t^* = {(\bar{x}_1 - \bar{x}_2) \over s_p \sqrt{{1\over n_1} + {1\over n_2}}} s_p^2 = {(11 - 1)(10.2)^2 + (13-1)(15.1)^2 \over 11 + 13 - 2} = 171.66 s_p = 13.1 t^* = {(99.7 - 103.5) \over 13.1 \sqrt{{1\over11} + {1\over13}}} = -0.71

Find the p-value:

P(t_{n_1 + n_2 - 2} > |-0.71|) = 0.243

Since the p-value is greater than \alpha, we fail to reject H_0. That is to say sample evidence does not suggest that raising the process temperature would reduce the mean coating thickness.

Practical vs. Statistical Significance

More samples is not always better.

Waste of resources
Statistical significance \ne practical significance

Example

Consider an SAT score improvement study.

$600 study plan: x_{11}, x_{12}, \cdots, x_{1n}

Traditional study plan: x_{21}, x_{22}, \cdots, x_{2n}

Test for H_0: \mu_1 = \mu_2

H_a: \mu_1 \ne \mu_2

Test statistic:

t^* = {\bar{x}_1 - \bar{x}_2 \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}

Suppose that \mu_1 - \mu_2 = 1 point. When n \to \infty, \bar{x}_1 - \bar{x}_2 \xrightarrow{p} \mu_1 - \mu_2, s_p^2 \to \sigma^2 as n\to \infty.

14 KiB Raw Blame History

Chapter 5

Statistical Inference

Parameter vs. Statistic

Example

Variability of Sample Statistics

Sampling Distribution

Central Limit Theorem

Unbiased Estimator

Confidence Interval for \mu if \sigma is known:

Example

Interpretation

Confidence interval for \mu if \sigma is unknown:

Example

Width of a Confidence Interval

One-Sided Confidence Intervals

Confidence Intervals in Python

Hypothesis Testing

Hypothesis

Hypothesis Test

Null Hypothesis

Alternative Hypothesis

Using H_0 and H_a

Possible outcomes of a hypothesis test

Figuring Out Whether Sample Data is Supported

One-Sample T-test for a Mean

P-Value

Significance Level (\alpha)

Steps of a Hypothesis Test

Example

Type 1 Error

Type 2 Error

NOTE:

Example

Two-Sample T-Test

Assumptions:

Example

Example

Practical vs. Statistical Significance

Example

14 KiB

Raw Blame History

Confidence Interval for `\mu` if `\sigma` is known:

Confidence interval for `\mu` if `\sigma` is unknown:

Using `H_0` and `H_a`

Significance Level `(\alpha)`