549 lines
20 KiB
TeX
549 lines
20 KiB
TeX
\hypertarget{chapter-5}{%
|
|
\section{Chapter 5}\label{chapter-5}}
|
|
|
|
\hypertarget{statistical-inference}{%
|
|
\subsection{Statistical Inference}\label{statistical-inference}}
|
|
|
|
Statistical inference is the process of drawing conclusions about the
|
|
entire population based on information from a sample.
|
|
|
|
\hypertarget{parameter-vs.-statistic}{%
|
|
\subsubsection{Parameter vs.~Statistic}\label{parameter-vs.-statistic}}
|
|
|
|
A parameter is a number that summarizes data from an entire population.
|
|
|
|
A statistic is a number that summarizes data from a sample.
|
|
|
|
\begin{longtable}[]{@{}lll@{}}
|
|
\toprule()
|
|
& parameter & statistic \\
|
|
\midrule()
|
|
\endhead
|
|
mean & \(\mu\) & \(\bar{x}\) \\
|
|
standard deviation & \(\sigma\) & \(s\) \\
|
|
variance & \(\sigma^2\) & \(s^2\) \\
|
|
\bottomrule()
|
|
\end{longtable}
|
|
|
|
\hypertarget{example}{%
|
|
\subsubsection{Example}\label{example}}
|
|
|
|
Suppose you were interested in the number of hours that Rowan students
|
|
spend studying on Sundays. You take a random sample of \(n = 100\)
|
|
students and the average time they study on Sunday is
|
|
\(\bar{x}= 3.2\){[}hrs{]}.
|
|
|
|
We use \(\bar{x} = 3.2\){[}hrs{]} as our best estimate for \(\mu\).
|
|
|
|
\hypertarget{variability-of-sample-statistics}{%
|
|
\subsubsection{Variability of Sample
|
|
Statistics}\label{variability-of-sample-statistics}}
|
|
|
|
We normally think of a parameter as a fixed value. Sample statistics
|
|
vary from sample to sample.
|
|
|
|
\hypertarget{sampling-distribution}{%
|
|
\subsubsection{Sampling Distribution}\label{sampling-distribution}}
|
|
|
|
A sampling distribution is the distribution of sample statistics
|
|
computed for different samples of the same sample size from the same
|
|
population.
|
|
|
|
The mean of the sample means is \(\mu\). For a random sample of size,
|
|
\(n\), the standard error is given by:
|
|
\[\text{var}(\bar{x}) = {\sigma^2 \over n}\]
|
|
|
|
\hypertarget{central-limit-theorem}{%
|
|
\subsubsection{Central Limit Theorem}\label{central-limit-theorem}}
|
|
|
|
If \(\bar{x}\) is the mean of a random sample of size, \(n\), taken from
|
|
a population with mean, \(\mu\), and finite variance, \(\sigma^2\), then
|
|
the limiting form of the distribution.
|
|
\[z = {\sqrt{n} (\bar{x} - \mu )\over \sigma}\]
|
|
|
|
As \(n \to \infty\), is the standard normal distribution. This generally
|
|
holds for \(n \ge 30\). If \(n < 30\), the approximation is good so long
|
|
as the population is not too different from a normal distribution.
|
|
|
|
\hypertarget{unbiased-estimator}{%
|
|
\subsubsection{Unbiased Estimator}\label{unbiased-estimator}}
|
|
|
|
A statistic, \(\hat{\theta}\), is said to be an unbiased estimator of
|
|
the parameter, \(\theta\), if: \[E[\hat{\theta}] = \theta\] or
|
|
\[E[\hat{\theta} - \theta] =0\]
|
|
|
|
The mean: \[\bar{x} = {1\over n} \sum_{i=1}^{n} x_i\] is an unbiased
|
|
estimator of \(\mu\).
|
|
|
|
Proof: \[E[\bar{x}] = E\left[ {1\over n} \sum_{i=1}^n x_i\right]\]
|
|
\[= {1\over n} E[x_1 + x_2 + x_3 + \cdots + x_n]\]
|
|
\[= {1\over n} \left[ E[x_1] + E[x_2] + \cdots + E[x_n]\right]\]
|
|
\[= {1\over n} [\mu + \mu + \cdots + \mu]\]
|
|
\[= {1\over n} [n\mu] = \mu\]
|
|
|
|
\hypertarget{confidence-interval-for-mu-if-sigma-is-known}{%
|
|
\subsubsection{\texorpdfstring{Confidence Interval for \(\mu\) if
|
|
\(\sigma\) is
|
|
known:}{Confidence Interval for \textbackslash mu if \textbackslash sigma is known:}}\label{confidence-interval-for-mu-if-sigma-is-known}}
|
|
|
|
If our sample size is ``large'', then the CLT tells us that:
|
|
\[{\sqrt{n} (\bar{x} - \mu) \over \sigma} \sim N(0,1) \text{ as } n \to \infty\]
|
|
|
|
\[1 - \alpha = P(-z_{\alpha \over 2} \le {\bar{x} - \mu \over \sigma/\sqrt{n}} \le z_{\alpha \over2}\]
|
|
|
|
A (\(1 - \alpha\))\% confidence interval for \(\mu\) is:
|
|
\[\bar{x} \pm z_{\alpha \over 2} {\sigma \over \sqrt{n}}\]
|
|
|
|
90\% CI: \(z_{\alpha \over 2} = 1.645\)
|
|
|
|
95\% CI: \(z_{\alpha \over 2} = 1.96\)\$
|
|
|
|
99\% CI: \(z_{\alpha \over 2} = 2.576\)
|
|
|
|
\hypertarget{example-1}{%
|
|
\subsubsection{Example}\label{example-1}}
|
|
|
|
In a random sample of 75 Rowan students, the sample mean height was 67
|
|
inches. Suppose the population standard deviation is known to be
|
|
\(\sigma = 7\) inches. Construct a 95\% confidence interval for the mean
|
|
height of \emph{all} rowan students.
|
|
|
|
\[\bar{x} \pm z_{\alpha \over 2} {\sigma \over \sqrt{n}}\]
|
|
\[\bar{x} = 67\] \[z_{\alpha \over 2} = 1.96\] \[\sigma = 7\] \[n = 75\]
|
|
|
|
A 95\% CI for \(\mu\):
|
|
\[67 \pm 1.96 \left({7\over\sqrt{75}}\right) = (65.4, 68.6)\]
|
|
|
|
\hypertarget{interpretation}{%
|
|
\paragraph{Interpretation}\label{interpretation}}
|
|
|
|
95\% confident that the mean height of all Rowan students is somewhere
|
|
between 65.4 and 68.6 inches.
|
|
|
|
From the sample, we found that \(\bar{x} = 67\) inches. Using the
|
|
confidence interval, we are saying that we are 95\% confident that
|
|
\(\mu\) is somewhere between 65.4 and 68.6 inches.
|
|
|
|
A limitation of \(z\) confidence interval is that \(\sigma\) is unlikely
|
|
to be known.
|
|
|
|
\hypertarget{confidence-interval-for-mu-if-sigma-is-unknown}{%
|
|
\subsubsection{\texorpdfstring{Confidence interval for \(\mu\) if
|
|
\(\sigma\) is
|
|
unknown:}{Confidence interval for \textbackslash mu if \textbackslash sigma is unknown:}}\label{confidence-interval-for-mu-if-sigma-is-unknown}}
|
|
|
|
If \(\sigma\) is unknown, we then estimate the standard error,
|
|
\({\sigma \over \sqrt{n}}\) as \({s \over \sqrt{n}}\).
|
|
|
|
When we estimate the standard error, the distribution is not normal.
|
|
Instead, it follows a t-distribution with n-1 degrees of freedom. The
|
|
new distribution is given as:
|
|
\[{\bar{x} - \mu \over {s \over \sqrt{n}}}\]
|
|
|
|
A (\(1 - \alpha\))\% confidence interval for \(\mu\) when \(\sigma\) is
|
|
unknown is: \[\bar{x} \pm t^* {s\over \sqrt{n}}\]
|
|
|
|
Where \(t^*\) is an end point chosen from the t-distribution. \(t^*\)
|
|
varies based on sample size and desired confidence level.
|
|
|
|
\hypertarget{example-2}{%
|
|
\subsubsection{Example}\label{example-2}}
|
|
|
|
A research engineer for a time manufacturer is investigating tire life
|
|
for a new rubber compound and has built 115 tires and tested them to
|
|
end-of-life in a road test. The sample mean and standard deviation are
|
|
60139.7, and 3645.94 kilometers.
|
|
|
|
Find a 90\% confidence interval for the mean life of all such tires.
|
|
\[\bar{x} \pm t^* {s\over\sqrt{n}}\] \[\bar{x} = 60139.7\]
|
|
\[s = 3645.94\] \[n = 115\]
|
|
\[t^* = \texttt{t\_crit\_value(115, 0.90)} = 1.658\]
|
|
\[60139.7 \pm 1.658 {3645.94 \over \sqrt{115}} = (59567.1, 60703.3)\]
|
|
|
|
\hypertarget{width-of-a-confidence-interval}{%
|
|
\subsubsection{Width of a Confidence
|
|
Interval}\label{width-of-a-confidence-interval}}
|
|
|
|
\[\bar{x} \pm t_{\alpha \over 2} {s \over \sqrt{n}}\] As sample size
|
|
increases the width of the confidence interval decreases, and
|
|
\(\bar{x}\) becomes a better approximation of \(\mu\).
|
|
\[\lim_{n\to\infty} {s \over \sqrt{n}} = 0\]
|
|
\[\lim_{n\to\infty} P(|\bar{x} - \mu| < \varepsilon) = 1\] Where
|
|
\(\varepsilon > 0\).
|
|
|
|
\hypertarget{one-sided-confidence-intervals}{%
|
|
\subsubsection{One-Sided Confidence
|
|
Intervals}\label{one-sided-confidence-intervals}}
|
|
|
|
\[\left(-\infty, \bar{x} + t_\alpha {s \over \sqrt{n}}\right)\]
|
|
|
|
\hypertarget{confidence-intervals-in-python}{%
|
|
\subsubsection{Confidence Intervals in
|
|
Python}\label{confidence-intervals-in-python}}
|
|
|
|
\begin{Shaded}
|
|
\begin{Highlighting}[]
|
|
\ImportTok{import}\NormalTok{ numpy }\ImportTok{as}\NormalTok{ np}
|
|
\ImportTok{import}\NormalTok{ matplotlib.pyplot }\ImportTok{as}\NormalTok{ plt}
|
|
\ImportTok{import}\NormalTok{ seaborn }\ImportTok{as}\NormalTok{ sns}
|
|
\ImportTok{import}\NormalTok{ scipy.stats}
|
|
|
|
\NormalTok{conf\_levels }\OperatorTok{=}\NormalTok{ []}
|
|
\NormalTok{iterations }\OperatorTok{=} \DecValTok{100}
|
|
|
|
\KeywordTok{def}\NormalTok{ tvalue(sample\_size, conf\_level):}
|
|
\ControlFlowTok{return}\NormalTok{ stat.t.ppf(}\DecValTok{1} \OperatorTok{{-}}\NormalTok{ (}\DecValTok{1} \OperatorTok{{-}}\NormalTok{ conf\_level)}\OperatorTok{/}\DecValTok{2}\NormalTok{, sample\_size }\OperatorTok{{-}} \DecValTok{1}\NormalTok{)}
|
|
|
|
\ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(iterations)}
|
|
\NormalTok{ sample }\OperatorTok{=}\NormalTok{ np.random.chisquare(df}\OperatorTok{=}\DecValTok{10}\NormalTok{, size}\OperatorTok{=}\DecValTok{100}\NormalTok{)}
|
|
\NormalTok{ sample\_mean }\OperatorTok{=}\NormalTok{ np.mean(sample)}
|
|
\NormalTok{ std }\OperatorTok{=}\NormalTok{ np.std(sample)}
|
|
\NormalTok{ t\_value }\OperatorTok{=}\NormalTok{ tvalue(}\DecValTok{100}\NormalTok{, }\FloatTok{.95}\NormalTok{)}
|
|
\NormalTok{ lb }\OperatorTok{=}\NormalTok{ sample\_mean }\OperatorTok{{-}}\NormalTok{ t\_value}\OperatorTok{*}\NormalTok{(std }\OperatorTok{/}\NormalTok{ np.sqrt(}\DecValTok{100}\NormalTok{))}
|
|
\NormalTok{ ub }\OperatorTok{=}\NormalTok{ sample\_mean }\OperatorTok{+}\NormalTok{ t\_value}\OperatorTok{*}\NormalTok{(std }\OperatorTok{/}\NormalTok{ np.sqrt(}\DecValTok{100}\NormalTok{))}
|
|
\NormalTok{ conf\_levels.append((lb, ub))}
|
|
|
|
\NormalTok{plt.figure(figsize}\OperatorTok{=}\NormalTok{(}\DecValTok{15}\NormalTok{,}\DecValTok{5}\NormalTok{))}
|
|
|
|
\ControlFlowTok{for}\NormalTok{ j, (lb, ub) }\KeywordTok{in} \BuiltInTok{enumerate}\NormalTok{(conf\_levels):}
|
|
\ControlFlowTok{if} \DecValTok{10} \OperatorTok{\textless{}}\NormalTok{ lb }\KeywordTok{or} \DecValTok{10} \OperatorTok{\textgreater{}}\NormalTok{ ub:}
|
|
\NormalTok{ plt.plot([j,j], [lb,ub], }\StringTok{\textquotesingle{}ro{-}\textquotesingle{}}\NormalTok{, color}\OperatorTok{=}\StringTok{\textquotesingle{}red\textquotesingle{}}\NormalTok{)}
|
|
\ControlFlowTok{else}\NormalTok{:}
|
|
\NormalTok{ plt.plot([j,j], [lb,ub], }\StringTok{\textquotesingle{}ro{-}\textquotesingle{}}\NormalTok{, color}\OperatorTok{=}\StringTok{\textquotesingle{}green\textquotesingle{}}\NormalTok{)}
|
|
|
|
\NormalTok{plt.show()}
|
|
\end{Highlighting}
|
|
\end{Shaded}
|
|
|
|
\includegraphics{ConfidenceInterval.png}
|
|
|
|
\hypertarget{hypothesis-testing}{%
|
|
\subsection{Hypothesis Testing}\label{hypothesis-testing}}
|
|
|
|
Many problems require that we decide whether to accept or reject a
|
|
statement about some parameter.
|
|
|
|
\hypertarget{hypothesis}{%
|
|
\subparagraph{Hypothesis}\label{hypothesis}}
|
|
|
|
A claim that we want to test or investigate
|
|
|
|
\hypertarget{hypothesis-test}{%
|
|
\subparagraph{Hypothesis Test}\label{hypothesis-test}}
|
|
|
|
A statistical test that is used to determine whether results from a
|
|
sample are convincing enough to allow us to conclude something about the
|
|
population.
|
|
|
|
Use sample evidence to back up claims about a population
|
|
|
|
\hypertarget{null-hypothesis}{%
|
|
\subparagraph{Null Hypothesis}\label{null-hypothesis}}
|
|
|
|
The claim that there is no effect or no difference \((H_0)\).
|
|
|
|
\hypertarget{alternative-hypothesis}{%
|
|
\subparagraph{Alternative Hypothesis}\label{alternative-hypothesis}}
|
|
|
|
The claim for which we seek evidence \((H_a)\).
|
|
|
|
\hypertarget{using-h_0-and-h_a}{%
|
|
\paragraph{\texorpdfstring{Using \(H_0\) and
|
|
\(H_a\)}{Using H\_0 and H\_a}}\label{using-h_0-and-h_a}}
|
|
|
|
Does the average Rowan student spend more than \$300 each semester on
|
|
books?
|
|
|
|
In a sample of 226 Rowan students, the mean cost of a students textbook
|
|
was \$344 with a standard deviation of \$106.
|
|
|
|
\(H_0\): \(\mu = 300\).
|
|
|
|
\(H_a\): \(\mu > 300\).
|
|
|
|
\(H_0\) and \(H_a\) are statements about population parameters, not
|
|
sample statistics.
|
|
|
|
In general, the null hypothesis is a statement of equality \((=)\),
|
|
while the alternative hypothesis is a statement of inequality
|
|
\((<, >, \ne)\).
|
|
|
|
\hypertarget{possible-outcomes-of-a-hypothesis-test}{%
|
|
\paragraph{Possible outcomes of a hypothesis
|
|
test}\label{possible-outcomes-of-a-hypothesis-test}}
|
|
|
|
\begin{enumerate}
|
|
\def\labelenumi{\arabic{enumi}.}
|
|
\tightlist
|
|
\item
|
|
Reject the null hypothesis
|
|
|
|
\begin{itemize}
|
|
\tightlist
|
|
\item
|
|
Rejecting \(H_0\) means we have enough evidence to support the
|
|
alternative hypothesis
|
|
\end{itemize}
|
|
\item
|
|
Fail to reject the null hypothesis
|
|
|
|
\begin{itemize}
|
|
\tightlist
|
|
\item
|
|
Not enough evidence to support the alternative hypothesis
|
|
\end{itemize}
|
|
\end{enumerate}
|
|
|
|
\hypertarget{figuring-out-whether-sample-data-is-supported}{%
|
|
\subsubsection{Figuring Out Whether Sample Data is
|
|
Supported}\label{figuring-out-whether-sample-data-is-supported}}
|
|
|
|
If we assume that the null hypothesis is true, what is the probability
|
|
of observing sample data that is as extreme or more extreme than what we
|
|
observed.
|
|
|
|
In the Rowan example, we found that \(\bar{x} = 344\).
|
|
|
|
\hypertarget{one-sample-t-test-for-a-mean}{%
|
|
\subsubsection{One-Sample T-test for a
|
|
Mean}\label{one-sample-t-test-for-a-mean}}
|
|
|
|
To test a hypothesis regarding a single mean, there are two main
|
|
parametric options: 1. z-test 1. t-test
|
|
|
|
The z-test requires knowledge of the population standard deviation.
|
|
Since \(\sigma\) is unlikely to be known, we will use a t-test.
|
|
|
|
To test \(H_0\): \(\mu = \mu_0\) against its alternative \(H_a\):
|
|
\(\mu \ne \mu_0\), use the t-statistic.
|
|
|
|
\[t^* = {\bar{x} - \mu_0 \over {s \over \sqrt{n}}}\]
|
|
|
|
\hypertarget{p-value}{%
|
|
\subparagraph{P-Value}\label{p-value}}
|
|
|
|
A measure of inconsistency between the null hypothesis and the sample
|
|
data.
|
|
|
|
\hypertarget{significance-level-alpha}{%
|
|
\subparagraph{\texorpdfstring{Significance Level
|
|
\((\alpha)\)}{Significance Level (\textbackslash alpha)}}\label{significance-level-alpha}}
|
|
|
|
\(\alpha\) for a test of hypothesis is a boundary below which we
|
|
conclude that a p-value shows statistically significant evidence against
|
|
the null.
|
|
|
|
Common \(\alpha\) levels are 0.01, 0.05, 0.10.
|
|
|
|
The lower the \(\alpha\), the stronger the evidence required to reject
|
|
\(H_0\). If the p-value is less than \(\alpha\), reject \(H_0\), but if
|
|
the p-value is greater than \(\alpha\), fail to reject \(H_0\).
|
|
|
|
\hypertarget{steps-of-a-hypothesis-test}{%
|
|
\paragraph{Steps of a Hypothesis
|
|
Test}\label{steps-of-a-hypothesis-test}}
|
|
|
|
\begin{enumerate}
|
|
\def\labelenumi{\arabic{enumi}.}
|
|
\tightlist
|
|
\item
|
|
State the \(H_0\) and \(H_a\)
|
|
\item
|
|
Calculate the test statistic
|
|
\item
|
|
Find the p-value
|
|
\item
|
|
Reject or fail to reject \(H_0\)
|
|
\item
|
|
Write conclusion in the context of the problem
|
|
\end{enumerate}
|
|
|
|
\hypertarget{example-3}{%
|
|
\subsubsection{Example}\label{example-3}}
|
|
|
|
A researcher is interested in testing a particular brand of batteries
|
|
and whether its battery life exceeds 40 hours.
|
|
|
|
A random sample of \(n=70\) batteries has a mean life of
|
|
\(\bar{x} = 40.5\) hours with \(s = 1.75\) Let \(\alpha = 0.05\).
|
|
|
|
\(H_0\): \(\mu = 40\)
|
|
|
|
\(H_a\) \(\mu > 40\)
|
|
|
|
\[t^* = {\bar{x} - \mu_0 \over {s \over \sqrt{n}}}\]
|
|
\[t^* = {40.5 - 40 \over {1.75 \over \sqrt{70}}} = 2.39\]
|
|
|
|
Find the p-value: \[P(t_{\alpha \over 2} \ge |t^*|) = 0.0097\]
|
|
|
|
\begin{Shaded}
|
|
\begin{Highlighting}[]
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ scipy.stats }\ImportTok{import}\NormalTok{ t}
|
|
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ t }\OperatorTok{=} \FloatTok{2.39} \CommentTok{\# The t{-}score}
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ s }\OperatorTok{=} \DecValTok{70} \CommentTok{\# The sample size}
|
|
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ t.sf(}\BuiltInTok{abs}\NormalTok{(t), s)}
|
|
\FloatTok{0.009772027372500908}
|
|
\end{Highlighting}
|
|
\end{Shaded}
|
|
|
|
If in fact \(H_0\) is true, the probability of observing a test
|
|
statistic that is as extreme or more extreme than \(t^* = 2.39\) is
|
|
about \(0.0097\). That is to say, the sample is very unlikely to occur
|
|
under \(H_0\). Since the p-value is less than \(\alpha\), \(H_0\) is
|
|
rejected.
|
|
|
|
Sample evidence suggests that the mean battery life of this particular
|
|
brand exceeds 40 hours.
|
|
|
|
\hypertarget{type-1-error}{%
|
|
\subparagraph{Type 1 Error}\label{type-1-error}}
|
|
|
|
When \(H_0\) is rejected despite it being true.
|
|
|
|
The probability that a type 1 error occurs is \(\alpha\)
|
|
|
|
\hypertarget{type-2-error}{%
|
|
\subparagraph{Type 2 Error}\label{type-2-error}}
|
|
|
|
When \(H_0\) is not rejected despite it being false.
|
|
|
|
\hypertarget{note}{%
|
|
\subsubsection{NOTE:}\label{note}}
|
|
|
|
Our group of subjects should be representative of the entire population
|
|
of interests.
|
|
|
|
Because we cannot impose an experiment on an entire population, we are
|
|
often forced to examine a small, and we hope that the sample statistics,
|
|
\(\bar{x}\) and \(s^2\), are good estimates of the population
|
|
parameters, \(\mu\) and \(\sigma^2\).
|
|
|
|
\hypertarget{example-4}{%
|
|
\subsubsection{Example}\label{example-4}}
|
|
|
|
The effects of caffeine on the body have been well studied. In one
|
|
experiment, a group of 20 male college students were trained in a
|
|
particular tapping movement and to tap a rapid rate. They were randomly
|
|
divided into caffeine and non-caffeine groups and given approximately 2
|
|
cups of coffee (Either 200{[}mg{]} of caffeine or decaf). After a two
|
|
hour period, the tapping rate was measured.
|
|
|
|
The population of interest is male college-aged students.
|
|
|
|
The question of interest: is the mean tap rate of the caffeinated group
|
|
different than that of the non-caffeinated group.
|
|
|
|
Let \(\mu_c\) be the mean of the caffeinated group, and \(\mu_d\) be the
|
|
mean of the caffeinated group.
|
|
|
|
\(H_0\): \(\mu_c = \mu_d\)
|
|
|
|
\(H_a\): \(\mu_c \ne \mu_d\)
|
|
|
|
\hypertarget{two-sample-t-test}{%
|
|
\subsection{Two-Sample T-Test}\label{two-sample-t-test}}
|
|
|
|
To test:
|
|
|
|
\(H_0\): \(\mu_1 = \mu_2\)
|
|
|
|
\(H_a\): \(\mu_1 \ne \mu_2\)
|
|
|
|
Use the following statistic:
|
|
\[t^* = {(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2) \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}\]
|
|
Where: \[s_p^2 = {(n_1 -1)s_1^2 + (n_2 - 1)s_2^2 \over n_1 + n_2 -2}\]
|
|
Where \(t^*\) follows a t-distribution with \(n_1 + n_2 -2\) degrees of
|
|
freedom under \(H_0\). Thus, the p-value is
|
|
\(P(t_{n_1 + n_2 -2} \ge |t^*|)\) for a one sided test, and twice that
|
|
for a two sided test.
|
|
|
|
\hypertarget{assumptions}{%
|
|
\subparagraph{Assumptions:}\label{assumptions}}
|
|
|
|
The two populations are independently normally distributed with the same
|
|
variance.
|
|
|
|
\hypertarget{example-5}{%
|
|
\subsubsection{Example}\label{example-5}}
|
|
|
|
\(H_0\): \(\mu_c = \mu_d\)
|
|
|
|
\(H_a\): \(\mu_c = \mu_d\)
|
|
|
|
\[s_p^2 = {(n_1 -1)s_1^2 + (n_2 - 1)s_2^2 \over n_1 + n_2 -2}\]
|
|
\[s_p^2 = {(10 -1)(5.73) + (10 - 1)(4.9) \over 18} = 5.315\]
|
|
\[s_p = \sqrt{5.315}\]
|
|
|
|
Find the p-value: \[2P(t_{n_1 + n_2 - 2} \ge |3.394|) = 0.00326\] Since
|
|
the p-value \(< \alpha\), we reject \(H_0\).
|
|
|
|
Sample evidence suggests that the mean tap rate for the caffeinated
|
|
group is different than that for the con-caffeinated group.
|
|
|
|
\hypertarget{example-6}{%
|
|
\subsubsection{Example}\label{example-6}}
|
|
|
|
The thickness of a plastic film in mils on a substrate material is
|
|
thought to be influenced by the temperature at which the coating is
|
|
applied. A completely randomized experiment is carried out. 11
|
|
substrates are coated at 125\(^\circ\)F, resulting in sample mean
|
|
coating thickness of \(\bar{x}_1 = 103.5\), and sample standard
|
|
deviation of \(s_1 = 10.2\). Another 13 substrates are coated at
|
|
150\(^\circ\)F, where \(\bar{x}_2 = 99.7\) and \(s_2 = 15.1\). It is
|
|
suspected that raising the process temperature would reduce the mean
|
|
coating thickness. Does the data support this claim? Use
|
|
\(\alpha = 0.01\).
|
|
|
|
\begin{longtable}[]{@{}lll@{}}
|
|
\toprule()
|
|
& 125\(^\circ\)F & 150\(^\circ\)F \\
|
|
\midrule()
|
|
\endhead
|
|
\(\bar{x}\) & 103.5 & 99.7 \\
|
|
\(s\) & 10.2 & 15.1 \\
|
|
\(n\) & 11 & 13 \\
|
|
\bottomrule()
|
|
\end{longtable}
|
|
|
|
\(H_0\): \(\mu_1 = \mu_2\)
|
|
|
|
\(H_a\): \(\mu_1 < \mu_2\)
|
|
|
|
\[t^* = {(\bar{x}_1 - \bar{x}_2) \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}\]
|
|
\[s_p^2 = {(11 - 1)(10.2)^2 + (13-1)(15.1)^2 \over 11 + 13 - 2} = 171.66\]
|
|
\[s_p = 13.1\]
|
|
\[t^* = {(99.7 - 103.5) \over 13.1 \sqrt{{1\over11} + {1\over13}}} = -0.71\]
|
|
|
|
Find the p-value: \[P(t_{n_1 + n_2 - 2} > |-0.71|) = 0.243\] Since the
|
|
p-value is greater than \(\alpha\), we fail to reject \(H_0\). That is
|
|
to say sample evidence does not suggest that raising the process
|
|
temperature would reduce the mean coating thickness.
|
|
|
|
\hypertarget{practical-vs.-statistical-significance}{%
|
|
\subsection{Practical vs.~Statistical
|
|
Significance}\label{practical-vs.-statistical-significance}}
|
|
|
|
More samples is not always better. * Waste of resources * Statistical
|
|
significance \(\ne\) practical significance
|
|
|
|
\hypertarget{example-7}{%
|
|
\subsubsection{Example}\label{example-7}}
|
|
|
|
Consider an SAT score improvement study.
|
|
|
|
\$600 study plan: \(x_{11}, x_{12}, \cdots, x_{1n}\)
|
|
|
|
Traditional study plan: \(x_{21}, x_{22}, \cdots, x_{2n}\)
|
|
|
|
Test for \(H_0\): \(\mu_1 = \mu_2\)
|
|
|
|
\(H_a\): \(\mu_1 \ne \mu_2\)
|
|
|
|
Test statistic:
|
|
\[t^* = {\bar{x}_1 - \bar{x}_2 \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}\]
|
|
Suppose that \(\mu_1 - \mu_2 = 1\) point. When \(n \to \infty\),
|
|
\(\bar{x}_1 - \bar{x}_2 \xrightarrow{p} \mu_1 - \mu_2\),
|
|
\(s_p^2 \to \sigma^2\) as \(n\to \infty\).
|