\hypertarget{chapter-5}{% \section{Chapter 5}\label{chapter-5}} \hypertarget{statistical-inference}{% \subsection{Statistical Inference}\label{statistical-inference}} Statistical inference is the process of drawing conclusions about the entire population based on information from a sample. \hypertarget{parameter-vs.-statistic}{% \subsubsection{Parameter vs.~Statistic}\label{parameter-vs.-statistic}} A parameter is a number that summarizes data from an entire population. A statistic is a number that summarizes data from a sample. \begin{longtable}[]{@{}lll@{}} \toprule() & parameter & statistic \\ \midrule() \endhead mean & \(\mu\) & \(\bar{x}\) \\ standard deviation & \(\sigma\) & \(s\) \\ variance & \(\sigma^2\) & \(s^2\) \\ \bottomrule() \end{longtable} \hypertarget{example}{% \subsubsection{Example}\label{example}} Suppose you were interested in the number of hours that Rowan students spend studying on Sundays. You take a random sample of \(n = 100\) students and the average time they study on Sunday is \(\bar{x}= 3.2\){[}hrs{]}. We use \(\bar{x} = 3.2\){[}hrs{]} as our best estimate for \(\mu\). \hypertarget{variability-of-sample-statistics}{% \subsubsection{Variability of Sample Statistics}\label{variability-of-sample-statistics}} We normally think of a parameter as a fixed value. Sample statistics vary from sample to sample. \hypertarget{sampling-distribution}{% \subsubsection{Sampling Distribution}\label{sampling-distribution}} A sampling distribution is the distribution of sample statistics computed for different samples of the same sample size from the same population. The mean of the sample means is \(\mu\). For a random sample of size, \(n\), the standard error is given by: \[\text{var}(\bar{x}) = {\sigma^2 \over n}\] \hypertarget{central-limit-theorem}{% \subsubsection{Central Limit Theorem}\label{central-limit-theorem}} If \(\bar{x}\) is the mean of a random sample of size, \(n\), taken from a population with mean, \(\mu\), and finite variance, \(\sigma^2\), then the limiting form of the distribution. \[z = {\sqrt{n} (\bar{x} - \mu )\over \sigma}\] As \(n \to \infty\), is the standard normal distribution. This generally holds for \(n \ge 30\). If \(n < 30\), the approximation is good so long as the population is not too different from a normal distribution. \hypertarget{unbiased-estimator}{% \subsubsection{Unbiased Estimator}\label{unbiased-estimator}} A statistic, \(\hat{\theta}\), is said to be an unbiased estimator of the parameter, \(\theta\), if: \[E[\hat{\theta}] = \theta\] or \[E[\hat{\theta} - \theta] =0\] The mean: \[\bar{x} = {1\over n} \sum_{i=1}^{n} x_i\] is an unbiased estimator of \(\mu\). Proof: \[E[\bar{x}] = E\left[ {1\over n} \sum_{i=1}^n x_i\right]\] \[= {1\over n} E[x_1 + x_2 + x_3 + \cdots + x_n]\] \[= {1\over n} \left[ E[x_1] + E[x_2] + \cdots + E[x_n]\right]\] \[= {1\over n} [\mu + \mu + \cdots + \mu]\] \[= {1\over n} [n\mu] = \mu\] \hypertarget{confidence-interval-for-mu-if-sigma-is-known}{% \subsubsection{\texorpdfstring{Confidence Interval for \(\mu\) if \(\sigma\) is known:}{Confidence Interval for \textbackslash mu if \textbackslash sigma is known:}}\label{confidence-interval-for-mu-if-sigma-is-known}} If our sample size is ``large'', then the CLT tells us that: \[{\sqrt{n} (\bar{x} - \mu) \over \sigma} \sim N(0,1) \text{ as } n \to \infty\] \[1 - \alpha = P(-z_{\alpha \over 2} \le {\bar{x} - \mu \over \sigma/\sqrt{n}} \le z_{\alpha \over2}\] A (\(1 - \alpha\))\% confidence interval for \(\mu\) is: \[\bar{x} \pm z_{\alpha \over 2} {\sigma \over \sqrt{n}}\] 90\% CI: \(z_{\alpha \over 2} = 1.645\) 95\% CI: \(z_{\alpha \over 2} = 1.96\)\$ 99\% CI: \(z_{\alpha \over 2} = 2.576\) \hypertarget{example-1}{% \subsubsection{Example}\label{example-1}} In a random sample of 75 Rowan students, the sample mean height was 67 inches. Suppose the population standard deviation is known to be \(\sigma = 7\) inches. Construct a 95\% confidence interval for the mean height of \emph{all} rowan students. \[\bar{x} \pm z_{\alpha \over 2} {\sigma \over \sqrt{n}}\] \[\bar{x} = 67\] \[z_{\alpha \over 2} = 1.96\] \[\sigma = 7\] \[n = 75\] A 95\% CI for \(\mu\): \[67 \pm 1.96 \left({7\over\sqrt{75}}\right) = (65.4, 68.6)\] \hypertarget{interpretation}{% \paragraph{Interpretation}\label{interpretation}} 95\% confident that the mean height of all Rowan students is somewhere between 65.4 and 68.6 inches. From the sample, we found that \(\bar{x} = 67\) inches. Using the confidence interval, we are saying that we are 95\% confident that \(\mu\) is somewhere between 65.4 and 68.6 inches. A limitation of \(z\) confidence interval is that \(\sigma\) is unlikely to be known. \hypertarget{confidence-interval-for-mu-if-sigma-is-unknown}{% \subsubsection{\texorpdfstring{Confidence interval for \(\mu\) if \(\sigma\) is unknown:}{Confidence interval for \textbackslash mu if \textbackslash sigma is unknown:}}\label{confidence-interval-for-mu-if-sigma-is-unknown}} If \(\sigma\) is unknown, we then estimate the standard error, \({\sigma \over \sqrt{n}}\) as \({s \over \sqrt{n}}\). When we estimate the standard error, the distribution is not normal. Instead, it follows a t-distribution with n-1 degrees of freedom. The new distribution is given as: \[{\bar{x} - \mu \over {s \over \sqrt{n}}}\] A (\(1 - \alpha\))\% confidence interval for \(\mu\) when \(\sigma\) is unknown is: \[\bar{x} \pm t^* {s\over \sqrt{n}}\] Where \(t^*\) is an end point chosen from the t-distribution. \(t^*\) varies based on sample size and desired confidence level. \hypertarget{example-2}{% \subsubsection{Example}\label{example-2}} A research engineer for a time manufacturer is investigating tire life for a new rubber compound and has built 115 tires and tested them to end-of-life in a road test. The sample mean and standard deviation are 60139.7, and 3645.94 kilometers. Find a 90\% confidence interval for the mean life of all such tires. \[\bar{x} \pm t^* {s\over\sqrt{n}}\] \[\bar{x} = 60139.7\] \[s = 3645.94\] \[n = 115\] \[t^* = \texttt{t\_crit\_value(115, 0.90)} = 1.658\] \[60139.7 \pm 1.658 {3645.94 \over \sqrt{115}} = (59567.1, 60703.3)\] \hypertarget{width-of-a-confidence-interval}{% \subsubsection{Width of a Confidence Interval}\label{width-of-a-confidence-interval}} \[\bar{x} \pm t_{\alpha \over 2} {s \over \sqrt{n}}\] As sample size increases the width of the confidence interval decreases, and \(\bar{x}\) becomes a better approximation of \(\mu\). \[\lim_{n\to\infty} {s \over \sqrt{n}} = 0\] \[\lim_{n\to\infty} P(|\bar{x} - \mu| < \varepsilon) = 1\] Where \(\varepsilon > 0\). \hypertarget{one-sided-confidence-intervals}{% \subsubsection{One-Sided Confidence Intervals}\label{one-sided-confidence-intervals}} \[\left(-\infty, \bar{x} + t_\alpha {s \over \sqrt{n}}\right)\] \hypertarget{confidence-intervals-in-python}{% \subsubsection{Confidence Intervals in Python}\label{confidence-intervals-in-python}} \begin{Shaded} \begin{Highlighting}[] \ImportTok{import}\NormalTok{ numpy }\ImportTok{as}\NormalTok{ np} \ImportTok{import}\NormalTok{ matplotlib.pyplot }\ImportTok{as}\NormalTok{ plt} \ImportTok{import}\NormalTok{ seaborn }\ImportTok{as}\NormalTok{ sns} \ImportTok{import}\NormalTok{ scipy.stats} \NormalTok{conf\_levels }\OperatorTok{=}\NormalTok{ []} \NormalTok{iterations }\OperatorTok{=} \DecValTok{100} \KeywordTok{def}\NormalTok{ tvalue(sample\_size, conf\_level):} \ControlFlowTok{return}\NormalTok{ stat.t.ppf(}\DecValTok{1} \OperatorTok{{-}}\NormalTok{ (}\DecValTok{1} \OperatorTok{{-}}\NormalTok{ conf\_level)}\OperatorTok{/}\DecValTok{2}\NormalTok{, sample\_size }\OperatorTok{{-}} \DecValTok{1}\NormalTok{)} \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(iterations)} \NormalTok{ sample }\OperatorTok{=}\NormalTok{ np.random.chisquare(df}\OperatorTok{=}\DecValTok{10}\NormalTok{, size}\OperatorTok{=}\DecValTok{100}\NormalTok{)} \NormalTok{ sample\_mean }\OperatorTok{=}\NormalTok{ np.mean(sample)} \NormalTok{ std }\OperatorTok{=}\NormalTok{ np.std(sample)} \NormalTok{ t\_value }\OperatorTok{=}\NormalTok{ tvalue(}\DecValTok{100}\NormalTok{, }\FloatTok{.95}\NormalTok{)} \NormalTok{ lb }\OperatorTok{=}\NormalTok{ sample\_mean }\OperatorTok{{-}}\NormalTok{ t\_value}\OperatorTok{*}\NormalTok{(std }\OperatorTok{/}\NormalTok{ np.sqrt(}\DecValTok{100}\NormalTok{))} \NormalTok{ ub }\OperatorTok{=}\NormalTok{ sample\_mean }\OperatorTok{+}\NormalTok{ t\_value}\OperatorTok{*}\NormalTok{(std }\OperatorTok{/}\NormalTok{ np.sqrt(}\DecValTok{100}\NormalTok{))} \NormalTok{ conf\_levels.append((lb, ub))} \NormalTok{plt.figure(figsize}\OperatorTok{=}\NormalTok{(}\DecValTok{15}\NormalTok{,}\DecValTok{5}\NormalTok{))} \ControlFlowTok{for}\NormalTok{ j, (lb, ub) }\KeywordTok{in} \BuiltInTok{enumerate}\NormalTok{(conf\_levels):} \ControlFlowTok{if} \DecValTok{10} \OperatorTok{\textless{}}\NormalTok{ lb }\KeywordTok{or} \DecValTok{10} \OperatorTok{\textgreater{}}\NormalTok{ ub:} \NormalTok{ plt.plot([j,j], [lb,ub], }\StringTok{\textquotesingle{}ro{-}\textquotesingle{}}\NormalTok{, color}\OperatorTok{=}\StringTok{\textquotesingle{}red\textquotesingle{}}\NormalTok{)} \ControlFlowTok{else}\NormalTok{:} \NormalTok{ plt.plot([j,j], [lb,ub], }\StringTok{\textquotesingle{}ro{-}\textquotesingle{}}\NormalTok{, color}\OperatorTok{=}\StringTok{\textquotesingle{}green\textquotesingle{}}\NormalTok{)} \NormalTok{plt.show()} \end{Highlighting} \end{Shaded} \includegraphics{ConfidenceInterval.png} \hypertarget{hypothesis-testing}{% \subsection{Hypothesis Testing}\label{hypothesis-testing}} Many problems require that we decide whether to accept or reject a statement about some parameter. \hypertarget{hypothesis}{% \subparagraph{Hypothesis}\label{hypothesis}} A claim that we want to test or investigate \hypertarget{hypothesis-test}{% \subparagraph{Hypothesis Test}\label{hypothesis-test}} A statistical test that is used to determine whether results from a sample are convincing enough to allow us to conclude something about the population. Use sample evidence to back up claims about a population \hypertarget{null-hypothesis}{% \subparagraph{Null Hypothesis}\label{null-hypothesis}} The claim that there is no effect or no difference \((H_0)\). \hypertarget{alternative-hypothesis}{% \subparagraph{Alternative Hypothesis}\label{alternative-hypothesis}} The claim for which we seek evidence \((H_a)\). \hypertarget{using-h_0-and-h_a}{% \paragraph{\texorpdfstring{Using \(H_0\) and \(H_a\)}{Using H\_0 and H\_a}}\label{using-h_0-and-h_a}} Does the average Rowan student spend more than \$300 each semester on books? In a sample of 226 Rowan students, the mean cost of a students textbook was \$344 with a standard deviation of \$106. \(H_0\): \(\mu = 300\). \(H_a\): \(\mu > 300\). \(H_0\) and \(H_a\) are statements about population parameters, not sample statistics. In general, the null hypothesis is a statement of equality \((=)\), while the alternative hypothesis is a statement of inequality \((<, >, \ne)\). \hypertarget{possible-outcomes-of-a-hypothesis-test}{% \paragraph{Possible outcomes of a hypothesis test}\label{possible-outcomes-of-a-hypothesis-test}} \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item Reject the null hypothesis \begin{itemize} \tightlist \item Rejecting \(H_0\) means we have enough evidence to support the alternative hypothesis \end{itemize} \item Fail to reject the null hypothesis \begin{itemize} \tightlist \item Not enough evidence to support the alternative hypothesis \end{itemize} \end{enumerate} \hypertarget{figuring-out-whether-sample-data-is-supported}{% \subsubsection{Figuring Out Whether Sample Data is Supported}\label{figuring-out-whether-sample-data-is-supported}} If we assume that the null hypothesis is true, what is the probability of observing sample data that is as extreme or more extreme than what we observed. In the Rowan example, we found that \(\bar{x} = 344\). \hypertarget{one-sample-t-test-for-a-mean}{% \subsubsection{One-Sample T-test for a Mean}\label{one-sample-t-test-for-a-mean}} To test a hypothesis regarding a single mean, there are two main parametric options: 1. z-test 1. t-test The z-test requires knowledge of the population standard deviation. Since \(\sigma\) is unlikely to be known, we will use a t-test. To test \(H_0\): \(\mu = \mu_0\) against its alternative \(H_a\): \(\mu \ne \mu_0\), use the t-statistic. \[t^* = {\bar{x} - \mu_0 \over {s \over \sqrt{n}}}\] \hypertarget{p-value}{% \subparagraph{P-Value}\label{p-value}} A measure of inconsistency between the null hypothesis and the sample data. \hypertarget{significance-level-alpha}{% \subparagraph{\texorpdfstring{Significance Level \((\alpha)\)}{Significance Level (\textbackslash alpha)}}\label{significance-level-alpha}} \(\alpha\) for a test of hypothesis is a boundary below which we conclude that a p-value shows statistically significant evidence against the null. Common \(\alpha\) levels are 0.01, 0.05, 0.10. The lower the \(\alpha\), the stronger the evidence required to reject \(H_0\). If the p-value is less than \(\alpha\), reject \(H_0\), but if the p-value is greater than \(\alpha\), fail to reject \(H_0\). \hypertarget{steps-of-a-hypothesis-test}{% \paragraph{Steps of a Hypothesis Test}\label{steps-of-a-hypothesis-test}} \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item State the \(H_0\) and \(H_a\) \item Calculate the test statistic \item Find the p-value \item Reject or fail to reject \(H_0\) \item Write conclusion in the context of the problem \end{enumerate} \hypertarget{example-3}{% \subsubsection{Example}\label{example-3}} A researcher is interested in testing a particular brand of batteries and whether its battery life exceeds 40 hours. A random sample of \(n=70\) batteries has a mean life of \(\bar{x} = 40.5\) hours with \(s = 1.75\) Let \(\alpha = 0.05\). \(H_0\): \(\mu = 40\) \(H_a\) \(\mu > 40\) \[t^* = {\bar{x} - \mu_0 \over {s \over \sqrt{n}}}\] \[t^* = {40.5 - 40 \over {1.75 \over \sqrt{70}}} = 2.39\] Find the p-value: \[P(t_{\alpha \over 2} \ge |t^*|) = 0.0097\] \begin{Shaded} \begin{Highlighting}[] \OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ scipy.stats }\ImportTok{import}\NormalTok{ t} \OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ t }\OperatorTok{=} \FloatTok{2.39} \CommentTok{\# The t{-}score} \OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ s }\OperatorTok{=} \DecValTok{70} \CommentTok{\# The sample size} \OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ t.sf(}\BuiltInTok{abs}\NormalTok{(t), s)} \FloatTok{0.009772027372500908} \end{Highlighting} \end{Shaded} If in fact \(H_0\) is true, the probability of observing a test statistic that is as extreme or more extreme than \(t^* = 2.39\) is about \(0.0097\). That is to say, the sample is very unlikely to occur under \(H_0\). Since the p-value is less than \(\alpha\), \(H_0\) is rejected. Sample evidence suggests that the mean battery life of this particular brand exceeds 40 hours. \hypertarget{type-1-error}{% \subparagraph{Type 1 Error}\label{type-1-error}} When \(H_0\) is rejected despite it being true. The probability that a type 1 error occurs is \(\alpha\) \hypertarget{type-2-error}{% \subparagraph{Type 2 Error}\label{type-2-error}} When \(H_0\) is not rejected despite it being false. \hypertarget{note}{% \subsubsection{NOTE:}\label{note}} Our group of subjects should be representative of the entire population of interests. Because we cannot impose an experiment on an entire population, we are often forced to examine a small, and we hope that the sample statistics, \(\bar{x}\) and \(s^2\), are good estimates of the population parameters, \(\mu\) and \(\sigma^2\). \hypertarget{example-4}{% \subsubsection{Example}\label{example-4}} The effects of caffeine on the body have been well studied. In one experiment, a group of 20 male college students were trained in a particular tapping movement and to tap a rapid rate. They were randomly divided into caffeine and non-caffeine groups and given approximately 2 cups of coffee (Either 200{[}mg{]} of caffeine or decaf). After a two hour period, the tapping rate was measured. The population of interest is male college-aged students. The question of interest: is the mean tap rate of the caffeinated group different than that of the non-caffeinated group. Let \(\mu_c\) be the mean of the caffeinated group, and \(\mu_d\) be the mean of the caffeinated group. \(H_0\): \(\mu_c = \mu_d\) \(H_a\): \(\mu_c \ne \mu_d\) \hypertarget{two-sample-t-test}{% \subsection{Two-Sample T-Test}\label{two-sample-t-test}} To test: \(H_0\): \(\mu_1 = \mu_2\) \(H_a\): \(\mu_1 \ne \mu_2\) Use the following statistic: \[t^* = {(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2) \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}\] Where: \[s_p^2 = {(n_1 -1)s_1^2 + (n_2 - 1)s_2^2 \over n_1 + n_2 -2}\] Where \(t^*\) follows a t-distribution with \(n_1 + n_2 -2\) degrees of freedom under \(H_0\). Thus, the p-value is \(P(t_{n_1 + n_2 -2} \ge |t^*|)\) for a one sided test, and twice that for a two sided test. \hypertarget{assumptions}{% \subparagraph{Assumptions:}\label{assumptions}} The two populations are independently normally distributed with the same variance. \hypertarget{example-5}{% \subsubsection{Example}\label{example-5}} \(H_0\): \(\mu_c = \mu_d\) \(H_a\): \(\mu_c = \mu_d\) \[s_p^2 = {(n_1 -1)s_1^2 + (n_2 - 1)s_2^2 \over n_1 + n_2 -2}\] \[s_p^2 = {(10 -1)(5.73) + (10 - 1)(4.9) \over 18} = 5.315\] \[s_p = \sqrt{5.315}\] Find the p-value: \[2P(t_{n_1 + n_2 - 2} \ge |3.394|) = 0.00326\] Since the p-value \(< \alpha\), we reject \(H_0\). Sample evidence suggests that the mean tap rate for the caffeinated group is different than that for the con-caffeinated group. \hypertarget{example-6}{% \subsubsection{Example}\label{example-6}} The thickness of a plastic film in mils on a substrate material is thought to be influenced by the temperature at which the coating is applied. A completely randomized experiment is carried out. 11 substrates are coated at 125\(^\circ\)F, resulting in sample mean coating thickness of \(\bar{x}_1 = 103.5\), and sample standard deviation of \(s_1 = 10.2\). Another 13 substrates are coated at 150\(^\circ\)F, where \(\bar{x}_2 = 99.7\) and \(s_2 = 15.1\). It is suspected that raising the process temperature would reduce the mean coating thickness. Does the data support this claim? Use \(\alpha = 0.01\). \begin{longtable}[]{@{}lll@{}} \toprule() & 125\(^\circ\)F & 150\(^\circ\)F \\ \midrule() \endhead \(\bar{x}\) & 103.5 & 99.7 \\ \(s\) & 10.2 & 15.1 \\ \(n\) & 11 & 13 \\ \bottomrule() \end{longtable} \(H_0\): \(\mu_1 = \mu_2\) \(H_a\): \(\mu_1 < \mu_2\) \[t^* = {(\bar{x}_1 - \bar{x}_2) \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}\] \[s_p^2 = {(11 - 1)(10.2)^2 + (13-1)(15.1)^2 \over 11 + 13 - 2} = 171.66\] \[s_p = 13.1\] \[t^* = {(99.7 - 103.5) \over 13.1 \sqrt{{1\over11} + {1\over13}}} = -0.71\] Find the p-value: \[P(t_{n_1 + n_2 - 2} > |-0.71|) = 0.243\] Since the p-value is greater than \(\alpha\), we fail to reject \(H_0\). That is to say sample evidence does not suggest that raising the process temperature would reduce the mean coating thickness. \hypertarget{practical-vs.-statistical-significance}{% \subsection{Practical vs.~Statistical Significance}\label{practical-vs.-statistical-significance}} More samples is not always better. * Waste of resources * Statistical significance \(\ne\) practical significance \hypertarget{example-7}{% \subsubsection{Example}\label{example-7}} Consider an SAT score improvement study. \$600 study plan: \(x_{11}, x_{12}, \cdots, x_{1n}\) Traditional study plan: \(x_{21}, x_{22}, \cdots, x_{2n}\) Test for \(H_0\): \(\mu_1 = \mu_2\) \(H_a\): \(\mu_1 \ne \mu_2\) Test statistic: \[t^* = {\bar{x}_1 - \bar{x}_2 \over s_p \sqrt{{1\over n_1} + {1\over n_2}}}\] Suppose that \(\mu_1 - \mu_2 = 1\) point. When \(n \to \infty\), \(\bar{x}_1 - \bar{x}_2 \xrightarrow{p} \mu_1 - \mu_2\), \(s_p^2 \to \sigma^2\) as \(n\to \infty\).