293 lines
9.8 KiB
TeX
293 lines
9.8 KiB
TeX
\hypertarget{chapter-4}{%
|
|
\section{Chapter 4}\label{chapter-4}}
|
|
|
|
\hypertarget{expected-value}{%
|
|
\subsection{Expected Value}\label{expected-value}}
|
|
|
|
\hypertarget{definition}{%
|
|
\subsubsection{Definition}\label{definition}}
|
|
|
|
Let \(X\) be a random variable with probability distribution \(f(x)\).
|
|
The mean, or expected value, of \(X\) is:
|
|
|
|
For a discrete distribution \[E[X] = \sum\limits_x xf(x)\]
|
|
|
|
For a continuous distribution:
|
|
\[E[X] = \int\limits_{-\infty}^{\infty} xf(x)dx\]
|
|
|
|
Given \(\{1, 2, 3, 3, 5\}\), the mean is: \[{1+2+3+3+5 \over 5} = 2.8\]
|
|
\[f(x) = \begin{cases}
|
|
{1\over5} & x=1 \\
|
|
{1\over5} & x=2 \\
|
|
{2\over5} & x=3 \\
|
|
{1\over5} & x=5 \\
|
|
\end{cases}\]
|
|
|
|
\[\sum\limits_x xf(x) = {1\over5}(1) + {1\over5}(2) + {1\over5}(3) + {1\over5}(5) = 2.8\]
|
|
|
|
\hypertarget{example}{%
|
|
\subsubsection{Example}\label{example}}
|
|
|
|
The probability distribution of a discrete random variable \(X\) is:
|
|
\[f(x) = {3 \choose x}\left({1 \over 4}\right)^x\left({3\over4}\right)^{3-x}, x \in \{0, 1, 2, 3\}\]
|
|
Find \(E[X]\): \[f(x) =
|
|
\begin{cases}
|
|
0 & x=0 \\
|
|
0.422 & x=1 \\
|
|
0.14 & x=2 \\
|
|
{1\over64} & x=3
|
|
\end{cases}\]
|
|
|
|
\[E[X] = \sum\limits_x x {3 \choose x}\left({1\over4}\right)^x \left({3\over4}\right)^{3-x}\]
|
|
\[E[X] = 0(0)+ 0.422(1) + 0.14(2) + {1\over64}(3) = 0.75\]
|
|
|
|
\hypertarget{example-1}{%
|
|
\subsubsection{Example}\label{example-1}}
|
|
|
|
Let \(X\) be the random variable that denotes the life in hours of a
|
|
certain electronic device. The PDF is: \[f(x) =
|
|
\begin{cases}
|
|
{20000\over x^3} & x > 100 \\
|
|
0 & elsewhere
|
|
\end{cases}\]
|
|
|
|
Find the expected life of this type of device:
|
|
\[E[X] = \int\limits_{-\infty}^{\infty} xf(x)dx = \int\limits_{100}^{\infty}x{20000 \over x^3}dx = 200 \text{[hrs]}\]
|
|
|
|
\textbf{Note:} \[E[x^2] = \int\limits_{\infty}^{\infty}x^2f(x)dx\]
|
|
|
|
\hypertarget{properties-of-expectations}{%
|
|
\subsubsection{Properties of
|
|
Expectations}\label{properties-of-expectations}}
|
|
|
|
\[E(b) = b\] Where \(b\) is a constant \[E(aX) = aE[X]\] Where \(a\) is
|
|
a constant \[E(aX + b) aE[X] + b\] \[E[X + Y] = E[X] + E[Y]\] Where
|
|
\(X\) and \(Y\) are random variables
|
|
|
|
\hypertarget{example-2}{%
|
|
\subsubsection{Example}\label{example-2}}
|
|
|
|
Given: \[f(x) = \begin{cases}
|
|
{x^2\over3} & -1 < x < 2 \\
|
|
0 & \text{elsewhere}
|
|
\end{cases}\] Find the expected value of \(Y = 4X + 3\):
|
|
|
|
\[E[Y] = E[4X + 3] = 4E[X] + 3\]
|
|
\[E[X] = \int\limits_{-1}^{3} {X^3 \over 3}dx = {1\over12}X^4 \Big|_{-1}^{3}={5\over4}\]
|
|
|
|
\hypertarget{variance-of-a-random-variable}{%
|
|
\subsubsection{Variance of a Random
|
|
Variable}\label{variance-of-a-random-variable}}
|
|
|
|
The expected value/mean is of special importance because it describes
|
|
where the probability distribution is centered. However, we also need to
|
|
characterize the variance of the distribution.
|
|
|
|
\hypertarget{definition-1}{%
|
|
\subsubsection{Definition}\label{definition-1}}
|
|
|
|
Let \(X\) be a random variable with probability distribution, \(f(x)\),
|
|
and mean, \(\mu\). The variance of \(X\) is given by:
|
|
\[\text{Var}[X] = E[(X-\mu)^2]\] Which is the average squared distance
|
|
away from the mean. This simplifies to:
|
|
\[\text{Var}[X] = E[X^2] - E[X]^2\] \textbf{Note:} Generally,
|
|
\[E[X^2] \ne E[X]^2\]
|
|
|
|
The standard deviation, \(\sigma\), is given by:
|
|
\[\sigma = \sqrt{\text{Var}[X]}\]
|
|
|
|
\textbf{Note}: The variance is a measure of uncertainty (spread) in the
|
|
data.
|
|
|
|
\hypertarget{example-3}{%
|
|
\subsubsection{Example}\label{example-3}}
|
|
|
|
The weekly demand for a drinking water product in thousands of liters
|
|
from a local chain of efficiency stores is a continuous random variable,
|
|
\(X\), having the probability density: \[F(x) = \begin{cases}
|
|
2(x-1) & 1 < x < 2 \\
|
|
0 & \text{elsewhere}
|
|
\end{cases}\]
|
|
|
|
Find the expected value:
|
|
\[E[X] = \int\limits_1^2 x (2(x-1)) dx = 2\int\limits_1^2 (x^2 - x)dx\]
|
|
\[E[X] = 2\left[{1\over3}x^3 - {1\over2}x^2 \Big|_1^2 \right] = {5\over3}\]
|
|
|
|
Find the variance: \[\text{Var}[X] = E[X^2] - E[X]^2\]
|
|
\[E[X^2] = \int\limits_1^2 2x^2(x-1)dx = 2\int\limits_1^2 (x^3 - x^2)dx\]
|
|
\[E[X^2] = {17\over6}\]
|
|
\[\text{Var}[X] = {17\over6} - \left({5\over3}\right)^2 = {1\over18}\]
|
|
|
|
Find the standard deviation:
|
|
\[\sigma = \sqrt{\text{Var}[X]} = {1\over3\sqrt{2}} = {\sqrt{2}\over6}\]
|
|
|
|
\hypertarget{example-4}{%
|
|
\subsubsection{Example}\label{example-4}}
|
|
|
|
The mean and variance are useful when comparing two or more
|
|
distributions.
|
|
|
|
\begin{longtable}[]{@{}lll@{}}
|
|
\toprule()
|
|
& Plan 1 & Plan 2 \\
|
|
\midrule()
|
|
\endhead
|
|
Avg Score Improvement & \(+17\) & \(+15\) \\
|
|
Standard deviation & \(\pm8\) & \(\pm2\) \\
|
|
\bottomrule()
|
|
\end{longtable}
|
|
|
|
\hypertarget{theorem}{%
|
|
\subsubsection{Theorem}\label{theorem}}
|
|
|
|
If \(X\) has variance, \(\text{Var}[X]\), then
|
|
\(\text{Var}[aX + b] = a^2\text{Var}[X]\).
|
|
|
|
\hypertarget{example-5}{%
|
|
\subsubsection{Example}\label{example-5}}
|
|
|
|
The length of time, in minutes, for an airplane to obtain clearance at a
|
|
certain airport is a random variable, \(Y = 3X - 2\), where \(X\) has
|
|
the density: \[F(x) = \begin{cases}
|
|
{1\over4} e^{x/4} & x > 0 \\
|
|
0 & \text{elsewhere}
|
|
\end{cases}\]
|
|
|
|
\[E[X] = 4\] \[\text{Var}[X] = 16\]
|
|
|
|
Find \(E[Y]\): \[E[Y] = E[3X-2] = 3E[X] - 2 = 10\]
|
|
\[\text{Var}[Y] = 3^2\text{Var}[X] = 144\]
|
|
\[\sigma = \sqrt{\text{Var}[Y]} = 12\]
|
|
|
|
\hypertarget{the-exponential-distribution}{%
|
|
\subsection{The Exponential
|
|
Distribution}\label{the-exponential-distribution}}
|
|
|
|
The continuous random variable, \(X\), has an exponential distribution
|
|
with parameter \(\beta\) if its density function is given by:
|
|
\[F(x) = \begin{cases}
|
|
{1\over\beta}e^{-x/\beta} & x > 0 \\
|
|
0 & \text{elsewhere}
|
|
\end{cases}\]
|
|
|
|
Where \(\beta > 0\).
|
|
|
|
\[E[X] = \beta\]
|
|
\[E[X] = \int\limits_0^{\infty} x{1\over\beta}e^{-x/\beta} dx\]
|
|
|
|
Aside: \[\Gamma(Z) = \int\limits_0^\infty x^{Z - 1}e^{-x}dx\] Where
|
|
\(\Gamma(Z) = (Z - 1)!\)
|
|
|
|
\[E[X] = \beta \int\limits_0^\infty \left({x\over\beta}\right)^{(2-1)} e^{-x/\beta} \left({dx\over\beta}\right) = \beta\Gamma(2)\]
|
|
\[E[X] = \beta(2-1)! = \beta\]
|
|
|
|
\[\text{Var}[X] = E[X^2] - E[X]^2\]
|
|
\[E[X^2] = \int\limits_0^\infty x^2{1\over\beta}e^{-x/\beta}dx = \beta^2 \int\limits_0^\infty \left({x\over\beta}\right)^{(2-1)} e^{-x/\beta} \left({dx\over\beta}\right)\]
|
|
\[E[X^2] = \beta^2\Gamma(3) = 2\beta^2\]
|
|
\[\text{Var}[X] = 2\beta^2 - \beta^2 = \beta^2\]
|
|
|
|
\hypertarget{application}{%
|
|
\paragraph{Application}\label{application}}
|
|
|
|
Reliability analysis: the time to failure of a certain electronic
|
|
component can be modeled by an exponential distribution.
|
|
|
|
\hypertarget{example-6}{%
|
|
\subsubsection{Example}\label{example-6}}
|
|
|
|
Let \(T\) be the random variable which measures the time to failure of a
|
|
certain electronic component. Suppose \(T\) has an exponential
|
|
distribution with \(\beta = 5\).
|
|
|
|
\[F(x) = \begin{cases}
|
|
{1\over5}e^{-x/5} & x > 0 \\
|
|
0 & \text{elsewhere}
|
|
\end{cases}\]
|
|
|
|
If 6 of these components are in use, what is the probability that
|
|
exactly 3 components are still functioning at the end of 8 years?
|
|
|
|
What is the probability that an individual component is still
|
|
functioning after 8 years?
|
|
|
|
\[P(T > 8) = \int\limits_8^\infty {1\over5}e^{-x/5}dx \approx 0.2\]
|
|
|
|
\[{6 \choose 3}(0.2)^3(0.8)^3 = 0.08192\]
|
|
|
|
\begin{Shaded}
|
|
\begin{Highlighting}[]
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ math }\ImportTok{import}\NormalTok{ comb}
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ comb(}\DecValTok{6}\NormalTok{,}\DecValTok{3}\NormalTok{) }\OperatorTok{*} \FloatTok{0.2}\OperatorTok{**}\DecValTok{3} \OperatorTok{*} \FloatTok{0.8}\OperatorTok{**}\DecValTok{3}
|
|
\FloatTok{0.08192000000000003}
|
|
\end{Highlighting}
|
|
\end{Shaded}
|
|
|
|
\hypertarget{the-normal-distribution}{%
|
|
\subsection{The Normal Distribution}\label{the-normal-distribution}}
|
|
|
|
The most important continuous probability distribution in the field of
|
|
statistics is the normal distribution. It is characterized by 2
|
|
parameters, the mean, \(\mu\), and the variance, \(\sigma^2\).
|
|
\[\text{mean} = \text{median} = \text{mode}\]
|
|
|
|
\[F(x|\mu,\sigma^2) = {1 \over \sqrt{2\pi} \sigma^2} e^{\left({1 \over 2\sigma^2}(x-\mu)^2\right)}\]
|
|
\[E[X] = \mu\] \[\text{Var}[X] = \sigma^2\]
|
|
|
|
For a normal curve:
|
|
\[P(x_1 < x < x_2) = \int\limits_{x_1}^{x_2} F(x)dx\]
|
|
|
|
\hypertarget{definition-2}{%
|
|
\subsubsection{Definition}\label{definition-2}}
|
|
|
|
The distribution of a normal variable with mean 0 and variance 1 is
|
|
called a standard normal distribution.
|
|
|
|
The transformation of any random variable, \(X\) into a standard normal
|
|
variable, \(Z\): \[Z = {X - \mu \over \sigma}\]
|
|
|
|
\hypertarget{example-7}{%
|
|
\subsubsection{Example}\label{example-7}}
|
|
|
|
Given a normal distribution with mean \(\mu = 30\) and standard
|
|
deviation, \(\sigma = 6\), find the normal curve area to the right of
|
|
\(x = 17\).
|
|
|
|
Transform to standard normal. \[Z = {17 - 30 \over 6} = -2.16\]
|
|
|
|
That is, \(x = 17\) on a normal distribution with \(\mu = 30\) and
|
|
\(\sigma = 6\) is equivalent to \(Z=-2.16\) on a normal distribution
|
|
with \(\mu = 0\) and \(\sigma = 1\).
|
|
|
|
\[P(X > 17) = P(Z > -2.16)\]
|
|
|
|
\[P(Z > -2.16) = 1 -P(Z \le -2.16) = 0.9846\]
|
|
|
|
\begin{Shaded}
|
|
\begin{Highlighting}[]
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ scipy.stats }\ImportTok{import}\NormalTok{ norm}
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ norm.cdf(}\OperatorTok{{-}}\FloatTok{2.16}\NormalTok{)}
|
|
\FloatTok{0.015386334783925445}
|
|
\end{Highlighting}
|
|
\end{Shaded}
|
|
|
|
\hypertarget{example-8}{%
|
|
\subsubsection{Example}\label{example-8}}
|
|
|
|
The finished inside diameter of a piston ring is normally distributed
|
|
with mean, \(\mu = 10\){[}cm{]}, and standard deviation,
|
|
\(\sigma = 0.03\){[}cm{]}.
|
|
|
|
What is the probability that a piston ring will have inside diameter
|
|
between 9.97{[}cm{]} and 10.03{[}cm{]}?
|
|
|
|
\[Z_1 = {9.97 - 10 \over 0.03} = -1\] \[Z_2 = {10.03 - 10 \over 3} = 1\]
|
|
\[P(9.97 < x < 10.03) = 0.68\]
|
|
|
|
\begin{Shaded}
|
|
\begin{Highlighting}[]
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ scipy.stats }\ImportTok{import}\NormalTok{ norm}
|
|
\OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ norm.cdf(}\DecValTok{1}\NormalTok{) }\OperatorTok{{-}}\NormalTok{ norm.cdf(}\OperatorTok{{-}}\DecValTok{1}\NormalTok{)}
|
|
\FloatTok{0.6826894921370859}
|
|
\end{Highlighting}
|
|
\end{Shaded}
|