\hypertarget{chapter-4}{% \section{Chapter 4}\label{chapter-4}} \hypertarget{expected-value}{% \subsection{Expected Value}\label{expected-value}} \hypertarget{definition}{% \subsubsection{Definition}\label{definition}} Let \(X\) be a random variable with probability distribution \(f(x)\). The mean, or expected value, of \(X\) is: For a discrete distribution \[E[X] = \sum\limits_x xf(x)\] For a continuous distribution: \[E[X] = \int\limits_{-\infty}^{\infty} xf(x)dx\] Given \(\{1, 2, 3, 3, 5\}\), the mean is: \[{1+2+3+3+5 \over 5} = 2.8\] \[f(x) = \begin{cases} {1\over5} & x=1 \\ {1\over5} & x=2 \\ {2\over5} & x=3 \\ {1\over5} & x=5 \\ \end{cases}\] \[\sum\limits_x xf(x) = {1\over5}(1) + {1\over5}(2) + {1\over5}(3) + {1\over5}(5) = 2.8\] \hypertarget{example}{% \subsubsection{Example}\label{example}} The probability distribution of a discrete random variable \(X\) is: \[f(x) = {3 \choose x}\left({1 \over 4}\right)^x\left({3\over4}\right)^{3-x}, x \in \{0, 1, 2, 3\}\] Find \(E[X]\): \[f(x) = \begin{cases} 0 & x=0 \\ 0.422 & x=1 \\ 0.14 & x=2 \\ {1\over64} & x=3 \end{cases}\] \[E[X] = \sum\limits_x x {3 \choose x}\left({1\over4}\right)^x \left({3\over4}\right)^{3-x}\] \[E[X] = 0(0)+ 0.422(1) + 0.14(2) + {1\over64}(3) = 0.75\] \hypertarget{example-1}{% \subsubsection{Example}\label{example-1}} Let \(X\) be the random variable that denotes the life in hours of a certain electronic device. The PDF is: \[f(x) = \begin{cases} {20000\over x^3} & x > 100 \\ 0 & elsewhere \end{cases}\] Find the expected life of this type of device: \[E[X] = \int\limits_{-\infty}^{\infty} xf(x)dx = \int\limits_{100}^{\infty}x{20000 \over x^3}dx = 200 \text{[hrs]}\] \textbf{Note:} \[E[x^2] = \int\limits_{\infty}^{\infty}x^2f(x)dx\] \hypertarget{properties-of-expectations}{% \subsubsection{Properties of Expectations}\label{properties-of-expectations}} \[E(b) = b\] Where \(b\) is a constant \[E(aX) = aE[X]\] Where \(a\) is a constant \[E(aX + b) aE[X] + b\] \[E[X + Y] = E[X] + E[Y]\] Where \(X\) and \(Y\) are random variables \hypertarget{example-2}{% \subsubsection{Example}\label{example-2}} Given: \[f(x) = \begin{cases} {x^2\over3} & -1 < x < 2 \\ 0 & \text{elsewhere} \end{cases}\] Find the expected value of \(Y = 4X + 3\): \[E[Y] = E[4X + 3] = 4E[X] + 3\] \[E[X] = \int\limits_{-1}^{3} {X^3 \over 3}dx = {1\over12}X^4 \Big|_{-1}^{3}={5\over4}\] \hypertarget{variance-of-a-random-variable}{% \subsubsection{Variance of a Random Variable}\label{variance-of-a-random-variable}} The expected value/mean is of special importance because it describes where the probability distribution is centered. However, we also need to characterize the variance of the distribution. \hypertarget{definition-1}{% \subsubsection{Definition}\label{definition-1}} Let \(X\) be a random variable with probability distribution, \(f(x)\), and mean, \(\mu\). The variance of \(X\) is given by: \[\text{Var}[X] = E[(X-\mu)^2]\] Which is the average squared distance away from the mean. This simplifies to: \[\text{Var}[X] = E[X^2] - E[X]^2\] \textbf{Note:} Generally, \[E[X^2] \ne E[X]^2\] The standard deviation, \(\sigma\), is given by: \[\sigma = \sqrt{\text{Var}[X]}\] \textbf{Note}: The variance is a measure of uncertainty (spread) in the data. \hypertarget{example-3}{% \subsubsection{Example}\label{example-3}} The weekly demand for a drinking water product in thousands of liters from a local chain of efficiency stores is a continuous random variable, \(X\), having the probability density: \[F(x) = \begin{cases} 2(x-1) & 1 < x < 2 \\ 0 & \text{elsewhere} \end{cases}\] Find the expected value: \[E[X] = \int\limits_1^2 x (2(x-1)) dx = 2\int\limits_1^2 (x^2 - x)dx\] \[E[X] = 2\left[{1\over3}x^3 - {1\over2}x^2 \Big|_1^2 \right] = {5\over3}\] Find the variance: \[\text{Var}[X] = E[X^2] - E[X]^2\] \[E[X^2] = \int\limits_1^2 2x^2(x-1)dx = 2\int\limits_1^2 (x^3 - x^2)dx\] \[E[X^2] = {17\over6}\] \[\text{Var}[X] = {17\over6} - \left({5\over3}\right)^2 = {1\over18}\] Find the standard deviation: \[\sigma = \sqrt{\text{Var}[X]} = {1\over3\sqrt{2}} = {\sqrt{2}\over6}\] \hypertarget{example-4}{% \subsubsection{Example}\label{example-4}} The mean and variance are useful when comparing two or more distributions. \begin{longtable}[]{@{}lll@{}} \toprule() & Plan 1 & Plan 2 \\ \midrule() \endhead Avg Score Improvement & \(+17\) & \(+15\) \\ Standard deviation & \(\pm8\) & \(\pm2\) \\ \bottomrule() \end{longtable} \hypertarget{theorem}{% \subsubsection{Theorem}\label{theorem}} If \(X\) has variance, \(\text{Var}[X]\), then \(\text{Var}[aX + b] = a^2\text{Var}[X]\). \hypertarget{example-5}{% \subsubsection{Example}\label{example-5}} The length of time, in minutes, for an airplane to obtain clearance at a certain airport is a random variable, \(Y = 3X - 2\), where \(X\) has the density: \[F(x) = \begin{cases} {1\over4} e^{x/4} & x > 0 \\ 0 & \text{elsewhere} \end{cases}\] \[E[X] = 4\] \[\text{Var}[X] = 16\] Find \(E[Y]\): \[E[Y] = E[3X-2] = 3E[X] - 2 = 10\] \[\text{Var}[Y] = 3^2\text{Var}[X] = 144\] \[\sigma = \sqrt{\text{Var}[Y]} = 12\] \hypertarget{the-exponential-distribution}{% \subsection{The Exponential Distribution}\label{the-exponential-distribution}} The continuous random variable, \(X\), has an exponential distribution with parameter \(\beta\) if its density function is given by: \[F(x) = \begin{cases} {1\over\beta}e^{-x/\beta} & x > 0 \\ 0 & \text{elsewhere} \end{cases}\] Where \(\beta > 0\). \[E[X] = \beta\] \[E[X] = \int\limits_0^{\infty} x{1\over\beta}e^{-x/\beta} dx\] Aside: \[\Gamma(Z) = \int\limits_0^\infty x^{Z - 1}e^{-x}dx\] Where \(\Gamma(Z) = (Z - 1)!\) \[E[X] = \beta \int\limits_0^\infty \left({x\over\beta}\right)^{(2-1)} e^{-x/\beta} \left({dx\over\beta}\right) = \beta\Gamma(2)\] \[E[X] = \beta(2-1)! = \beta\] \[\text{Var}[X] = E[X^2] - E[X]^2\] \[E[X^2] = \int\limits_0^\infty x^2{1\over\beta}e^{-x/\beta}dx = \beta^2 \int\limits_0^\infty \left({x\over\beta}\right)^{(2-1)} e^{-x/\beta} \left({dx\over\beta}\right)\] \[E[X^2] = \beta^2\Gamma(3) = 2\beta^2\] \[\text{Var}[X] = 2\beta^2 - \beta^2 = \beta^2\] \hypertarget{application}{% \paragraph{Application}\label{application}} Reliability analysis: the time to failure of a certain electronic component can be modeled by an exponential distribution. \hypertarget{example-6}{% \subsubsection{Example}\label{example-6}} Let \(T\) be the random variable which measures the time to failure of a certain electronic component. Suppose \(T\) has an exponential distribution with \(\beta = 5\). \[F(x) = \begin{cases} {1\over5}e^{-x/5} & x > 0 \\ 0 & \text{elsewhere} \end{cases}\] If 6 of these components are in use, what is the probability that exactly 3 components are still functioning at the end of 8 years? What is the probability that an individual component is still functioning after 8 years? \[P(T > 8) = \int\limits_8^\infty {1\over5}e^{-x/5}dx \approx 0.2\] \[{6 \choose 3}(0.2)^3(0.8)^3 = 0.08192\] \begin{Shaded} \begin{Highlighting}[] \OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ math }\ImportTok{import}\NormalTok{ comb} \OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ comb(}\DecValTok{6}\NormalTok{,}\DecValTok{3}\NormalTok{) }\OperatorTok{*} \FloatTok{0.2}\OperatorTok{**}\DecValTok{3} \OperatorTok{*} \FloatTok{0.8}\OperatorTok{**}\DecValTok{3} \FloatTok{0.08192000000000003} \end{Highlighting} \end{Shaded} \hypertarget{the-normal-distribution}{% \subsection{The Normal Distribution}\label{the-normal-distribution}} The most important continuous probability distribution in the field of statistics is the normal distribution. It is characterized by 2 parameters, the mean, \(\mu\), and the variance, \(\sigma^2\). \[\text{mean} = \text{median} = \text{mode}\] \[F(x|\mu,\sigma^2) = {1 \over \sqrt{2\pi} \sigma^2} e^{\left({1 \over 2\sigma^2}(x-\mu)^2\right)}\] \[E[X] = \mu\] \[\text{Var}[X] = \sigma^2\] For a normal curve: \[P(x_1 < x < x_2) = \int\limits_{x_1}^{x_2} F(x)dx\] \hypertarget{definition-2}{% \subsubsection{Definition}\label{definition-2}} The distribution of a normal variable with mean 0 and variance 1 is called a standard normal distribution. The transformation of any random variable, \(X\) into a standard normal variable, \(Z\): \[Z = {X - \mu \over \sigma}\] \hypertarget{example-7}{% \subsubsection{Example}\label{example-7}} Given a normal distribution with mean \(\mu = 30\) and standard deviation, \(\sigma = 6\), find the normal curve area to the right of \(x = 17\). Transform to standard normal. \[Z = {17 - 30 \over 6} = -2.16\] That is, \(x = 17\) on a normal distribution with \(\mu = 30\) and \(\sigma = 6\) is equivalent to \(Z=-2.16\) on a normal distribution with \(\mu = 0\) and \(\sigma = 1\). \[P(X > 17) = P(Z > -2.16)\] \[P(Z > -2.16) = 1 -P(Z \le -2.16) = 0.9846\] \begin{Shaded} \begin{Highlighting}[] \OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ scipy.stats }\ImportTok{import}\NormalTok{ norm} \OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ norm.cdf(}\OperatorTok{{-}}\FloatTok{2.16}\NormalTok{)} \FloatTok{0.015386334783925445} \end{Highlighting} \end{Shaded} \hypertarget{example-8}{% \subsubsection{Example}\label{example-8}} The finished inside diameter of a piston ring is normally distributed with mean, \(\mu = 10\){[}cm{]}, and standard deviation, \(\sigma = 0.03\){[}cm{]}. What is the probability that a piston ring will have inside diameter between 9.97{[}cm{]} and 10.03{[}cm{]}? \[Z_1 = {9.97 - 10 \over 0.03} = -1\] \[Z_2 = {10.03 - 10 \over 3} = 1\] \[P(9.97 < x < 10.03) = 0.68\] \begin{Shaded} \begin{Highlighting}[] \OperatorTok{\textgreater{}\textgreater{}\textgreater{}} \ImportTok{from}\NormalTok{ scipy.stats }\ImportTok{import}\NormalTok{ norm} \OperatorTok{\textgreater{}\textgreater{}\textgreater{}}\NormalTok{ norm.cdf(}\DecValTok{1}\NormalTok{) }\OperatorTok{{-}}\NormalTok{ norm.cdf(}\OperatorTok{{-}}\DecValTok{1}\NormalTok{)} \FloatTok{0.6826894921370859} \end{Highlighting} \end{Shaded}