Binomial Random Variable Correction for Continuity

Question

I will give you an outline how to work a question similar to the two numerically very different part (b)'s. I will use 2500 raisins for 100 loaves, and answer for the average of four loaves. I hope you can use this outline to see how to work whichever problem is of interest.

By the CLT the number of raisins in one loaf is approximately $$X_i \sim \mathsf{Norm}(\mu = 25, \sigma = 4.975).$$ [You should explain how to get the values $\mu$ and $\sigma$ from binomial $n$ and $p.$]

Then the mean of four loaves would be approximately

$$\bar X \sim \mathsf{Norm}(\mu_{\bar X} = 25,\, \sigma_{\bar X} = 4.975/\sqrt{4} = 2.4875).$$ [You should explain how the values $\mu_{\bar X}$ and $\sigma_{\bar X}$ are obtained form the values $\mu$ and $\sigma$ above.]

You seek $P(\bar X > 32) = P(\bar X > 32.5) = P(\bar X \ge 33).$ The continuity correction uses the second of these. [Why?] Then $$P(\bar X > 32.5) = P\left(\frac{\bar X - \mu_{\bar X}}{\sigma_{\bar X}} > \frac{32.5 - 25}{2.4875} \right) = P(Z > 3.015),$$ which you can evaluate using a printed normal table or software. But you should recognize immediately that this probability is rather small. [Why?]

Addendum: OK, it seems you are mainly interested in when to use continuity correct to improve normal approximations to binomial probabilities. Here are three relevant examples showing the effect of continuity corrections. [I will use R for quick computation, but you could get about the same normal approximations by standardizing and using printed tables.]

Continuity correction crucial for useful answer: $X \sim \mathsf{Binomial}(n = 16, p = 1/2)$ approximated by $\mathsf{Normal}(\mu = 8, \sigma = 2).$ Find $$P(6 \le X \le 9) = P(5.5 < X < 9.5) = P(5 < X < 10).$$ For the discrete binomial random variable $X,$ all three statements give identical results. But the middle form is the one to use for continuity correction. The exact binomial probability $$P(5.5 < X < 9.5) = P(X=6) + P(X=7) + P(X=8) + P(X = 9)\\ = P(X \le 9) - P(X \le 5) = 0.6677$$ to four places. The normal approximation with continuity correction gives $0.6825.$ (You can usually expect normal approximations to be accurate to about two places.) The normal approximations without continuity correction (0.5328 and 0.7745) are quite far from the mark.

          sum(dbinom(6:9, 16, .5)) ## 0.6676941                   # exact: P(X=6) + P(X=7) + P(X=8) + P(X=9) diff(pbinom(c(5,9), 16, .5)) ## 0.6676941                   # exact: P(X <= 9) - P(X <= 5) diff(pnorm(c(6, 9), 8, 2))  ## 0.5328072                   # botched norm aprx: too small diff(pnorm(c(5.5, 9.5), 8, 2))  ## 0.6824948                   # norm aprx w/ cont corr: closest diff(pnorm(c(5, 10), 8, 2))  ## 0.7745375                   # botched norm aprx: too big

In the figure below, we want the total height of the four binomial bars between the vertical broken lines. The normal approximation with continuity correction includes the area under the normal curve between the two broken lines.

enter image description here

Continuity correction important: $Y \sim \mathsf{Binomial}(n = 100, p = 1/2)$ approximated by $\mathsf{Normal}(\mu = 50, \sigma = 5).$ Find $$P(40 \le X \le 52) = P(39.5 < X < 52.5) = P(39 < X < 53).$$ The exact binomial probability is $0.6738;$ the normal approximation with continuity correction is $0.3736.$ The approximation with continuity correction is clearly better than the other two (0.6327 and 0.6736).

          sum(dbinom(40:52, 100, .5)) ## 0.6737502                    # exact binomial probability as sum of PDF values diff(pbinom(c(39,52), 100, .5)) ## 0.6737502                    # exact binom. probability as diff. of two CDF values diff(pnorm(c(40, 52), 50, 5))  ## 0.6326716                    # normal aprx, too small diff(pnorm(c(39.5, 52.5), 50, 5))  ## 0.673598                     # normal aprx with continuity correction. Best. diff(pnorm(c(39, 53), 50, 5))  ## 0.7118434

enter image description here

Continuity correction less important: $Y \sim \mathsf{Binomial}(n = 100, p = 1/2)$ approximated by $\mathsf{Normal}(\mu = 50, \sigma = 5).$ Find $$P(40 \le X \le 60) = P(39.5 < X < 60.5) = P(39 < X < 61).$$ The exact binomial probability is $0.9648;$ the normal approximation with continuity correction is $0.9643.$ The binomial approximations without the continuity correction (0.9545 and 0.9722) are not as good, but they are not disastrously misleading.

          sum(dbinom(40:60, 100, .5)) ## 0.9647998                       # exact binomial diff(pnorm(c(40, 60), 50, 5))  ## 0.9544997 diff(pnorm(c(39.5, 60.5), 50, 5))  # normal aprx with continuity correction. Best ## 0.9642712 diff(pnorm(c(39, 61), 50, 5))  ## 0.9721931

Notes: (a) It is difficult to give rules of thumb to predict when the continuity correction will be really important, so good practice is always to use it. [Or better yet in applied situations, to use software to get the exact binomial result.]

(b) My examples all use $p = 1/2.$ Usually, normal approximation to binomial works best when $p = 1/2;$ when $p = 1/2$ the binomial distribution is symmetrical and that makes it easier for the symmetrical normal distribution to give a good approximation. When $p$ is far from $1/2,$ normal approximations may be problematic, and the continuity correction may be even more important.

(c) For full disclosure, I have to admit there are quirky cases (especially with $p$ far from $1/2$) in which continuity correction may decrease accuracy, but my opinion is that those are cases in which the normal approximation really shouldn't be used at all.

(d) In some applications when the distribution is taken to be approximately normal and rounding is customary, one does not usually use a continuity correction. For example, if you are taking men's heights, measured in inches, to be $\mathsf{Norm}(\mu = 58, \sigma = 3.5)$ and the problem is to find the probability a randomly chosen man is over 6 feet tall, then most texts wouldn't expect you to compute $P(X > 71.5).$ [Or to worry about $P(X < 0),$ which is technically positive.]

Picton Frinslazince

Binomial Random Variable Correction for Continuity

Menu Halaman Statis