Friday, November 18, 2011

Chapter 7 Q22

Q. The annual cost of automobile insurance is $939. Assume that the population standard deviation is $245. Find the probability that a SRS of insurance policies will have a sample mean within $25 of the population mean for sample sizes 30, 50, 100, and 400.

A. The interval is between 939 – 25, and 939 + 25 or 914 to 964. We can do this in Excel. For the first sample, where n =30, find the standard error. This is the population standard deviation divided by the square root of the sample size. Where n =30, the standard error is 44.73.

First find the probability from the extreme left of the distribution to 964. In Excel that is =norm.dist(964,939,44.73,true) = 0.71. 

Now find the probability from the extreme left of the distribution to 914. In Excel this is =norm.dist(914,939,44.73,true) = 0.29. 

The final step is to subtract the smaller probability from the larger one. This gives the probability of the area between 914 and 964. This is 0.71 – 0.29 = 0.42.

Follow the same steps with the larger samples. You will see that the probability increases with the sample size. As the sample size increases, the standard error decreases and we are more confident of the location of the unknown population parameter µ. For example, with a sample size of n=400, the probability increases to 0.96.

In this question we know µ. But make the intellectual leap to see that we can use the same method to estimate the location of µ if we did not know it. 

Wednesday, November 16, 2011

Chapter 6 Q20

An alert student has noticed a mistake in the textbook answers. Here I've worked through Q20 from Chapter 6.


a.       a. We want the area to the left of 50, because the question asks for fewer than 50 hours. Use =norm.dist(50,77,20,true) to get 0.089

b.      b. We want more than 100, the area to the right of 100. Excel adds up from the left, so we need to subtract from 1. Use =1-norm.dist(100,77,20,true) to get 0.125

c.       c. The question asks for the “upper 20%”. We are asked to find a random variable which corresponds to this area. In other words, what is the value of the random variable X which separates the top 20% from the bottom 80%. We can use =norm.inv for this, but we will need to write =norm.inv(0.8,77,20) to get 93.83. If we had written =norm.inv(0.2,77,20) then the result would the value of the random variable X that separates the bottom 20% from the top 80%. Draw a sketch and make sure you get this. This posting on our blog might help make this clearer.

Tuesday, November 15, 2011

Q4 p333

Q4 p333

Q: A 95% confidence interval for a population mean was reported to be 152 to 160. If σ = 15, what sample size was used in this study?

A: The mean must be right in the middle of the CI, so here the mean, xbar, is 156. Therefore the Margin of Error was 4. Recall how we find the MofE in the first place? It is MofE =1.96*sigma/root n.

So in this case 4=(1.96 *15)/root n.

Switch the terms around so that root n = (1.96*15)/4 = 7.35 That’s the square root of the sample size, so square it to get n =54.02. ROUND UP to get n = 55

Sunday, November 13, 2011

Chapter 7 Good Questions Answered

Chapter 7 Q20b

Q. Mean length of employment is 17.5 weeks and population standard deviation is 4 weeks. You take a sample of 50 unemployed individuals for a follow-up study. What is the probability that the sample will provide a sample mean within one week of the population mean?

A. This question is asking you to find the area (which represents a probability) one week either side of the population mean. In Excel, go:

=norm.dist(18.5,17.5,4,true) – normdist(16.5,17.5,4,true)

Chapter 7 Q44

Survey results give the standard error of the mean as 20. The population standard deviation is 500.

a.       How large was the sample used? Answer: the SE = standard deviation/root n. So root n = standard deviation/SE. Here root n = 500/20 = 25. So n = 25 squared.

b.      What is the probability that the point estimate was within plus/minus 25 of the population mean?

 Answer: we are looking for the area between the population mean plus 25 and the population mean minus 25. The population mean isn`t given, but this doesn`t matter. Use any number you like. Here I`ve used 100.  In Excel: =norm.dist(125,100,500,true)-norm.dist(75,100,500,true)

Friday, November 11, 2011

Difference between binomdist and norm.dist

We use binomdist when:

1. The random variables are "discrete". For example number of cockroaches found in a jar would be discrete, because it pretty much has to be an integer

2. There is the idea of "trials"

binomdist gives a probability. Now, watch for "false" and "true". Here is a Youtube on this.

We use norm.dist when the random variable is continuous. This posting (in my blog here) might help.