课程名称︰统计导论
课程性质︰数学系选修
课程教师︰郑明燕
开课学院:理学院
开课系所︰数学系、数学研究所、应用数学科学研究所
考试日期︰2015年01月12日(一),15:30-17:20
考试时限:110分钟
试题 :
Introductory Statistics Final Examination January 12, 2015
(I) True or False (16 pts.)
1. __ The t-distribution has a variance that is greater than one.
2. __ If the null hypothesis is rejected, one is confident that the
alternative hypothesis is true.
3. __ If the null hypothesis is not rejected, one is confident that the
null hypothesis is true.
4. __ A regression line can be used to predict the y-value correponding
to any x-value.
5. __ The t test is used when the sample size is small and the sample is
taken from a normal populaion.
6. __ A researcher obtained a 95% confidence interval for the proportion
of shoppers that favored longer shopping hours as (0.56,0.78).
Therefore there is a 95% chance that the true proportion will fall
in the interval.
7. __ If a researcher wanted to compare 8 means, it is preferable to use
the F test rahter than using the t test to compare the means two at
a time.
8. __ When performing an F test, if the means differ significantly, the
between-group variation will be much lager than the within-group
variation.
(II) Short Abswers (total 43s)
1. (4 pts) Explain the confidence level of an interval estimate.
2. Explain the terms in hypothesis testing:
(a) (6 pts) Type I error and Type II error.
(b) (6 pts) significance level and power.
3. Define the following two methods used to test hypotheses.
(a) (4 pts) The p-value method.
(b) (4 pts) The confidence interval method.
4. (4 pts) State the purpose of the Tukey Honestly Significant Difference
test.
5. Two means.
(a) (4 pts) Define the two-sided level-α t-test for the difference of two
population means based on a matched-pair sample.
(b) (4 pts) Define the two-sided level-α t-test for the difference of two
population means based on a matched-pair sample.
(c) (3 pts) What is the purpose of matching pairs?
(d) (4 pts) Suppose the sample variances and sample size in (a) and (b)
are the same. Which method gives shorter confidence interval?
(III) (total 41 pts)
1. (8 pts) If you want to check three vending machines to determine if they
are properly dispensing 12 ounces of coffee. What would you do? Formulate
the statistical model and provide the soluion.
2. (8 pts) Write down a general multiple linear regression model for the
relationship between income (y), the dependent varaible, and two
independent variables age (x1) and grade point average (x2) for an
employee of a company. State the assumptions and explain the meaing of
each term in the model.
3. (10 pts)A researcher wants to know whether the typing speed of a secretary
(in words per minute) is realted to the time (in hours) that it takes to
learn to use a new word processing program. The data are as the following.
Speed(x) │48 74 52 79 83 56 85 63 88 74 90 92
─────┼────────────────────────
Time(y) │ 7 4 8 3.5 2 6 2.3 5 2.1 4.5 1.9 1.5
(a) Compute the explained variation and the unexplained variation.
(b) Find the regression line. Why is it call the line of best fit?
4. (15 pts.) Two types of outdoor paint, enamel and latex, were tested to see
how long (in months) each lasted before it began to crack, flake, and peel.
They were tested in the United States to study the effects of climate in
the paint. The data and the ANOVA table follow.
| Geographic location
───────┼───────────────────────────────
Type of paint| North East South West
Enamel |60,53,58,62,57 54,63,62,71,76 80,82,62,88,71 62,76,55,48,61
Latex |36,41,54,65,53 62,61,77,53,64 68,72,71,82,86 63,65,72,71,63
Source SS d.f. MS F
──────────────────
paint type 12.1
location 2501
interaction 268.1
within group 2326.8
──────────────────
total 5108
Let X_ijk be the number of months lasted for the kth observation using ith
paint type and tested at jth geographic location. Five possible models for
the data are
(A) X_ijk = μ + εijk
(B) X_ijk = μ + αi + εijk
(C) X_ijk = μ + βj + εijk
(D) X_ijk = μ + αi + βj + εijk
(E) X_ijk = μ + αi + βj + γij + εijk
Here, εijk denotes the natural variation and is assumed to follow a Normal
(0,σ^2) distribution.
(a) Complete the ANOVA table.
(b) Test significance, at level α=0.05, of the main effects and the
interaction.
(c) Which of the models A,B,C,D, and E is most appropriate for the data?
Explain its meanings.