An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual

Page 1 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 1 preview image

An Introduction toStatistical Methods and DataAnalysisSEVENTH EDITIONR. Lyman OttMichael LongneckerTexas A&M UniversityCollege Station, TX

Page 2 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 2 preview image

Page 3 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 3 preview image

ContentsChapter 1: Statistics and the Scientific Method ............................................................................... 1Chapter 2: Using Surveys and Experimental Studies to Gather Data .............................................. 3Chapter 3: Data Description........................................................................................................... 11Chapter 4: Probability and Probability Distributions..................................................................... 47Chapter 5: Inferences about Population Central Values ................................................................ 68Chapter 6: Inferences Comparing Two Population Central Values............................................... 89Chapter 7: Inferences about Population Variances ...................................................................... 111Chapter 8: Inferences about More Than Two Population Central Values ................................... 127Chapter 9: Multiple Comparisons ................................................................................................ 153Chapter 10: Categorical Data....................................................................................................... 164Chapter 11: Linear Regression and Correlation........................................................................... 206Chapter 12: Multiple Regression and the General Linear Model ................................................ 250Chapter 13: Further Regression Topics ....................................................................................... 304Chapter 14: Analysis of Variance for Completely Randomized Designs.................................... 368Chapter 15: Analysis of Variance for Blocked Designs .............................................................. 393Chapter 16: The Analysis of Covariance ..................................................................................... 415Chapter 17: Analysis of Variance for Some Fixed-, Random-, and Mixed-Effects Models ....... 437Chapter 18: Split-Plot, Repeated Measures, and Crossover Designs........................................... 462Chapter 19: Analysis of Variance for Some Unbalanced Designs .............................................. 490

Page 4 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 4 preview image

1Chapter 1Statistics and the Scientific Method1.1a.The population of interest is all salmon released from fish farms located in Norway.b.The samples are the two batches of salmon released (1,996 and 2,499 in northern and southernNorway, respectively).c.The migration pattern and survival of salmon released from fish farms.d.Since the sample is only a small proportion of the whole population, it is necessary to evaluate whatthe mean weight may be for any other random selection of farmed salmon.1.2a.All private water wells.b.The 100 private water wells in or near the Barnett Shale in Texas.c.The level of contaminants in the water wells.d.We want to relate the level of contaminants of the 100 points in the sample to the level in the wholesuspect area. Thus we need to know how accurate a portrayal of the population is provided by the100 points in the sample.1.3a.All families that have had option of SNAP (food stamps).b.60,782 examined over the time period of 1968 to 2009.c.Adult health and economic outcomes (specifically, the incidence of metabolic health outcomes andeconomic self-sufficiency).d.In order to evaluate how closely the sample families represent the American population over thistime period.1.4a.All head impacts resulting from playing football over a given period of time.b.The 1,281,444 head impacts recorded.c.The number (or percent) of concussions suffered through these impacts.d.The advances in tackling techniques imply that there is variability in how a tackle is performed.We need to see if our sample was representative of the hits that may be sustained.1.5a.The population of interest is the population of those who would vote in the 2004 senatorialcampaign.b.The population from which the sample was selected is registered voters in this state.c.The sample will adequately represent the population, unless there is a difference between registeredvoters in the state and those who would vote in the 2004 senatorial campaign.d.The results from a second random sample of 5,000 registered voters will not be exactly the sameas the results from the initial sample. Results vary from sample to sample. With either sample wehope that the results will be close to that of the views of the population of interest.

Page 5 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 5 preview image

21.6a.The professor’s population of interest is college freshmen at his university.b.The sampled population is all freshmen enrolled in HIST 101.c.Yes, there is a major difference in the two populations. Those enrolled in HIST 101 may notaccurately reflect the population of all freshmen at his university. For example, they might be moreinterested in history.d.Had the professor lectured on the American Revolution, those students in HIST 101 would be morelikely to know which country controlled the original 13 states prior to the American Revolutionthan other freshmen at the university.

Page 6 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 6 preview image

3Chapter 2Using Surveys and Experimental Studies to Gather Data2.1a.The explanatory variable is level of alcohol drinking. One possible confounding variable issmoking. Perhaps those who drink more often also tend to smoke more, which would impactincidence of lung cancer. To eliminate the effect of smoking, we could block the experiment intogroups (e.g., nonsmokers, light smokers, heavy smokers).b.The explanatory variable is obesity. Two confounding variables are hypertension and diabetes.Both hypertension and diabetes contribute to coronary problems. To eliminate the effect of thesetwo confounding variables, we could block the experiment into four groups (e.g., hypertension anddiabetes, hypertension but no diabetes, diabetes but no hypertension, neither hypertension nordiabetes).2.2a.The explanatory variable is the new blood clot medication. The confounding variable is the year inwhich patients were admitted to the hospital. Because those admitted to the hospital the previousyear were not given the new blood clot medication, we cannot be sure that the medication isworking or if something else is going on. We can eliminate the effects of this confounding byrandomly assigning stroke patients to the new blood clot medication or a placebo.b.The explanatory variable is the software program. The confounding variable is whether studentschoose to stay after school for an hour to use the software on the school’s computers. Those studentswho choose to stay after school to use the software on the school’s computers may differ in someway from those students who do not choose to do so, and that difference may relate to theirmathematical abilities. To eliminate the effect of the confounding variable, we could randomlyassign some students to use the software on the school’s computers during class time and the restto stay in class and learn in a more traditional way.2.3 Possibleconfoundingfactorsincludestudent-teacherratios,expendituresperpupil,previousmathematics preparation, and access to technology in the inner city schools. Adding advancedmathematics courses to inner city schools will not solve the discrepancy between minority students andwhite students, since there are other factors at work.2.4 There may be a difference in student-teacher ratios, expenditures per pupil, and previous preparationbetween the schools that have a foreign language requirement and schools that do not have a foreignlanguage requirement.2.5 The relative merits of the different types of sampling units depends on the availability of a samplingframe for individuals, the desired precision of the estimates from the sample to the population, and thebudgetary and time constraints of the project.2.6 She could conduct a stratified random sample in which the states serve as the stratum. A simple randomsample could then be selected within each state. This would provide information concerning thedifferences between the states along with the individual opinions of the employees.

Page 7 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 7 preview image

42.7a.All residents in the county.b.All registered voters.c.Survey nonresponse – those who responded were probably the people with much stronger opinionsthan those who did not respond, which then makes the responses not representative of the responsesof the entire population.2.8a.In the first scenario, people would be more willing to lie about using a biodegradable detergentbecause there is no follow up to verify and individuals usually prefer to appear environmentallyconscious. The second survey has a check in place to verify the answers given are truthful.b.The first survey would likely yield a higher percentage of those who say they use a biodegradabledetergent.The second may anger the individuals who tell the truth as if their honesty is beingtested.2.9a.Alumni (men only?) who graduated from Yale in 1924.b.No. Alumni whose addresses were on file 25 years later would not necessarily be representative oftheir class.c.Alumni who responded to the mail survey would not necessarily be representative of those whowere sent the questionnaires. Income figures may not be reported accurately (intentionally), or maybe rounded off to the nearest $5,000, say, in a self-administered questionnaire.d.Rounding income responses would make the figure $25,111 unlikely. The fact that higher incomerespondents would be more likely to respond (bragging), and the fact that incomes are likely to beexaggerated, would tend to make the estimate too high.2.10a.Simple random sampling.b.Stratified sampling.c.Cluster sampling.2.11a.Simple random sampling.b.Stratified sampling.c.Cluster sampling.2.12a.Stratified sampling. Stratify by job category and then take a random sample within each jobcategory. Different job categories will use software applications differently, so this samplingstrategy will allow us to investigate that.b.Systematic random sampling. Sample every tenth patient (starting from a randomly selected patientfrom the first ten patients). Provided that there is no relationship between the type of patient andthe order that the patients come into the emergency room, this will give us a representative sample.2.13a.Stratified sampling. We should stratify by type of degree and then sample 5% of the alumni withineach degree type. This method will allow us to examine the employment status for each degreetype and compare among them.

Page 8 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 8 preview image

5b.Simple random sampling. Once we find 100 containers we will stop. Still it will be difficult to geta completely random sample. However, since we don’t know the locations of the containers, itwould be difficult to use either a stratified or cluster sample.2.14a.Water temperature and Type of hardenerb.Water temperature: 175F and 200F; Type of hardener:H1,H2,H3c.Manufacturing plantsd.Plastic pipee.Location on Plastic pipef.2 pipes per treatmentg.Covariates: Noneh.6 treatments: (175F,H1), (175F,H2), (175F,H3), (200F,H1), (200F,H2), (200F,H3)2.15This is an example where there are two levels of Experimental units, and the analysis is discussed in Chapter18.To study the effect of month:a. Factors: Monthb. Factor levels: 8 levels of month (Oct - May)c. Block = each sectiond. Experimental unit (Whole plot EU) = each treee. Measurement unit = each orangef. Replications = 8 replications of each monthg. Covariates = noneh. Treatments = 8 treatments (Oct – May)To study the effect of location:a. Factors: Locationb. Factor levels: 3 levels of location (top, middle, bottom)c. Block = each sectiond. Experimental unit (Split plot EU) = each location treee. Measurement unit = each orangef. Replications = 8 replications of each locationg. Covariates = noneh. Treatments = 3 treatments (top, middle, bottom)2.16a.Factors: Type of drugb.Factor levels:D1,D2, Placeboc.Blocks: Hospitalsd.Experimental units: Wardse.Measurement units: Patientsf.Replications: 2 wards per drug in each of the 10 hospitalsg.Covariates: Noneh.Treatments:D1,D2, Placebo

Page 9 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 9 preview image

62.17a.Factors: Type of treatmentb.Factor levels:D1,D2, Placeboc.Blocks: Hospitals, Wardsd.Experimental units: Patientse.Measurement units: Patientsf.Replications: 2 patients per drug in each of the ward/hospital combinationsg.Covariates: Noneh.Treatments:D1,D2, Placebo2.18a.Factors: Type of schoolb.Factor levels: Public; Private – non-parochial; Parochialc.Blocks: Geographic regiond.Experimental units: Classroomse.Measurement units: Students in classroomsf.Replications: 2 classrooms per each type of school in each of the city/region combinationsg.Covariate: Measure of socio-economic statush.Treatments: Public; Private – non-parochial; Parochial2.19a.Factors: Temperature, Type of seafoodb.Factor levels: Temperature (0C, 5C, 10C); Type of seafood (oysters, mussels)c.Blocks: Noned.Experimental units: Package of seafoode.Measurement units: Sample from packagef.Replications: 3 packages per temperatureg.Treatments: (0C, oysters), (5C, oysters), (10C, oysters), (0C, mussels), (5C, mussels), (10C, mussels)2.20Randomized complete block design with blocking variable (10 orange groves) and 48 treatmentsin a 3 × 4 × 4 factorial structure.Experimental Units: PlotsMeasurement Units: Trees2.21Randomized complete block design with blocking variable (10 warehouses) and 5 treatments (5vendors)2.22Randomized complete block design, where blocked by day2-factor structure (where the factors are type of glaze, and thickness)2.23a.Design B. The experimental units are not homogeneous since one group of consumers givesuniformly low scores and another group gives uniformly high scores, no matter what recipe is used.

Page 10 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 10 preview image

7Using design A, it is possible to have a group of consumers that gives mostly low scores randomlyassigned to a particular recipe. This would bias this particular recipe. Using design B, theexperimental error would be reduced since each consumer would evaluate each recipe. That is, eachconsumer is a block and each of the treatments (recipes) is observed in each block. This results inhaving each recipe subjected to consumers who give low scores and to consumers who give highscores.b.This would not be a problem for either design. In design A, each of the remaining 4 recipes wouldstill be observed by 20 consumers. In design B, each consumer would still evaluate each of the 4remaining recipes.2.24a.“Employee” should refer to anyone who is eligible forsick days.b.Use payroll records. Stratify by employee categories (full-time, part-time, etc.), employmentlocation (plant, city, etc.), or other relevant subgroup categories. Consider systematic selectionwithin categories.c.Sex (women more likely to be care givers), age (younger workers less likely to have elderlyrelatives), whether or not they care for elderly relatives now or anticipate doing so in the near future,how many hours of care they (would) provide (to define “substantial”), etc. The company mightwant to explore alternative work arrangements, such as flex-time, offering employees 4 ten-hourdays, cutting back to 3/4-time to allow more time to care for relatives, etc., or other options thatmight be mutually beneficial and provide alternatives to taking sick days.2.25a.Each state agency and some federal agencies have records of licensed physicians, professionalcorporations, facility licenses, etc. Professional organizations such as the American MedicalAssociation, American Hospital Administrators Association, etc., may have such lists, but they maynot be as complete as licensing records.b.What nursing specialties are available at this time at the physician’s offices or medical facilities?Whatmedicalspecialties/facilitiesdotheyanticipateaddingorexpanding?Whatstaffingrequirements are unfilled at this time or may become available when expansion occurs? What isthe growth/expansion time frame?c.Licensing boards may have this information. Many professional organizations have specialcategories for members who are unemployed, retired, working in fields not directly related tonursing, students who are continuing their education, etc.d.Population growth estimates may be available from the Census Bureau, university economicgrowth research, bank research studies (prevailing and anticipated load patterns), etc. Health riskfactors and location information would be available from state health departments, the EPA,epidemiological studies, etc.e.Licensing information should be stratified by facility type, size, physician’s specialty, etc., prior tosampling.2.26If phosphorous first: [P,N][10,40], [10,50], [10,60], then [20,60], [30,60]or[20,40], [20,50], [20,60], then [10,60], [30,60]or[30,40], [30,50], [30,60], then [10,60], [10,60]

Page 11 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 11 preview image

8If nitrogen first: [N,P][40,10], [40,20], [40,30], then [50,30], [60,30]or[50,10], [50,20], [50,30], then [40,30], [60,30]or[60,10], [60,20], [60,30], then [40,30], [50,30]So, for examplePhosphorus3030301020Nitrogen4050606060Yield150170190165185Recommendation: Phosphorus at 30 pounds, and Nitrogen at 60 pounds.2.27Factor 2Factor 1IIIIIIA254565B1030502.28a.Group dogs by sex and age:GroupDogYoung female2, 7, 13, 14Young male3, 5, 6, 16Old female1, 9, 10, 11Old male4, 8, 12, 15b.Generate a random permutation of the numbers 1 to 16:15741131381121625610914Go through the list and the first two numbers that appear in each of the four groups receive treatmentL1and the other two receive treatmentL2.GroupDog-TreatmentYoung female2-L2, 7-L1, 13, 14-L2Young male3-L1, 5-L2, 6-L1, 16-L2Old female1-L1, 9-L2, 10-L2, 11-L1Old male4-L1, 8-L2, 12-L2, 15-L12.29a.Bake one cake from each recipe in the oven at the same time. Repeat this procedurertimes. Thebaking period is a block with the four treatments (recipes) appearing once in each block. The fourrecipes should be randomly assigned to the four positions, one cake per position. Repeat thisprocedurertimes.b.If position in the oven is important, then position in the oven is a second blocking factor along withthe baking period. Thus, we have a Latin square design. To haver= 4, we would need to have eachrecipe appear in each position exactly once within each of four baking periods. For example:Period 1Period 2Period 3Period 4R1R2R4R1R3R4R2R3R3R4R2R3R1R2R4R1

Page 12 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 12 preview image

9c.We now have an incompleteness in the blocking variable period since only four of the five recipescan be observed in each period. In order to achieve some level of balance in the design, we need toselect enough periods in order that each recipe appears the same number of times in each periodand the same total number of times in the complete experiment. For example, suppose we wantedto observe each reciper= 4 times in the experiment. If would be necessary to have 5 periods inorder to observe each recipe 4 times in each of the 4 positions with exactly 4 recipes observed ineach of the 5 periods.Period 1Period 2Period 3Period 4Period 5R1R2R5R1R4R5R3R4R2R3R3R4R2R3R1R2R5R1R4R52.30a.The 223 plots of approximately equal sized land from Google Earth (excluding water)b.If there is some reason to believe the trees in the ‘watery’ regions differ from those in the otherregions, this discrepancy may cause a divide in our sampling frame and the population of all treesin the region.c.Again, if trees in the watery region tend to have larger trunk diameter, we would underestimate thenumber of trees with diameter of 12 inches or more.2.31a.All cars (and by extension, their tires) in the state.b.Cars registered in the 4 months in which the sample was taken.c.2 potential concerns arise: not all cars in the region are registered and the time of year may lead toignoring some cars (some people leave the area for the winter).Unregistered cars may have ahigher proportion of unsafe tire tread thickness.2.32a.All corn fields in the state.b.All corn fields in the state (if a list is available).c.Stratified sampling plan in which the number of acres planted in corn determine the strata.d.No biases appear present.2.33a.People are notoriously bad at recall.A telephone interview immediately following the time ofinterest would likely be best, but nonresponse is often high. Mailed questionnaires would likely beadministered too late to be of use and personal interviewing would be intractable to interview in atimely manner.b.All three are potential avenues. Interviews are more personal but more time consuming. Mailingquestionnaires should also work as the editor has a list of his/her clientele, but if he wants to garnerinformation about perspectives of those not reading his/her paper, he/she may need to blanket thecity with questionnaires. Telephone interviews may be difficult as finding the numbers of those inthe area may be difficult.c.Again, all three methods would be viable. A mailed questionnaire would be the easiest and cheapestbut the response rate would likely be lower.d.If the county believes they have an accurate list of those with dogs, a mailed questionnaire ortelephone interview would work, but using a list of registered dogs may be underrepresenting thosewho haven’t taken good care of their dogs (and thereby underrepresenting the proportion withrabies shots).

Page 13 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 13 preview image

102.34People who cheat on their taxes are unlikely to admit to it readily.Therefore, the poll likelyunderestimates the true percentage of people who cheat on their taxes.Garnering truthfulresponses, even if anonymity is guaranteed, on questions of a personal nature can be a challenge.

Page 14 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 14 preview image

11Chapter 3Data Description3.1a.The following is a pie chart of the federal expenditures for the 2014 fiscal year (in billions ofdollars).b.The following is a bar chart of the federal expenditures for the 2006 fiscal year (in billions ofdollars).National DefenseSocial SecurityMedicare & MedicaidNational Debt InterestMajor Social-Aid ProgramsotherCategory532562253821852612Pie Chart of Federal ProgramOtherMajor Social-Aid ProgramsNational Debt InterestMedicare & MedicaidSocial SecurityNational Defense9008007006005004003002001000Federal Program2014 Expenditures ($billions)Chart of 2014 Expenditures ($billions)

Page 15 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 15 preview image

12c.The following are a pie chart and bar chart of the federal expenditures for the 2014 fiscal year (inpercentages).d.The pie chart using percentages is probably most informative to the tax-paying public. Here thetax-paying public can compare the percentages spent by the Federal government for domestic anddefense programs as part of a whole.3.2a.Pie charts would not be appropriate to display these data. We would not be able to see trends overtime.National DefenseSocial SecurityMedicare & MedicaidNational Debt InterestMajor Social-Aid ProgramsotherCategory14.6%15.5%7.0%22.6%23.5%16.9%Pie Chart of Federal ProgramOtherMajor Social-Aid ProgramsNational Debt InterestMedicare & MedicaidSocial SecurityNational Defense2520151050Federal ProgramPercent of 2014 Expenditures ($billions)Chart of 2014 Expenditures ($billions)Percent within all data.

Page 16 of 16

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 16 preview image

13b.The following bar chart shows the changes across the 20 years in the public’s choice in vehicle.c.It appears that the percentage of passenger cars has decreased over the period 1990-2010. If therewas a substantial increase in gasoline prices, we would expect the percentage of passenger cars toincrease.3.3a.The following bar chart shows the increase in the number of family practice physicians (inthousands of physicians) over the period 1980-2001.2001200019991998199519901980706050403020100YearFamily Practice Physicians (in thousands)201020092008200720062005200019951990100806040200DataSUV/Light TruckPassenger CarLabelsChart of 1990, 1995, 2000, 2005, 2006, 2007, 2008, 2009, 2010Percent within variables.

Preview Mode

This document has 509 pages. Sign in to access the full document!

Report

Learning Tools

Writing Tools

Browse Resources

An Introduction To Statistical Methods And Data Analysis, 7th Edition Solution Manual - Page 1

Study Now!

Document Details

Related Documents

Solution Manual for An Introduction to Statistical Methods and Data Analysis, 6th Edition

Statistics - Univariate Inferential Tests

Statistics - Sampling

Statistics - Probability

Statistics - Principles of Testing

Statistics - Numerical Measures

Statistics - Introduction to Statistics

Statistics - Graphic Displays

Statistics - Cummulative Reviews

Statistics - Common Mistakes and Tables

Company

Explore

Study Tools