STA 3032-CMB – Probability and Statistics for Engineers
Spring 2014 Term Project
Available: Thursday, January 23. Due: Thursday, April 10.

Instructions:
Use Excel, Minitab, ‘R’ or any other software of your choice to analyze the data provided in excel file.
Minitab tutorials are available in the MeetMinitab file on the course homepage in webcourses. If you decide to use Excel, ensure that the Analysis ToolPak add-in is present.
Visit: http://www.excel-easy.com/data-analysis/analysis-toolpak.html if you don’t have it.
Every graph must have appropriate titles and subtitles (if necessary).
Numerical answers must be given to 3 decimal places.
Show all workings clearly and logically.
No hardcopy of the project will be accepted. Only MS World and .PDF file submissions will be accepted and all submissions must be made in the webcourses.
The questions are worth 98 points in total and remaining 2 points will be awarded to those with professional presentation.
This is an individual assignment. No collaboration among students is allowed.

Questions:

Periodically, the Federal Trade Commission (FTC) ranks domestic cigarette brands according to tar, nicotine, and carbon monoxide content. The test results are obtained by using a sequential smoking machine to “smoke” cigarettes to a 23-millimiter butt length. The tar, nicotine, and carbon monoxide concentration (in milligram) in the residual “dry” particulate matter of the smoke are then measured. The nicotine levels of 500 cigarette brands are saved into the NICOTINE sheet of the given Excel data file.
Find the range, variance and standard deviation of this data set.
Create the relative frequency histogram for the nicotine level.
Compute the interval (y ) ̅± 2s
Estimate the percentage of brands with nicotine levels of less than 0.5mg.
20 points

In 1879, A. A. Michelson made 100 determinations of the velocity of light in air using a modification of a method proposed by the French physicist Foucault. He made the measurements in five trials of 20 measurements each. The observations (in kilometers per second) follow. Each value has 299,000 subtracted from it (See LIGHT sheet of the given Excel data file). The currently accepted true velocity of light in a vacuum is 299,792.5 kilometers per second. Stigler (1977, The Annals of Statistics) reported that the “true” value for comparison to these measurements is 734.5.
Construct the Normality plots for Trial 4 and Trial 5.
Construct comparative box plots of these measurements.
Does it seem that all five trials are consistent with respect to the variability of the measurements?
Are all five trials centered on the same value?
How does each group of trials compare to the true value?
Could there have been bias in the measuring instrument?

Trial 1 Trial 2 Trial 3 Trial 4 Trial 5
850 960 880 890 890
1000 830 880 910 870
740 940 880 810 840
980 790 910 920 870
900 960 880 810 780
930 810 850 890 810
1070 940 860 820 810
650 880 870 860 740
930 880 720 800 760
760 880 840 880 810
850 800 720 770 810
810 830 840 720 940
950 850 620 760 790
1000 800 850 840 950
980 880 860 740 810
1000 790 840 850 800
980 900 970 750 820
960 760 840 850 810
880 840 950 760 850
960 800 840 780 870

20 points

Refer to the MPG sheet of the provided Excel data file.
Describe the MPG Mileage Ratings of the 100 cars with a Boxplot. Show the outliers present in the data along with values of the corresponding outliers
Display a table of the descriptive statistics (i.e. variance, mean, squared error, standard deviation, Q1, Q3, and IQR)
18 points

An engineer is studying the formulation of a cement mortar. He is interested in finding out if treating the mixture with an emulsifier will impact the curing time and tension bond strength of the mortar. The data from this experiment is given:
Modified 16.85 16.4 17.21 16.35 16.52 17.04 16.96 17.15 16.59 16.57
Unmodified 16.62 16.75 17.37 17.12 16.98 16.87 17.34 17.02 17.08 17.27

Draw and compare the dot diagram for the two data sets
Draw Box Plots to assist in the interpretation of the tension strength data from this experiment.
Are there any outliers in the data?
15 points
The maker of a shampoo knows that customers like this product to have a lot of foam. Ten sample bottles of the product are selected at random and the foam heights observed are as follows (in millimeters): 210, 215, 194, 195, 211, 201, 198, 204, 208, and 196.
Is there evidence to support the assumption that foam height is normally distributed?
Find a 95% CI on the mean foam height.
Find a 95% prediction interval on the next bottle of shampoo that will be tested.
Find an interval that contains 95% of the shampoo foam heights with 99% confidence.
Explain the difference in the intervals computed in parts (b), (c), and (d).
15 points

The National Snow and Ice Data Center (NSIDC) collects data on albedo, depth, and physical characteristics of ice melt ponds in the Canadian Arctic. Albedo is the ratio of the light reflected by ice to that received by it. (High albedo values give a white appearance to the ice.) Visible albedo values are recorded for a sample of 504 ice melt points located in the Canadian Arctic. These data are saved in the ICE sheet of provided data file. Another variable of interest is the type of ice observed for each pond. Ice type (column F) is classified as first-year ice (represented as the number 1), multiyear ice (2), or landfast ice (3).
Construct a summary table and a horizontal bar graph to describe the ice types of the 504 melt ponds.
Find a 90% confidence interval for the true mean visible albedo value of all Canadian Arctic ice ponds in column L to T.
Find a 90% confidence interval for the true variance visible albedo value of all Canadian Arctic ice ponds in column L to T.
What is the interpretation of the intervals gotten in (b) and (c)?
10 points