STA 3032-CMB – Probability and Statistics for Engineers

Spring 2014 Term Project

Available: Thursday, January 23. Due: Thursday, April 10.

Instructions:

Use Excel, Minitab, ‘R’ or any other software of your choice to analyze the data provided in excel file.

Minitab tutorials are available in the MeetMinitab file on the course homepage in webcourses. If you decide to use Excel, ensure that the Analysis ToolPak add-in is present.

Visit: http://www.excel-easy.com/data-analysis/analysis-toolpak.html if you don’t have it.

Every graph must have appropriate titles and subtitles (if necessary).

Numerical answers must be given to 3 decimal places.

Show all workings clearly and logically.

No hardcopy of the project will be accepted. Only MS World and .PDF file submissions will be accepted and all submissions must be made in the webcourses.

The questions are worth 98 points in total and remaining 2 points will be awarded to those with professional presentation.

This is an individual assignment. No collaboration among students is allowed.

Questions:

Periodically, the Federal Trade Commission (FTC) ranks domestic cigarette brands according to tar, nicotine, and carbon monoxide content. The test results are obtained by using a sequential smoking machine to “smoke” cigarettes to a 23-millimiter butt length. The tar, nicotine, and carbon monoxide concentration (in milligram) in the residual “dry” particulate matter of the smoke are then measured. The nicotine levels of 500 cigarette brands are saved into the NICOTINE sheet of the given Excel data file.

Find the range, variance and standard deviation of this data set.

Create the relative frequency histogram for the nicotine level.

Compute the interval (y ) ̅± 2s

Estimate the percentage of brands with nicotine levels of less than 0.5mg.

20 points

In 1879, A. A. Michelson made 100 determinations of the velocity of light in air using a modification of a method proposed by the French physicist Foucault. He made the measurements in five trials of 20 measurements each. The observations (in kilometers per second) follow. Each value has 299,000 subtracted from it (See LIGHT sheet of the given Excel data file). The currently accepted true velocity of light in a vacuum is 299,792.5 kilometers per second. Stigler (1977, The Annals of Statistics) reported that the “true” value for comparison to these measurements is 734.5.

Construct the Normality plots for Trial 4 and Trial 5.

Construct comparative box plots of these measurements.

Does it seem that all five trials are consistent with respect to the variability of the measurements?

Are all five trials centered on the same value?

How does each group of trials compare to the true value?

Could there have been bias in the measuring instrument?

Trial 1 Trial 2 Trial 3 Trial 4 Trial 5

850 960 880 890 890

1000 830 880 910 870

740 940 880 810 840

980 790 910 920 870

900 960 880 810 780

930 810 850 890 810

1070 940 860 820 810

650 880 870 860 740

930 880 720 800 760

760 880 840 880 810

850 800 720 770 810

810 830 840 720 940

950 850 620 760 790

1000 800 850 840 950

980 880 860 740 810

1000 790 840 850 800

980 900 970 750 820

960 760 840 850 810

880 840 950 760 850

960 800 840 780 870

20 points

Refer to the MPG sheet of the provided Excel data file.

Describe the MPG Mileage Ratings of the 100 cars with a Boxplot. Show the outliers present in the data along with values of the corresponding outliers

Display a table of the descriptive statistics (i.e. variance, mean, squared error, standard deviation, Q1, Q3, and IQR)

18 points

An engineer is studying the formulation of a cement mortar. He is interested in finding out if treating the mixture with an emulsifier will impact the curing time and tension bond strength of the mortar. The data from this experiment is given:

Modified 16.85 16.4 17.21 16.35 16.52 17.04 16.96 17.15 16.59 16.57

Unmodified 16.62 16.75 17.37 17.12 16.98 16.87 17.34 17.02 17.08 17.27

Draw and compare the dot diagram for the two data sets

Draw Box Plots to assist in the interpretation of the tension strength data from this experiment.

Are there any outliers in the data?

15 points

The maker of a shampoo knows that customers like this product to have a lot of foam. Ten sample bottles of the product are selected at random and the foam heights observed are as follows (in millimeters): 210, 215, 194, 195, 211, 201, 198, 204, 208, and 196.

Is there evidence to support the assumption that foam height is normally distributed?

Find a 95% CI on the mean foam height.

Find a 95% prediction interval on the next bottle of shampoo that will be tested.

Find an interval that contains 95% of the shampoo foam heights with 99% confidence.

Explain the difference in the intervals computed in parts (b), (c), and (d).

15 points

The National Snow and Ice Data Center (NSIDC) collects data on albedo, depth, and physical characteristics of ice melt ponds in the Canadian Arctic. Albedo is the ratio of the light reflected by ice to that received by it. (High albedo values give a white appearance to the ice.) Visible albedo values are recorded for a sample of 504 ice melt points located in the Canadian Arctic. These data are saved in the ICE sheet of provided data file. Another variable of interest is the type of ice observed for each pond. Ice type (column F) is classified as first-year ice (represented as the number 1), multiyear ice (2), or landfast ice (3).

Construct a summary table and a horizontal bar graph to describe the ice types of the 504 melt ponds.

Find a 90% confidence interval for the true mean visible albedo value of all Canadian Arctic ice ponds in column L to T.

Find a 90% confidence interval for the true variance visible albedo value of all Canadian Arctic ice ponds in column L to T.

What is the interpretation of the intervals gotten in (b) and (c)?

10 points