This document assumes you have Minitab installed on your computer. The instructions are based on Minitab 14, which is known to run on Windows 98SE and XP. Version 15 requires XP but initially had problems with Vista.
Getting and Opening Data Files
We will use an example data set from Regression Analysis by Example (4th ed.) by Chatterjee and Hadi (Wiley, New York, 2006). Go to the web site for this book at http://www.ilr.cornell.edu/~hadi/rabe4/. We will use the computer repair data. In this study a random sample of service call records for a computer repair operation were examined and the length of each call (in minutes) and the number of components repaired or replaced were recorded. The data are in file P027.MTB. Follow the directions on the book’s home page to download this and save it somewhere where you can find it on your computer. The web site is a little misleading in that the file you actually obtain will be P027.zip. You will then need a program that can unzip this file into P027.mtb. Now you can run Minitab. From Minitab’s main menu, select File, Open Worksheet and browse to where you put P027.mtb. Select that file and click on the Open button.
Simple Plots for Each Variable
Of course, the first step is to look at your data. Pull down the Graph menu and select Stem-and-Leaf. Double-click on each variable.
MTB > GStd. MTB > Stem-and-Leaf 'Minutes' 'Units'. Stem-and-leaf of Minutes N = 14 Leaf Unit = 10 2 0 22 3 0 4 5 0 67 (3) 0 899 6 1 01 4 1 4 1 445 1 1 6 Stem-and-leaf of Units N = 14 Leaf Unit = 0.10 1 1 0 2 2 0 3 3 0 5 4 00 6 5 0 (2) 6 00 6 7 0 5 8 0 4 9 00 2 10 00
We could have made histograms or boxplots. We simply want to see if there are any peculiarities in the data for each variable by itself before we look into relationships between variables. We see none here.
Pull down the Graph menu and select Scatterplot. Accept the defaults on the first dialog. On the second, you must indicate your response and predictor variables. Available choices appear in a list at left. Double click on Minutes to select it as the Y variable. The cursor moves to the X variable column and you can now double click on Units to select that. Then click on OK and a scatterplot will appear in a new window.
Note that the dialog boxes include numerous options (which we do not need at the moment). We are not surprised to see that the length of a service call increases with the number of components repaired or replaced.
Correlation and Covariance
Minitab has a command language in addition to a menu interface. Each has its pros and cons. A major advantage of the command line is that you can store commands in a macro file and rerun the same analysis over and over. This is useful, for example, if you do the same analysis every month for new data on the same variables (such as data on your business). If commands are not appearing in the top window (as they did in the stem and leaf example above) when you make choices from the menu, select Editor (not Edit) and Enable Commands. You can check this by putting the cursor at the Minitab prompt “MTB>” in the top half of the screen and typing desc c1. (This stands for “describe column 1” (of the spreadsheet on the bottom half of the screen)). The commands for correlation and covariance are easy to remember. You can cite variables by column (as c1, for example) or by name (in single quotes).
MTB > corr 'Minutes' 'Units' Pearson correlation of Minutes and Units = 0.994 P-Value = 0.000 MTB > covariance 'Minutes' 'Units' Minutes Units Minutes 2136.0275 Units 136.0000 8.7692
The extra numbers in the covariance table are variances. Pull down the Stat menu and select Basic Statistics, then Display Descriptive Statistics. Pick one or both variables. Summary statistics for the variable(s) of your choice should appear in the existing top window.
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum Minutes 14 0 97.2 12.4 46.2 23.0 60.3 96.5 146.0 166.0 Units 14 0 6.000 0.791 2.961 1.000 3.750 6.000 9.000 10.000
(You may get a different selection of summary statistics.) These summaries do not include variances, so go through the process again. In the dialog box, click on the Statistics button. You get a list of possible summary statistics. Tick Variance and make any other changes that appeal to you. Click OK. The variances should be added to a new summary table and should match (except for rounding) the numbers in the covariance window.
Variable N N* Mean SE Mean StDev Variance Minimum Q1 Median Minutes 14 0 97.2 12.4 46.2 2136.0 23.0 60.3 96.5 Units 14 0 6.000 0.791 2.961 8.769 1.000 3.750 6.000 Variable Q3 Maximum Minutes 146.0 166.0 Units 9.000 10.000
Running the Regression
Now pull down Stat yet again and select Regression, Regression. Select your variables and click OK. A brief regression output should appear.
The regression equation is Minutes = 4.16 + 15.5 Units Predictor Coef SE Coef T P Constant 4.162 3.355 1.24 0.239 Units 15.5088 0.5050 30.71 0.000 S = 5.39172 R-Sq = 98.7% R-Sq(adj) = 98.6% Analysis of Variance Source DF SS MS F P Regression 1 27420 27420 943.20 0.000 Residual Error 12 349 29 Total 13 27768
The t-values (here “T”) test the hypotheses that the corresponding population parameters are 0. If you wish to test a nonzero value, subtract it from the coefficient in the regression output window and divide by the coefficient’s s.e. (Use a calculator for this.) Similarly, if you want confidence intervals, use the coefficient plus or minus the product of its s.e. with a t-value for the desired confidence level and 12 degrees of freedom. (Use a calculator for this.)
To plot the regression line on the scatterplot, redo the scatterplot but this time pick With Regression in the first dialog box.
You can cut and paste Minitab output into your own reports but note that the text windows on the statistics.com Assignments page will only accept text input. So, of the output examples above, the scatterplots could not be pasted there. All the text that appears in the upper Session window in Minitab can be pasted into Assignments. To copy the contents of a graphics window (say for a report you are writing with your word processor) , first click on the graph window to make it the current window if it isn’t already, then select Edit > Copy Window. You will not see anything happen but if you go to another application you can paste there.
Regression through the Origin
To fit a regression line through the origin (i.e., intercept=0) redo the regression but this time select Options on the dialog box where you pick variables. (While you are here, notice some of the other choices, such as computing the Durbin-Watson statistic.) Untick the box Fit Intercept. The new results (with commands) should be
MTB > Regress 'Minutes' 1 'Units'; SUBC> NoConstant; SUBC> Brief 2. The regression equation is Minutes = 16.1 Units Predictor Coef SE Coef T P Noconstant Units 16.0744 0.2213 72.63 0.000 S = 5.50228 Analysis of Variance Source DF SS MS F P Regression 1 159683 159683 5274.42 0.000 Residual Error 13 394 30 Total 14 160077
or minutes = 16.0744*units.
If you wish to explore Minitab’s command line, pull down the main Help menu and select Session Command Help (which may not be present unless you asked for it when installing Minitab).
To make a prediction (with confidence interval), rerun the regression. (We will rerun the original regression rather than the one through the origin.) In the Options dialog box is a window labeled Prediction Intervals for New Observations. Type in the value of Units for which you want a prediction of Minutes. We predicted the length of a service call with four components repaired or replaced. The prediction output appears after the regular regression output and looks like
Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI 1 66.20 1.76 (62.36, 70.03) (53.84, 78.55) Values of Predictors for New Observations New Obs Units 1 4.00
© 2007 statistics.com