Table Of ContentGetting Started with
S-P 6 for Windows
LUS
July 2001
Insightful Corporation
Seattle, Washington
Proprietary Insightful Corporation owns both this software program and its
Notice documentation. Both the program and documentation are
copyrighted with all rights reserved by Insightful Corporation.
The correct bibliographical reference for this document is as follows:
Getting Started with S-PLUS 6 for Windows, Insightful Corporation,
Seattle, WA.
Printed in the United States.
Copyright Notice Copyright © 1987-2001, Insightful Corporation. All rights reserved.
Insightful Corporation
1700 Westlake Avenue N, Suite 500
Seattle, WA 98109-3044
USA
ii
CONTENTS
Chapter 1 What’s New in S-PLUS 6 1
S Version 4 Engine 1
S-PLUS GraphletsTM 1
Microsoft Excel 2
CONNECT/C++ 2
Statistics 3
Graphics 5
Data Import and Export 5
Project Folders and Chapters 6
Object Explorer 6
Additional Features 6
Chapter 2 Quick Tour 9
Overview 10
Getting Data 11
Creating a 2D Graph 13
Performing a Linear Regression 15
Identifying and Labeling Data Points 18
Editing a Graph 20
Creating a 3D Graph 21
Viewing Objects and Databases 24
Organizing Your Work 26
iii
Contents
Chapter 3 Extended Tour 29
Importing a File 31
Editing Variable Names and Adding Descriptions 33
Variable Names 33
Column Descriptions 33
Creating a 2D Graph 34
Changing Graph Features 35
Axes and Labels 35
Plot Properties 35
Titles 36
Using Trellis Graphics for Multipanel Conditioning 37
Highlighting Data Points 40
Extracting Graph Panels 41
Applying Statistical Models 43
Summaries 43
Linear Models 44
Doing More With Graphics 46
Varying 2D Axes Types 46
Creating Graphs With Multiple Axes 47
Embedding and Extracting Data in Graph Sheets 53
Creating a Graph Using the Object Explorer 54
Editing a Plot in the Object Explorer 55
Viewing Plots in Separate Panels 55
Removing Outliers 56
Creating a 3D Graph 57
Adding Color Draping 58
Creating PowerPoint Slides 59
iv
Contents
Using S-PLUS With Microsoft Excel 60
Creating Excel Data Inside S-PLUS 60
Plotting Excel Data 61
Analyzing Excel Data 61
Using the S-PLUS Language 64
S-PLUS Language Basics 64
Listing Objects 65
Fitting a Linear Model 66
Running an S-PLUS Script 66
Getting Help in S-PLUS 69
HTML Help 69
Help in the Commands and Script Windows 71
Chapter 4 Summary of Basic Procedures 73
Using Menus 74
Main Menus 74
Shortcut (Right-Click) Menus 74
Using Dialogs 75
The Apply Button 75
The Dialog Rollback Buttons 76
Typing and Editing in Dialogs 76
Using Toolbars and Palettes 77
ToolTips 77
DataTips 78
Selecting Data 79
Using the Select Data Dialog 79
Using the Object Explorer 79
Importing Data 80
v
Contents
Selecting Variables to Plot 81
Selecting Variables in a Data Window 81
Selecting Variables in the Object Explorer 81
Creating Plots 82
Selecting Graph Objects 83
Index 85
vi
1
WHAT’S NEW IN S-P 6
LUS
S Version 4 Engine 1
S-PLUS GraphletsTM 1
Microsoft Excel 2
CONNECT/C++ 2
Statistics 3
Graphics 5
Data Import and Export 5
Project Folders and Chapters 6
Object Explorer 6
Additional Features 6
S-PLUS 6 is a significant new release of S-PLUS integrating the next-
generation S version 4 language from Lucent Technologies into the
Windows product line and providing further improvements to the
graphical user interface and functional capabilities.
Outlined below is a brief overview of the major new or enhanced
features included in S-PLUS 6. For more detailed information, refer to
the printed or online manuals, as well as the online help.
S Version 4 The new, more powerful S language underpinning S-PLUS 6 provides
Engine enhanced object-oriented capabilities, support for large data sets, and
enhanced performance and memory management. In addition, new
cross-platform file compatibility of data objects between the Windows
and UNIX versions of S-PLUS makes it easy to access S-PLUS data
across platforms.
S-PLUS S-PLUS 6 brings you S-PLUS GraphletsTM, a new interactive graphics
GraphletsTM format for displaying graphical information on the Web. Because
S-PLUS Graphlets are interactive, your graphics come alive. Using
S-PLUS Graphlets, you can create data mining applications where the
1
Chapter 1 What’s New in S-PLUS 6
viewer can drill down into your data or you can create hyperlinked
graphics, giving the viewer access to further information on other
Web pages.
Microsoft Excel Tighter integration with Microsoft Excel makes it easier than ever to
analyze data stored in Excel format, giving you the ability to open
Excel worksheets within S-PLUS and create graphics or perform
statistical analyses directly from the data. Now you can:
1. Open any Excel (.xls) file as an active document inside
S-PLUS.
2. Link regions of your Excel spreadsheet to S-PLUS data frames
using the new Link Wizard.
3. Update S-PLUS data frames at the click of a button when your
data change in Excel.
4. Automatically re-establish links from Excel to S-PLUS data
frames when you next open your Excel worksheet in S-PLUS.
CONNECT/C++ Also new in S-PLUS 6 is the CONNECT/C++ Foundation Class
Library, an object-oriented C++ interface to the S engine that allows
C++ developers to write a client program using data objects and
structures from the S engine, run S functions, evaluate S syntax, and
process the results. The CONNECT/C++ foundation classes are for
C++ developers who want to construct client applications that use the
S engine for data processing and computation.
CONNECT/C++ provides a suite of C++ classes and methods for
handling S-PLUS objects in C++. The C++ classes allow easy
conversion of S-PLUS functions to C++ code that can be dynamically
loaded into S-PLUS, resulting in C++ code orders of magnitude faster
than the interpreted S-PLUS code. Benefits of the new classes include:
1. Significant reduction in the effort and code requirements to
write a client application using the S engine.
2. Ability to create and work with S objects without having to
know the details of the structure of the objects themselves.
3. Automatic reference counting of S objects so that objects are
cleaned up automatically when the scope changes in the
program.
2
Statistics
4. Powerful Vector, Array, and Matrix classes that provide for
fast and powerful data manipulation through easy-to-use
methods.
5. Easy access to S-PLUS functions and results by a client
application since functions may be called from the client
given the name of the function and a list of arguments passed
to the function.
6. Overridable event handler methods that provide notification
to the client application when S-PLUS objects change and
when output is available from evaluations or data operations.
7. Lower code overhead in the client application when creating
objects since named objects can be created and initialized in
one call.
8. Ability to source S-PLUS code and debug C++ or Fortran
code from Developer Studio using the App Wizard.
Statistics S-PLUS 6 offers new statistical techniques, including the latest NLME
methods from Pinheiro and Bates, as well as cutting-edge techniques
for robust regression and missing data handling. In addition, key
statistical functions, such as linear regression, now operate on large
data sets. New or enhanced statistics features include:
1. Functionality to perform the Shapiro-Wilk test for normality.
This test has better power properties when compared to
Pearson’s chi-squared, Komogorov’s, and other tests for a
variety of commonly encountered alternatives.
2. Functionality to compute the Durbin-Watson test statistic.
This test is used to test the residuals in a linear model for serial
correlation.
3. NLME library updated to 3.3, including:
• A new pdMat class, pdBand, for representing banded
positive-definite matrices.
• A new corStruct class, corBand, for banded correlation
structures.
3
Chapter 1 What’s New in S-PLUS 6
• A revised approach for calculating approximate variance-
covariance matrices for the estimates after convergence,
which allows the Hessian matrix to be obtained with
respect to restricted or unrestricted parameterizations.
• The ability to set a fixed value for the within-group
standard error (lme and nlme) or the residual standard
error (gls and gnls) in the optimization algorithm.
4. Updated user-contributed libraries, including, as unsupported
libraries, the class, MASS, nnet, and spatial libraries of Bill
Venables and Brian Ripley and the design and hmisc
libraries of Frank Harrell. (These libraries can be loaded using
the Load Library dialog.)
5. Robust Methods and Missing Data libraries added.
• Robust Methods provides better estimates and predictions
for heavy-tailed data without sacrificing efficiency for data
that are close to Gaussian. The Robust Methods library
contains the following functionality:
• Improved robust regression for the linear model
• Robust ANOVA
• Robust Poisson and logistic generalized linear models
• Robust covariance and correlation matrix estimation
• New plots for outlier detection and comparing fits
• New multiple model fits and comparisons paradigm
• Missing Data provides new, cutting-edge methods for
proper handling of missing data. The library uses a
model-based approach, with models fit by EM and data
augmentation (Gibbs sampler) algorithms. The data
augmentation algorithms produce multiple imputations;
users may also use their own routines for creating multiple
imputations.
The Missing Data library includes capabilities for
performing arbitrary analyses “in parallel” on multiple
completed data sets. Numerical results can also be
combined in ways that reflect the additional uncertainty
due to missing data.
4