现在的位置: 首页 > 综合 > 正文

Multivariate Data Plots

2018年10月22日 ⁄ 综合 ⁄ 共 22705字 ⁄ 字号 评论关闭
文章目录

 Why plot data?

1)     Plotting your data should usually be the first done once a data set is ready to be analyzed.  The purpose is to

a)    Look for trends

b)    Discover unusual observations (outliers)

c)    Suggest items to examine in more sophisticated statistical analyses

2)     During a sophisticated statistical analysis, plots may be helpful to lead to particular conclusions (example: residual plots in regression analysis)

3)     Once the sophisticated statistical analysis is done, one should use plots to help explain the results to yourself and to others.  This is especially helpful for statisticians consulting with subject-matter researchers. 

1       Three-Dimensional Data Plots

 Bubble Plot

Scatter plot of two variables with the size of the plotting character proportional to a third variable.  The third variable plotting character is usually a circle (bubble). 

 Example: Cereal data (cereal_ch3.sas)

 Partial SAS code: 

dm 'log;clear;output;clear;';
options ps=50 ls=70 pageno=1;
goptions reset=global border ftext=swiss gunit=cm htext=0.4 
         htitle=0.5;
goptions display noprompt;

**********************************************************;
**                                                      **;
** AUTHOR: Chris Bilder                                 **;
** COURSE: STAT 873                                     **;
** DATE: 1-18-01                                        **;
** UPDATE: 12-13-01, 8-24-03                            **;
** PURPOSE: Construct plots for the cereal data         **;
**                                                      **;
** NOTES:                                               **;

**                                                      **;
**********************************************************;
title1 'Chris Bilder, STAT 873';
*Read in Excel file containing the cereal data;
*  Note: The variable names are ID, shelf, cereal, size_g, sugar_g, fat_g, and sodium_mg;
proc import out=set1 
            datafile= "c:\Chris\UNL\Stat5063\Chapter 1\
                       cereal_data.xls" 
            dbms=excel2000 replace;
     getnames=yes;
run;

*adjust for the serving size;

data set2;
  set set1;
  sugar = sugar_g/size_g;
  fat = fat_g/size_g;
  sodium = sodium_mg/size_g;
run; 

*Construct bubble plot;
*  Notes: bsize is the scaling factor for the bubbles with a default; 
*         value of 5.  If you are not careful, changing bsize may; 
          change your interpretation of the plot;
proc gplot data=set2;
  bubble sugar*fat = sodium / vaxis=axis1 haxis=axis2 frame grid 
                              bcolor=blue bsize=5;
  title2 'Sugar vs. fat with bubble proportional to sodium';
  axis1 label = (a=90 'Sugar')
        length = 12;
  axis2 label = ('Fat')
        length = 12;
run;

It is difficult to interpret the above plot since there does not appear to be much of difference between the bubble sizes. Below is the SAS code needed to change the size of the bubbles and to label each bubble with its corresponding sodium value.*In order
to lebel the bubbles with their corresponding sodium values * they need to be rounded.  If they were not, 10+ digits would be;

* shown on the plot for each sodium value;

data set3;
  set set2;
  sodium_round = round(sodium,0.01);
run;
*Note: blabel prints the sodium value on the plot;
*      bsize is 10 instead of the default of 5;
proc gplot data=set3;
  bubble sugar*fat = sodium_round / vaxis=axis1 haxis=axis2 frame 
                    grid bcolor=blue bsize=10 blabel;
  title2 'Sugar vs. fat with bubble proportional to sodium';
  axis1 label = (a=90 'Sugar')
        length = 12;
  axis2 label = ('Fat')
        length = 12;
run;
 

Changing the bubble size does help some, and increasing it even more helps a little more (plot excluded).  Note the two 0 values in the southwest corner of the plot.  These are cereals with no sodium.  Without labeling the points, these cereals would have
been missed!  What are these cereals?[t1] 

What happens if standardized data values were plotted instead of the raw values?  

*Compare the above plot to a plot using standardized values;
proc standard data=set2 out=stand mean=0 std=1;
  var sugar fat sodium;
run;
data stand2;
  set stand;
  sodium_round = round(sodium,0.01);
run;

*Note: blabel prints the sodium value on the plot;
proc gplot data=stand2;
  bubble sugar*fat = sodium_round / vaxis=axis1 haxis=axis2 
         frame grid bcolor=blue bsize=10 blabel;
  title2 'Sugar vs. fat with bubble proportional to sodium 
          (standardized data)';
  axis1 label = (a=90 'Sugar')
        length = 12;
  axis2 label = ('Fat')
        length = 12;
run;
 

Sodium values near 0 have small bubbles.  As sodium values increase in absolute size, the bubbles get larger.  Bubble plots are also useful for model diagnostic plots.  In a regression class, you may have seen these used. 

2   3D Scatter Plot

Scatter plot of three variables

Example: Cereal data (cereal_ch3.sas)

 

*Construct 3D scatter plot;

*  Note: Rotate=70 and tilt=70 are the defaults;

proc g3d data=set2;

  scatter sugar*fat = sodium / grid  zticknum=6 xticknum=6 yticknum=6

                               shape='cube' color='blue' rotate=140 

      tilt=40;

  title2 '3D scatter plot';

run;

 

Needle

Again, it is difficult to see specific trends in the plot that may help us interpret the data better.  We do see some potential outliers on the far left side.    

A few different values of TILT and ROTATE should be tried to see if looking at the data from a different angle may help. 

 


3  Plots of Higher Dimensional Data

3D Bubble Plot

3D scatter plot with the plotting symbol proportional to a 4th variable. PROC G3D in SAS can be used to construct these types of plots. 

 Example: Cereal data (cereal_ch3.sas)。Below is additional SAS code used to construct a 3D bubble plot.  A list of predefined SAS plotting colors is available atwww.okstate.edu/sas/v8/sashtml/gref/zgscheme.htm#zxvalues[t4] 
and the shapes are availableat   http://support.sas.com/91doc/getDoc/graphref.hlp/symbolchap.htm#global-symboltable 

data set3;
  set set2;
  length color $6; *needs to have a length as large as purple;
  if shelf=1 then do; 
     shape = 'balloon';
     color = 'red';
     end;
  if shelf=2 then do; 
     shape = 'cube';
     color = 'purple';
     end;
  if shelf=3 then do;
     shape = 'cylinder';
     color = 'green';
     end;
  if shelf=4 then do;
     shape = 'pyramid';
     color = 'blue';
     end;
run;

*Construct 3D scatter plot;
proc g3d data=set3;
  scatter sugar*fat = sodium / grid  zmin=0 zmax=12 zticknum=7 
                               xticknum=7 yticknum=7
                               shape=shape size=shelf 
                               color=color 
      rotate=140 tilt=70;
title2 '3D scatter plot with color, shape, and size of points 
        corresponding to shelf';
footnote 'Shelf: #1=Balloon(red), #2=Cube(purple), 
          #3=Cylinder(green), #4=Pyramid(blue)';
run;
 

 

What statements can be made about the data from the above plot?

Six different variables could be shown on a plot like this where the shape, size, and color of the plotting symbol correspond to different variables. 

 

4  Star Plots and Sun-Ray Plots

·      Each star (or sun) represents a particular experimental unit.

·      The center of the star denotes 0 (or a specified value – like the smallest value for a variable).

·      For each variable, a line or “ray” extends out from the center at a length corresponding to the variable value.

Some of its uses

·      Help detect outliers – stars that are very different from the others indicate possible outliers

·      Validate cluster analysis results – observations grouped within the same cluster should have similar stars.

 When using SAS for these plots, all variables should be standardized to make sure they are all measured on the same scale (unless they already are).  Please note that R is much easier to use to create these plots. 

  Example: Cereal data (cereal_ch3.sas)

  PROC GCHART can be used to construct the star plots.  Below is additional SAS code used to construct the plots.  Note that some data set manipulation is needed to get the data set in the proper form. 

 

*******************************************************;
*Create Star (Sun) plot                               *;
*  The SAS star plots in PROC GCHART were not designed*;
*  to handle the data in its present form.  The data  *;
*  needs to be transformed to the following form      *;
*       ID  variable  value                           *;
*        1   Sugar    0.4528440736                    *;  
*        1   Fat      -1.162340844                    *;
*        1   Sodium   0.1854699394                    *;
*        2   Sugar    -1.457523293                    *;
*      ...                                            *;
*       40   Sodium   0.4271883325                    *;
* before the plot can be created.  The SUMVAR option  *;
* in PROC GCHART sums only 1 value for each ID and    *;
* variable combination (SAS trick).                   *;
*******************************************************;
data star1;
  set stand;
  value = sugar;  variable='Sugar '; output;
  value = fat;    variable='Fat   '; output;
  value = sodium; variable='Sodium'; output;
  keep ID value variable;
run;
*Find the minimum sugar, fat, or sodium value;
*  Use this value for the STARMIN option in PROC GCHART;
proc means data=star1 min;
  var value;
run;
 
*Construct the sun (star) plots;
*  Notes: 1)The standardized data is used for the plot;
*         2)If the STARMIN option was not used, all values < 0 would have 
*           0 length rays since the default ray origin is 0.;
*         3)Instead of using the starmin option, 2.2800577 could be added *           to VALUE to ensure all of them are >=0.  ;
*         4)The NOCONNECT option for STAR can be used to avoid having rays;
*           filled in on the plot;
*         5)The first color plotted is black (including the text).  The 
*          colors() statement tells SAS the order to cycle through colors;
*         6)fill=solid tells SAS how to fill the rays;
*         7)A “statistic” needs to be plotted.  Here, SUMVAR=value; 
*           specifies the statistic even though the sum of one; 
*           observation is being done;
*         8) noheading tells SAS not to print “sum of var=value”; 
goptions colors=(black red blue) htext=0.2 htitle=0.5;
proc gchart data=star1;
  title2 'Star (Sun) plots for the cereal data';
  star variable / noheading sumvar=value fill=solid 
                  across=5 down=3 group=ID starmin=-2.2800577; 
  footnote 'Ray direction: Northwest=Sugar, East=Fat, Southwest=Sodium';
run;
 
*All cereals are on one plot since the value=none option is used (more 
* room);
goptions colors=(black red blue) htext=0.4 htitle=0.5;
proc gchart data=star1;
  title2 'Star (Sun) plots for the cereal data';
  star variable / noheading value=none sumvar=value fill=solid 
                  across=10 down=4 group=ID  starmin=-2.2800577; 
  footnote ‘Ray direction: Northwest=Sodium, East=Fat, Southwest=Sugar’;
run;

*use this statement to remove the footnote from future output or plots;
footnote;

The PROC GCHART code here is complicated, and I am sort of “tricking” SAS to do the plot like we would want it.  Here is a brief discussion of some of the options in the STAR statement.  Some options are also explained in the code. 

o  SUMVAR = value: This gives the variable that has the numerical values in it.  In this case, I called the variable “value” in the star1 data set.  Note that this is where the “trick” is.  SAS needs to summarize (like find a sum or a mean) a particular
variable for GCHART to work.  In this case, I tell SAS to sum the “value” variable. 

o  GROUP = id: This tells SAS to summarize (in this case sum) by this variable in the star1 data set.  Since each “id” is unique for a particular “variable” in the star1 data set, this causes SAS to sum just one value!

o  FILL = SOLID: specifies how to fill in the rays on each star

o  ACROSS = 10 and DOWN = 4: Tells SAS to put the plots in a 10´4 grid.  If there is not enough room on one page, SAS goes to additional pages in the GRAPH window (as done in the first plot).

o  NOHEADING: suppresses a plot title showing what values are being “summed” over

o  STARMIN = -2.28: Tells SAS what the center point of the star represents forall of the variables represented in the star.  Notice how important this is to specify correctly and how this can be a little limiting in how the plot is done.  R’s version
of the star plot does not have these limits. 

o  VALUE = NONE: This suppresses the values for each variable from being printed (used in the second PROC GCHART); if this is excluded, they are printed (as done in the first PROC GCHART)

Below are three plots generated from the first PROC GCHART.

Below is the plot generated from the second PROC GCHART.

Comments:

1)          Cereals 26 and 30 (Post Shredded Wheat Spoon Size and Food Club Frosted Shredded Wheat) appear to be low in sugar, fat, and sodium (relative to the other cereals). 

2)          Cereals              [CB5] appear to be high in fat.

3)          Cereals[CB6]              appear to be high in sugar.

4)          Cereals[CB7]              appear to be high in sodium.

5)          The researcher’s hypothesis before the data collection was that shelf 1 and 2 tend to have the higher sugar content cereals.  From this plot, what do you think? 

 

These type of plots can often be a starting point to answering the question.  Obviously, more sophisticated analyses would be needed before a formal answer could be given. 

 See the Chapter 3 R supplement for information on how to do these plots in R for a much easier way.

 There are limits to using star plots.  The number of stars or faces can be quite large for a large data set since each star or face represents an experimental unit (=observation here). 

 Remember that the placekicking data set has 1,425 observations!  Maybe a star could be used to represent each placekicker by averaging over variables????  This idea of averaging over experimental units (when there is something logical to average over) is
a great way to still use this type of plot for large samples.

 

Andrews’ Plots

 These plots represent each experimental unit through a combination of sine and cosine curves.  Below is an example.  Please see my Chapter 3 additional.doc file (available on the schedule web page) if you are interested in knowing more about the plots. 
The cereal_ch3.sas program also has the code needed to complete the plots for the cereal data set.  The parallel coordinate plots to be discussed later serve the same purpose as Andrews’ plots, but they are easier to interpret. 

  

Side-by-Side Scatter Plots

 See Chapter 2.

 Michael Friendly has a SAS program which invokes SAS/INSGHT to construct side-by-side scatter plots (scatter plot matrix).  The program can be downloaded at

www.math.yorku.ca/SCS/sasmac/scatter.html

 


3.3       Plotting to Check for Multivariate Normality

If x has a multivariate normal distribution, the points on each scatter plot in a scatter plot matrix should form approximately an ellipse.  This is because the contours of a bivariate normal distribution
are ellipses (see Chapter 1). 

An “ad-hoc” way for checking a multivariate normality assumption is to check for ellipses in each scatter plot of a scatter plot matrix.  This does NOT ensure multivariate normality, but it gives at least some credibility to the assumption. 

Johnson (1998) notes that ways to check for multivariate normality are limited.  He gives his own “ad-hoc” method by using chi-square probability plots.  Read this on your own. 

 

5     Additional Topics Not in Johnson (1998)

Sampling data from a population characterized by a multivariate normal distribution

 

The PROC IML function

CALL VNORMAL(series, mu, sigma, n, seed);

can be used to sample (generate) multivariate normal data.  In the above function,

·      series denotes the name of the matrix containing the output

·      mu denotes the mean vector

·      sigma denotes the covariance matrix

·      n denotes the sample size

·      seed denotes the seed used by SAS to randomly generate the data

Why is it important to know how to sample data from a population characterized by a multivariate normal distribution?[CB8] 

Example: vnorm_ex.sas

Data is sampled (n=1,000) from a population that is characterized by the same bivariate normal distribution examined in Chapter 1 (see graph_mult_normal.sas) where

 m = , , and .   

Below are 3D surface and contour plots of the actual bivariate normal distribution.

Below is the SAS code to generate the data and examine some summary measures.

proc iml;
  mu={15,20};
  sigma={1 0.5,
      0.5 1.25};
  *Note that the pop. corr. is 0.5/sqrt(1*1.25) = 0.45;
  *CALL VNORMAL( series, mu, sigma, n, seed);
  call vnormal(save, mu, sigma, 1000, 155504946);
  *print save;
  col={ "x1" "x2"};
  create set1 from save [colname=col];
  append from save;
quit;

title2 'Random sample from bivariate normal distribution';
proc print data=set1;
run;
 
title2 'Examine the estimated correlation';
proc corr data=set1;
  var x1 x2;
run;

proc gplot data=set1;
  plot x1*x2 / grid vaxis=axis1 haxis=axis2 frame;
  title2 'x1 vs. x2';
  symbol1 v=dot cv=blue h=0.1;
  axis1 label = (a=90 'x1')
        length = 10
        order = (10 to 30 by 5);
  axis2 label = ('x2')
        length = 8.71
        order = (10 to 30 by 5);
run;

Below is part of the SAS output:

 

                  M Chris Bilder,STAT 5063                     1

        Random sample from bivariate normal distribution

                    Obs       x1         x2

                      1    14.2886    18.5389

                      2    16.9355    20.9535

                      3    16.4319    22.0318

                      4    16.7188    19.9707

                 5    16.1138    18.8472

M

        997    13.0716    18.3942

                    998    15.5042    18.5346

                    999    13.6406    18.7912

1000    15.7136    20.1452

 

             Chris Bilder,STAT 5063                    12

                Examine the estimated correlation

                       The CORR Procedure

                2  Variables:    x1       x2

 

                       Simple Statistics

 Variable           N          Mean       Std Dev           Sum

 x1              1000      15.01474       1.04284         15015

 x2              1000      19.94536       1.08318         19945

 

                       Simple Statistics

              Variable       Minimum       Maximum

              x1            11.88741      18.64080

              x2            16.83301      23.75814

 

           Pearson Correlation Coefficients, N = 1000

                   Prob > |r| under H0: Rho=0

                               x1            x2

                 x1       1.00000       0.45296

                                         <.0001

                 x2       0.45296       1.00000

                     <.0001

 

Comments:

1)   Compare the summary statistics from PROC CORR to the population parameter values.

2)   The shape of the scatter plot points is in the form of an ellipse. 

 

Questions: What would a histogram for x1 or x2 individually look like?  How would one approximate x1 or x2’s probability distribution graphically using a histogram?  

Below is part of the SAS code and output used to do the same items as in the above answers to the questions, but using x1 and x2 at the same time.  Below is what a 3D histogram looks like for x1 and x2.  The plot
was done in S-Plus (you are not responsible for knowing how to create it).

 

What I would like SAS to do is plot a 3D surface over the histogram.  This can be done with the help of PROC KDE.  This allows one to create a “smoothed” estimate of theobserved (sampled version) bivariate distribution, and this will be
similar to the plots on p. 3.28 of the actual (population) bivariate distribution. 

 

Don’t worry about exactly how PROC KDE works.  I only want you to understand that it helps find an estimate of f(x). 

 

*Finds a bivariate estimate of the density and puts it into the out data 
*  set;
*Note the following from the help for PROC KDE:                        ;   
*  PROC KDE uses a Gaussian density as the kernel, and its assumed; 
*    variance determines the smoothness of the resulting estimate.;
*Since a Gaussian (normal) density is used, the resulting bivariate; 
*  density estimate will probably be closer to the bivariate normal ;
*  density that what it actually is;
proc kde data=set1 out=set2;
  var x1 x2;
run;

*Create 3D plot of the surface;
proc g3d data=set2;
  title2 '3D surface ';
  plot x1*x2=density / grid zmin=0 zmax=0.3 zticknum=5;
run;

*Create contour plot;
proc gcontour data=set2;
  plot x1*x2=density  / grid autolabel=(reveal) 
     haxis=axis1 vaxis=axis2 legend=legend1  
     levels=0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15;
  title2 'Contour plot';                                

  axis1 label = ('x2')
        length=10
        order = (10 to 30 by 5);
  axis2 label=(a=90 'x1')
        length=8.71
        order = (10 to 30 by 5);
  legend1 label=('f(x)')
          down=2;
run;

From PROC G3D:

From PROC GCONTOUR:

Generated data:                                     Actual (Chapter 1):

Additional SAS code (excluded here) allows the 3D surface plot to be drawn on the same scale as the Chapter 1 corresponding plot. 

Sampled data:

Actual (Chapter 1):

 

Why should we expect them to be similar?

 See the Chapter 3 R supplement for a further analysis of this generated data. 

 Trellis plots (co-plots)

The plots allow for the viewing of multidimensional relationships between variables through conditioning.  Trellis plots were developed by AT&T Bell Labs.  For a more detailed description of Trellis graphics see the Trellis graphics website athttp://www.research.att.com/~rab/trellis,
the S-Plus manuals, or the bookVisualizing Data [CRB9] by William S. Cleveland.  Below are the basic concepts of Trellis graphs.

The picture to the right is a trellis[CRB10] .  The important part of the picture is the squares that make up the trellis.  Imagine one plot (possibly
a scatter plot) within each square and that the plots are from the same data set.  However, each plot represents a different subset (possibly overlapping) of the data set.  The subsets are determined by conditioning variables. 

 Below is a trellis plot containing dot plots of barley yields for various barley varieties.  Notice the 6 different squares orpanels of the trellis.  They each represent a particular farming location in Minnesota.  Therefore, the barley
varieties vs. yield dot plots are represented conditionally on farming location. 

 Here is a description from the Trellis graphics website of the story behind this data:

 The barley experiment was run in the 1930s. The data first appeared in a 1934 report published by the experimenters. Since then, the data have been analyzed and re-analyzed. R. A. Fisher presented the data for five of the sites in his classic book,The
Design of Experiments.
Publication in the book made the data famous, and many others subsequently analyzed the them, usually to illustrate a new statistical method.

 Then in the early 1990s, the data were visualized by Trellis Graphics. The result was a big surprise. Through 60 years and many analyses, an important happening in the data had gone undetected. The above figure shows the happening, which occurs at Morris.
For all other sites, 1931 produced a significantly higher overall yield than 1932. The reverse is true at Morris. But most importantly, the amount by which 1932 exceeds 1931 at Morris is similar to the amounts by which 1931 exceeds 1932 at the other sites.
Either an extraordinary natural event, such as disease or a local weather anomaly, produced a strange coincidence,or the years for Morris were inadvertently reversed. More Trellis displays, a statistical modeling of the data, and some background
checks on the experiment led to the conclusion that the data are in error. But it was Trellis displays such as the above figure that provided the ``Aha!'' which led to the conclusion.

 Example: cereal data (cereal_ch3.sas)

 I recommend to use R (or S-Plus) to create trellis plots! SAS only can do trellis plots for histograms or bar charts.  Below is an example of using PROC UNIVARIATE for histograms.  Below is the SAS code and output.  Pay special attention to the form of
the data set used for PROC CAPABILITY. 

 Maybe I should have standardized by shelf? 

 

*************************************************;

* Trellis plot histograms conditioning on shelf *;

*   and variable (sugar, fat, sodium).          *;

* The data set needs to be in the same form as  *;

*   for the star plots, but also the shelf needs*;

*   to be included.                             *;

* Remember that star1 looks like the following: *;

*       ID  variable  value                     *;

*        1   Sugar    0.4528440736              *;  

*        1   Fat      -1.162340844              *;

*        1   Sodium   0.1854699394              *;

*        2   Sugar    -1.457523293              *;

*      ...                                      *;

*       40   Sodium   0.4271883325              *;

*************************************************;

data trellis1;
  set star1;
  if ID >=1 and ID <=10 then shelf = 1;
  if ID >=11 and ID <=20 then shelf = 2;
  if ID >=21 and ID <=30 then shelf = 3;
  if ID >=31 and ID <=40 then shelf = 4;
run;

title2 "Trellis plot for the cereal data";
proc univariate data=trellis1;
  class shelf variable;
  var value;
  histogram value / normal(fill) cfill=yellow NROWS=4 
                    ncols=3;
run;
 

Comments:

1)   Standardized data is plotted (standardized earlier in program).

2)   Since only 10 observations are used for each panel, a histogram and normal approximation may yield poor results. 

3)   What does the plot suggest about the data set?[CB11] 

4)   PROC CAPABILTY in SAS/QC can also create a similar plot to the one above.  See my program for the code. 

Again, R (and S-Plus) are much more versatile for Trellis plots.  See the Chapter 3 R supplement for further examples.

Final notes:

·      Michael Friendly’s website contains many nice SAS programs for graphing atwww.math.yorku.ca/SCS/friendly.html.  See the section entitled “SAS System for Statistical Graphics:”.  
Some of these programs are from his SAS graphics book. 

·      Java graphs from PROC GCHART, GCONTOUR, GMAP, GPLOT, and G3D – see
www.sas.com/service/techtips/ts_qa/procgchart.html
.

 

Example code using part of cereal_ch3.sas

ods html body='c:/chris/sas_info/myGraph.html';
goptions device=java;

proc gplot data=set2;
  bubble sugar*fat = sodium / vaxis=axis1 haxis=axis2 frame grid 
                              bcolor=blue bsize=2;
  title2 'Sugar vs. fat with bubble proportional to sodium';
  axis1 label = (a=90 'Sugar')
        length = 12;
  axis2 label = ('Fat')
        length = 12;
run;
ods html close;

·      When doing multiple plots in SAS, unexpected colors, symbols, labels, may sometimes appear.  This is because statements like SYMBOL and TITLE are global in SAS.  This means that they are in effect until they are reused or “turned off”.  To turn off
these statements, use the following code before a graphics PROC:

    goptions reset=all;

·      The Journal of Computational and Graphical Statistics (www.amstat.org/publications/jcgs/) often has articles on advances in statistical graphics. 

·      GGobi is a software package for multivariate data visualization.  It is available for free athttp://www.ggobi.org/.  It can be used with R. 


 [t1]Shredded wheat

 [CB2]What would be done if the third variable naturally has negative values? 

 [CRB3]If we have enough time – otherwise, take STAT 5073 (categorical data analysis) where this will be covered

 [t4]Can not find new link to list of colors

 [CB5]8=Capn Crunch’s Peanut Butter Crunch, 14=Capn Crunch’s Peanut Butter Crunch (same – just on different shelf!), 20=Oreo’s O’s, 25=Post morning traditions – Raison, Date, and Pecan

 [CB6]12=Kellogg’s Smacks, 16=Marshmallow Blasted Froot Loops

 [CB7]2=Post Toasties Corn Flakes, 3=Kellogg’s Corn Flakes, 10=Food Club Crispy Rice

 [CB8]Simulations – 1) Generate data from this distribution to make sure a particular statistical analysis method is working correctly.  If it is not, there may be a problem with the statistical
method or a programming error.  2) What happens in small sample sizes for a particular statistical procedure?  3) Power

 [CRB9]Ordered for library in Dec. 2000

 [CRB10]From Trellis graphics homepage

 [CB11]1) The shelf 1 sodium is shifted to the right of the other shelves 2) The shelf 2 sugar is shifted to the right of the other shelves

抱歉!评论已关闭.