现在的位置: 首页 > 综合 > 正文

哑变量矩阵生成dummy matrix with sas

2018年10月20日 ⁄ 综合 ⁄ 共 2961字 ⁄ 字号 评论关闭

FROM:http://support.sas.com/kb/23/217.html

For a specified model, there are several procedures that allow you to save the design matrix to a data set:

  • For models using GLM parameterization (also called indicator or
    dummy coding
    ) of CLASS variables, you can use an ODS OUTPUT statement with PROC GLMMOD to save the design matrix to a data set. Note that modeling procedures such as GLM, MIXED, GLIMMIX and others use GLM parameterization. Specify the model in the MODEL
    statement and identify any categorical predictors in the CLASS statement. Note that PROC GLMMOD only offers GLM parameterization of CLASS variables. For example, the GLM statements below fit the indicated model and the GLMMOD statements that follow create
    a data set from the same design matrix that was used in PROC GLM. Use the ODS LISTING statements if you want to suppress display of the GLMMOD output in the Output window.

       proc glm data=a;
          class a b c;
          model y=a b c a*b;
          run;
    
       ods output designpoints=xmatrix;
       ods listing close;
       proc glmmod data=a;
          class a b c;
          model y=a b c a*b;
          run;
       ods listing;
    
  • For models that use GLM or other parameterizations, you can use the OUTDESIGN= option in PROC GLMSELECT or PROC LOGISTIC. These procedures can create design variables using any of several different parameterizations including GLM, reference, effects, polynomial,
    and others. Specify the model in the MODEL statement and identify any categorical predictors in the CLASS statement. Use the PARAM= option in the CLASS statement to select the parameterization. Each of the GLMSELECT and LOGISTIC steps below creates a data
    set containing the same design matrix as produced above by PROC GLMMOD (and as used internally by PROC GLM). Note that the OUTDESIGNONLY option in PROC LOGISTIC suppresses the inclusion of the other variables in the input data set. The SELECTION=NONE option
    in PROC GLMSELECT requests that the procedure fit the specified model rather than use a model selection method.

       proc logistic data=a outdesign=xmatrix outdesignonly;
          class a b c / param=glm;
          model y=a b c a*b;
          run;
       proc glmselect data=a outdesign=xmatrix;
          class a b c / param=glm;
          model y=a b c a*b / selection=none;
          run;
    

    But you can also use other coding methods. For example, these statements use effects coding for the categorical (CLASS) variables:

       proc logistic data=a outdesign=xmatrix outdesignonly;
          class a b c / param=effect;
          model y=a b c a*b;
          run;
    

    See the LOGISTIC and GLMSELECT documentation for information about the various coding methods that are available.

    
    
  • For models using GLM, reference, or effects parameterization, you can use PROC TRANSREG. Specify any categorical variables in the CLASS expansion. Use the ZERO= option to select a reference category or, as below, ZERO=NONE to specify GLM parameterization.
    Specify any continuous predictors, the response, and any other variables that you want transferred to the output data set in the ID statement. For example, the following statements create a data set containing the same design matrix as produced above by PROC
    GLMMOD (and as used internally by PROC GLM):

       proc transreg data=a design;
          model class(a b c a*b / zero=none);
          id y;
          output out=xmatrix;
          run;
    

    Effects coding can be done as follows:

       proc transreg data=a design;
          model class(a b c a*b / effects);
          id y;
          output out=xmatrix;
          run;
    

    Note that PROC TRANSREG automatically creates a macro variable,
    &_trgind
    , which contains a list of variable names that it creates. You can use this macro variable in subsequent procedures to refer to the full model.

【上篇】
【下篇】

抱歉!评论已关闭.