二项分布比例的置信区间计算

现在的位置: 首页 > 综合 > 正文

二项分布比例的置信区间计算

2018年10月22日 ⁄ 综合 ⁄ 共 7197字 ⁄ 字号小中大 ⁄ 评论关闭

之前一网友遇到类似问题，特查了相关文献，归结一下方法有二。

1.根据各方法的计算公式进行编辑公式，data 步就可以搞定。相关文献有：《A Comparison of Binomial Proportion Interval Estimation Methods 》、《Confidence Interval Calculation for Binomial Proportions》，这两篇文章都曾讨论比较了Wald、Wilson Score、Exact Clopper Pearson几种方法，另外《A SAS Program to Calculate and
Plot Widths of (1- )100% Confidence Intervals for Binomial Proportions》一文将其中的Clopper Pearson法写成了宏。

2.PROC FREQ语句也提供了计算基于二项分布置信区间的方法。《Are you computing Confidence Interval for binomial proportion using PROC FREQ? Be Careful!!》和《Confidence Intervals for the Binomial Proportion with Zero Frequency》作了相关讨论，proc freq语句官方文档也作了相关说明（http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_freq_a0000000660.htm）

相关代码摘录如下。

/***********************************************************************
** Program: Propci.sas
** Author: Keith Dunnigan
**
** Purpose: Calculate the confidence interval for one proportion.
**
** Description: Calculate Wald, Wilson Score, and exact Clopper-Pearson
** confidence intervals for a variety of n, p, and alpha
** combinations for either a one or two-sided interval.
**
***********************************************************************/
%macro runprog;
data propci;
set parms;
if sided = 1 then do;
zalpha=probit(1-(alpha));
end;
else if sided = 2 then do;
zalpha=probit(1-(alpha/2));
end;
** Wald;
q = 1 - p;
stder = sqrt(p*q/n);
Wald_lcl = p - zalpha * stder;
Wald_ucl = p + zalpha * stder;
** Wilson Score;
part1 = p + ((zalpha**2)/(2*n));
part2 = sqrt( (p*q/n) + ((zalpha**2)/(4*n**2)) );
part3 = 1 + (zalpha**2)/n;
Wilson_lcl = (part1 - (zalpha * part2))/ part3;
Wilson_ucl = (part1 + (zalpha * part2))/ part3;
** Exact Clopper Pearson;
x = round (n*p,0.1);
* Calculate the lower limit.;
v1 = 2*(n-x+1);
v2 = 2*x;
if sided = 1 then do;
a = 1-(alpha);
end;
else if sided = 2 then do;
a = 1-(alpha/2);
end;
coef = (n-x+1)/x;
fscore = finv(a,v1,v2);
exact_lcl = 1/(1+coef*fscore);
* Calculate the upper limit.;
v11 = 2*(x+1);
v22 = 2*(n-x);
fscore2 = finv(a,v11,v22);
coef2 = (x+1)/(n-x);
numer = coef2*fscore2;
exact_ucl = numer/(1+numer);
run;
options nodate;
title 'Confidence Intervals for a Single Proportion';
proc print data = propci split = '_' noobs;
var n p alpha sided Wald_lcl Wald_ucl Wilson_lcl Wilson_ucl exact_lcl exact_ucl;
label wald_lcl = 'LCL_(Wald)'
wald_ucl = 'UCL_(Wald)'
wilson_lcl = 'LCL_(Wilson_Score)'
wilson_ucl = 'UCL_(Wilson_Score)'
sided = 'Sides_on_CI'
exact_lcl = 'LCL_(Exact)'
exact_ucl = 'UCL_(Exact)';
run;
%mend runprog;
data parms;
infile cards;
input n p alpha sided;
cards;
24 0.9 0.05 1
25 0.9 0.05 1
26 0.9 0.05 1
29 0.9 0.05 1
32 0.9 0.05 1
35 0.9 0.05 1
38 0.9 0.05 1
43 0.9 0.05 1
44 0.9 0.05 1
45 0.9 0.05 1
45 0.9 0.05 1
48 0.9 0.05 1
49 0.9 0.05 1
50 0.9 0.05 1
run;
%runprog;

**Some programs for calculating the Exact Binomial P value and Confidence Bound;

**Using SAS Proc FREQ;

data temp1;
  input batch bleeding $ count;
  datalines;
  1 have 2
  1 no 3
  2 have 3
  2 no 7
  3 have 4
  3 no 11
run;

proc freq;
  weight count;
  tables bleeding /  binomial (p=0.1) alpha=0.2 cl;  **p=1 option indicates the standard rate to compare with, 
                                                       here we assume the standard bleeding rate is 10%;
                                                     **Alpha=0.2 to obtain one side 90% lower limit;
  exact binomial;  *Obtain the exact p-value;
  by batch;
run;


* This Program Calculates EXACT Confidence Bounds
* on proportion of "successes" in a Binomial Distribution
* See Sachs, Applied Statistics, p. 333
*
* Author: Mark DeHaan   msd@inel.gov
* Date: 4/21/92
*
* Macro Variable Names & Definitions
*  nsuccess= number of Successes found in sample
*  sampsize= sample size
*  twosided= two sided confidence bounds (0=no, 1=yes)
*  conflev=  confidence level for estimates (e.g. .95)
******************************************************;
%macro p_conf(nsuccess,sampsize,twosided,conflev);
   Data null;
     phat=&nsuccess/&sampsize;
     if &twosided then do;  *compute a two-sided conf. bound;
        clevel=&conflev+((1-&conflev)/2);
        put;put;
        put "***Note: Two sided &conflev confidence bounds***";
        end;
     else do;
        clevel=&conflev;
        put;put;
        put "***Note: One sided &conflev confidence bounds***";
        end;
     f_low=finv(clevel,2*(&sampsize-&nsuccess+1),2*&nsuccess);
     f_up=finv(clevel,2*(&nsuccess+1),2*(&sampsize-&nsuccess));
     low_phat=&nsuccess/(&nsuccess+(&sampsize-&nsuccess+1)*f_low);
     up_phat=(&nsuccess+1)*f_up/(&sampsize-&nsuccess+(&nsuccess+1)*f_up);
     put "Number of successes= &nsuccess    Sample size= &sampsize";
     put phat= low_phat= up_phat=;
     put;put;
     run;
%mend;
 
%p_conf(7,20,0,.975)
%p_conf(7,20,1,.95)
%p_conf(2,5,0,.90)
%p_conf(2,5,1,.80) 
 
 
---------------------------------------------------------------
Date:         Tue, 5 Jun 90 09:14:57 GMT
Reply-To:     LESLIE DALY PhD <LDALY@IRLEARN>
Sender:       "SAS(r) Discussion" <SAS-L@OHSTVMA>
From:         LESLIE DALY PhD <LDALY@IRLEARN.BITNET>
Subject:      POISSON (AND bINOMIAL) CONFIDENCE INTERVALS
To:           Barry W Grau <U42054@UICVM.BITNET>
 
Stephen Hull asked for SAS code for a Poisson confidence interval.  I
attach below two short MACROS to get confidence intervals for both the
binomial and Poisson cases.  I have found these to be most useful.  The
easiest way to use them is to copy this mail into a file and delete
all except the MACROS.  %INCLUDE this file in the SAS program and
run the macro using %CIPOISS etc.  The MACROS are commented in full.
 
%macro cibinom(CI,n,k,P,PL,PR);
*****************************************************************
*                                                               *
*   MACRO FOR BINOMIAL CONFIDENCE INTERVALS                     *
*   (Leslie Daly   LDALY@IRLEARN.BITNET     VERSION   1.1)      *
*                                                               *
*   This SAS Macro calculates the confidence interval for a     *
*   binomial distribution.  The method is that given in the     *
*   Geigy Scientific Tables Vol 2 (Ed C Lentner) (Ciba-Geigy    *
*   Ltd, Basle, Switzerland) p221.  It is one of the most       *
*   accurate approximations available in a closed form.         *
*                                                               *
*   This reference should be consulted for an explanation of    *
*   the code below.                                             *
*                                                               *
*   USAGE:  %CIBINOM(CI,N,K,P,PL,PU);                           *
*                                                               *
*   INPUT PARAMETERS:  CI - Confidence level as a proportion    *
*                           (ie for 95% confidence interval     *
*                            CI should be 0.95)                 *
*                       N - Total number of trials              *
*                       K - Number of successes                 *
*   OUTPUT PARAMETERS;  P - Observed prob. of success (K/N)     *
*                      PL - Lower confidence limit              *
*                      PU - Upper confidence limit              *
*                                                               *
*                                                               *
*****************************************************************;
ALPHA = (1 - &CI)/2;
&P = &K/&N;
IF &K EQ 0 THEN DO;
   &PL = 0;
   &PR = 1 - 10**(LOG10(ALPHA)/&N);
END;
IF &K EQ &N THEN DO;
   &PL = 10**(LOG10(ALPHA)/&N);
   &PR = 1;
END;
IF &K GT 0 AND &K LT &N THEN DO;
calpha =  probit(1 - ALPHA);
a = (calpha**4)/18  +  (calpha**2)*(2*&N + 1)/6 + (&N + 1/3)**2  ;
AL = &K;
AR = &K + 1;
BL =  (CALPHA**4)/18  + (CALPHA**2)*(4*(&N - AL) + 3)/6
                      + 2*(AL*(3*&N + 1) - &N)/3  - 2/9;
CL =  (CALPHA**4)/18  + (AL - 1/3)**2  -  (CALPHA**2)*(2*AL - 1)/6;
&PL =  BL/(2*A) - SQRT( (BL/(2*A))**2 - CL/A );
BR =  (CALPHA**4)/18  + (CALPHA**2)*(4*(&N - AR) + 3)/6
                      + 2*(AR*(3*&N + 1) - &N)/3  - 2/9;
CR =  (CALPHA**4)/18  + (AR - 1/3)**2  -  (CALPHA**2)*(2*AR - 1)/6;
&PR =  BR/(2*A) + SQRT( (BR/(2*A))**2 - CR/A );
END;
%MEND CIBINOM;
 
 
%MACRO cipoiss(CI,K,KL,KR);
*****************************************************************
*                                                               *
*   MACRO FOR POISSON CONFIDENCE INTERVALS                      *
*   (Leslie Daly   LDALY@IRLEARN.BITNET     VERSION   1.1)      *
*                                                               *
*   This SAS Macro calculates the confidence interval for a     *
*   Poisson distribution.  The confidence lilmits are exact     *
*   and are based on the relationship between the chisquare     *
*   distribution and the Poisson.                               *
*   If CHI(df,P) represents the chisquuare value which on       *
*   df degrees of freedom cuts off an area P in the upper       *
*   tail of the distribution (ie CHI(1,0.05) = 3.84) then       *
*   for a count of x (observed) the 95% Confidence limits are:  *
*      Lower: 0.5*( CHI(2*X,0.975)                              *
*      Upper: 0.5*( CHI(2*X +2,0.025)                           *
*   with obvious adjustments for 99% limits etc.                *
*                                                               *
*   In SAS the chisquare value can be obtained using the        *
*   function GAMINV. In the notation above (watch the           *
*   probability levels)                                         *
*                                                               *
*        CHI(DF,P) = 2*GAMINV(1-P,0.5*DF)                       *
*                                                               *
*   hence the code below.                                       *
*                                                               *
*                                                               *
*   USAGE:             %CIPOISS(CI,K,KL,KR)                     *
*                                                               *
*                                                               *
*   INPUT PARAMETERS:  CI - Confidence level as a proportion    *
*                           (ie for 95% confidence interval     *
*                            CI should be 0.95)                 *
*                       K - Observed number of events           *
*   OUTPUT PARAMETERS; KL - Lower confidence limit              *
*                      KU - Upper confidence limit              *
*                                                               *
*                                                               *
*****************************************************************;
ALPHA = (1 - &CI)/2;
if &k ne 0 then do;
   &KR = gaminv (1 - ALPHA,&K+1);
   &KL = gaminv (ALPHA,&K);
END;
IF &K EQ 0 THEN DO;
   &KR = -LOG(ALPHA);
   &KL = 0;
END;
%MEND CIPOISS;

【上篇】SAS中的HASH语句及其常见应用
【下篇】Multivariate Data Plots

作者: forage

该日志由 forage 于6年前发表在综合分类下，最后更新于 2018年10月22日.
转载请注明: 二项分布比例的置信区间计算 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

二项分布比例的置信区间计算

作者: forage

书签

最新文章New

本站推荐

返回首页