想要统计每天BO有多少用户在线?每个小时用户在线数多少?通过BO的审计功能可以实现,但是审计会降低性能。如果前端有Apache的做负载均衡只要开启了日志,我们便可以轻松的通过awk来分析日志,得到我们想要的数据。下面的代码中完成了我的3个需求:
1. 统计每天系统用户上线数多少
2.统计每个小时用户在线数多少
3.统计报表保存动作平均开销是多少?
通过gawk轻松搞定。
1. Apache 日志格式
日志格式用的是common,类似如下格式:
10.1.1.1 - - [09/Dec/2011:07:11:26 -1200] "GET /OpenDocument/opendoc/openDocument.jsp?iDocID=144758&boRefresh=Y HTTP/1.1" 200 3382
2. GAWK程序
通过IP地址和时间就能搞定前连个需求,通过jsp的页面判断用户进行了什么操作这样3个需求都能满足,代码如下:
#! /usr/bin/gawk -f #$1 is ip #$4 is date #year: substr($4,9,4) #month: Mons[substr($4,5,3)] #day: substr($4,2,2) #time ltime = substr($4,14,8); gsub(/:/," ",ltime) #request url $7 function getTime(date){ year = substr(date,9,4) month = Mons[substr(date,5,3)] day = substr(date,2,2) ltime = substr(date,14,8) gsub(/:/," ",ltime) return mktime(year " " month " " day " " ltime) } BEGIN{ Mons["Jan"] = 1; Mons["Feb"] = 2; Mons["Mar"] = 3; Mons["Apr"] = 4; Mons["May"] = 5; Mons["Jun"] = 6; Mons["Jul"] = 7; Mons["Aug"] = 8; Mons["Sep"] = 9; Mons["Oct"] = 10; Mons["Nov"] = 11; Mons["Dec"] = 12 } { currIp = $1 currDate = getTime($4) hour = substr($4,14,2) #get how many user on line per hour user[hour,currIp] #get how many user on line today ip[currIp] #get avg report saved time if ($7 ~ /\cdz_adv\/checkProcessSave.jsp/){ startTime = currDate } if ($7 ~ /reportSaveAlert.html\?/){ totalTime += currDate - startTime times += 1 } } END{ print length(ip) " users on line today." #print user per hour for (i in user){ split(i,lists,SUBSEP) count[lists[1]] += 1 } for (i in count) print i,":",count[i]," users" #print ave report saved time if (times > 0) print totalTime / times "s avg report saved time." else print "0 report is saved." }
用了原始数据之后便可以自己做个dashborad分析,做顾问不简单啊,啥都得懂。