现在的位置: 首页 > 综合 > 正文

SAXParser读取超大型xml

2014年03月12日 ⁄ 综合 ⁄ 共 4344字 ⁄ 字号 评论关闭

利用dom解析xml需要一次性加载所有的xml内容 内容太大就要加大虚拟机的内存,如果超大,内存无法存放的话

就只能采用流的形式了

 

xml内容

 

- <NewDataSet xmlns="">
- <ExecResult diffgr:id="ExecResult1" msdata:rowOrder="0" diffgr:hasChanges="inserted" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <ResultValue>True</ResultValue>
  <ResultInfo>登录成功!</ResultInfo>
  <StatDateTime>2011-4-21</StatDateTime>
  </ExecResult>
- <INTENT diffgr:id="INTENT2" msdata:rowOrder="1" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <INTENTID>8304885</INTENTID>
  <ADDCODE>4407</ADDCODE>
  <SENDDATE>2011-04-15 21:47:52</SENDDATE>
  <NMEMBERID>300479</NMEMBERID>
  <SMEMBERID>167131</SMEMBERID>
  <THEITEMID>6642238</THEITEMID>
  <COMMODITYID>88491</COMMODITYID>
  <BATCH>400</BATCH>
  <PRICE>29.2200</PRICE>
  <INNERID>20209</INNERID>
  <PROJECTID>260</PROJECTID>
  </INTENT>
- <SENDPRODUCT diffgr:id="SENDPRODUCT4602" msdata:rowOrder="4601" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <THEITEMID>6360192</THEITEMID>
  <ADDCODE>4418</ADDCODE>
  <SENDQUANTITY>48</SENDQUANTITY>
  <BATCHTXT>10071002</BATCHTXT>
  <BILLNUMBER />
  <VALIDITYDATE>2012-1-9 0:00:00</VALIDITYDATE>
  <INVOICENUMBER />
  <SENDDATE>2011-04-15 17:40:10</SENDDATE>
  <PROJECTID>260</PROJECTID>
  <INNERID>3066</INNERID>
  <NMEMBERID>300370</NMEMBERID>
  <SMEMBERID>166980</SMEMBERID>
  <COMMONDITYID>49511</COMMONDITYID>
  <GUIDCODE />
  </SENDPRODUCT>
- <PUBLIC_MEMBER diffgr:id="PUBLIC_MEMBER1" msdata:rowOrder="0" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <MEMBERID>305659</MEMBERID>
  <MEMBERCLASSID>1</MEMBERCLASSID>
  <MEMBERCLASSNAME>医疗机构</MEMBERCLASSNAME>
  <MEMBERNAME>惠州市惠城区横沥医院</MEMBERNAME>
  <ADDCODE>441302</ADDCODE>
  <TELEPHONE>13652772813</TELEPHONE>
  <FAX>0752-3185010</FAX>
  <ZIP>516253</ZIP>
  <THEADDRESS>惠州市惠城区横沥镇解放路184号</THEADDRESS>
  <EMAIL>heng1958@163.com</EMAIL>
  <CONTACT>邹宣平</CONTACT>
  </PUBLIC_MEMBER>
- <PUBLIC_MEMBER_INFO diffgr:id="PUBLIC_MEMBER_INFO1" msdata:rowOrder="0" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <MEMBERID>26508</MEMBERID>
  <LICENSEID>粤HB20060033</LICENSEID>
  <REG_ADDRESS>广州增城市中新镇风光路178号</REG_ADDRESS>
  <DEPUTY>陈桂恩</DEPUTY>
  <PRESIDENT>陈桂恩</PRESIDENT>
  <QUANLITYPRINCIPAL />
  <SALESMODE />
  <PUBLICDEPARTMENT />
  <PROCESSSPAN>2010-12-31 00:00:00</PROCESSSPAN>
  </PUBLIC_MEMBER_INFO>
- <Product diffgr:id="Product478" msdata:rowOrder="477" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <THEITEMID>412</THEITEMID>
  <ADDCODE>4401</ADDCODE>
  <SENDQUANTITY>1</SENDQUANTITY>
  <BATCHTXT>11</BATCHTXT>
  <BILLNUMBER />
  <VALIDITYDATE>2</VALIDITYDATE>
  <INVOICENUMBER>FP201103</INVOICENUMBER>
  <SENDDATE>2011-03-28 17:45:26</SENDDATE>
  <PROJECTID>142</PROJECTID>
  <INNERID>1</INNERID>
  <NMEMBERID>166764</NMEMBERID>
  <SMEMBERID>156400</SMEMBERID>
  <COMMONDITYID>328000</COMMONDITYID>
  </Product>
- <COMMODITY diffgr:id="COMMODITY1" msdata:rowOrder="0" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <COMMODITYID>53829</COMMODITYID>
  <PROJECTID>260</PROJECTID>
  <TRADENAME>注射用亚叶酸钙</TRADENAME>
  <DOSE>注射用无菌粉末(冻干粉针)</DOSE>
  <SPEC>0.1g</SPEC>
  <PACKSPEC>1支/盒</PACKSPEC>
  <UNIT></UNIT>
  <PRODUCER_MEMBER_ID>155316</PRODUCER_MEMBER_ID>
  <PNAME>山西普德药业股份有限公司</PNAME>
  <APPROVE_NO>国药准字H14022464</APPROVE_NO>
  <DRUG_NAME>亚叶酸钙</DRUG_NAME>
  <DRUG_ID>5967</DRUG_ID>
  <DRUG_TYPE>1</DRUG_TYPE>
  <GMP_TYPE>1</GMP_TYPE>
  <GMP_NUMBER>G3662</GMP_NUMBER>
  <GMP_ENDDATE>2010-12-13 00:00:00</GMP_ENDDATE>
  <CONTROLID>1</CONTROLID>
  <INNERID>33824</INNERID>
  <THESTATE>1</THESTATE>
  <COMMODITYIDFORTC>53829</COMMODITYIDFORTC>
  <ISBASECOMM>0</ISBASECOMM>
  </COMMODITY>
  </NewDataSet>
利用sax读取xml 内容是有技巧性的 比如如何处理 相同名字的节点  中文的话 他会一个字一个字的度的问题
一下代码给出解释:
package cn.net.tc.yjj.core.utils;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import cn.net.tc.yjj.system.dto.Commodity;
import cn.net.tc.yjj.system.dto.Intent;
import cn.net.tc.yjj.system.dto.PublicMember;
import cn.net.tc.yjj.system.dto.PublicMemberInfo;
import cn.net.tc.yjj.system.dto.SendProduct;

抱歉!评论已关闭.