Pull解析方法给应用程序完全的控制文档该怎么样被解析。Android中对Pull方法提供了支持的API,主要是
org.xmlpull.v1.XmlPullParser; org.xmlpull.v1.XmlPullParserFactory;
二个类,其中主要使用的是XmlPullParser,XmlPullParserFactory是一个工厂,用于构建XmlPullParser对象。
应用程序通过调用XmlPullParser.next()等方法来产生Event,然后再处理Event。可以看到它与Push方法的不同,Push方法是由Parser自己主动产生Event,回调给应用程序。而Pull方法是主动的调用Parser的方法才能产生事件。
假如XML中的语句是这样的:"<author country="United States">James Elliott</author>",author是TAG,country是ATTRIBUTE,"James Elliott"是TEXT。
要想解析文档先要构建一个XmlPullParser对象
final XmlPullParserFactory factory = XmlPullParserFactory.newInstance(); factory.setNamespaceAware(true); final XmlPullParser parser = factory.newPullParser();
Pull解析是一个遍历文档的过程,每次调用next(),nextTag(), nextToken()和nextText()都会向前推进文档,并使Parser停留在某些事件上面,但是不能倒退。
然后把文档设置给Parser
parser.setInput(new StringReader("<author country=\"United States\">James Elliott</author>");
这时,文档刚被初始化,所以它应该位于文档的开始,事件应该是START_DOCUMENT,可以通过XmlPullParser.getEventType()来获取。然后调用next()会产生
- START_TAG,这个事件告诉应用程序一个标签已经开始了,调用getName()会返回"author";再next()会产生
- TEXT事件,调用getText()会返回"James Elliott",再next(),会产生
- END_TAG,这个告诉你一个标签已经处理完了,再next(),会产生
- END_DOCUMENT,它告诉你整个文档已经处理完成了。
除了next()外,nextToken()也可以使用,只不过它会返回更加详细的事件,比如 COMMENT, CDSECT, DOCDECL, ENTITY等等非常详细的信息。如果程序得到比较底层的信息,可以用nextToken()来驱动并处理详细的事件。需要注意一点的是TEXT事件是有可能返回空白的White Spaces比如换行符或空格等。
另外有二个非常实用的方法nextTag()和nextText()
nextTag()--首先它会忽略White Spaces,如果可以确定下一个是START_TAG或END_TAG,就可以调用nextTag()直接跳过去。通常它有二个用处:当START_TAG时,如果能确定这个TAG含有子TAG,那么就可以调用nextTag()产生子标签的START_TAG事件;当END_TAG时,如果确定不是文档结尾,就可以调用nextTag()产生下一个标签的START_TAG。在这二种情况下如果用next()会有TEXT事件,但返回的是换行符或空白符。
nextText()--它只能在START_TAG时调用。当下一个元素是TEXT时,TEXT的内容会返回;当下一个元素是END_TAG时,也就是说这个标签的内容为空,那么空字串返回;这个方法返回后,Parser会停在END_TAG上。比如:
<author>James Elliott</author> <author></author> <author/>
当START_TAG时,调用nextText(),依次返回:
"James Elliott"
""(empty)
""(empty)
这个方法在处理没有子标签的标签时很有用。比如:
<title>What Is Hibernate</title> <author>James Elliott</author> <category>Web</category>
就可以用以下代码来处理:
while (eventType != XmlPullParser.END_TAG) { switch (eventType) { case XmlPullParser.START_TAG: tag = parser.getName(); final String content = parser.nextText(); Log.e(TAG, tag + ": [" + content + "]"); eventType = parser.nextTag(); break; default: break; } }
这就要比用next()来处理方便多了,可读性也大大的加强了。
最后附上一个解析XML的实例Android程序
import java.io.IOException; import java.io.InputStream; import org.xmlpull.v1.XmlPullParser; import org.xmlpull.v1.XmlPullParserException; import org.xmlpull.v1.XmlPullParserFactory; import android.util.Log; public class RssPullParser extends RssParser { private final String TAG = FeedSettings.GLOBAL_TAG; private InputStream mInputStream; public RssPullParser(InputStream is) { mInputStream = is; } public void parse() throws ReaderBaseException, XmlPullParserException, IOException { if (mInputStream == null) { throw new ReaderBaseException("no input source, did you initialize this class correctly?"); } final XmlPullParserFactory factory = XmlPullParserFactory.newInstance(); factory.setNamespaceAware(true); final XmlPullParser parser = factory.newPullParser(); parser.setInput(mInputStream); int eventType = parser.getEventType(); if (eventType != XmlPullParser.START_DOCUMENT) { throw new ReaderBaseException("Not starting with 'start_document'"); } eventType = parseRss(parser); if (eventType != XmlPullParser.END_DOCUMENT) { throw new ReaderBaseException("not ending with 'end_document', do you finish parsing?"); } if (mInputStream != null) { mInputStream.close(); } else { Log.e(TAG, "inputstream is null, XmlPullParser closed it??"); } } /** * Parsing the Xml document. Current type must be Start_Document. * After calling this, Parser is positioned at END_DOCUMENT. * @param parser * @return event end_document * @throws XmlPullParserException * @throws ReaderBaseException * @throws IOException */ private int parseRss(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException { int eventType = parser.getEventType(); if (eventType != XmlPullParser.START_DOCUMENT) { throw new ReaderBaseException("not starting with 'start_document', is this a new document?"); } Log.e(TAG, "starting document, are you aware of that!"); eventType = parser.next(); while (eventType != XmlPullParser.END_DOCUMENT) { switch (eventType) { case XmlPullParser.START_TAG: { Log.e(TAG, "start tag: '" + parser.getName() + "'"); final String tagName = parser.getName(); if (tagName.equals(RssFeed.TAG_RSS)) { Log.e(TAG, "starting an RSS feed <<"); final int attrSize = parser.getAttributeCount(); for (int i = 0; i < attrSize; i++) { Log.e(TAG, "attr '" + parser.getAttributeName(i) + "=" + parser.getAttributeValue(i) + "'"); } } else if (tagName.equals(RssFeed.TAG_CHANNEL)) { Log.e(TAG, "\tstarting an Channel <<"); parseChannel(parser); } break; } case XmlPullParser.END_TAG: { Log.e(TAG, "end tag: '" + parser.getName() + "'"); final String tagName = parser.getName(); if (tagName.equals(RssFeed.TAG_RSS)) { Log.e(TAG, ">> edning an RSS feed"); } else if (tagName.equals(RssFeed.TAG_CHANNEL)) { Log.e(TAG, "\t>> ending an Channel"); } break; } default: break; } eventType = parser.next(); } Log.e(TAG, "end of document, it is over"); return parser.getEventType(); } /** * Parse a channel. MUST be start tag of an channel, otherwise exception thrown. * Param XmlPullParser * After calling this function, parser is positioned at END_TAG of Channel. * return end tag of a channel * @throws XmlPullParserException * @throws ReaderBaseException * @throws IOException */ private int parseChannel(XmlPullParser parser) throws XmlPullParserException, ReaderBaseException, IOException { int eventType = parser.getEventType(); String tagName = parser.getName(); if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_CHANNEL.equals(tagName)) { throw new ReaderBaseException("not start with 'start tag', is this a start of a channel?"); } Log.e(TAG, "\tstarting " + tagName); eventType = parser.nextTag(); while (eventType != XmlPullParser.END_TAG) { switch (eventType) { case XmlPullParser.START_TAG: { final String tag = parser.getName(); if (tag.equals(RssFeed.TAG_IMAGE)) { parseImage(parser); } else if (tag.equals(RssFeed.TAG_ITEM)) { parseItem(parser); } else { final String content = parser.nextText(); Log.e(TAG, tag + ": [" + content + "]"); } // now it SHOULD be at END_TAG, ensure it if (parser.getEventType() != XmlPullParser.END_TAG) { throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?"); } eventType = parser.nextTag(); break; } default: break; } } Log.e(TAG, "\tending " + parser.getName()); return parser.getEventType(); } /** * Parse image in a channel. * Precondition: position must be at START_TAG and tag MUST be 'image' * Postcondition: position is END_TAG of '/image' * @throws IOException * @throws XmlPullParserException * @throws ReaderBaseException */ private int parseImage(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException { int eventType = parser.getEventType(); String tag = parser.getName(); if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_IMAGE.equals(tag)) { throw new ReaderBaseException("not start with 'start tag', is this a start of an image?"); } Log.e(TAG, "\t\tstarting image " + tag); eventType = parser.nextTag(); while (eventType != XmlPullParser.END_TAG) { switch (eventType) { case XmlPullParser.START_TAG: tag = parser.getName(); Log.e(TAG, tag + ": [" + parser.nextText() + "]"); // now it SHOULD be at END_TAG, ensure it if (parser.getEventType() != XmlPullParser.END_TAG) { throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?"); } eventType = parser.nextTag(); break; default: break; } } Log.e(TAG, "\t\tending image " + parser.getName()); return parser.getEventType(); } /** * Parse an item in a channel. * Precondition: position must be at START_TAG and tag MUST be 'item' * Postcondition: position is END_TAG of '/item' * @throws IOException * @throws XmlPullParserException * @throws ReaderBaseException */ private int parseItem(XmlPullParser parser) throws XmlPullParserException, IOException, ReaderBaseException { int eventType = parser.getEventType(); String tag = parser.getName(); if (eventType != XmlPullParser.START_TAG || !RssFeed.TAG_ITEM.equals(tag)) { throw new ReaderBaseException("not start with 'start tag', is this a start of an item?"); } Log.e(TAG, "\t\tstarting " + tag); eventType = parser.nextTag(); while (eventType != XmlPullParser.END_TAG) { switch (eventType) { case XmlPullParser.START_TAG: tag = parser.getName(); final String content = parser.nextText(); Log.e(TAG, tag + ": [" + content + "]"); // now it SHOULD be at END_TAG, ensure it if (parser.getEventType() != XmlPullParser.END_TAG) { throw new ReaderBaseException("not ending with 'end tag', did you finish parsing sub item?"); } eventType = parser.nextTag(); break; default: break; } } Log.e(TAG, "\t\tending " + parser.getName()); return parser.getEventType(); } }