现在的位置: 首页 > 综合 > 正文

Matlab命令系列之XML读写:xmlread,xmlwrite

2013年10月22日 ⁄ 综合 ⁄ 共 7231字 ⁄ 字号 评论关闭

XML文档是用来组织和展示有结构的数据的文档格式,Matlab本身有两个函数支持对XML文档的读取和写入,使用起来很方便,没有难度;难度在于如何使用函数返回的变量。两个XML读写的函数是:xmlreadxmlwrite。这两个函数是基于DOM(Document Object Model)的,xmlread的输出变量和xmlwrite的输入变量都是DOM的node。DOM是一种基于Java的对象,对象都是有属性和方法的,本篇文章主要是介绍这些属性和方法,在介绍之前,先介绍下两个函数的使用方法,再具体介绍DOM的相关细节。

1 XML的读写

语法:

DOMnode= xmlread(filename)
str = xmlwrite(DOMnode)
xmlwrite(filename,DOMnode)

通过其语法规则可以很方便的读写XML,但是要进一步使用读入的结果,就要进一步了解DOM的细节。

2 XML DOM

在DOM中,XML中的每个元素都被看做一个节点(node),访问该节点的属性和方法服从一定的标准,将在下一节讲到。下面通过实例来说明节点的类型:

<productinfo
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="http://www.mathworks.com/namespace/info/v1/info.xsd">

<!-- This is a sample info.xml file. -->

<list>

<listitem>
<label>Import Wizard</label>
<callback>uiimport</callback>
<icon>ApplicationIcon.GENERIC_GUI</icon>
</listitem>

<listitem>
<label>Profiler</label>
<callback>profile viewer</callback>
<icon>ApplicationIcon.PROFILER</icon>
</listitem>

</list>
</productinfo>

(1)Element nodes: 对应于标签(tag)的名称。如上例中的productinfo、list、listitem、label、callback、icon。

(2)Text nodes:element nodes中所包含的value。如上例中第一个element node label所包含的“ Import
Wizard
”,即是一个text node。

(3)Attribute nodes:标签的一对前括号中所包含的名字和值。如上例中的“xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance”,xmlns:xsi是该attribute
node的名字,http://www.w3.org/2001/XMLSchema-instance是该node的值。

(4)Comment nodes:xml文档中其他的文本,如<!-- This is a sample info.xml file. -->

(5)Document nodes:对应整个xml文档,这种节点可以产生新的、以上所有的节点。

3 DOM的属性与方法

    DOM中包含了不同的接口对象,用以描述不同的数据,摘要如下:

Interface Summary
Attr The Attr interface represents an attribute in an Element object.
CDATASection CDATA sections are used to escape blocks of text containing characters that would otherwise be regarded as markup.
CharacterData The CharacterData interface extends Node with a set of attributes and methods for accessing character data in the DOM.
Comment This interface inherits from CharacterData and represents the content of a comment, i.e., all the characters between the starting ' <!--' and ending '-->'.
Document The Document interface represents the entire HTML or XML document.
DocumentFragment DocumentFragment is a "lightweight" or "minimal" Document object.
DocumentType Each Document has a doctype attribute whose value is either null or a DocumentType object.
DOMConfiguration The DOMConfiguration interface represents the configuration of a document and maintains a table of recognized parameters.
DOMError DOMError is an interface that describes an error.
DOMErrorHandler DOMErrorHandler is a callback interface that the DOM implementation can call when reporting errors that happens while processing XML data, or when doing some other processing (e.g.
DOMImplementation The DOMImplementation interface provides a number of methods for performing operations that are independent of any particular instance of the document object model.
DOMImplementationList The DOMImplementationList interface provides the abstraction of an ordered collection of DOM implementations, without defining or constraining how this collection is implemented.
DOMImplementationSource This interface permits a DOM implementer to supply one or more implementations, based upon requested features and versions, as specified in .
DOMLocator DOMLocator is an interface that describes a location (e.g.
DOMStringList The DOMStringList interface provides the abstraction of an ordered collection of DOMString values, without defining or constraining how this collection is implemented.
Element The Element interface represents an element in an HTML or XML document.
Entity This interface represents a known entity, either parsed or unparsed, in an XML document.
EntityReference EntityReference nodes may be used to represent an entity reference in the tree.
NamedNodeMap Objects implementing the NamedNodeMap interface are used to represent collections of nodes that can be accessed by name.
NameList The NameList interface provides the abstraction of an ordered collection of parallel pairs of name and namespace values (which could be null values), without defining or constraining how this collection is implemented.
Node The Node interface is the primary datatype for the entire Document Object Model.
NodeList The NodeList interface provides the abstraction of an ordered collection of nodes, without defining or constraining how this collection is implemented.
Notation This interface represents a notation declared in the DTD.
ProcessingInstruction The ProcessingInstruction interface represents a "processing instruction", used in XML as a way to keep processor-specific information in the text of the document.
Text The Text interface inherits from CharacterData and represents the textual content (termed character data in XML) of an Element or Attr.
TypeInfo The TypeInfo interface represents a type referenced from Element or Attr nodes, specified in the schemas associated with the document.
UserDataHandler When associating an object to a key on a node using Node.setUserData() the application can provide a handler that gets called when the node the object is associated to is being cloned, imported, or renamed.

 

由于DOM对象太多,只简单介绍常用的几种,其余可以参考文献3。

(1)Node对象(继承自DOM的最原始的Parent:IIOMetadataNode)是整个DOM的superinterface,很多对象都是从Node继承而来,包含了Node的基本属性(Field)和方法(method):

Field:

  • static shrot COMMENT_NODE : 表示Comment node
  • static shrot DOCUMENT_NODE:表示Document node
  • static shrot ELEMENT_NODE:    表示Element node
  • static shrot TEXT_NODE:            表示Text node

Method:

  • Node  getFirstChild():  得到该节点的第一个child,返回类型为Node
  • Node  getLastChild():  得到该节点的最后一个child,返回类型为Node
  • String getNodeName():获取节点名字,返回类型为string
  • short  getNodeType(): 获取该节点的类型
  • String getTextContent(): 获取该节点下包括的所有Text节点的value
  • boolean hasAttributes(): 检查该节点(必须是Element node)是否有属性
  • boolean hasChildNodes() 检车该节点是否有子节点

(2)Document对象(继承自Node)用来表示整个文档,是整个xml文档树形结构的根节点(root),提供了访问给文档的入口,是xmlread返回的对象类型:

  • DocumentType getDoctype(): 返回DTD定义的文档类型
  • NodeList    getElementsByTagName(String tagname): 返回所有标签名字是tagname的Element nodes

(3)NodeList对象(继承自DOM的最原始的Parent:IIOMetadataNode包含多个nodes,相当于一种容器:

  • int  getLength():NodeList列表中Nodes的数目
  • Node item(int index):返回第index个node,从0开始计数

4 范例

function theStruct = parseXML(filename)
% PARSEXML Convert XML file to a MATLAB structure.
try
   tree = xmlread(filename);
catch
   error('Failed to read XML file %s.',filename);
end

% Recurse over child nodes. This could run into problems 
% with very deeply nested trees.
try
   theStruct = parseChildNodes(tree);
catch
   error('Unable to parse XML file %s.',filename);
end


% ----- Subfunction PARSECHILDNODES -----
function children = parseChildNodes(theNode)
% Recurse over node children.
children = [];
if theNode.hasChildNodes
   childNodes = theNode.getChildNodes;
   numChildNodes = childNodes.getLength;
   allocCell = cell(1, numChildNodes);

   children = struct(             ...
      'Name', allocCell, 'Attributes', allocCell,    ...
      'Data', allocCell, 'Children', allocCell);

    for count = 1:numChildNodes
        theChild = childNodes.item(count-1);
        children(count) = makeStructFromNode(theChild);
    end
end

% ----- Subfunction MAKESTRUCTFROMNODE -----
function nodeStruct = makeStructFromNode(theNode)
% Create structure of node info.

nodeStruct = struct(                        ...
   'Name', char(theNode.getNodeName),       ...
   'Attributes', parseAttributes(theNode),  ...
   'Data', '',                              ...
   'Children', parseChildNodes(theNode));

if any(strcmp(methods(theNode), 'getData'))
   nodeStruct.Data = char(theNode.getData); 
else
   nodeStruct.Data = '';
end

% ----- Subfunction PARSEATTRIBUTES -----
function attributes = parseAttributes(theNode)
% Create attributes structure.

attributes = [];
if theNode.hasAttributes
   theAttributes = theNode.getAttributes;
   numAttributes = theAttributes.getLength;
   allocCell = cell(1, numAttributes);
   attributes = struct('Name', allocCell, 'Value', ...
                       allocCell);

   for count = 1:numAttributes
      attrib = theAttributes.item(count-1);
      attributes(count).Name = char(attrib.getName);
      attributes(count).Value = char(attrib.getValue);
   end
end

参考:

    1 matlab help

    2 http://www.academictutorials.com/xml-dom/xmldom-introduction.asp

    3 http://download.oracle.com/javase/6/docs/api/

抱歉!评论已关闭.