Good day!

When working with this XML document:

<?xml version="1.0" encoding="UTF-8"?> <response> <state>200</state> <error></error> <result> <user name="parent" title="NONE"> <roles> <item>parent</item> </roles> </user> </result> </response> 

I have some problems with parsing it. Here is how I pull information from it:

(I use the DocumentBuilderFactory-> DocumentBuilder-> Document classes for working with XML)

 ... DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dbuilder= dfactory.newDocumentBuilder(); byte[] bytes = xmltext.getBytes(); // xmltext - текст XML InputStream is = new ByteArrayInputStream(bytes); Document xmldoc = dbuilder.parse(is); xmldoc.getDocumentElement().normalize(); ... NodeList d = xmldoc.getDocumentElement().getElementsByTagName("roles"); String ut=d.item(0).getFirstChild().getNodeValue(); ... 

As a result, I get an empty string in the variable "ut". And in it should be the text "parent".

Why it happens? Thank.

  • And here is Android? Removed the appropriate label. - a_gura

1 answer 1

Formatting elements (tabs, spaces, line breaks) are also the children of the node. In this case, the first descendant of node d is the newline character and a few indentation spaces. The next child is the item element itself. In order to avoid such a situation, it is possible during iteration through the tree to check whether the received element implements the Element interface — then this is a node.

  • Thank you very much! Did not know about such a bad thing in XML. It’s just interesting to whom it occurred to consider the space as a descendant of a node!? After all, it is useless. - AseN
  • Authors of the XML specification, probably. By the way, if my memory serves me, then you can set up the parser so that it skips all the whitespace characters. I do not remember exactly, maybe it depends on the specific implementation of the parser. In general, Google in your hands, go for it! - fori1ton