Tuesday, July 1, 2008

DOM automatically interpret html entities to the characters which entities represent

After the DOM parser retrieved out the content of an element by getTextContent() . The html entities were automatically translated into the characters.

So far, I didn't find a way to disable the automatic translation. So I am using org.apache.commons.lang.StringEscapeUtils to translate the special characters back to html entities.

This solution is really not a solution.

No comments: