Tuesday 30 November 2010

XML pretty print without parsing

I was working on some XML generators and one of the methods receives the XML as a string which is written to a file. The xml string however is not formatted and hence the output looks clumsy. There are several pretty print methods available but all of them parses the string to xml for formatting. I did not want to want to a use a heavyweight process like parsing just for formatting. So this piece of code uses regular expression to format the the String as an xml.

public static String prettyPrintXMLAsString(String xmlString) {
/* Remove new lines */
xmlString.replaceAll("\n", "");
StringBuffer xmlFinal = new StringBuffer();
/* Grooup the xml tags */
Pattern p = Pattern
.compile("(<[^/][^>]+>)?([^<]*)(</[^>]+>)?(<[^/][^>]+/>)?");
Matcher m = p.matcher(xmlString);
int tabCnt = 0;
while (m.find()) {
/* Groups return null as string when no match. So replace */
String str1 = (null == m.group(1) || m.group().equals("null")) ? ""
: m.group(1);
String str2 = (null == m.group(2) || m.group().equals("null")) ? ""
: m.group(2);
String str3 = (null == m.group(3) || m.group().equals("null")) ? ""
: m.group(3);
String str4 = (null == m.group(4) || m.group().equals("null")) ? ""
: m.group(4);


printTabs(tabCnt, xmlFinal);
if (!str1.equals("") && str3.equals("")) {
++tabCnt;
}
if (str1.equals("") && !str3.equals("")) {
--tabCnt;
xmlFinal.deleteCharAt(xmlFinal.length() - 1);


}


xmlFinal.append(str1);
xmlFinal.append(str2);
xmlFinal.append(str3);
/* Handle <mytag/> king of tags*/
if (!str4.equals("")) {
xmlFinal.append("\n");
printTabs(tabCnt, xmlFinal);
xmlFinal.append(str4);
}
xmlFinal.append("\n");
}
return xmlFinal.toString();
}


private static void printTabs(int cnt, StringBuffer buf) {
for (int i = 0; i < cnt; i++) {
buf.append("\t");
}




However stuff like CDATA might not work here..