Results 1 to 3 of 3
  1. #1
    munish is offline Member
    Join Date
    Jul 2009
    Posts
    37
    Rep Power
    0

    Default Parse XML SAX with no </> (end tags) for child nodes.

    Hi ,

    I am trying to parse an xml like below:
    <top>

    <num> Number: 301
    <title> International Organized Crime

    <desc> Description:
    Identify organizations that participate in international criminal

    <narr> Narrative:
    A relevant document must as a minimum identify the organization and the
    type of illegal activity (e.g., Columbian cartel exporting cocaine).

    </top>


    <top>

    <num> Number: 302
    <title> Poliomyelitis and Post-Polio

    <desc> Description:
    Is the disease of Poliomyelitis (polio) under control in the
    world?

    <narr> Narrative:
    Relevant documents should contain data or outbreaks of the
    polio disease (large or small scale), medical protection

    </top>
    but getting following error while parsing with sax.
    The element type "narr" must be terminated by the matching end-tag "</narr>"
    code:
    public class QueryExpantion {
    void run1(){
    try {
    SAXParserFactory factory = SAXParserFactory.newInstance();
    SAXParser saxParser = factory.newSAXParser();
    DefaultHandler handler = new DefaultHandler() {
    boolean top = false;
    boolean desc = false;
    boolean num = false;
    boolean narr = false;
    String topString = "";
    String descString = "";
    String numString = "";
    String narrString = "";

    public void startElement(String uri, String localName,
    String qName, Attributes attributes)
    throws SAXException {
    if (qName.equalsIgnoreCase("top")) {
    top = true;
    }
    if (qName.equalsIgnoreCase("desc")) {
    desc = true;
    }
    if (qName.equalsIgnoreCase("num")) {
    num = true;
    }
    if (qName.equalsIgnoreCase("narr")) {
    narr = true;
    }
    }

    public void characters(char ch[], int start, int length)
    throws SAXException {
    if (top) {
    topString = new String(ch, start, length);
    }

    if (desc) {
    descString = new String(ch, start, length);
    }
    if (num) {
    numString = new String(ch, start, length);
    }

    if (narr) {
    narrString = new String(ch, start, length);
    }
    }

    public void endElement(String uri, String localName,
    String qName) throws SAXException {
    if (qName.equalsIgnoreCase("top")) {
    num = false;
    desc = false;
    narr = false;
    top = false;
    }
    }
    };

    saxParser.parse("/home/munish/Documents/trec-demo-master/test-data/topics.301-450", handler);

    } catch (Exception e) {
    e.printStackTrace();
    }
    }
    }
    So is it possible to parse this with SAX or i need to go for some different apporach

  2. #2
    masijade is offline Senior Member
    Join Date
    Jun 2008
    Posts
    2,571
    Rep Power
    9

    Default Re: Parse XML SAX with no </> (end tags) for child nodes.

    THAT is NOT proper XML. You could use an XML parser (MAYBE) to get the varying "top" groups, but the rest WILL NOT parse with any XML parser, and, because it contains a bunch of "open" tags, but no "close" tags, you probably can't even parse the "top" groups.

  3. #3
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,302
    Blog Entries
    7
    Rep Power
    20

    Default Re: Parse XML SAX with no </> (end tags) for child nodes.

    XML is not HTML; i.e. it is very 'strict' when it comes to matching open and close tags; e.g. in HTML you can make a mess out of it (some Lisps had a similar 'feature' where a single ']' matched all open'('); thank the gods for XMLs 'strictness'.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

Similar Threads

  1. Replies: 11
    Last Post: 02-08-2012, 02:43 PM
  2. Replies: 1
    Last Post: 07-14-2010, 08:58 AM
  3. parse XML tags (urgent)
    By Cylab in forum New To Java
    Replies: 5
    Last Post: 07-12-2010, 01:57 PM
  4. Replies: 3
    Last Post: 01-29-2010, 08:05 AM
  5. How to parse HTML tags
    By Ada in forum Advanced Java
    Replies: 1
    Last Post: 05-31-2007, 09:42 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •