Results 1 to 4 of 4
  1. #1
    DerekRaimann is offline Member
    Join Date
    Dec 2010
    Posts
    45
    Rep Power
    0

    Default SAX parser not able to load XHTMLs that have DOCTYPE

    I have run into an issue where any xhtml file I try to parse with the following script causes it to freeze for some reason if it contains a DOCTYPE definition. Here is the code:

    Java Code:
    	import java.io.*;
    	import java.net.*;
    	import javax.xml.parsers.*;
    	import org.xml.sax.*;
    	import org.xml.sax.helpers.*;
    
    	/**
    	 * This program demonstrates how to use a SAX parser. The program prints all hyperlinks links
    	 * of an XHTML web page. <br>
    	 * Usage: java SAXTest url
    	 * @version 1.00 2001-09-29
    	 * @author Cay Horstmann
    	 */
    	public class SAXTest {
    		
    		public static void main(String[] args) throws Exception {
    						
    			String file = "";
    			if (args.length == 0) {
    				System.out.println("Usage: java SAXTest filename");
    				System.exit(1);
    			} else file = args[0];
    
    			DefaultHandler handler = new DefaultHandler() {
    				public void startElement(String namespaceURI, String lname, String qname, 
    					Attributes attrs) throws SAXException {
    					if (lname.equals("a") && attrs != null) {
    						for (int i = 0; i < attrs.getLength(); i++) {
    							String aname = attrs.getLocalName(i);
    							if (aname.equals("href")) System.out.println(attrs.getValue(i));
    						}
    					}
    				}
    			};
    			SAXParserFactory factory = SAXParserFactory.newInstance();
    			factory.setNamespaceAware(true);
    			SAXParser saxParser = factory.newSAXParser();
    			saxParser.parse(file, handler);
    		}
    	}
    The script takes one parameter when running it: the filename of the file to be parsed. If you have an <?XML ...?> line in the file, make sure there is no whitespace before it! That caused a whole series of errors to appear. Thanks for your help, I would really like to be able to get past this hangup!

    -Derek

  2. #2
    DerekRaimann is offline Member
    Join Date
    Dec 2010
    Posts
    45
    Rep Power
    0

    Default

    You are really on to something here. Could you elaborate on those exotic statements so I can get past this mysterious SAX error? How will power leveling my level 55 Shadow Priest that I no longer use because I quit playing WOW help me parse documents containing DOCTYPE in their contents? Thanks for your esoteric input!

    Sincerely,

    Derek Raimann

  3. #3
    DarrylBurke's Avatar
    DarrylBurke is offline Forum Police
    Join Date
    Sep 2008
    Location
    Madgaon, Goa, India
    Posts
    11,458
    Rep Power
    20

    Default

    Quote Originally Posted by DerekRaimann View Post
    You are really on to something here. Could you elaborate on those exotic statements so I can get past this mysterious SAX error? How will power leveling my level 55 Shadow Priest that I no longer use because I quit playing WOW help me parse documents containing DOCTYPE in their contents? Thanks for your esoteric input!

    Sincerely,

    Derek Raimann
    A tip: never respond to spam. It disappears after a while and you're left looking silly ;)

    db

  4. #4
    DerekRaimann is offline Member
    Join Date
    Dec 2010
    Posts
    45
    Rep Power
    0

    Red face

    Understood, however a universe devoid of silly people would be so crass ;)

    Yours truly,

    Derek Raimann

Similar Threads

  1. Replies: 6
    Last Post: 10-10-2008, 06:07 PM
  2. Replies: 0
    Last Post: 10-10-2008, 03:52 PM
  3. DNS name parser 1.2.1
    By JavaBean in forum Java Software
    Replies: 0
    Last Post: 07-14-2007, 09:21 PM
  4. XML Parser
    By samfuerte in forum XML
    Replies: 1
    Last Post: 07-14-2007, 05:14 PM
  5. Parser
    By Peter in forum Advanced Java
    Replies: 2
    Last Post: 07-04-2007, 08:08 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •