Results 1 to 5 of 5
  1. #1
    SnakeDoc is offline Senior Member
    Join Date
    Apr 2012
    Posts
    129
    Rep Power
    0

    Question Make Transformer Stream file output ???

    Hello!

    I currently am using DOM to parse a very large XML file (about 25MB). I'm quite aware that SAX parser may be better for this since it does not load the entire XML into memory, however this is the way i've built it.

    My problem is, I am outputting a new XML file based on a bunch of logic i've constructed in my program. Currently it seems that my output is being built in Memory, and then when the program completes, it outputs the entire new XML file at once. I think it would be faster, and use considerably less resources if i could stream the new XML while the program is running... however am unsure of how to go about this.

    My program is spit up into multiple methods, etc, so i'm not really sure how to make them all stream to the same file in the correct order, etc.

    Some of my code:

    method that starts the creation of the XML:
    Java Code:
    public static void createXMLHead() {
    		try {
    			dbfac = DocumentBuilderFactory.newInstance();
    			docBuilder = dbfac.newDocumentBuilder();
    			doc = docBuilder.newDocument();
    			
    			root = doc.createElement("AmazonEnvelope");
    			doc.appendChild(root);
    			
    			Attr attr = doc.createAttribute("xmlns:xsi");
    			attr.setValue("http://www.w3.org/2001/XMLSchema-instanc");
    			Attr attr2 = doc.createAttribute("xsi:noNamespaceSchemaLocation");
    			attr2.setValue("amzn-envelope.xsd");
    			
    			root.setAttributeNode(attr);
    			root.setAttributeNode(attr2);
    			
    			Element header = doc.createElement("Header");
    			root.appendChild(header);
    				
    				Element docVer = doc.createElement("DocumentVersion");
    				docVer.appendChild(doc.createTextNode("1.01"));
    				header.appendChild(docVer);
    				
    				Element merchIdent = doc.createElement("MerchantIdentifier");
    				merchIdent.appendChild(doc.createTextNode("1234566778"));
    				header.appendChild(merchIdent);
    				
    			Element messType = doc.createElement("MessageType");
    			messType.appendChild(doc.createTextNode("Inventory"));
    			root.appendChild(messType);
    			
    		} catch (Exception e) {
    			e.printStackTrace();
    		}
    	}
    method that creates the body of the XML, it loops a bunch of times depending on how many items i parse out of the original XML:
    Java Code:
    public static void createXMLBody(String itemSku, String itemAvailability) {
    		numOfItems++;
    		
    		Element message = doc.createElement("Message");
    		root.appendChild(message);
    		
    			Element messageId = doc.createElement("MessageID");
    			messageId.appendChild(doc.createTextNode(Integer.toString(numOfItems)));
    			message.appendChild(messageId);
    			
    			Element opType = doc.createElement("OperationType");
    			opType.appendChild(doc.createTextNode("Update"));
    			message.appendChild(opType);
    			
    			Element inventory = doc.createElement("Inventory");
    			message.appendChild(inventory);
    			
    				Element sku = doc.createElement("SKU");
    				sku.appendChild(doc.createTextNode(itemSku));
    				inventory.appendChild(sku);
    				
    				Element quantity = doc.createElement("Quantity");
    				quantity.appendChild(doc.createTextNode(itemAvailability));
    				inventory.appendChild(quantity);
    	}
    and lastly the method that saves the entire thing to an XML file:
    Java Code:
    public static void saveXML() {
        	String randomName = null;
        	String xmlName = null;
        	String AmzIUTimeStamp = null;
            try {
            	/////////////////
            	//Output the XML
    
            	//set up a transformer
            	TransformerFactory transfac = TransformerFactory.newInstance();
            	Transformer trans = transfac.newTransformer();
            	trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
            	trans.setOutputProperty(OutputKeys.INDENT, "yes");
            	//trans.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
            	doc.setXmlStandalone(true);
            	// formulate XML name
            	DateFormat df = new SimpleDateFormat("yyyyMMdd_hhmmss");  
         	    df.setTimeZone(TimeZone.getTimeZone("PST"));
         	    AmzIUTimeStamp = df.format(new Date());
            	randomName = Long.toHexString(Double.doubleToLongBits(Math.random()));
            	xmlName = "AMZ_IU_" + AmzIUTimeStamp + "_" + randomName + ".xml";
            	//create string from xml tree
            	StreamResult result = new StreamResult(new File("C:\\Documents and Settings\\username\\Desktop\\AMZ_TEST_DATA\\outbox\\" + xmlName));
            	DOMSource source = new DOMSource(doc);
            	trans.transform(source, result);
            } catch(Exception e) {
            	e.printStackTrace();
            }
    	}
    can anyone make any suggestions on how to go about this correctly, while still keeping my methods split up? thanks!

  2. #2
    Fubarable's Avatar
    Fubarable is offline Moderator
    Join Date
    Jun 2008
    Posts
    19,316
    Blog Entries
    1
    Rep Power
    26

    Default Re: Make Transformer Stream file output ???

    Quote Originally Posted by SnakeDoc View Post
    I currently am using DOM to parse a very large XML file (about 25MB). I'm quite aware that SAX parser may be better for this since it does not load the entire XML into memory, however this is the way i've built it.

    My problem is, I am outputting a new XML file based on a bunch of logic i've constructed in my program. Currently it seems that my output is being built in Memory, and then when the program completes, it outputs the entire new XML file at once. I think it would be faster, and use considerably less resources if i could stream the new XML while the program is running... however am unsure of how to go about this....
    I'm not the XML guru, so please take anything I say with a grain of salt, but when you state "large XML", and "desire to stream", I can't help but think of using a streaming XML parser such as SAX, such as you've mentioned, or perhaps even better, StAX. I'm curious why you must use DOM here and not one of the streaming parsers? A change might solve two birds with one stone.

  3. #3
    SnakeDoc is offline Senior Member
    Join Date
    Apr 2012
    Posts
    129
    Rep Power
    0

    Default Re: Make Transformer Stream file output ???

    hello Fubarable!

    thanks for the response...

    i would rather stick with DOM for now since i understand it decently and SAX is quite different syntax and brand new to me. besides i have a few hundred lines written using DOM right now and don't really want to start over using SAX lol. I'm not quite sure if be using DOM is even really an issue here, since i'm trying to stream my output to a file instead of dumping it all at once... as in as I create the line in my XML, i'd like it to go to a file instead of residing in memory and then writing to XML once the method is complete. I'm using DocumentBuilderFactory and DocumentBuilder to create the file coupled with Transformer... so not sure if that changes the case or not..

    thanks for any suggestions!

  4. #4
    Fubarable's Avatar
    Fubarable is offline Moderator
    Join Date
    Jun 2008
    Posts
    19,316
    Blog Entries
    1
    Rep Power
    26

    Default Re: Make Transformer Stream file output ???

    Of course SAX is useful for XML streaming parsing, but I'm pretty sure that it is not for writing an XML to file. I'm not sure about StAX in this regard. Edit: on review of sources, it does appear that you *can* use StAX to output XML, although I don't think that it format the output for pretty printing, but there are other utilities for that.
    Last edited by Fubarable; 06-29-2012 at 01:09 AM.

  5. #5
    SnakeDoc is offline Senior Member
    Join Date
    Apr 2012
    Posts
    129
    Rep Power
    0

    Default Re: Make Transformer Stream file output ???

    hmm... ya i'm more of trying to figure out a way to stream my output to xml more than modify how i'm parsing the original file. I'm using DocumentBuilderFactory, DocumentBuilder, and Transformer to output a "clean" looking XML file based on the logic in my program, but my problem is more along the lines that it only starts to write the file after it has been fully constructed inside memory, instead i'd like to have it start streaming the output to a file as its being created, so basically write the file in real-time. if i need to rewrite how i'm outputing my file, thats ok... but i'd really like to stay away from rewriting my parsing method because i don't think that really has much to do with my output (its just how i get data from the the original file so that i can populate my output file - and since DOM reads the entire original file into memory prior to me parsing and populating my output, it shouldn't be part of the problem).. maybe i'm wrong lol... i'm still very much so a newbie!

    Thanks again for any advice! :)

Similar Threads

  1. Input Output stream
    By kazuhiko in forum Networking
    Replies: 13
    Last Post: 03-15-2012, 04:43 PM
  2. Replies: 15
    Last Post: 04-12-2011, 03:42 PM
  3. transformer skipping file
    By twfurst in forum XML
    Replies: 0
    Last Post: 04-07-2011, 04:41 PM
  4. Replies: 3
    Last Post: 02-16-2009, 09:20 PM
  5. JSP- Binary output stream
    By Java Tip in forum Java Tip
    Replies: 0
    Last Post: 01-29-2008, 09:06 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •