Results 1 to 6 of 6
  1. #1
    sharath161 is offline Member
    Join Date
    Aug 2014
    Posts
    1
    Rep Power
    0

    Default Efficient xml parsing for count using Java hashmap?

    Hi all,

    Hope you all doing well.
    I am kind of new to Java. Learning slowly.
    I have a query.. hope you will be able to have a look at it.

    Have an xml like

    <ProductList>
    <Product Quality='Good' color='Blue'>
    <Details ItemID='1001'/>
    </Product>
    ...
    ..
    </ProductList>

    The <Product> tags can run into hundreds or thousands.
    Quality can have values like Good, Bad, Damaged. Color can also have various values (Blue, Red ..)
    ItemID can repeat or can be different.

    So I want group by ItemID->Color->Good count or BadCount and TotalCount as below.


    I want to get a sum like
    <ProductList>
    <ProductType ItemID="1001">
    <Product_Detail CountGood="1" CountBad="2" CountTotal="3" Type="Blue"/>
    </ProductType>
    ...
    ...
    </ProductList>

    I think using Hashmaps (Hashmap1 for ItemID, Hashmap2 for color, Hashmap3 for quality) may do the work. But I have to parse the xml number of ItemID's multiply-by number of colors multiply by various colors times for this.
    But let me know if you feel a better algorithm is there in performance perspective using Java.

    Regards,
    Sarat.

  2. #2
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    12,091
    Rep Power
    20

    Default Re: Efficient xml parsing for count using Java hashmap?

    If the output is more XML, personally I would look at xslt transformation.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  3. #3
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,098
    Rep Power
    6

    Default Re: Efficient xml parsing for count using Java hashmap?

    XLST might be an option although I would be clueless myself how to set it up such that the requirements are met. At least as a learning experience it is still worthwhile to pursue this in Java too.

    First of all: why are you already worrying about performance when you haven't even built it yet? First get it to work and then worry about performance, IF there is a problem you can actually fix. Probably most of the time spent by your application will be the parsing of the horribly verbose XML data, you can't do much about that.


    In any case you're using Java - use objects, not spread everything over different hashmaps. To be able to collect the required data and emit the XML you are aiming for you'd need your own class which looks like this:

    ItemDetail
    - itemId
    - goodCount
    - badCount
    - type

    No totalCount property is needed as that is simply goodCount + badCount apparently (then why would you want to emit it in the XML? Whatever logic reads the XML can sum the values too). You could indeed generate a hashmap with the itemId as the key to get easy access to the object as you collect product data.

    Java Code:
    Map<String, ItemDetail> itemDetails = new HashMap<>();
    Having that, we have to think about how to actually get the data from the XML. As you say the XML data can be huge, so it is not a good idea to tank the entire thing into memory at once using a DOM parser. That leaves running through the XML in a streaming fashion, which will boil down to using SAX or STAX to parse it on the fly. I'd look into STAX. Why StAX? (The Java™ Tutorials > Java API for XML Processing (JAXP) > Streaming API for XML)
    "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis

  4. #4
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    12,091
    Rep Power
    20

    Default Re: Efficient xml parsing for count using Java hashmap?

    Quote Originally Posted by gimbal2 View Post
    XLST might be an option although I would be clueless myself how to set it up such that the requirements are met. At least as a learning experience it is still worthwhile to pursue this in Java too.
    A DOM parser with a lot of grouping (assuming the IDs aren't in order).
    Done reporting stuff like this before now, and transformation is pretty zippy with it.
    I would (of course) have to actually look up the details, but it's not uncommon.

    The only potential drawback is it isn't scalable, as you get the inevitable DOM explosion. But a few thousand small records like the above? Shouldn't pose a problem.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  5. #5
    gimbal2 is offline Just a guy
    Join Date
    Jun 2013
    Location
    Netherlands
    Posts
    4,098
    Rep Power
    6

    Default Re: Efficient xml parsing for count using Java hashmap?

    oh oops, I totally misread that. Its hundreds or thousands, I read hundreds OF thousands :/
    "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis

  6. #6
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    12,091
    Rep Power
    20

    Default Re: Efficient xml parsing for count using Java hashmap?

    Now you've made me think maybe that "or" is a typo...:)
    Please do not ask for code as refusal often offends.

    ** This space for rent **

Similar Threads

  1. count every 45 records and prfix the count number
    By dkr786 in forum New To Java
    Replies: 8
    Last Post: 02-15-2013, 07:31 PM
  2. Replies: 7
    Last Post: 05-16-2012, 07:41 PM
  3. Replies: 1
    Last Post: 05-16-2012, 07:40 PM
  4. Little help with java count++
    By ls7897 in forum New To Java
    Replies: 4
    Last Post: 11-23-2010, 04:01 AM
  5. Replies: 7
    Last Post: 12-08-2009, 07:17 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •