Results 1 to 8 of 8
  1. #1
    tim's Avatar
    tim
    tim is offline Senior Member
    Join Date
    Dec 2007
    Posts
    435
    Rep Power
    7

    Default Incremental Deletion

    Hi everyone

    I'm looking for any ideas on Incremental Deletion with the min() aggregate. This is not a homework question if you're wondering. ;) Anyways, here it goes:

    Let's say that we have a set of numbers and a data structure is used to store the minimum value such it can be recalculated after incremental deletion. In other words, if I've got one billion values, I don't want to rescan all of them to find the minimum.

    I thought of storing the values in a binary search tree as the set grows. When a value is removed the minimum is just the left most leaf. It will not be O(logn) if it's not balanced however. I'm wondering if anyone has a better idea. ;)

    Thank you :p
    Tim
    Last edited by tim; 05-17-2010 at 03:27 PM.
    Eyes dwelling into the past are blind to what lies in the future. Step carefully.

  2. #2
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,528
    Blog Entries
    7
    Rep Power
    20

    Default

    Store them in a SortedMap (a TreeMap) where the key is the number itself and the value is the frequency the number occurrred in your sequence. I wonder if it can store a billion numbers though ...

    kind regards,

    Jos

  3. #3
    tim's Avatar
    tim
    tim is offline Senior Member
    Join Date
    Dec 2007
    Posts
    435
    Rep Power
    7

    Default

    Hi Jos

    So the SortedMap implements a tree structure sorted on keys. And if an element is removed and the frequency is zero, what then? How would you find the new minimum? This kind of problem is in terms of data warehouses, so one billion should be reasonable for a parallel machine. ;)

    Thank you. ;)
    Tim
    Last edited by tim; 05-17-2010 at 02:47 PM.
    Eyes dwelling into the past are blind to what lies in the future. Step carefully.

  4. #4
    tim's Avatar
    tim
    tim is offline Senior Member
    Join Date
    Dec 2007
    Posts
    435
    Rep Power
    7

    Default

    Okay, we can have a frequency value to reduce the tree size. Now there will be a node for each unique value. Nodes can be stored in an AVL tree.

    Wikipedia
    In computer science, an AVL tree is a self-balancing binary search tree
    Looks like an improvement. Any other ideas? ;)
    Eyes dwelling into the past are blind to what lies in the future. Step carefully.

  5. #5
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,528
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by tim View Post
    Okay, we can have a frequency value to reduce the tree size. Now there will be a node for each unique value. Nodes can be stored in an AVL tree.

    Wikipedia


    Looks like an improvement. Any other ideas? ;)
    Not needed; those TreeMaps are implemented by red-black trees. Also the API allows an efficient way to get the leftmost node (the smallest element). Those AVL trees are old fashioned ;-)

    kind regards,

    Jos

  6. #6
    tim's Avatar
    tim
    tim is offline Senior Member
    Join Date
    Dec 2007
    Posts
    435
    Rep Power
    7

    Default

    Hi Jos

    Ah, yes. Red-black trees have the balanced height property. A tree seems to be the way to go then. I was hoping for some simple method similar to binning or something, but for min/max problems, the leaves are handy. Keeping a sorted list is out of the question. :D But anyways.

    Thank you Jos
    Tim
    Eyes dwelling into the past are blind to what lies in the future. Step carefully.

  7. #7
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,528
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by tim View Post
    Hi Jos

    Ah, yes. Red-black trees have the balanced height property. A tree seems to be the way to go then. I was hoping for some simple method similar to binning or something, but for min/max problems, the leaves are handy. Keeping a sorted list is out of the question. :D But anyways.

    Thank you Jos
    Tim
    You're welcome of course; if you don't mind: why do you need to keep a billion numbers?

    kind regards,

    Jos

  8. #8
    tim's Avatar
    tim
    tim is offline Senior Member
    Join Date
    Dec 2007
    Posts
    435
    Rep Power
    7

    Default

    Hi Jos

    I just thought of some hypothetical scenario. A data cube constructed from 1 TB of data will have large aggregates. I'm not sure if a billion could be in a cell, but the main issue is scalability.

    Thanks for the ideas. ;)
    Tim
    Eyes dwelling into the past are blind to what lies in the future. Step carefully.

Similar Threads

  1. regarding deletion of spoon-feeding post
    By Fubarable in forum Forum Lobby
    Replies: 7
    Last Post: 03-18-2010, 01:46 PM
  2. Replies: 7
    Last Post: 05-06-2009, 04:32 PM
  3. Deletion Servlet
    By Java Tip in forum Java Tip
    Replies: 0
    Last Post: 12-25-2007, 11:23 AM
  4. Massive Hit Deletion Help please
    By quickfingers in forum Java 2D
    Replies: 1
    Last Post: 12-24-2007, 07:49 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •