Results 1 to 9 of 9
  1. #1
    ennuages is offline Member
    Join Date
    Nov 2010
    Posts
    1
    Rep Power
    0

    Exclamation Picking a Collection and Design for querying 1G Point objects

    Hi everyone, any chance I could get some design tips?

    I have one million Point objects where the x and y values are floats with 2 decimal places. These points exist somewhere between a top left corner of (0,0) and a bottom right corner(400,200).

    When the user specifies a top left and bottom right Point(for a rectangle) (no slanted diaganol lines), I have to return all of my Point objects that are inside of the Rectangle specified by the user - in less than a second.

    I tried making 2 TreeSets -- one ordered by X and one ordered by Y, where they both use the same Point objects, (I'm avoiding instantiating double the points). The 1st treeset is instantiated without problems, but when my application attempts instantiating the 2nd TreeSet, I get an out-of Heap memory exception. If that were to have worked correctly instead, I was planning on using the "subset" method in TreeSet to get all the Point objects in the user's selected X range, .. and then vice-versa on the "Y" treeset for the points matching the user's specified Y rectangle values,.. and then I could take the intersection of those two result lists to arrive at a list with the points inside the user's specified rectangle.

    Tweaking the JVM memory is Not going to be an option.

    Is there a different design I should be using?
    Thanks so much

  2. #2
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    18

    Default

    Quote Originally Posted by ennuages View Post
    Tweaking the JVM memory is Not going to be an option.
    Why not?
    That seems a pretty silly restriction to place on this...

    Have you looked at where all you r memory is going?
    How big the TreeSets actually are?

  3. #3
    masijade is offline Senior Member
    Join Date
    Jun 2008
    Posts
    2,571
    Rep Power
    8

    Default

    Uhm, why not a list in a list (or a 2-dimensional array rather than a 2 dimensional list) with "non-used" points having a null reference? Then simply pull out the indexes that correspond to the x and y coordinates involved?

  4. #4
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    18

    Default

    How would you convert the float coordinates to int indexes?

  5. #5
    masijade is offline Senior Member
    Join Date
    Jun 2008
    Posts
    2,571
    Rep Power
    8

    Default

    Sorry, didn't notice float, but he said, specifically "with two decimal points", so x 100.

  6. #6
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    18

    Default

    Good point.
    I just took in the float aspect and missed the two decimal places bit.
    :)

    I do wonder how big a List that would make, though. If they're already suffering an OOM with two TreeSets containing only the Points that exist, then adding in (albeti null) references for 40000 x 20000...800 million references?

  7. #7
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    12,999
    Blog Entries
    7
    Rep Power
    19

    Default

    Maybe the following will fit: for 20000x40000 points (== 8E8 points) a bitset (one bit per point) takes up 1E8 bytes (100 MB). The bit is zero when a point at that coordinate doesn't exist. If the space is (extremely?) sparse this probably won't be a suitable solution.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  8. #8
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    11,450
    Rep Power
    18

    Default

    We don't know what the space restrictions are, only that they are sufficiently low to prevent two TreeSets containing the same million objects.
    We know the million objects fit (since they can create one TreeSet), but an additional million references (plus whataver overhead there is for a TreeSet) kills it. So I suspect an additional 100Mb would also kill it.

  9. #9
    Join Date
    Nov 2010
    Posts
    1
    Rep Power
    0

    Default Using an array can be an efficient solution with memory and performance

    See the code below which uses both Treeset and array.

    import java.awt.Point;
    import java.io.BufferedReader;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;
    import java.io.IOException;
    import java.util.ArrayList;
    import java.util.Arrays;
    import java.util.Comparator;
    import java.util.Date;
    import java.util.HashSet;
    import java.util.Iterator;
    import java.util.Random;
    import java.util.Set;
    import java.util.SortedSet;
    import java.util.StringTokenizer;
    import java.util.TreeSet;

    class ABC {
    public static final int TT_EOL = 10;
    }

    class TestSwitch {
    public static void main(String args[]) {
    System.out.println("Treeset *************");
    treeSetOper();
    System.out.println("Arry *************");
    arrayOper();

    }

    public static void treeSetOper(){


    TreeSet<Point> parray1 = new TreeSet<Point>(new XComparator());
    TreeSet<Point> parray2 = new TreeSet<Point>(new YComparator());

    for(int i = 0 ; i < 1000000 ; i++){
    Random r1 = new Random();
    Point p = new Point(r1.nextInt(40000),r1.nextInt(20000));
    parray1.add(p);
    }

    Iterator<Point> i = parray1.iterator();
    while(i.hasNext()){
    Point p = i.next();
    parray2.add(p);
    }
    Point inputTop = new Point(19980,9980);
    Point inputBottom = new Point(20020,10020);


    long before = System.currentTimeMillis();
    SortedSet<Point> subsetx = parray1.subSet(new Point(19980,0), new Point(20020,20000));
    SortedSet<Point> subsety = parray2.subSet(new Point(0,9980), new Point(40000,10020));
    Set<Point> intersection = new HashSet<Point>(subsetx);
    intersection.retainAll(subsety);
    long after = System.currentTimeMillis();
    System.out.println("Size : " + intersection.size());
    System.out.println("Time : " + (after-before));

    }

    public static void arrayOper(){

    long before = System.currentTimeMillis();
    Point[] parray1 = new Point[1000000];
    //Point[] parray2 = new Point[1000000];

    for(int i = 0 ; i < 1000000 ; i++){
    Random r1 = new Random();
    Point p = new Point(r1.nextInt(40000),r1.nextInt(20000));
    //Point p = new Point(1,1);
    parray1[i]=p;
    //parray2[i]=p;
    }



    //Arrays.sort(parray1,new XComparator());
    //Arrays.sort(parray2,new YComparator());

    Point inputTop = new Point(19980,9980);
    Point inputBottom = new Point(20020,10020);
    ArrayList<Point> intersection = new ArrayList<Point>();
    for(int i = 0 ; i < 1000000 ; i++){
    Point p = parray1[i];
    if((p.x >=19980 && p.x <= 20020) && (p.y >=9980 && p.y <= 10020)){

    intersection.add(p);
    }


    }

    long after = System.currentTimeMillis();
    System.out.println("Size : " + intersection.size());
    System.out.println("Time : " + (after-before));

    }


    }

    class XComparator implements Comparator<Point>{

    public int compare(Point o1, Point o2) {
    int retval = 0;
    int x = o1.x-o2.x;
    if(x < 0){
    return -1;

    }else if(x==0){


    }else if(x>0){
    return +1;
    }

    int y = o1.y-o2.y;
    if(y < 0){
    return -1;

    }else if(y==0){


    }else if(y>0){
    return +1;
    }


    return retval;
    }


    }

    class YComparator implements Comparator<Point>{

    public int compare(Point o1, Point o2) {

    int retval = 0;
    int y = o1.y-o2.y;
    if(y < 0){
    return -1;

    }else if(y==0){


    }else if(y>0){
    return +1;
    }

    int x = o1.x-o2.x;
    if(x < 0){
    return -1;

    }else if(x==0){


    }else if(x>0){
    return +1;
    }

    return retval;
    }


    }

Similar Threads

  1. Picking randomly
    By AbdulAziz Bader in forum New To Java
    Replies: 17
    Last Post: 06-02-2010, 12:39 PM
  2. Help needed with querying collections
    By Jawaharlal in forum Advanced Java
    Replies: 5
    Last Post: 04-28-2010, 04:43 PM
  3. iterating through a collection of objects
    By Scotty Boy in forum New To Java
    Replies: 0
    Last Post: 04-10-2008, 01:28 AM
  4. querying russian data from db problem
    By mr_empty in forum JDBC
    Replies: 0
    Last Post: 03-04-2008, 07:56 AM
  5. 3D Duke picking his nose and contemplating on Ruby
    By Mark in forum Reviews / Advertising
    Replies: 2
    Last Post: 12-28-2007, 03:00 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •