1. Member
Join Date
Oct 2011
Posts
2
Rep Power
0

## Data Sorting Questions

I'm working with list of 100k Doubles with 8 digits to the right of the decimal point. All the numbers are between 0 and 1. My goal is to sort them as quickly as possible. I'm using a modified version of quicksort with insertion sort once the data is sorted to within blocks of 300. I thought I was doing well until I saw other people accomplishing the same task in under 10% of the time it's taking me. Even a standard mergesort (Arrays.sort()) is working somewhere near my time. I just don't see how I'm getting so dominated. Does anyone have any ideas for how to improve my sorting process?

2. Moderator
Join Date
Jul 2010
Location
California
Posts
1,642
Rep Power
6

## Re: Data Sorting Questions

Does anyone have any ideas for how to improve my sorting process?
Without code, not really.

Even a standard mergesort (Arrays.sort()) is working somewhere near my time
Read the API for Arrays...Arrays.sort() is a modified Quicksort. You have provided little information for anyone to analyze your algorithm (I presume you are writing the sort yourself...something you don't explicitly mention) as well as how you are benchmarking the sorts.

3. ## Re: Data Sorting Questions

Originally Posted by Ion4
I'm working with list of 100k Doubles with 8 digits to the right of the decimal point. All the numbers are between 0 and 1. My goal is to sort them as quickly as possible. I'm using a modified version of quicksort with insertion sort once the data is sorted to within blocks of 300. I thought I was doing well until I saw other people accomplishing the same task in under 10% of the time it's taking me. Even a standard mergesort (Arrays.sort()) is working somewhere near my time. I just don't see how I'm getting so dominated. Does anyone have any ideas for how to improve my sorting process?
Can you give some (percentages of) times? I used a heapsort algorithm (see one of my blogs here in the forum) in an interpreted functional language of mine and it was a breeze to sort 100,000 numbers; there must be an error in your sorting method.

kind regards,

Jos

4. Member
Join Date
Oct 2011
Posts
2
Rep Power
0

## Re: Data Sorting Questions

Originally Posted by doWhile
Read the API for Arrays...Arrays.sort() is a modified Quicksort. You have provided little information for anyone to analyze your algorithm (I presume you are writing the sort yourself...something you don't explicitly mention) as well as how you are benchmarking the sorts.
I am writing the sort myself. I guess I assumed Arrays.sort(Double[] a) and Arrays.sort(Object[] a) were the same but after checking the API you're right. One is quicksort and the other is merge sort. I feel like I could sort either as Strings or Doubles. I chose Doubles but I plan to use the same process on strings and see how efficient it is.

Originally Posted by JosAH
Can you give some (percentages of) times? I used a heapsort algorithm (see one of my blogs here in the forum) in an interpreted functional language of mine and it was a breeze to sort 100,000 numbers; there must be an error in your sorting method.

kind regards,

Jos
Running the sort on 100k number a specified number of times (which I'm not told) my result is around 400 milliseconds. Others accomplishing the same sort with correct outputs are sorting in 30-40 milliseconds. My process is an in-place quicksort until blocks are sorted within groups of 300. The quicksort is also using a median of three pivots modification to select a better pivot. Finally I call insertion sort on the entire array at this point since no number can be farther than 300 away from it's correct position. This was the best I could do using a combination of these sorts. I was going to try radix sort but I feel like the high constant here wouldn't make it worth sorting on the order of n. If you can provide any advice, that would be great. I'll give heapsort a try later today.

5. ## Re: Data Sorting Questions

Read my blog; it describes a heapsort implementation in great detail and has all the code in it (it isn't much). The implementation knows (almost) nothing about what it has to sort; that is left to an interface that has to supply the number of things to be sorted; it has to compare two things and it has to swap two things. If that is too slow for you, you can always optimize the interace out of the code (so it can only sort an array of doubles). The sorting algoritm itself is O(n*log(n)), always, so it is theoretically optimal.

kind regards,

Jos
Last edited by JosAH; 10-07-2011 at 08:43 PM.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•