Thread: toHexString optimization (in fact general optimization question)

1. Member
Join Date
Dec 2008
Posts
3
Rep Power
0

toHexString optimization (in fact general optimization question)

Hello,

I have to handle a situation where we export millions of data, after converting them according to user preferences.

Currently I'm working on some hexadecimal conversion I want to optimize:

Java Code:
String result = Constants.HEX_PREFIX + Integer.toHexString(aInt);
result = result.toUpperCase();
I've looked how the toHexString method was coded in the jdk (rt.jar, or src.zip):

Java Code:
final static char[] digits = {
'0' , '1' , '2' , '3' , '4' , '5' , '6' , '7' , '8' , '9' , 'a' , 'b' ,
'c' , 'd' , 'e' , 'f' , 'g' , 'h' , 'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
'o' , 'p' , 'q' , 'r' , 's' , 't' , 'u' , 'v' , 'w' , 'x' , 'y' , 'z'
};

public static String toHexString(int i) {
}

private static String toUnsignedString(int i, int shift) {
char[] buf = new char[32];
int charPos = 32;
int radix = 1 << shift;
do {
buf[--charPos] = digits[i & mask];
i >>>= shift;
} while (i != 0);

return new String(buf, charPos, (32 - charPos));
}
My intent is to copy/modify this code according to our special conversion need. The method would looks like that:

Java Code:
private static final char[] DIGITS = {'0', '1', '2', '3', '4', '5', '6', '7', '8',
'9', 'A', 'B', 'C', 'D', 'E', 'F', 'g', 'h', 'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};

private static String toHexString(int i)
{
char[] buf = new char[32];
int charPos = 32;
do
{
buf[--charPos] = DIGITS[i & 0xF];
i >>>= 4;
}
while (i != 0);

return new String(buf, charPos, (32 - charPos));
}
The reason of this post is that I fear the JVM could optimize the toHexString/toUnsignedString with a native method implementation (because doing this kind of conversion with "pure" java code seems to be unefficient) ?

Is it possible a JVM implementation could do that ?

If it's the case, replacing the toHexString() by a custom method would in fact slow the process down ...

What is your opinion ?

Thanks & regards.

PS: We are using the Oracle JRockit JVM, but I've looked in their rt.jar also, and it is the exact same method.

PS2: I know there are other possible optimization, like adding the HEX_PREFIX directly in our toHexString, and using directly the char[] object or passing a StringBuilder, but that isn't really the concern of this post

2. In my opinion, there is no telling what the JVM will optimize away,.... can you write this as a call to native and find some asm to do it? I do not think the JVM will tinker with mov eax,val; and so on....

In general, the only - an by experience the only thing to do is write some test harness as a module and unit test a very small portion of the code,... hopefully on a crash-box that is isolated from production platforms.

If I can figure out how to do jni, I would do some testing for you - I have to figure out how to do cli compiles anyway.

3. Member
Join Date
Dec 2008
Posts
3
Rep Power
0
Hmmm ...

I don't want to optimize the code using assembly myself in fact.

I just thought that the JVM (JRockit) "could" do it in some situation, using some internal native method. And in this case, it would be more efficient to use the original Integer.toHexString() (native/asm optimized transparently by the jvm) rahter than my own custom optimized - still java - code ...

Concerning the tests I could do, I don't think they can prove something, even by calling the code million of time, as I know the JVM take some time to learn and decide its optimizations in server mode.

4. Why are you concerned about optimizing it?

Originally Posted by Donald Knuth
Premature optimization is the root of all evil.
I have a hard time imagining any use case where the time taken to convert to hex is significant relative to other tasks, such as talking to a database.

5. Senior Member
Join Date
Nov 2008
Posts
286
Rep Power
10
Firstly to allay your fears, no -- the JVM won't replace non-native methods with its own secret code. What it will do, of course, is compile whatever Java code to native code as it sees fit. (I'm not too familiar with JRockit, but it probably has tuning parameters for e.g. how many executions of a method need to occur before it's compiled. I would have thought that "millions of times" would be enough.) JVMs can replace your code with their own "build-in" versions in the case of certain native methods that are part of the JDK. For example, ByteBuffer get() and put() operations can be resolved to single machine instructions, as can various of the methods in Math (Sun's JVM performs these optimisations; don't know about JRockit).

Now, as long as the code is running natively, you may have trouble actually measuring the CPU difference between the library version and your "optimised" version. As @fishtoprecords says, if in your code path you do anything other than "burn CPU", then the cost of generating a hex string (or several million hex strings) will probably pale into insignificance.

What your optimisation will do is lower object throughput: you generate fewer objects per hex string. But again, modern JVMs can generally handle an object throughput of millions and millions and millions of objects per second if they're simple things like strings.

If you profile your code and find that the hex conversion and other CPU-intensive tasks really ARE a bottleneck, then you might want to think about "optimising" in other ways, such as parallelising. But I'd wait to see the profile before hashing up too much of your code...

6. low mtbf v greymeat between the human ears

Yep, it's like ftr and Neil state. I did not go into that ( I was sure those issues would be reviewed ) but what see here on a deeper level is going to remarkable effort to get something to "full-tilt-boogie" and those who use the product are neither concerned with the matter nor will they allow your technical skills and deep tweaking to run .... suprisingly this is even if it ran, ran well in testing and deployment and proved robustness under loading with 100,000 hours of 000.00000000007654 statistical failure significance ( and recoverable failure at that ) they would still base decisions on matters more appropriate to Lorimar's production of "Dallas" I know, I have been there: A person who worked with K&R was going to get me a stint as an advisor on equipment selection on a major carrier, in Dallas no less. I tried to warn her, she had a Masters in CS but was not a Road Dog.

She took a 5-10 hour sweat your dress shirts out butt chewin.

This has got to be the most bizzare problem I know of in cs, I am throwing ai at it.

I have zero fantasies that that will be of any use.....

7. Member
Join Date
Dec 2008
Posts
3
Rep Power
0
Actually I didn't measure the CPU usage directly, but rather I try to export 1 million of data, and compare the time (ms) taken to achieve that.

For now it's only Research&Dev, not production, it's only a draft, so of course these measurments are approximative, but I try to run the test a few times before concluding anything.

Of course we do a lot other operations (multi-user system) that are very costly, but the export seems to take precedance over all, so I guess the less time it takes to finish, the sooner the other users could work.

Premature optimization is the root of all evil.
I agree, usually I don't bother doing too much optimization (well written code should be enough), but in this case there was a real problem, and I had to rewrite all the code. We do a lot data conversion (int to formatted date/time, int to locale formatted integer, telecom specific values to convert, enum to translate, ...) using user preferences, and by rewriting the code I went from 1000 rows / second to 20000 rows / second.

For more millions rows to export this improvement is a good news ...

But ... I must agree, that it seems the great part of improvment came from correcting real burden in the process (like reading the prefs for each rows ...) rahter than tweaking toHexString .... but this last days I'm completly obsessed by "optimization", and I hoped to gain a little more process time with ultimate tips ...

Thanks

Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•