I'm trying to implement an effcient priority queue in Java with the VanEmdeBoas (VEB) layout (only the data layout, not the data structures). Can anyone help me?

My goal is to aim at cache performance: the memory access patterns should be pretty good, however after several attempts of optimizing my implementations, none has a better cache behavior and better running time than the native Java PriorityQueue, in fact, my implementation if a binary heap is better, which is weird because (I think) VEB should cause less cache misses than binary heaps, since it assumes a blocked nature and should cause more cache alignment.

I don't know if there are many more details I could advance: I'm using an explicit Java array[] to store the nodes.

Thanks in advance.