native impls of HashMap / HashSet / ArrayList
It was suggested by @jvasileff that the performance of HashMap is limited by the use of our cross-platform Array abstraction. If so, then it should be possible to improve performance on the JVM by creating a native("jvm") implementation of HashMap that uses a Java array directly. We should try that, and see how much of a difference it makes. If it makes a major difference, we would do the same for HashSet and ArrayList.
We should also try creating native("jvm") implementations of these classes which simply wrap Java's HashMap / HashSet / ArrayList, and see how that impacts performance.
See http://www.nurkiewicz.com/2014/04/hashmap-performance-improvements-in.html also, which discusses perf in case of high hash collisions and usage of trees rather than lists of buckets in the case. Perhaps related? No idea.
IMO the first thing to do would be to check the HashMap source code for algo difference before trying to imagine non-algo reasons.
What's the definition of "major"? It seems we can get pretty close to Java's impls without giving up cross platform code.
trees ... Perhaps related?
For the recent work, I'm guessing not, since we're testing with classes that have good hash implementations. But we could find out for sure by seeing if any bin winds up with more than 8 items.
One advantage Java has for Maps (not Sets) is that Map.Entry is an interface, so they can have the Entry be the actual cell instead of having a cell that holds an entry. http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/java/util/HashMap.java#805
Edit: although, perhaps we should store values directly in cells too, and only generate entries if/when we need them?
One advantage Java has for
Maps (notSets) is thatMap.Entryis an interface, so they can have theEntrybe the actual cell instead of having a cell that holds an entry.
What operations does this affect? Iteration of Entrys would be the only thing, no?
I think so.
I'd lean towards storing items directly in Cells, and generating Entrys when needed. I suppose we could also cache Entrys, but not sure that would be better. This would also give us the option of implementing HashSet in terms of HashMap, as the JDK does.
Moving to 1.2.3