logo separator

[mkgmap-dev] splitter - performance and memory

From Chris Miller chris.miller at kbcfp.com on Sun Jul 19 23:28:28 BST 2009

> It all sounds good. Well worth trying each idea out to see if it makes
> an improvement.

OK thanks, I'll go ahead with some of those changes then and post a patch 
or two once I have something useful for you to look at. I've already created 
a SplitIntList to replace the SplitIntMap and am running it now with the 
whole dataset, 5GB heap. Will let you know what happens...  So far it's been 
running ~30 mins and appears to be using ~40% of the memory SplitIntMap was, 
which is about in line with what I'd expect. One CPU core is pegged at ~70%, 
I'm assuming due to the XML parsing though I haven't profiled it yet to verify 
that (I'm running against an uncompressed osm file). If that is indeed the 
case, the next thing I'll try is replacing SAX with XPP and see what improvement 
that brings.

> Although the .get() is not used at the moment I thinks that counting
> the ways in each area may be more accurate, in which case you will
> need the node-ids again.

I'm at a bit of a disadvantage here as I don't yet really know much about 
the file format (though I'm looking at http://wiki.openstreetmap.org/wiki/OSM_Protocol_Version_0.6 
and the raw XML currently). You're saying that by splitting based on the 
number of ways rather than number of nodes would provide a more balanced 
split? Hmm OK, in which case it will probably make more sense in the long 
run to build an index on disk. I'll hopefully get around to looking at doing 
something like that for the second step anyway and it should be reusable 
for both.

> I also have plans to remove the limitation that an element can only be
> in four areas.

OK - I haven't dug deep enough into the code to understand this limitation 
yet but I don't think any of the changes I'm considering will affect this.

Cheers,
Chris






More information about the mkgmap-dev mailing list