logo separator

[mkgmap-dev] Memory limits for mkgmap and splitter

From Chris Miller chris.miller at kbcfp.com on Tue Aug 4 15:58:26 BST 2009

> What currently happens when not enough memory is available is ofcourse
> that the heap is getting swapped in and out to disk by the os' memory
> manager. This is so slow that it's not workable. Do you think that
> doing this intelligently in Splitter will be much faster? I.e. would
> it actually be of any real use to switch to temporary disk storage for
> multiple gigabytes of data?

Definitely. Having knowledge of the algorithm in use and having control over 
exactly what is written out and when will make a huge difference since care 
can be taken to ensure disk access is kept as low as possible. It's still 
going to come at a cost, but it should be much less than letting the OS try 
and guess what the best thing to do is.

> I assume the actual splitting uses the most memory of the two stages?

Yes, generating areas.list takes less than half the memory of the splitting 
stage (with the latest code). It's tricky to reduce the second stage memory 
usage much further, so swapping some information to disk is one of the few 
alternatives. Multiple parsing runs could be made over the planet file instead 
however I can't see that offering very good performance.

> Ah, this might explain the 'POI only' tiles I have without good reason
> in the NE USA and Canada. Is there a possibility for a quick fix for
> this behavior, as this would be most welcome...? :-)

Not really... currently the limitation is that each node, way and relation 
can only belong to a maximum of 4 areas. This is because during the split 
each area is given an ID from 1-255 (8 bits) and up to 4 areas are squeezed 
into a single 32 bit integer for each node/way/relation. This is really important 
to save memory for the nodes and ways, so to support more than 255 areas 
would require a lot more memory. What happens with > 255 areas is that multiple 
areas map to the same 8 bits and so get mixed up with each other without 
warning - there's no bounds checking on the bit manipulation. About the only 
'quick fix' is to refuse to split more than 255 areas, or to split them in 
multiple full passes which will take significantly longer. In the meantime 
your best bet is to split your areas.list file into two by hand, making sure 
there's < 256 areas in each. Then run the splitter twice, once for each area 
file you created.

Currently there's another limitation in that a relation can only appear in 
a maximum of 4 areas - hence all the "Relation 123 in too many areas" message 
you probably see when splitting. The result of this is that the relation 
only gets written to the first 4 areas it encounters and won't appear in 
any additional ones. I think I can fix this one fairly easily, I'll take 
a look in the next few days.

Chris






More information about the mkgmap-dev mailing list