logo separator

[mkgmap-dev] Increasing the performance of whole-planet splits.

From Chris Miller chris_overseas at hotmail.com on Fri Jun 11 01:08:40 BST 2010

Hello Scott,

> I envisioned it as an alternate format that could exist in parallel
> with the osm, but also be smaller and faster. I still need to clean up
> the osmosis patches and turn them into a plugin.

OK sounds like we're both thinking along the same lines then. Note that the 
current cache is basically an alternate format too which is (mostly) interchangable 
with an .osm file. That's why I suggested swapping the current cache with 
your binary format as an option.

> What harder problems do you think the splitter needs solved?

One example is the current handling of ways and relations that cross tile 
boundaries. This is tackled via the overlap handling, which is really just 
a hack to try to hang on to enough information so mkgmap will always be able 
to eg cut ways correctly as they cross the boundaries. For the most part 
it works well, but there are still cases where it isn't enough. Ideally for 
a given tile we'd hang on to ALL the nodes that made up (certain) ways and 
relations, regardless of how far outside the tile the nodes fell. That would 
eliminate some problem cases that mkgmap currently has to guess at fixing, 
or can't deal with at all. Hanging on to all the nodes is hard, because on 
a dataset the size of the planet it's not easy to quickly find all the nodes 
(including their associated metadata) that belong to a given way or relation.

> The reason for this was the costs of initializing were showing up in
> the profile.

Do you mean the cost of initializing the Element instances? If so then yes, 
that's something that I'd caught previously in profiling too, hence why they 
were being reused and reset (prior to r109 that is :).

> May I suggest holding off on this for a bit? The branch that adds
> support for reading the binary file format refactors this interface so
> that node/way/rel objects are passed directly instead of using an
> interface that looks like an XML callbacks.

Sure thing, I'll get the other patches tested and checked in first.

> Good catch. Should I respin the patch or will you?

I've already made the change locally, and given that it's an improvement 
over r109 I'll check it anyway but I think it might be worth revisiting this 
later to see if it's worth recycling/pooling the Element objects somehow. 
The threading & buffering obviously make this trickier and maybe it'll prove 
not to be worth it.

> josm already includes a quadtree implementation. I wish it could be
> disentangled and re-used across the different OSM applications. It'd
> be a perfect solution here.

Not quite perfect I think... a quadtree doesn't allow overlapping rectangles 
whereas an r-tree does, which would allow us to deal with the overlap much 
more efficiently.

> Perfection isn't necessary.
> How about a 64x64 array. Divide the world into 4096 regions. Each
> region contains a list of all areas --- including their overlap ---
> that intersect it. Then, for each node, you only need to iterate the
> OSMWriters in that region. Getting the round-offs right would be
> tricky, but it should be good enough filter out almost all
> non-matching areas.

Agreed, what you describe would still be a good improvement. This is basically 
how the DensityMap works already so that could probably/possibly be tweaked 
and reused. I still prefer the raw performance (and challenge!) of the r-tree 
approach but realistically it'll be some time before I get around to implementing 
that, so would be more than happy to go with the above in the meantime.

> Thanks for a useful program. Glad I could help make it better.

Don't thank me, thank Steve for starting this project in the first place! 
I'm much like you, just got involved because I had an itch to scratch, and 
it looked like an interesting problem :)

Chris






More information about the mkgmap-dev mailing list