logo separator

[mkgmap-dev] Address search and index.

From WanMil wmgcnfg at web.de on Tue Feb 15 18:16:45 GMT 2011

>> I propose two solutions.
>> 1. Quick fix
>> Use the Locator.xml to merge different notations of the same country /
>> region name. This won't be perfect but will probably fix the most
>> obvious problems for a first release of the index branch.
> This will probably have to be human-driven to have any chance of
> success. But yes, it could be done. It would get rid of the
> inconsistencies, but the problem of working out to which village a given
> street belongs, when there are two villages nearly touching - that will
> still remain.

Yes of course. It's intended for getting the index branch working 
without too many effort. It's not the premium solution.

>> 2. OSM boundary data (the general solution)
>> It sounds great to use the OSM boundary data but there are some pitfalls
>> we need to go around. I'll list the pitfalls here. Maybe someone finds
>> an easy solution for them.
>> 1st problem: Splitter (as you already mentioned)
>> The tiles do not contain the full information for multipolygons that
>> exceed the tile bounds. I don't think that this will be easy fixable.
>> You would need to implement a complete multipolygon handling in splitter
>> to decide which data must additionally added to a single tile. That's a
>> big deal and will consume lots of resources.
> It may be possible to pre-process the planet file to divide the world
> into (say) 1° by 1° squares and pre-tag each one with any outer
> bounding-polygon information which applies to the whole of that tile.
> When a splitter produces an output file it will know that level of
> information instantly and will only have to work hard to "shrink" any
> bounding-polygon whose border actually crosses the area of interest.
> But we must be doing this (or something like this) already for
> coastlines, yes?

I like the idea of creating smaller tiles with consistent information. I 
would not implement this in splitter. I have implemented the 
multipolygon algorithms for mkgmap and that was hard work until it 
reached the current quality. Handling boundary data is nothing else than 
multipolygon handling. We don't need to reinvent the wheel and can use 
mkgmap (at least the mkgmap codebase) to do that.
This is how it could work:
1. Use osmosis to filter the boundary information from a bigger planet 
2a. Maybe use osmosis again to separate different administration levels 
of the boundary data. I think the boundary file will be very quick too 
large to handle.
2b. Another way to be able to handle bigger boundary files is that we 
could implement a temporary data storage in files. The current problem 
of mkgmap is that it must read the entire file (points and ways) before 
it can start with the multipolygon processing. By swapping point and way 
information to disc it might be possible to work out bigger files. Mmmh 
... sounds like a database ... maybe a kind of very optimized database ...
3. mkgmap processes the boundary information and cuts that into tiles 
that are stored on disc. These boundary tiles need a format to which new 
data can be added step by step and that can be loaded quickly. I am not 
sure if OSM or PBF are a good choice for that.
4. When mkgmap processes the tiles it can load the relevant boundary 
data from disc and assign the information to the tile data. mkgmap 
already has a QuadTree implementation which can be used (with little 
changes) to performantly merge the boundary information with the tile data.

By the way: The coastlinefile processing has not such a mechanism. It 
simply loads all coastline data from one file and keeps that in memory. 
That's not a good choice for large areas. This could be tuned.

>> 2nd problem: Incomplete data
>> The boundary data has a similar structure to the coastline data. The
>> coastline processing is working now with mkgmap but the failure rate is
>> quite high. Only a single OSM data failure can cause the complete
>> workflow to fail.
> Yeah - this is indeed an issue. But any map is only as good as its data.
> If the data is wrong it must be fixed.


> I would propose a suite of
> sanity-checker programs that should be periodically run over the planet
> file looking for broken polygons.

Ok, that's not mkgmaps task. There are already some fine tools like the 
WayCheck suite. Maybe they could be tuned to do that. In the end people 
will set more value on valid boundary data if mkgmap makes use of it.

> It should be easier to maintain than
> the coastline data because the political (or adminstration??) bounding
> polygons should obey certain rules that could be checked automatically.

Can you give an example? I don't see why it should be easier than 
coastline checking.

>> 3rd problem: Amount of data
>> A solution for pitfall 1 (and 2) could be to provide quality checked
>> extra data containing boundary information only. This is already
>> available for the generate-sea processing. You can provide the coastline
>> data in a separate file. But the amount of data will be VERY high. I
>> don't think that it is a good thing to have minimum memory requirements
>> of some GB.
>> So in the end we would need to throw away the tile concept and implement
>> a database interface for mkgmap.
>> Maybe that's the solution?
> The outer boundary for a land-locked country will consist of a LOT of
> data, agreed. For island nations it will be vastly less because you can
> draw a crude polygon off-shore to encompass all the land area. And you'd
> have to do that, because you want to include all the minor off-shore
> islands. You want an extreme example, look at Greece! But the outer
> polygon for Greece might not be all that detailed yet still do the job
> correctly - except for the northern land-border of course which will be
> crazy.

Sounds easy although I have no idea how to put that into a good 
algorithm that does not exhaust common memory and processor configurations.
* How do you want to detect island nations?

> Steve

Have fun!

More information about the mkgmap-dev mailing list