logo separator

[mkgmap-dev] Address & city country name assignment.

From Colin Smale colin.smale at xs4all.nl on Wed Feb 16 12:48:52 GMT 2011

If only it were easy to extract all the boundaries as a set of polygons. 
The boundary relations will need to be flattened to simple polygons. It 
might help if we could assume that a given location can only be in one 
polygon at each admin_level. The problem would then reduce to "which 
polygons contain this point", for several thousand polygons, and a bit 
of postprocessing. The polygon list could be published regularly, and an 
optimised lookup algorithm could be implemented and re-used where ever 
it is needed.

I once spent a small amount of time looking at this for national 
boundaries and I gave up at that time because it was getting too complex 
for the time I had available. For many countries it was OK but many 
countries (e.g. France) have relations and subrelations and possibly 
even subsubrelations with reuse of the boundary ways at different 
levels. The point I am trying to make is that this is a generic problem 
which would benefit from a generic solution.

Another use for this might be for territory defaults - drive-on-left, 
maxspeed on motorways, pedestrians on footpaths, etc etc.


On 16/02/2011 13:09, Dermot McNally wrote:
> It's amusing and not particularly surprising how, as soon as we have
> searchable maps, we discover the importance of having better
> addressing information about locations. So far a lot of a fundamental
> principles have been mentioned:
> * That using is_in information is easy, but not satisfactory, since
> it's often missing, inconsistent, poorly maintained and hard to use to
> infer a hierarchy of belonging (arbitrary bits of streets usually
> don't have it set, so how do you  make a best guess of what nearby
> element should "own" it?
> * That boundary polygons are increasingly present on our map, that
> they can solve most of the problems of is_in, that they are already
> succeeding is_in for other address-sensitive applications in OSM, but
> that they are very hard to process as part of how mkgmap processes the
> map.
> I am convinced that is_in is never going to give us satisfactory
> results, that we cannot trust the values entered in that field by
> mappers and that, the more boundary polygons are used to solve other
> problems, the less is_in will even be maintained. I have not been
> entering is_in in my mapping for at least two years, at most I will
> correct entries by others.
> Mkgmap needs to, at those parts of the process where address hierarchy
> information is currently inferred, be capable of querying an external
> source to find the required information. Because at least some of my
> ideas for a possible source are a little cumbersome, it would probably
> be ideal if a number of options are permitted, rather like how drawing
> the sea is managed. One of the address lookup "plugins" would probably
> be the existing simple one based on is_in, for users who want to avoid
> extra prerequisites.
> So if that's what a simple, poorly-functioning address plugin looks
> like, what would the best one look like? Right now, the ultimate OSM
> geocoder is Nominatim. It is capable of consuming a place name or
> co-ordinate (of a road segment, say) and deducing an address
> hierarchy. It already uses the best clues available to do this -
> including both boundary polygons and is_in tags. And because an entire
> hierarchy is deduced, it offers us the flexibility to index locations
> under more than one hierarchy element, as many commercial Garmin maps
> seem to. For instance, my current location might reasonably be
> searched for under any of the following names in the city field:
> Dublin (city of which my location is a suburb)
> Dublin (historical county where I am located)
> Dublin 15 (postal district)
> Blanchardstown (Historical village and focus of modern suburb)
> and there are even sub-parts of Blanchardstown, typically
> corresponding to old rural "townlands" that might be searched for:
> Corduff, Ongar, Carpenterstown.
> Only the most disciplined maintainer of is_in will capture enough
> information to permit matching on all of these elements and there is
> no way sufficient consistency will exist. So a Nominatim lookup is the
> way to go, as we export all of the problems to an externally
> maintained tool.
> The snag: Even though Mapquest, who currently host the biggest public
> Nominatim instance, are very generous with the level of API lookups
> they allow there will be trouble if every mkgmap user performs
> thousands of Nominatim lookups when refreshing their Garmin maps. It
> will also be slow and bandwidth-intensive. This can be solved somewhat
> by having one's own instance of Nominatim, possibly containing only an
> interesting subset of the map. It would very likely prove worthwhile
> to define a cache file format into which to stuff those results of the
> query that mkgmap will require.
> If these cache files were maintained by country of bbox, they could be
> calculated centrally by people with sufficient hardware or expertise,
> then made available for download by normal users. This is a lot like
> what Steve suggests above, but without the expectation that mappers
> maintain the address file (because they just plain won't, and the
> required information is already available from Nominatim, so it would
> be a waste anyway).
> I'm interested in your comments on this. While to do what I describe
> certainly requires some hard work, it's all front-loaded, once we can
> find a working framework we never have to worry about it again. Well,
> not until Nominatim is superseded by an even more awesome geocoder.
> Dermot

More information about the mkgmap-dev mailing list