logo separator

[mkgmap-dev] mkgmap crashing with non-OSM data

From Gerd Petermann gpetermann_muenchen at hotmail.com on Fri Dec 28 16:57:21 GMT 2018

Hi Patrick,

the TIGER data is full of wrong intervals. Sometimes even numbers are combined with addr:interpolation=odd, sometimes they produce duplicate numbers,
e.g. when one goes from 1500..1599 and another one from 1500..1740.
Those wrong intervals caused a lot of loops and also the error. With r4260 the performance should be better, but I did not yet work on a better detection of wrong data.
If you want to get an impresssion I suggest to enable logging with
for a single input file like that in this thread.
See https://wiki.openstreetmap.org/wiki/Mkgmap/dev#Enabling_Debugging for details.

You will find messages like these in the log:
INFO: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberGenerator  f:\dwnload\temp\test.osm: conflict caused by addr:interpolation way 107th Street West http://www.openstreetmap.org/way/-1960472598 40274..40382, step=2 and address element 40298(13) at 34.614524,-118.319053
WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberGenerator  f:\dwnload\temp\test.osm: addr:interpolation way 107th Street West http://www.openstreetmap.org/way/-1960472598 40274..40382, step=2 is ignored, it produces 1 duplicate number(s) too far from existing nodes


Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Patrick Simmons <linuxrocks123 at netscape.net>
Gesendet: Freitag, 28. Dezember 2018 07:49
An: mkgmap-dev at lists.mkgmap.org.uk
Betreff: Re: [mkgmap-dev] mkgmap crashing with non-OSM data


Thanks for getting back to me.  And, btw, I'm very sorry about the quadruple-post.  I wrote and use my own email client and it crashed upon sending my message to the list, and for some reason I'm not getting copies of my own posts, so I thought it hadn't gone through.  Then I checked the archive and ... oops.

Re 1: TIGER is a product of the US federal government, so it is public domain: no license is needed to use it in any way for any purpose.

Re 2: I'd be interested to know what happens in places where the TIGER data conflicts with OSM.  I agree it would suck to erase the contributions of OSM mappers and would like to avoid that if at all possible.  Ideally in the case of a conflict you'd just get two items in the search results very close to each other and could pick the better one.  We can check what's happening pretty easily if you know of a place in the US where OSM has a street address that mkgmap-created maps normally index and that differs from what's in TIGER: just send me the address, and I'll search for it on my device loaded with my shiny new maps and see what I get.

Re performance issue: for the whole US, it was taking about 48 hours using 3 threads on an i5-4460S, and about 3.33GB of RAM per thread.  I had to limit the number of threads used to three instead of four so that it wouldn't overflow the Java heap with -Xmx10000M, which was all the memory I had.  The first time I tried to make the maps (about 2 weeks ago now), I did some rudimentary profiling to make sure it wasn't infinite looping, and I seem to recall the place where it was taking a long time was in ExtNumbers.java in the for loop on lines 1135-1146.

My guess would be the problem would more likely be due to the added volume of data than the mixture of the data.  My script should be generating XML for parallel street address ways that is similar to how street numbers might exist in normal OSM data, but it is generating 50GB uncompressed of them.  You can download http://moongate.ydns.eu/tiger_versus_python/tiger_all.osc.xz if you'd like to take a look at it, but please wait about 3 hours after I send this email since my computer is currently generating and uploading an updated version of that file.


On Thu, 27 Dec 2018 22:26:19 -0700 (MST), Gerd Petermann <gpetermann_muenchen at hotmail.com> wrote:
> Hi Partrick,
> thanks for reporting, I can reproduce the problem and I'll try to fix it.
> Two remarks:
> 1) Please make sure that the TIGER licence allows to do this mixing of data
> 2) Please note that TIGER data is not really a good source for addresses and
> the mixture of OSM data and TIGER data are likely to decrease the quality in
> those places where they differ
> The data shows a performance problem in mkgmap (probably caused by this
> mixture), it takes very long to calculate the address data.
> Gerd
> --
> Sent from: http://gis.19327.n8.nabble.com/Mkgmap-Development-f5324443.html
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
MailTask: The Email Manager
GPLv3 software, beta maturity
mkgmap-dev mailing list
mkgmap-dev at lists.mkgmap.org.uk

More information about the mkgmap-dev mailing list