logo separator

[mkgmap-dev] Problems with latest splitter versions

From Michael Prinzing mipri at gmx.net on Sat Mar 5 00:12:38 GMT 2011

On Thu, 3 Mar 2011 21:16:26 -0600, Scott Crosby wrote:
>On Wed, Mar 2, 2011 at 5:36 PM, Michael Prinzing <mipri at gmx.net> wrote:

>> I am getting the splitter from svn, compiling and running it unter
>> Windows XP SP3 using JDK 1.6.0_25. There are 2 problems if I am using a
>> splitter version newer than r161 (tryed until r167 so far).
>>
>>
>> 1.) germany.osm.pbf from geofabrik cannot be split
>> --------------------------------------------------
>>
>
>BTW, there's no need to do two passes. Unlike prior versions, this
>version of the splitter will handle as many areas as you want to throw
>at it. I've done 6000 at once on my 4GB ram desktop.

This is a great improvement, especially on a system with only 1,7GB of
RAM like mine.


>Frantisek Mantlik tracked down and fixed the problem with threads=1.

Yes, it is working fine now, thanks.



>> 2.) File containing contour data cannot be processed
>> ----------------------------------------------------
>>
>> When generating my maps, I usually add data to draw contour lines. The
>> file containing that data is available from
>> http://osm.arndnet.de/contourdata.osm.gz (not my site!). It is in OSM
>> format 0.5.
>>
>
>> When buildig a map, I am  combining this data with the OSM data (please
>> note that the contour data file contains node IDs starting at 2^31, so
>> they collide with the IDs from "real" OSM data if used unchanged). But
>> even when I am trying to split this file without additional OSM data,
>> the same as above happens and the splitter stops at the end of the
>> first and only pass:
>
>This won't work. The splitter is written in Java, where most standard
>Java classes and collections don't support indices that are 64 bits.
>This means that ID's more than 2**31 are too big and will be treated
>as negative by Java and blow up this code. The current array resize
>logic further limits this to node ID's less than somewhere around
>1,750,000,000, but I just committed a fix to trunk that I think will
>extend this range to 1--1,999,999,999. You could help me find and fix
>any other issues if you could supply a small file with node ID's
>around 1,950,000,000.

Well, there was a typo in the description in my initial post, sorry for
that. The correct version is:

The original file contains IDs starting with 2^30 = 1073741824. Since
the beginning of this year they are colliding with the IDs OSM is
assigning to new nodes, so I decided to renumber the nodes in my file.
In order to have the gap between OSM IDs and my IDs as big as possible
and being able to do the change by a simple search and replace, I
changed the leading '1' in every node ID with a '2', so now my IDs are
starting from 2073741824. The file contains about 18,000,000 nodes, so
even the highest ID is less than 2^31. I thought this should be safe,
and the old splitter r161 was able to process the file.

I've tryed again now with the new splitter r170 and different node IDs 
and found the following:

If the node IDs are starting at 2073741824 (= 2^30 + 1e09) there is
still an exception. It happens immediately when the splitter begins to
write out the data. I've posted an example yesterday, but with the new
splitter there is an other exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index (32402221) is greater than or equal to list size (31250001)
        at it.unimi.dsi.fastutil.objects.ObjectArrayList.get(ObjectArrayList.java:258)
        at uk.me.parabola.splitter.SparseInt2ShortMapInline.put(SparseInt2ShortMapInline.java:128)
        at uk.me.parabola.splitter.SparseInt2ShortMultiMap$Inner.put(SparseInt2ShortMultiMap.java:81)
        at uk.me.parabola.splitter.SparseInt2ShortMultiMap.put(SparseInt2ShortMultiMap.java:31)
        at uk.me.parabola.splitter.SplitProcessor.writeNode(SplitProcessor.java:209)
        at uk.me.parabola.splitter.SplitProcessor.processNode(SplitProcessor.java:118)
        at uk.me.parabola.splitter.OSMParser.endElement(OSMParser.java:243)
        at uk.me.parabola.splitter.AbstractXppParser.parse(AbstractXppParser.java:57)
        at uk.me.parabola.splitter.Main.processMap(Main.java:399)
        at uk.me.parabola.splitter.Main.writeAreas(Main.java:355)
        at uk.me.parabola.splitter.Main.split(Main.java:188)
        at uk.me.parabola.splitter.Main.start(Main.java:116)
        at uk.me.parabola.splitter.Main.main(Main.java:105)


Next try with node IDs beginning at 1773741824. Again I get an
exception but this time after the splitter has written a part of the
output, not immediately as above. In this time, the splitter uses an
huge amount of memory, and so the exception says: 

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at it.unimi.dsi.fastutil.longs.LongArrays.ensureCapacity(LongArrays.java:107)
        at it.unimi.dsi.fastutil.longs.LongArrayList.ensureCapacity(LongArrayList.java:202)
        at it.unimi.dsi.fastutil.longs.LongArrayList.size(LongArrayList.java:271)
        at uk.me.parabola.splitter.SparseInt2ShortMapInline.resizeTo(SparseInt2ShortMapInline.java:97)
        at uk.me.parabola.splitter.SparseInt2ShortMapInline.put(SparseInt2ShortMapInline.java:125)
        at uk.me.parabola.splitter.SparseInt2ShortMultiMap$Inner.put(SparseInt2ShortMultiMap.java:81)
        at uk.me.parabola.splitter.SparseInt2ShortMultiMap$Inner.put(SparseInt2ShortMultiMap.java:79)
        at uk.me.parabola.splitter.SparseInt2ShortMultiMap.put(SparseInt2ShortMultiMap.java:31)
        at uk.me.parabola.splitter.SplitProcessor.writeWay(SplitProcessor.java:231)
        at uk.me.parabola.splitter.SplitProcessor.processWay(SplitProcessor.java:134)
        at uk.me.parabola.splitter.OSMParser.endElement(OSMParser.java:253)
        at uk.me.parabola.splitter.AbstractXppParser.parse(AbstractXppParser.java:57)
        at uk.me.parabola.splitter.Main.processMap(Main.java:399)
        at uk.me.parabola.splitter.Main.writeAreas(Main.java:355)
        at uk.me.parabola.splitter.Main.split(Main.java:188)
        at uk.me.parabola.splitter.Main.start(Main.java:116)
        at uk.me.parabola.splitter.Main.main(Main.java:105)

This looks like if the fix that should extend the possible range for
the IDs to 1,999,999,999 does not work yet. Unfortunately it is not
sufficient in this case to have just a few nodes and ways, so I cannot
provide a small piece of data to reproduce it. Of course I could send
you the whole file which has about 60MB in PBF format.

If I am using IDs beginning at 1573741824, everything works fine (even
with the same memory settings as before).

For now I can assign IDs that the splitter is able to handle, but
sooner or later this will be a problem. If OSM is growing like it did
in the last months, it will reach node IDs of 2e09 and above pretty
soon (next year). And the new version of srtm2osm is also generating
node IDs from 2e09 on upwards to avoid collisions with the OSM data.
While splitter r161 could handle this, it is not possible to process
this data with the new splitter versions.


Michael






More information about the mkgmap-dev mailing list