logo separator

[mkgmap-dev] New splitter version, big memory savings

From Felix Hartmann extremecarver at googlemail.com on Thu Sep 3 13:49:39 BST 2009

Really great,

what we would need now is a possibitlity to split by countries.
e.g. taking europe.osm.bz2 and splitting it into all major states, this 
would avoid having to use the tiles from geofabrik which cannot be 
merged without having broken routing at the frontiers.

Has anyone any idea how we could do that?

It's not only for osm but also important for srtm (.hgt) files, or .hgt 
converted to osm with srt2osm.

Osmosis is not really the good tool to do that (it breaks routing at the 
tile boarder), but it has the possibility to use bounding polygons for 
cutting out pieces.

Chris Miller wrote:
> I've just checked in a new version of the splitter (r84) that requires far 
> less memory and also performs slightly better during the first stage of the 
> split. As an example, it used to take about a 5GB heap to generate areas.list 
> for the whole planet, but now only takes around 300MB(!). An additional advantage 
> is that as the planet grows in size and complexity going forwards, the memory 
> required during the first stage will not increase.
>
> This change should finally mean that anyone is able to split the planet even 
> on a relatively low end machine (though be prepared for a long wait!). If 
> you try but run out of memory during the second stage of the split (ie after 
> areas.list has been generated), reduce the value of --max-areas. This will 
> reduce the memory required during that second stage, at the cost of additional 
> passes over the data (if you do require multiple passes then I highly recommend 
> you also use the --cache option, it can make a huge difference to performance).
>
> One possible downside to this new version is the algorithm that decides how 
> to split up the map has been changed somewhat. This results in slightly different 
> tile layouts compared to the old algorithm. This shouldn't be too noticable 
> unless perhaps it puts a tile boundary right through somewhere you care about 
> when the old algorithm didn't. Of course, it may also work in your favour 
> for the same reason! My experiments have shown the number of tiles generated 
> increases by about 2% with the new approach which I think is a small price 
> to pay for the huge memory saving.
>
> I've put some example kml files online here that show the before and after 
> effects of the change:
>
> UK  --max-nodes=1600000
> old splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-original.kml&z=3
> new splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-density.kml&z=3
>
> Europe  --max-nodes=1600000
> old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-original.kml&z=3
> new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-density.kml&z=3
>
> Europe  --max-nodes=300000
> old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-original.kml&z=3
> new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-density.kml&z=3
>
> Planet  --max-nodes=1600000
> old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-original.kml&z=3
> new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-density.kml&z=3
>
>
> If you aren't happy with the new tiles, or simply want to compare the new 
> with the old, there's a parameter --legacy-mode=true that will generate areas.list 
> using the old approach. Assuming there aren't any serious problems that come 
> to light I'll be removing that parameter in a future build.
>
>
> Where to from here? As some of you may have guessed, this new splitter is 
> based on a 'density map' as discussed in earlier mails. Currently it maps 
> the density of nodes only, and the generated map is only held in memory long 
> enough to calculate the tile boundaries. I indend to write this map out to 
> disk so it can be used by external tools, or reused on successive runs of 
> the splitter to allow extremely quick generation of areas.list with different 
> --max-nodes settings. After that I hope to tackle the quite difficult problem 
> (in terms of performance and memory overhead) of generating density maps 
> for ways and relations too. Once we have density maps for all three element 
> types it will hopefully be possible to generate tiles that are as big as 
> possible but still avoid giving 'Map to big' messages. Another related area 
> I'm starting to look at is new/improved algorithms for arranging the tiles 
> so they eg avoid putting boundaries through the middle of cities, or reduce 
> the overall number of tiles. Any ideas here would be appreciated.
>
> Enjoy!
> Chris
>
>
>
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
>   



More information about the mkgmap-dev mailing list