logo separator

[mkgmap-dev] splitter that generates problem list

From GerdP gpetermann_muenchen at hotmail.com on Sat Nov 3 20:35:19 GMT 2012

Hi all,

I am about to finish coding the new splitter algorithm.
I am quite happy with the results now, so I'd like to ask you for testing
it.
Activate it with the parameter
--keep-complete

I've also changed the handling of the user written problem file, so please
check this as well.

I've committed the changes as r212 on the problem-list branch.
When you test the --keep-complete parameter, please try also using
--overlap=0.
Be careful when you use large input files like planet or europe on a 32bit
machine!
Even if you use a split-file with only a few small tiles, the new algorithm
has to save 
a few bits for every node and way. I suggest to use osmconvert with the -b
or -B parameter first to cut out a reasonable part of planet. 

Reg. performance:
Runtime is much higher for large input files because r212 with
--keep-complete has to read the input at least two times more often than
r202. I tried to keep heap requirements small, but you should expect
problems processing files > 1.5GB on a 32bit system.
With --overlap=0 and --keep-complete the size of the output files is almost
the same. 

Any comments are welcome.

Gerd 
P.S.
For those that are interested:
The problem list generator works like this (well, I hope it does):
- calculate the bbox of the used areas 
- calculate additional "pseudo-areas" to cover the whole planet (these can
be very large)
- for each node: calculate the area that contains it and save it in a map
- for each way: get the areas of each point. Save a list of all areas and
save it in a map
- for each relaton: get the areas of each member 
Each relation or way that is written to more than one area is a potential
problem case, but only if at least one of the areas is a real writer area
(and not a pseudo-area) or if the bounding box of the areas intersects with
at least one real area. 
When this information is calculated, we have our problem list and can start
a more detailed analyses using the real coordinates. This is done in the
MultiTileProcessor. It starts reading (and storing) all relations. This
requires a lot of heap. Next, it marks all elements (also sub relations) of
the problem rels also as problem cases (recursively). When this is done, it
marks the parent rels that contain problem rels also as problem rels. Now it
releases the storage for all non-problem-rels and starts to read the  ways
that are marked as problem cases. For each way, it marks all nodes as
problem cases. Next, it starts to collect the coordinates of all problem
nodes (this also requires a lot of heap and is subject of tuning). 
Next it reads again all problem ways and calculates the exact bbox of each
way as well as the real writer areas. For a closed way, it will also
recognize fully enclosed areas. Both the area info as well as the writer set
info is stored. Next it processes the stored problem relations to calculate
the needed writers and stores them. To make sure that all elements of a
relation are written to all tiles it combines the writer sets of the
relation with the previously calculates writers and stores them.
Finally it reads again the ways to update the writer sets of the problem
nodes. 
Now the normal split process starts to write the output files, and whenever
it finds a node or way that is in the special lists it will write it to more
tiles.




--
View this message in context: http://gis.19327.n5.nabble.com/splitter-that-generates-problem-list-tp5734014.html
Sent from the Mkgmap Development mailing list archive at Nabble.com.



More information about the mkgmap-dev mailing list