[mkgmap-dev] [PATCH v1] LocationHook speedup

Fri Dec 30 19:36:24 GMT 2011

Hi Gerd,

the results are a bit heterogeneous. In some cases the LocationHook 
finishes much quicker, in some cases it is slower. I have some ideas 
what can be improved, so I will change the patch and have a additional try.

> Hello WanMil,
>
> I also thought about reducing the CPU time in LocationHook, but with a
> different approach:
> I think a lot of time is used because the quadtree first adds elements to
> resultList, (a HashSet) and 2nd a loop takes all these elements to add tags.
>
> Two ideas:
> 1) Optimization:
> It might save time if we could pass a function pointer to the quadtree. The
> function does the tagging for one element, and we don't have to manage
> resultlists. I implemented that, but the program was slower. I assume that
> was caused by the fact that I did not implement the part that removes
> elements from the quadtree when they were "fully worked out".

I am not sure if managing the resultlists is slow. In the end it 
consists of adding elements to a HashSet (which I think should be quite 
fast?!)

>
> 2) An completely different approach:
> I did not yet fully understand the algorithmn in LocationHook, but I got the
> impression that it might be possible to create a dictionary with
> combinations of admin-level tags, and save a reference to that dictionary
> instead of adding similar tags to each element. This could save space and
> time, but would require additional logic in class Element. Do you think this
> is worth trying?

Of course you can try although I don't expect that it will improve much.

The algorithm in LocationHook works as follows:
1. All elements (points and lines) having at least one tag are added to 
the QuadTree. This is expensive - it takes 20-50% of the complete 
LocationHook time.
2. All preprocessed bounds that overlaps the tile boundaries are loaded 
from file. There is only little improvement possible (maybe zipping 
could improve disk access time).
3. The bounds are sorted by their admin_level tag, so that the greatest 
admin_level (or postcode) comes first.
4. For all elements of the preprocessed bounds the covered elements 
(points and lines) are searched. All matching elements are tagged with 
the preprocessed bounds. The preprocessed bounds also contains internal 
references (mkgmap:lies_in) which are tagged also (e.g. the bounds of 
the town Cologne contains the information that it is located in region 
North Rhine Westfalia and Germany).
5. Elements having all tags from the remaining preprocessed bounds are 
removed from the quadtree because no more information can be added to them.

Adding tags is a cheap method with a small memory footprint. At least 
the footprint is not much bigger than a reference to a dictionary. And 
you have to maintain the dictionary which I think is expensive. But I 
don't know exactly how you want to build up the dictionary so I cannot 
say for sure.

Maybe one the following things can contain an improvement:
* Analyzing the style rules might help to sort out some of the 
preprocessed boundaries
* Changing the parameters of the Quadtree (max node size 1000)
* The Quadtree query for a polygon (ElementQuadtreeNode: public 
Set<Element> get(ElementQuadTreePolygon polygon, Set<Element> 
resultList)) divides the polygon into bounds of the four subnodes. I 
expect that this is quite expensive. Maybe there is a better way to 
reduce the number of quadtree nodes that have to be queried.

WanMil

>
> Ciao,
> Gerd
>
> --
> View this message in context: http://gis.638310.n2.nabble.com/PATCH-v1-LocationHook-speedup-tp7135704p7137702.html
> Sent from the Mkgmap Development mailing list archive at Nabble.com.
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev