logo separator

[mkgmap-dev] SeaGenerator patch v1

From Gerd Petermann gpetermann_muenchen at hotmail.com on Fri Jan 11 21:00:37 GMT 2013

Hi,

attached is seaGen_v2.patch

> > the index for the precomp-sea method stores 131072 * 2 = 262.144 Strings
> > with String.intern() in a HashMap (one instance for each job).
> 
> That's a bit over estimated because the Strings are interned. Therefore 
> 131072+<number of sea tiles>+2= ~143000 Strings are used (over all 
> jobs). Of course the HashMap remains for each job with 131072 entries.

Yes, that's right. I forgot that only for mixed tiles a file name is saved.

> 
> > Attached is a patch that changes the index to use a byte[512][256] array >
> > 131072 bytes.
> >
> > It reduces run time and memory needs in mkgmap (when --precomp-sea is used)
> 
> Why does it reduce the run time? Did you measure it?
See also below.  think the most important point regarding runtime is that it avoids String.intern() 
calls. This has two effects: 
1) ~ 143000 calls take some time
2) the interned strings make it more likely the further String.intern() calls
have collisions in the HashMap that is used by intern(). That means that
the storing of the tags is slowed down a bit.

> Can you please give numbers. How much memory is saved? I don't want to 
> commit optimizations that change the readability of code without having 
> a good reason.

I ran mkgmap with 20 tiles from germany (max-nodes 1600000) using the parms and style
from toc-rox "Freizeitkarte". 
java -Xmx7000m  -agentlib:hprof=cpu=samples,depth=10 -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -jar d:\mkgmap_trunk\dist\mkgmap.jar --max-jobs=4 -c f:\osm\klaus\t.cfg -c template.args > m

The 1st version is r2443 , 
the 2nd is 2443 with the OSMId patch + the reduce_peak_mem_v1.patch (SeaGeneraor excluded)
the 3rd is like the 2nd plus the attached seaGen_v2.patch 

Numbers: (1st,2nd,3rd)
Total time 149 s / 142 s / 135s
Garbagecat reported these:
1st:
========================================
SUMMARY:
========================================
# GC Events: 176
GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD
Max Heap Space: 4224448K
Max Heap Occupancy: 3490537K
Max Perm Space: 82688K
Max Perm Occupancy: 44234K
Throughput: 56%
Max Pause: 3279 ms
Total Pause: 65358 ms
First Timestamp: 343 ms
Last Timestamp: 147189 ms 

2nd: 
# GC Events: 185
GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD
Max Heap Space: 3681024K
Max Heap Occupancy: 3190176K
Max Perm Space: 83968K
Max Perm Occupancy: 44876K
Throughput: 60%
Max Pause: 3163 ms
Total Pause: 57316 ms
First Timestamp: 320 ms
Last Timestamp: 142164 ms 
3rd:
# GC Events: 168
GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD
Max Heap Space: 3807680K
Max Heap Occupancy: 3156373K
Max Perm Space: 34176K
Max Perm Occupancy: 31855K
Throughput: 60%
Max Pause: 2622 ms
Total Pause: 54084 ms
First Timestamp: 318 ms
Last Timestamp: 135052 ms 

> 
> >
> > One problem: The unpatched version would allow to have different formats
> > (*.o5m, *.osm.pbf, .osm.gz) in one sea directory.
> > The patched version assumes that all files have the same prefix  and
> > extension.
> > So, if you want to update a precompiled sea tile you have to make sure that
> > you keep the name and format.
> > I hope this small limit is no problem?
> 
> That's ok. An error should be printed if that's not the case.

OK. I've added a check to detect that.

> 
> Please rework the patch a bit.
> 1. Instead of instantiating an Area object to get minLat etc. you can 
> use the Utils.toMapUnit(double) method
> 2. Document what's in the byte array. Constants improve readability.
> 3. Maybe create an accessor method to the byte array so you avoid to 
> blow up the other code with index calculations.
> 4. Remove unused variables
> 5. Instead of x and y use e.g. latIndex and lonIndex as variable names 
> so that it is easy to read the code.

Good points. I did not find unused variables, but tried to do the rest.

> 
> By the way: using the ThreadLocal requires loading the index in each 
> thread which is not optimal. But at the time implementing it I didn't 
> want to bother about synchronizing the load procedure. This might be a 
> point where some run time can be saved.

Regarding memory there is no more reason to bother. 
Regarding performance it would be better to use a binary format to avoid
all the pattern.split() calls, but the profiler shows that the patched
version requires only ~0.2 % of the total run time, so I think it's
okay now.

Ciao,
Gerd



 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130111/be42db45/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: seaGen_v2.patch
Type: application/x-download
Size: 9537 bytes
Desc: not available
Url : http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130111/be42db45/attachment-0001.bin 


More information about the mkgmap-dev mailing list