[mkgmap-dev] splitter memory usage

Sun Oct 30 23:07:14 GMT 2011

Hello Scott,

thanks for the detailed analysis. 
(I started looking at this because I "played" with the program on my netbook
and wasn't able to split even small files like that for Niedersachsen,
that's why I thought the program is in error. In my job I coded many
programs that handle mass data (on IBM mainframes), but I am a bloody
beginner in java) 

I think I understand now the idea in your implementation, and I agree that
it it probably the best solution for a whole planet where the relation
between (highest node Id / number of ids) is rather small. 

On the other hand, this ratio is going to get worser in the future for
"normal" splits (e.g. germany), esp. when such a high node id is also saved
in the 1st overflow map.

Interesting for me:
I tried to split europe.osm.pbf with default parms and -Xmx2000m : r181
crashed with a gc message, my version finished.

regarding a parm:
I assume the program can decide which algorithmn is better after the 1st
pass, but a parm could also be used.

regarding my Storer class:
I think it reduces space. The normal approach would be to save each id in
the Int2Short HashMap, but I liked your trick with the chunks, as they save
space and reduce the problem of hash collisions and the future problem that
node id will exceed 2^31. My first change was to store each (chunkmask and
chunk) in their own HashMaps, and that caused a lot of overhead compared to
the Storer. 

Besides that:
Why is a new chunk initialized with 4 times 4 in chunkMake:
		Arrays.fill(out,(short)4);
I used this:
		Arrays.fill(out,(short)unassigned);
and I think it works fine. In the original code, the first 4 shorts in chunk
are never used. 
Correct?

Ciao,
Gerd

--
View this message in context: http://gis.638310.n2.nabble.com/splitter-memory-usage-tp6935688p6946578.html
Sent from the Mkgmap Development mailing list archive at Nabble.com.