[mkgmap-dev] Splitter pbf vs o5m processing

From WanMil wmgcnfg at web.de on Thu Dec 13 14:27:07 GMT 2012

> Hi Steve,
> Steve Ratcliffe wrote
>> Hello Gerd
>>> no, it is not (yet). I plan to add o5m support to mkgmap soon. With my
>>> patch you can use splitter
>> As an aside, what do you think it is about the o5m format that makes
>> it quicker than pbf?
> Well, not easy to say. I think it's a combination of many small points:
> 1) pbf uses (by default) compressied blocks, so you have to unzip a complete
> block before you can
> use any information in the block.
> 2) pbf read routines create a lot of temporary objects, this seems to stress
> GC
> 3) pbf doesn't allow to skip processing of node tags or way tags, but
> splitters' read passes often don't need them. So, with pbf we create lists
> of tags and return them to GC, with o5m we can simply skip them.
> To be fair, using the --drop-version parm in osmconvert removes a lot of
> info which is ignored by splitter and mkgmap. I did never try what effect is
> has to use pbf input that was created with this parm.
> When writing, o5m is probably only faster because it doesn't zip the data.
> As long as mkgmap doesn't understand o5m I see no benefit in using this.
> Maybe other computers show different results, esp. if the CPU is much faster
> than mine and the Disk access is slower.
> By the way: my patch also speeds up pbf reading a little bit.
> Ciao,
> Gerd

Hi Gerd

I've done some tests with the latest splitter version r255.
I have split the geofabrics europe extract in pbf and o5m format.

As you pointed out o5m processing is much quicker (8528s vs. 12939s).
I also observed that pbf seemed to use more memory than o5m and 
therefore I activated gc logging and checked it with garbagecat.

The interesting values are
o5m: 94%
pbf: 61%
So 3400m seems to be too small for pbf processing to workout the europe 
extract so that the GC runs permanently.

Total Pause:
o5m:  527816ms =  528s
pbf: 5093916ms = 5094s
Wow, so for pbf GC requires 4566s more time.

Subtracting the GC time from the total processing time o5m and pbf need 
quite the same time:
o5m:  8528s -  528s = 8000s
pbf: 12939s - 5094s = 7845s

Obviously a part of the difference in GC time can be explained with your 
thoughts (pbf must extract all parts and must read tags which are thrown 
away directly afterwards). But do you think that the whole difference 
can be explained with that?

I will post my logfiles directly to you because they are too big to be 
posted on the mailing list.


