logo separator

[mkgmap-dev] Using buffered io doubles splitter performance

From Chris Miller chris.miller at kbcfp.com on Sun Nov 15 19:06:35 GMT 2009

Thanks Jon, good catch! It seems the .bz2 buffering is where the biggest 
gains are to be had. I guess I missed that case because most of my profiling 
of the splitter has been with uncompressed osm files. The cache generation 
buffering is a bit more interesting - there is already some custom buffering 
(in LengthPrefixOutputStream) that was intended to replace the need for using 
BufferedOutputStream however it's not doing nearly as good a job as I'd hoped. 
Turns out that's because the buffer ends up being flushed quite frequently 
(every 100 bytes or so).

I've committed your changes, but I'll probably look into further changes 
to the cache buffering. I'd like to eliminate the BufferedOutputStream so 
there are fewer memory copies taking place; this should speed up the cache 
generation even further.

Cheers,
Chris


JB> I noticed that the splitter processing does a large number of small
JB> read & writes. The speed can be doubled by using buffered IO as per
JB> the attached patch.
JB> 
JB> Before:
JB> 
JB> $ time java -Xmx1500m -jar splitter/dist/splitter.jar
JB> --max-nodes=1000000 --cache=cache
JB> /store/planet/great_britain-20091114.osm.bz2
JB> 
JB> ...
JB> 
JB> Wrote 11,101,332 nodes, 1,485,442 ways, 54,180 relations
JB> 
JB> Time finished: Sat Nov 14 13:43:40 GMT 2009
JB> 
JB> Total time taken: 675s
JB> 
JB> real    11m15.561s
JB> user    6m17.064s
JB> sys     4m19.885s
JB> After:
JB> Wrote 11,101,332 nodes, 1,485,442 ways, 54,180 relations
JB> Time finished: Sat Nov 14 14:17:37 GMT 2009
JB> Total time taken: 305s
JB> real    5m5.343s
JB> user    4m42.738s
JB> sys     0m8.555s
JB> I did remember to delete the cache between the two runs.
JB> 
JB> Jon
JB> 






More information about the mkgmap-dev mailing list