logo separator

[mkgmap-dev] MDR building out-of-memory

From Ticker Berkin rwb-mkgmap at jagit.co.uk on Wed May 12 16:59:10 BST 2021

Hi Gerd

I did a bit more experimenting with LargeListSorter and Mdr7/11 and
have some changes - patch attached.

1/ explicit param to LargeListSorted to useCache
2/ do nothing if 0 or 1 records to sort
3/ reduce chunkSize to reduce memory usage, increase maxDepth so same
overall limit
4/ allocate temp "merged" with correct length
5/ useCache parameters and remarks on usage in Mdr7 & 11
6/ prevent a copy of allStreets
7/ share the collator and allow it and "sort" to be garbage collected
8/ A comments that sort of partial streets must use sortkey, even if
only sorting a few record. LargeListSorter is a handy way of doing this
for a simple/single sortKey per record.

Sortkey simply makes a byte array of the string converted into the
target charset and sort does a byte comparison on this.

collator.compare(str1, str2) ignores the shield chars (and possibly the
other prefix/suffix markers) during the comparison and has extra
dynamic processing looking to handle char aliases (PRIMARY strength
etc).

The comment at the top of Sort.java:

 * found that sorting with the sort keys and the collator gave
different results in some
 * cases. This implementation does not.

is not really correct.

An alternative collator string comparison could be provided that
behaves in the same way as sort.createSortKey. This would speed up 'on
-the-fly' sorts, maybe becoming feasible.

Ticker

On Wed, 2021-05-12 at 09:08 +0000, Gerd Petermann wrote:
> Hi Ticker,
> 
> I think the MultiSortKeys were introduced because the on-the-fly solution was far too slow, at least with the normal Java sort. Could well be that the problem was solved with Java 8 or newer releases.
> 
> I think there was a special case with maps containing huge numbers of equally named roads causing extreme run times.
> This depends on the style. Some styles add a name like "tr2" for each unnamed track with tracktype=grade2.
> 
> Gerd
> 
> ________________________________________
> Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap at jagit.co.uk>
> Gesendet: Mittwoch, 12. Mai 2021 10:48
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] MDR building out-of-memory
> 
> Hi Gerd
> 
> Certainly no cache. Maybe reduce the chunk size, but this might
> increase copying.
> 
> It could be improved by doing a linear chunk split/sort then a multi
> -way merge. This would avoid lots of copying assuming the following:
> 
> Using the original array to store sorted chunks demands that another
> array of the same full size is needed for the final merge. If each
> sorted key chunk is converted to a object chunk and these merged, then
> although the same total size is needed, it is made of number of smaller
> arrays.
> 
> The most space efficient solution might be have Mdr11 "implements
> Comparable" and generate pairs of sortkeys on the fly and let the java
> sort take care of all the details.
> 
> The other use of LargeListSorter is Mdr7. I get a higher hit-rate
> (~50%) for the first/partialSorter (1046096 allstreets).
> 
> However, for the repeated fullNameSorter on the partial results, most
> of the lists are just 1 long, with very few more than 10. I guess this
> depends on --road-name-config/split-name and use of shields etc, but
> LargeListSorted seems overkill.
> 
> Ticker
> 
> 
> On Tue, 2021-05-11 at 19:15 +0000, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > I think the cache is not meant to improve run time, it is used to
> > deduplicate and thus reduce memory. Maybe it would be better to use a
> > smaller chunk size and no cache.
> > No idea why I didn't use
> > merged = new ArrayList<>(len);
> > 
> > Gerd
> > 
> > ________________________________________
> > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag
> > von Ticker Berkin <rwb-mkgmap at jagit.co.uk>
> > Gesendet: Dienstag, 11. Mai 2021 19:19
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] MDR building out-of-memory
> > 
> > Hi Gerd
> > 
> > I've looked at this, and if Mdr5 space becomes a problem again, I'll
> > consider converting to it.
> > 
> > A couple of comments:
> > 
> > My map had 2909735 poi so the sort chunk size was ~727433. The cache
> > sizes after each chunk were 563095, 603595, 597239 & 605718. Is the
> > cache worth-while for this low hit-rate? Just running the Gmapsupp
> > combiner on existing tiles (without --route, so no streets), I got a
> > run time of 1 min 44 secs with the cache and 1 min 30 without!
> > However
> > most of this time is copying the tiles into gmapsupp.img so not an
> > accurate statistic.
> > 
> > You could pre-allocate List<> "merged" with the correct size.
> > 
> > Ticker
> > 
> > On Tue, 2021-05-11 at 14:40 +0000, Gerd Petermann wrote:
> > > Hi Ticker,
> > > 
> > > I've committed the patch as is. I've not seen big changes in
> > > performance, but I've used a different (already existing) set of
> > > files which was created with my own style. For me,
> > > Mdr11.preWriteImpl() is the most problematic part reg. OOM errors.
> > > 
> > > Maybe look at the code which uses LargeListSorter.
> > > 
> > > Gerd
> > > 
> > > ________________________________________
> > > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag
> > > von Ticker Berkin <rwb-mkgmap at jagit.co.uk>
> > > Gesendet: Dienstag, 11. Mai 2021 13:27
> > > An: Development list for mkgmap
> > > Betreff: Re: [mkgmap-dev] MDR building out-of-memory
> > > 
> > > Hi Gerd
> > > 
> > > Here is updated version of patch.
> > > 
> > > Changes from last:
> > > 
> > > Uses your cache code for region and country (in 2 places). For
> > > British
> > > Isles, there are 190 regions and 7 countries, so I don't think the
> > > extra memory will be a problem and there should be some performance
> > > benefit.
> > > 
> > > Delays allocating cities until it can use sortKeys.size() for
> > > initial
> > > allocation. For above map this is 0.07% too big, so I don't think
> > > trimToSize() is worthwhile.
> > > 
> > > Shares the Sort object between the 4 methods.
> > > 
> > > Ticker
> > > _______________________________________________
> > > mkgmap-dev mailing list
> > > mkgmap-dev at lists.mkgmap.org.uk
> > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev at lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev at lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mdrSort.patch
Type: text/x-patch
Size: 6352 bytes
Desc: not available
URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20210512/c1c08b3a/attachment-0001.bin>


More information about the mkgmap-dev mailing list