logo separator

[mkgmap-dev] java.lang.AssertionError while building index from unicode tiles

From Gerd Petermann gpetermann_muenchen at hotmail.com on Tue Oct 19 09:13:34 BST 2021

Hi Ticker,

please remove the unrelated changes. I think we discussed them with patch mdrSort.patch in May, subject "MDR building out-of-memory".

Gerd

________________________________________
Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Ticker Berkin <rwb-mkgmap at jagit.co.uk>
Gesendet: Montag, 18. Oktober 2021 16:36
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] java.lang.AssertionError while building index from unicode tiles

Hi Gerd

Here is first version of the changes to improve MDR unicode and stop
the crash.

It always provides a PRIMARY strength sort value, both in the key for
sorting and direct comparison when using the collator. Previously
neither of these would have anything for a unicode character not
mentioned in the sort/cp65001.txt file

In an attempt to stop ordering clashes between the specified sort and
the ones fudged from the actual unicode value, it orders anything
unknown after the known values. Unfortunately these can then become
larger than 2 bytes - and, as this is all the space available without
re-structuring, they have to wrap onto the known sort region. I only
found 1 character that did this and I don't know if it conflicted with
an existing sort.

Regardless of the character set used, in all the places where sorting
is used for de-dupe, I've used the SECONDARY strength collator to
detect similar record instead of name.equals(lastName)

I also noticed that my source base included optimisation for
LargeListSorter, its use of a key cache and some tidy-up of this in
mdr7 & mdr11 so these are here as well.

Ticker



More information about the mkgmap-dev mailing list