logo separator

[mkgmap-dev] java.lang.AssertionError while building index from unicode tiles

From Ticker Berkin rwb-mkgmap at jagit.co.uk on Mon Oct 18 15:36:51 BST 2021

Hi Gerd

Here is first version of the changes to improve MDR unicode and stop
the crash.

It always provides a PRIMARY strength sort value, both in the key for
sorting and direct comparison when using the collator. Previously
neither of these would have anything for a unicode character not
mentioned in the sort/cp65001.txt file

In an attempt to stop ordering clashes between the specified sort and
the ones fudged from the actual unicode value, it orders anything
unknown after the known values. Unfortunately these can then become
larger than 2 bytes - and, as this is all the space available without
re-structuring, they have to wrap onto the known sort region. I only 
found 1 character that did this and I don't know if it conflicted with
an existing sort.

Regardless of the character set used, in all the places where sorting
is used for de-dupe, I've used the SECONDARY strength collator to
detect similar record instead of name.equals(lastName)

I also noticed that my source base included optimisation for
LargeListSorter, its use of a key cache and some tidy-up of this in
mdr7 & mdr11 so these are here as well.

Ticker

-------------- next part --------------
A non-text attachment was scrubbed...
Name: mdrUnicode.patch
Type: text/x-patch
Size: 18940 bytes
Desc: not available
URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20211018/2eee90a0/attachment.bin>


More information about the mkgmap-dev mailing list