logo separator

[mkgmap-dev] java.lang.AssertionError while building index from unicode tiles

From Steve Ratcliffe steve at parabola.me.uk on Fri Oct 22 14:24:02 BST 2021

Hi Ticker

 > Problem is that resources/sort/cp65001.txt doesn't give ordering to
 > lots of characters; it looks like it covers only about 10,500 of the
 > 1,112,064 possible code-points. Many of these non-ordered characters
 > are being used by the names in the tile in question.

I used the program in extra/src/uk/me/parabola/util/CollationRules.java
to generate some of the tables.

This uses the file "allkeys.txt" which can be obtained
from https://www.unicode.org/Public/UCA/latest/allkeys.txt

The document explaining the unicode collation rules that references
that file is: http://www.unicode.org/reports/tr10/ It includes a
section for programmatically deriving the weights for characters that
do not have explicit entries in the table.

 > Assuming the actual ordering of unspecified code-points doesn't really
 > matter, I propose to change the logic slightly so undefined Unicode is
 > sorted on its 16-bit value after the range of known sorts.

I think that is a good initial approach to get things working.

Steve



More information about the mkgmap-dev mailing list