[mkgmap-dev] character repertoires

Mon Feb 25 22:18:14 GMT 2013

On 13-02-25 21:03:40 CET, Steve Ratcliffe wrote:
> This is the current algorithm:
> 
> 1a. if ascii(no-code-page): all characters > 0x7f are transliterated
>      into ascii characters
> 1b. if code-page=1252: all characters > 0xff are transliterated into
>      latin1 characters.

I guess here’s the little weakness (which you also hint at yourself
elsewhere in your mail): all characters > 0xff by means of their unicode
code point, not by their code point in the target code page.
Well, I mean by whether they’ve got any code point in the target code
page. :-)
I wonder how to improve the algorithm without making it much more CPU
intensive.
Does Java offer a fast code page mapability lookup?
If it were programmed in C (I haven’t written any Java code this
century), I might throw some RAM at it, initialize 64 KiB so zeroes
(to cover 16 bit unicode), and set all those to 1 for the unicode code
points reverse mapped from the code page printable character code points
of the target code page.

> > Asian:
> > A map with CP1258 shows up with totally unlabeled streets, not even
> > anything from the ASCII range.
> 
> Strange - are labels correct in the file? If you run strings on the img
> do you see the ascii labels? If so then it is a device thing.

Yes – strings on the generated cp1258.img look pretty similar to the
output of `strings cp1251.img`.

You can all try it yourselves, using the attached little package.

rj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: codepages.tar.bz2
Type: application/octet-stream
Size: 2432 bytes
Desc: not available
Url : http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130225/ddabb047/attachment.obj