logo separator

[mkgmap-dev] New branch for default typ file

From Randolph J. Herber army.bronze.star at gmail.com on Tue Dec 17 21:20:49 GMT 2019

Dear Sirs:

There has been a thread of discussion of whether there should be a 
/Beginning Of Message/ /(BOM)/ at the beginning of a UTF-8 file.

This discussion is complicated by the fact that some of the developers 
work on Unix, Linux, BSD, iOS, Solaris and Windows. These operating 
systems have UTF-8 handling libraries written at different times and to 
different Unicode standards. Originally the Unicode standard said that 
UTF-8 should *not* have a BOM character at the beginning of a file. 
Later Unicode changed the standard to a BOM is permissible, not required 
and not recommended. Microsoft added a BOM to the beginning of UTF-8 
files before doing so was permissible to ease the problem of recognizing 
a UTF-8 file. This broke the other operating systems' handling of UTF-8. 
Microsoft petitioned for the permissibility of a BOM to avoid changing 
their file handling.

At this time, I believe at all programs should use Unicode and not 
Microsoft code pages. I have had problems with Microsoft code pages 
since MSDOS days.

Splitter and mkgmap are written in Java. Java still follows the original 
Unicode standard of no BOM at the beginning of a UTF-8 text file. This 
is a "not to fixed" situation per the Java language developers. This 
situation results in problems with Java, particularly in a Microsoft 
Windows environment,

The code fragments below provide Java solutions to writing a BOM at the 
beginning of a UTF-8 text files so that Microsoft native text editors 
can handle them and, on reading a text file, provides a automatic way of 
ignoring an optional BOM by checking for the BOM after file opening.

A test for execution in a Windows environment is provided below if one 
decides to add a BOM only on Microsoft Windows.

I have not downloaded the splitter and mkgmap sources and searched for 
the appropriate places in their sources to apply the changes. I feel the 
main splitter and mkgmap developers are placed better to make these 
changes. This is the reason that I did not provide patches to the sources.

Randolph J. Herber.

On 12/14/2019 9:44 AM, Randolph J. Herber wrote:
>
> Dear Sirs:
>
> Re UTF-8
>
> Microsoft does this to handle another problem: Microsoft uses many 
> different character set encodings (i.e., code pages, e.g., CP1252) and 
> the BOM is used by Microsoft to indicate that the "code page" is 
> UTF-8. Java was implemented to an older version of the Unicode 
> standard that prohibited an UTF-8 BOM. The problem is comes from 
> moving back and forth across that cultural divide. Yes, this is painful.
>
> A solution to the reading issue from the Java side:
>
> https://stackoverflow.com/questions/1835430/byte-order-mark-screws-up-file-reading-in-java
>
> A solution for writing a UTF-8 BOM in Java:
>
> |BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new 
> FileOutputStream(the File), StandardCharsets.UTF_8))out.write('\ufeff');|
>
> A check for execution in a Windows environment:
>
> String OS = System.getProperty("os.name").toLowerCase();
> Boolean isWindows = OS.indexOf("win") >= 0;
>
> Perhaps, on output write the BOM in a Windows environment and use the 
> BOM optional on input.
>
> http://www.unicode.org/faq/utf_bom.html
>
> Q: What are some of the differences between the UTFs?
>
> A: The following table summarizes some of the properties of each of 
> the UTFs.
>
> Name 	UTF-8 	UTF-16 	UTF-16BE 	UTF-16LE 	UTF-32 	UTF-32BE 	UTF-32LE
> Smallest code point 	0000 	0000 	0000 	0000 	0000 	0000 	0000
> Largest code point 	10FFFF 	10FFFF 	10FFFF 	10FFFF 	10FFFF 	10FFFF 
> 10FFFF
> Code unit size 	8 bits 	16 bits 	16 bits 	16 bits 	32 bits 	32 bits 
> 32 bits
> Byte order 	N/A 	<BOM> 	big-endian 	little-endian 	<BOM> 	big-endian 
> little-endian
> Fewest bytes per character 	1 	2 	2 	2 	4 	4 	4
> Most bytes per character 	4 	4 	4 	4 	4 	4 	4
>
> In the table <BOM> indicates that the byte order is determined by a 
> byte order mark, if present at the beginning of the data stream, 
> otherwise it is big-endian.
>
> http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
>
> Table 2-4.  The Seven Unicode Encoding Schemes
>
> Encoding Scheme       Endian Order                    BOM Allowed?
>
> UTF-8 N/A                                         yes
>
> The remainder of the table omitted.
>
> https://docs.microsoft.com/en-us/windows/win32/intl/using-byte-order-marks
>
>
>   Using Byte Order Marks
>
>   * 05/30/2018
>   * 2 minutes to read
>  *
>      o
>      o
>     <https://github.com/MicrosoftDocs/win32/blob/docs/desktop-src/Intl/using-byte-order-marks.md>
>
> Always prefix a Unicode plain text file with a byte order mark, which 
> informs an application receiving the file that the file is 
> byte-ordered. Available byte order marks are listed in the following 
> table. Because Unicode plain text is a sequence of 16-bit code values, 
> it is sensitive to the byte ordering used when the text is written.
>
> Note
>
> A byte order mark is not a control character that selects the byte 
> order of the text.
>
> Byte order mark 	Description
> EF BB BF 	UTF-8
> FF FE 	UTF-16, little endian
> FE FF 	UTF-16, big endian
> FF FE 00 00 	UTF-32, little endian
> 00 00 FE FF 	UTF-32, big-endian
>
>
>
>
> On 12/14/2019 3:41 AM, Ticker Berkin wrote:
>> Hi Joris & Gerd
>>
>> Great to see the typ-files now in trunk and all the work in updating
>> mapnik.txt to the current default style. Next week I plan to go through
>> "20191209 mapnik update.pdf" and comment on it and possible changes to
>> the default style.
>>
>> Some other questions however:
>>
>> How do you see mapnik.txt now being maintained; will it be as as simple
>> .txt file with patches being supplied in the same way as other source
>> files, or will it be regenerated from your translation spreadsheet and
>> other sources? I'd prefer the simple text file approach, but this might
>> allow changes into the file which make it incompatible with the tools
>> Joris uses to enhance it.
>>
>> It is currently in UTF8 format, with an appropriate BOM at the start of
>> the file. I don't know how the java input libraries determine the
>> conversion rules to internal unicode, but this file should be
>> consistent with all the others that contain characters outside the
>> simple ansi 7-bit range (roadNameConfig.txt, default/inc/address)
>>
>> It contains the statement:
>>> CodePage=65001
>> This is saying the output should be unicode, but the output should be
>> the same as the associated map.
>>
>> Also the FID should be removed.
>>
>> Regards
>> Ticker
>>
>> On Tue, 2019-12-10 at 09:59 +0000, Gerd Petermann wrote:
>>> Hi Joris,
>>>
>>> the file mapnik.txt says "Based on mkgmap default style version:
>>> r4262"
>>> Is it the right file?
>>>
>>> reg. line type 0x0b: highway=motorway_link & (mkgmap:exit_hint=true |
>>> mkgmap:dest_hint=*)
>>> I want to look at the DestinationHook. If I got that right it should
>>> be OK to have a zero-length road with that type to get the wanted
>>> destination hint. In that case we don't have to care about rendering.
>>>
>>> Gerd
>>>
>>> ________________________________________
>>> Von: Joris Bo<jorisbo at hotmail.com>
>>> Gesendet: Montag, 9. Dezember 2019 20:45
>>> An: Development list for mkgmap; Gerd Petermann
>>> Betreff: RE: [mkgmap-dev] New branch for default typ file
>>>
>>> Hi All,
>>>
>>> I don't think any changes needed in mkgmap itself. When the draworder
>>> of bay is lower then water it will display correctly.
>>> See attached new typ-file for correct usage.
>>> Even better (but this is a change in default style): don't use
>>> natural = bay in polygons but only in points for displaying as name.
>>>
>>> Today I spent some time testing and repairing.
>>>
>>> The mapnik.txt in branch mkgmap-default-typ-r4268 was pretty old and
>>> also did not have the translations of all the languages anymore. It
>>> also lost draworder of a lot of polygons which made the bay-problem
>>> occur.
>>>
>>> I did a complete recheck of the most recent default-style in: mkgmap
>>> -r4386.zip and changed de typ-file accordingly.
>>>
>>> I downloaded a full europe-latest from geofabrik today, builded it as
>>> a big full europa map with the default style of r4386  and with
>>> mkgmap r4386.jar No errors occured.
>>>
>>> I think it’s up to date again but some review and comments are always
>>> welcome.
>>>
>>> See typ-file in attachement,
>>>
>>> Kind regards,
>>> Joris
>>>
>>>
>>>
>>>
>>>
>>> -----Oorspronkelijk bericht-----
>>> Van: mkgmap-dev<mkgmap-dev-bounces at lists.mkgmap.org.uk>  Namens Pinns
>>> UK
>>> Verzonden: maandag 9 december 2019 18:31
>>> Aan: Gerd Petermann<gpetermann_muenchen at hotmail.com>;
>>> mkgmap-dev at lists.mkgmap.org.uk
>>> Onderwerp: Re: [mkgmap-dev] New branch for default typ file
>>>
>>> Hi Gerd
>>>
>>> Yes, you can do that with a draw level 1 higher than sea.
>>>
>>> Draw orders are defined at the beginning of a (txt) typ file just
>>> before the polygons
>>>
>>> using the following format
>>>
>>> Type=0x type number , draworder
>>>
>>> It is good practice to sort the draworders , as that is how they
>>> appear in a typ file
>>>
>>> [_drawOrder]
>>> Type=0x03,1
>>> Type=0x28,1
>>> Type=0x54,1
>>> Type=0x01,2
>>> Type=0x09,2
>>>    Type=0x4E,2
>>>    Type=0x10F1C,2
>>> etc etc
>>> [end]
>>> I have no idea what the draworder for sea is , but just make it one
>>> higher
>>>
>>> On 09/12/2019 16:41, Gerd Petermann wrote:
>>>> Hi Nick,
>>>>
>>>> I don't want to cut out islands from bay polygons, I thought about
>>>> a proper typ for 0x3d which somehow marks "calmer water"
>>>> and a draw order that puts this above water and below any land type
>>>> polygon.
>>>> Is that possible?
>>>>
>>>> Gerd
>>>>
>>>> ________________________________________
>>>> Von: Pinns UK<osm at pinns.co.uk>
>>>> Gesendet: Montag, 9. Dezember 2019 16:17
>>>> An: Gerd Petermann;mkgmap-dev at lists.mkgmap.org.uk
>>>> Betreff: Re: AW: AW: [mkgmap-dev] New branch for default typ file
>>>>
>>>> Hi Gerd
>>>>
>>>> Yes, I suppose so
>>>>
>>>> On 09/12/2019 15:14, Gerd Petermann wrote:
>>>>> Hi Nick,
>>>>>
>>>>> my understanding is that you always have another water polygon,
>>>>> either ocean or natural=water.
>>>>>
>>>>> Gerd
>>>>>
>>>>> ________________________________________
>>>>> Von: Pinns UK<osm at pinns.co.uk>
>>>>> Gesendet: Montag, 9. Dezember 2019 16:04
>>>>> An: Gerd Petermann;mkgmap-dev at lists.mkgmap.org.uk
>>>>> Betreff: Re: AW: [mkgmap-dev] New branch for default typ file
>>>>>
>>>>> Hi Gerd
>>>>>
>>>>> In case of 2) you need 2 polygons for doing each job; one showing
>>>>> 'water' and the other one not
>>>>>
>>>>> Ideally,    mkgmap checks if islands are in a 'bay' area
>>>>>
>>>>> In my area we have lots of natural=bays ; fortunately they do not
>>>>> include islands
>>>>>
>>>>> On 09/12/2019 14:51, Gerd Petermann wrote:
>>>>>> Hi,
>>>>>>
>>>>>> thanks for the help.
>>>>>> I see two ways to handle the a polygon with natural=bay:
>>>>>> 1) in ponts style with natural=bay & name=*  [....]
>>>>>> 2) in polygons (as it is now) with natural=bay [0x3d resolution
>>>>>> 18]
>>>>>>
>>>>>> In case of 1) we just need option add-pois-to-areas In case of
>>>>>> 2) we
>>>>>> would want to render the water area covered by the bay polygon
>>>>>> different, but not anything on the land or on islands. Would
>>>>>> that be possible?
>>>>>>
>>>>>> Gerd
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________________
>>>>>> Von: mkgmap-dev<mkgmap-dev-bounces at lists.mkgmap.org.uk>  im
>>>>>> Auftrag
>>>>>> von Pinns UK<osm at pinns.co.uk>
>>>>>> Gesendet: Montag, 9. Dezember 2019 15:42
>>>>>> An:mkgmap-dev at lists.mkgmap.org.uk
>>>>>> Betreff: Re: [mkgmap-dev] New branch for default typ file
>>>>>>
>>>>>> Andrzej is correct about how transparency is defined
>>>>>>
>>>>>> Garmin regards all polygons with transparency  as bitmaps and
>>>>>> therefore require 2 colours.
>>>>>>
>>>>>> The Bitmap need to be shown below the xpm
>>>>>>
>>>>>> If a polygon is completely transparent then a second 'dummy'
>>>>>> colour
>>>>>> is still needed
>>>>>>
>>>>>> Xpm="32 32 2 1"
>>>>>> "0 c none"
>>>>>> "1 c #C8C8C8"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> "00000000000000000000000000000000"
>>>>>> ;12345678901234567890123456789012
>>>>>> [end]
>>>>>>
>>>>>> On 09/12/2019 14:19, Andrzej Popowski wrote:
>>>>>>> Hi Gerd,
>>>>>>>
>>>>>>> I use TypViewer for creating typ files and I don't know XPM
>>>>>>> details.
>>>>>>> But looking at TypViewer output, I guess that transparent
>>>>>>> pixels
>>>>>>> are defined with color like that:
>>>>>>>
>>>>>>> "  c none"
>>>>>>>
>>>>>>> where space ' ' is used for marking pixels.
>>>>>>>
>>>>>>> Changing draw order instead of transparent graphics could be
>>>>>>> a
>>>>>>> solution too, but I'm not sure if covered polygon label would
>>>>>>> remain visible. And without label, there is not much use of
>>>>>>> this object.
>>>>>>>
>>>>>> _______________________________________________
>>>>>> mkgmap-dev mailing list
>>>>>> mkgmap-dev at lists.mkgmap.org.uk
>>>>>> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
>>> _______________________________________________
>>> mkgmap-dev mailing list
>>> mkgmap-dev at lists.mkgmap.org.uk
>>> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
>>> _______________________________________________
>>> mkgmap-dev mailing list
>>> mkgmap-dev at lists.mkgmap.org.uk
>>> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
>> _______________________________________________
>> mkgmap-dev mailing list
>> mkgmap-dev at lists.mkgmap.org.uk
>> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20191217/ce898492/attachment-0001.html>


More information about the mkgmap-dev mailing list