logo separator

[mkgmap-dev] Name Substitution not correctly working

From Felix Hartmann extremecarver at gmail.com on Sun Jul 27 14:54:04 BST 2014

1. Yes - had I set notepad to default to UTF-8 I probably would have 
evaded the bug. (as long as you don't use create new document dialog on 
right click in Windows - they will always be in ANSI except if you do 
some registry hacks).
And yes - the mkgmap style-file is in UTF-8 - but as a windows user you 
usually don't notice. Because it is without BOM - so as long as there is 
no Umlaut or other special character in it, notepad++ or probably most 
windows user will open the file as ANSI because as long as you don't use 
any such character - it is actually still identical. Where the mkgmap 
style-file in UTF-8 with BOM, it would be clearer... (but I don't want 
to start a with or without BOM discussion here).

So right now only the address file in the style is quite safe - because 
recently there were some special characters added.
/mkgmap:country=POL & mkgmap:region!=* & mkgmap:admin_level4=* { set 
mkgmap:region='${mkgmap:admin_level4|subst:województwo =>}' }/


But as long as there is no working check - and mkgmap default style-file 
comes in UTF-8 without BOM - there is quite big danger the bug will 
happen to others too... (for my style I now set it to UTF8 plus for 
added security (though it won't matter) I added a line : /#this is a 
UTF-8 check - ÖÄÜè/
so should any editor actually change the encoding to ANSI - I would 
directly notice... So such a line at the start could be an alternative 
to UTF-8 with BOM..


2. about the patch:
Mmmh - that patch goes a bit too far... - it actually stops at errors on 
input file (not style) too I think (note the time stamp 30 seconds later):
14:49:25 china cn 6555 this is run101 starting to compile openmtmbap 
with mkgmap
Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: 
Error: (stream:10089): Bad character in input, file probably not in utf-8
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)
         at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)
         at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)
         at 
uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)
         at 
uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)
         at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)
         at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)
Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img
14:49:55 china cn 6555 Finished Compiling Openmtbmap - this is run101
mapsetbuilding failed - to few maxnodes??
Press any key to continue . . .


vs (input file in ANSI):
15:11:38 china cn 6555 this is run101 starting to compile openmtmbap 
with mkgmap
Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: 
Error: (stream:10089): Bad character in input, file probably not in utf-8
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)
         at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)
         at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)
         at 
uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)
         at 
uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)
         at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)
         at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)
Could Not Find C:\OpenMTBMap\maps\ovm_6555*.img
15:11:42 china cn 6555 Finished Compiling Openmtbmap - this is run101
mapsetbuilding failed - to few maxnodes??



However now that I once had a file in ANSI - (even though changed back 
to UTF-8) some residue in memory means I always get directly the error - 
even on default style...

C:\OpenMTBMap\maps>start /low /b /wait java -jar 
-XX:StringTableSize=100003 -Xms6000M -Xmx10300M c:\openmtbmap\mkgmap.jar 
--max-jobs=8 "--generate-sea" "--code-page=65001" 
"--precomp-sea=c:\openmtbmap\maps\sea.zip" --nsis --index 
--levels="0:24, 1:2
3, 2:22, 3:21, 4:20, 5:19, 6:18" --overview-levels="7:17, 8:16, 9:15, 
10:14, 11:13, 12:12" --adjust-turn-headings --add-pois-to-areas 
--reduce-point-density=3.4 --reduce-point-density-polygon=6 
--housenumbers --link-pois-to-ways --ignore-turn-restric
tions --polygon-size-limits="24:16, 23:14, 22:12, 21:11, 20:10, 19:9, 
18:8, 17:7, 16:6, 15:5, 14:4, 13:3, 12:2, 11:0, 10:0" 
--description=openmtbmap_gcc --show-profiles=1 
--location-autofill=bounds,is_in,nearest 
--bounds=c:\openmtbmap\maps\bounds.z
ip --route --country-abbr=gcc --country-name=gcc-states 
--mapname=65560000 --family-id=6556 --product-id=1 
--series-name=openmtbmap_gcc-states_27.07.2014 
--family-name=mtbmap_gcc_27.07.2014 --tdbfile --overview-mapname=mapsetc 
--keep-going --area-nam
e="gcc-states_27.07.2014_openmtbmap.org" -c 
e:\openmtbmap\maps\template.gcc-states 7*.img  1>NUL
Exception in thread "main" uk.me.parabola.mkgmap.scan.SyntaxException: 
Error: (stream:10089): Bad character in input, file probably not in utf-8
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.readChar(TokenScanner.java:239)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.readTok(TokenScanner.java:189)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.fillTok(TokenScanner.java:154)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.ensureTok(TokenScanner.java:150)
         at 
uk.me.parabola.mkgmap.scan.TokenScanner.isEndOfFile(TokenScanner.java:111)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.read(SrtTextReader.java:145)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:105)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.<init>(SrtTextReader.java:97)
         at 
uk.me.parabola.mkgmap.srt.SrtTextReader.sortForCodepage(SrtTextReader.java:126)
         at uk.me.parabola.mkgmap.main.Main.getSort(Main.java:638)
         at uk.me.parabola.mkgmap.main.Main.processFilename(Main.java:246)
         at 
uk.me.parabola.mkgmap.CommandArgsReader$Filename.processArg(CommandArgsReader.java:256)
         at 
uk.me.parabola.mkgmap.CommandArgsReader.readArgs(CommandArgsReader.java:125)
         at uk.me.parabola.mkgmap.main.Main.mainStart(Main.java:134)
         at uk.me.parabola.mkgmap.main.Main.main(Main.java:105)


On 27.07.2014 12:32, Steve Ratcliffe wrote:
> On 26/07/14 18:43, Felix Hartmann wrote:
>> Okay - I used ANSI. Could there maybe be a check for this in the check
>> styles routine, or in general?
>> I do suppose that must have been the problem.
>
> Although it is not always possible to tell if a file is in the wrong
> encoding, it should have been in this case.  I see that the ì
> character gets converted to a unicode replacement character (0xfffd)
>
> If you had done:
>     echo 'Shì'
>
> it would have come out something like: Sh� (hope that works in email)
> and shown the problem.
yes - clearly. (and works in email somehow).
>
> There are a couple of ways to make bad characters an error, rather
> than getting replaced.  The attached patch allows them to
> be replaced and then throws and error when seen. This has the
> advantage of giving you file name and line number of the error.
> It might interfere with something valid, so give it a try.
>
> I don't use notepad++, but these links might be useful:
>
> http://superuser.com/questions/292086/how-can-i-enforce-so-notepad-uses-utf-8-every-time-i-create-a-new-file 
>
>
> http://stackoverflow.com/questions/5090845/change-the-default-encoding-for-notepad 
>
>
> ..Steve

-- 
keep on biking and discovering new trails

Felix
openmtbmap.org & www.velomap.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20140727/1a40afef/attachment.html>


More information about the mkgmap-dev mailing list