logo separator

[mkgmap-dev] StandardCharsets and try (with-resources)

From Ticker Berkin rwb-mkgmap at jagit.co.uk on Sun Jan 19 18:30:12 GMT 2020

Hi Gerd

Here is new version of patch with line.trim() restored and exception
thrown.

@mike - It is likely that this will fix your problem with the display
of option text with non-ascii characters; with previous code, mkgmap
*read* the text incorrectly unless your local charset is was utf-8.

Ticker

On Fri, 2020-01-17 at 17:04 +0000, Ticker Berkin wrote:
> Hi Gerd
> 
> The line.trim() deletion wasn't intended - I'll put it back. 
> 
> I think it best to change sortForCode IOException to throw
> ExitException. Maybe they meant to return some default "Sort", ie
> sortForCodepage(1252), but this seems wrong.
> 
> I started looking at CombinedStyleFileLoader. It does its Input and
> Output in the default charset and I don't know if anyone uses it
> anymore, but I didn't want to change any of its behaviour, so I
> thought
> best not to touch it.
> 
> Reg. new class for files that use '#' for comments. Some of these
> already use TokenScanner which can be configured. The only other one
> that a quick grep finds is the character transliteration tables, so I
> don't think it is worth it at the moment.
> 
> Ticker
> 
> On Fri, 2020-01-17 at 16:20 +0000, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > - I think there is a small change in the handling of lines in
> > OsmMapDataSource.readDeleteTagsFile. The old code used
> > line = line.trim();
> > This is missing now. Is that intended?
> > 
> > - I also don't understand the line with your comment "// ??? I
> > don't
> > understand this" . Looks like an endless recursive call?
> > 
> > - You sometimes replaced FileReader, but not in
> > CombinedStyleFileLoader. Why not?
> > 
> > We have a few places where we read files which use "#" for comment
> > lines.  Would it help to create a class for that?
> > 
> > I made a few minor mods, see attachment.
> > 
> > Gerd
> > 
> > ________________________________________
> > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag
> > von Ticker Berkin <rwb-mkgmap at jagit.co.uk>
> > Gesendet: Freitag, 17. Januar 2020 13:53
> > An: Development list for mkgmap
> > Betreff: [mkgmap-dev] StandardCharsets and try (with-resources)
> > 
> > Hi Gerd
> > 
> > Attached patch
> > 
> > - uses StandardCharsets.* where possible.
> > 
> > - notes some usage of the java local DefaultCharset.
> > 
> > - changed a couple of these to force utf-8 instead.
> > 
> > - if --read-config file gives decoding errors, names the charset
> > used
> > to read the file (ie DefaultCharset) instead of 'utf-8' in the
> > error
> > message.
> > 
> > - accepts/ignores unicode BOM in more files
> > 
> > - uses try (open...) {} where possible in files changed for the
> > above
> > reasons.
> > 
> > There is some code in
> > mkgmap/srt/SrtTextReader.java:sortForCodepage()
> > that I don't understand; it would appear to get into a recursive
> > loop
> > on IOException.
> > 
> > Ticker
> > 
> > On Tue, 2020-01-14 at 09:55 +0000, Gerd Petermann wrote:
> > > Hi Ticker,
> > > 
> > > yes, and every missing close() is a brain teaser ;)
> > > We have a few places where files are opened and closed in a
> > > different
> > > method. This is likely to cause trouble in unit tests, esp. on
> > > Windows.
> > > Whereever possible we should use try-with-ressources instead of
> > > Utils.closeFile() and add a comment
> > > like in SeaGenerator line
> > > in zipFile = new ZipFile(precompSeaDir); // don't close here!
> > > when a file is intentionally kept open.
> > > 
> > > Gerd
> > > > ________________________________________
> > > Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im
> > > Auftrag
> > > von Ticker Berkin <rwb-mkgmap at jagit.co.uk>
> > > Gesendet: Dienstag, 14. Januar 2020 10:43
> > > An: Development list for mkgmap
> > > Betreff: Re: [mkgmap-dev] TYP files and character encoding
> > > > Hi Gerd
> > > > Here is updated patch that closes the file, although I find
> > > > many
> > > files
> > > in mkgmap that don't have explicit close(), but I presume
> > > .finalize()
> > > will close them eventually.
> > > > I'll do another patch for other text file handling, using
> > > StandardCharset where possible and fixing TokenScanner message
> > > for
> > > bad
> > > characters if not utf-8 and, if reasonable, allowing a BOM even
> > > if
> > > the
> > > file is opened as utf-8 anyway.
> > > > Ticker
> > > > On Tue, 2020-01-14 at 08:21 +0000, Gerd Petermann wrote:
> > > > Hi Ticker,
> > > > 
> > > > thanks for the patch.
> > > > 
> > > > Please review TypCompiler.CharsetProbe.  BufferedReader br is
> > > > not
> > > > closed. Is that intended?
> > > > 
> > > > I see that we have a mix of "utf-8" and "UTF-8" in the mkgmap
> > > > sources. I think it would be good to use StandardCharsets.UTF_8
> > > > where
> > > > possible
> > > > and unify the rest.
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: utf8_v3.patch
Type: text/x-patch
Size: 24368 bytes
Desc: not available
URL: <http://www.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20200119/eede4cbe/attachment-0001.bin>


More information about the mkgmap-dev mailing list