logo separator

[mkgmap-dev] How can we use prefix/suffix feature in road names?

From Gerd Petermann GPetermann_muenchen at hotmail.com on Fri Apr 14 05:57:16 BST 2017

Hi Carlos,

yes, at the moment I am not so much interested in the actual list, but maybe I can learn about special cases.

I try to find a format for the data that allows easy use and easy control of the feature.
I think I understand the needs for German, English and some Latin based languages but I
have no idea about the rest.
I have the feeling that we should group the list by language.
I assume that the common German prefixes like "Am, An, Auf, Bei, Zum, Zur" should be ignored in
the UK.
My current thinking : we can set up a file similar to LocatorConfig.xml which contains prefixes and suffixes grouped by
language and add some information in LocatorConfig.xml to say which language(s) are used for road
names in that country.
Do you think that would work?

Gerd
________________________________________
Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Carlos Dávila <cdavilam at orangecorreo.es>
Gesendet: Donnerstag, 13. April 2017 17:10:12
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] How can we use prefix/suffix feature in road names?

Hi Gerd
I already have a list with some (~1000) prefixes for Spanish, Catalan,
Galician, French and Portuguese. It was obtained manually, checking the
list of roads (not only residential) shown in MapSource address search
box for Europe and Brazil maps (if I recall correctly). It's not exactly
a list of prefixes, but a set of style rules to detect prefixes, but
actual prefixes can be easily extracted from it. If it can help, I can
send it to you.

El 13/04/17 a las 16:35, Gerd Petermann escribió:
> Hi all,
>
> I've compiled two lists using this method:
> 1) Collect all names in OSM ways with highway=residential from europe.osm.pbf dated 2017-04-04. I've used this commands:
> osmfilter europe.o5m --ignore-dependencies --keep= --keep-ways="highway=residential" -o=residential_roads.o5m
>
> I've created a small java program based on splitter to
> 2) Collect those names with at least one blank or apostroph ( ' )
> 3) For each name: Find position of blank or apostrop, create prefix as substring from start to position and suffix as substring from position  to end. Set position to next blank/apostrop. Stop if none found.
> Each calculated prefix / suffix is counted.
> Example : "Chemin de Piere Froide" gives
> prefixes "Chemin", "Chemin de", "Chemin de Piere" and
> suffixes "de Piere Froide", "Piere Froide", and "Froide"
>
> I've printed those strings with > 1000 occurences sorted by highest occurence first.
>
> It seems to me that we need some language experts to sort out which of the strings are useful prefix / suffix strings.
> I am pretty sure that the "prefix"
> "Rue Jean"     11476
> is not a good candidate but others with smaller numbers are okay, e.g.
> "Rua de"     5324
> ....
> "Route du"     3329
>
> The suffixes with multiple words are probably not useful, at least not in those languages that I know a little bit.
> Note that my algo is case sensitive.
>
> Maybe we can use these lists to set up a list of prefixes and suffixes ?
> I am now compiling those lists for a planet file from 2017-01-05.
>
> Gerd
>
>
>
> ________________________________________
> Von: mkgmap-dev <mkgmap-dev-bounces at lists.mkgmap.org.uk> im Auftrag von Carlos Dávila <cdavilam at orangecorreo.es>
> Gesendet: Mittwoch, 12. April 2017 23:08:45
> An: Development list for mkgmap
> Betreff: Re: [mkgmap-dev] How can we use prefix/suffix feature in road names?
>
> El 12/04/17 a las 21:48, Steve Ratcliffe escribió:
>> Hi
>>
>>> There is at least one visible effect of these 0x1e and 0x1f
>>> characters: When you zoom out MapSource removes the prefix / suffix
>>> part(s) from
>> In addition there is 0x1b which is like 0x1e, except that it does not
>> act as a space.  Only known example is is for "L'" being used as a
>> prefix:
>>
>>    Rue de L'Abbe Vincent
> There's also D': Rue D'Aberdeen, Allée D'Albert...
>> There is also 0x1c which is the non-spacing equivalent of 0x1f, I
>> don't know of any examples of that being used in street names.
>>
>> Since 0x1e and 0x1f are effectively spaces, I've created a patch to
>> make them sort along with and just after space.
>>
>> ..Steve

_______________________________________________
mkgmap-dev mailing list
mkgmap-dev at lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


More information about the mkgmap-dev mailing list