logo separator

[mkgmap-dev] [index] Automatic location completion

From Johann Gail johann.gail at gmx.de on Tue Mar 1 18:50:24 GMT 2011

Am 01.03.2011 19:38, schrieb Johann Gail:
>
>>> 1-The list of regions (state/country field) is much better than the one
>>> obtained with trunk. All those included are actual regions (some with
>>> two different names, e.g. Castilla la Mancha&   Castilla-la Mancha).
>>> Trunk includes many names that are not actual regions of Spain, but
>>> provinces, cities or even villages.
>> That's fine! I don't understand why you get two different similar names.
>> I think this is caused by addr: tags that don't use the same spelling
>> like the boundary multipolygons.
>> Do you know about any similar name detection algorithm? So something
>> like a "sounds-like(String cityname)" function? This would be necessary
>> to fix that.
>>
> Look for the SOUNDEX algorithm. It is described at least at the german 
> and english wikipedia. It was originally developed to find similar 
> names in genealogy, but I think it could be well used in your 
> situation, maybe with slight modifications.
I have searched a little more and found the metaphone algorithm as a 
successor of the soundex. For metaphone is a java class already 
available at 
http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/Metaphone.html

I have never used this, but looks quite reasonable.

Regards,
Johann




More information about the mkgmap-dev mailing list