logo separator

[mkgmap-dev] Search addresses for latin countries (help on reg exp)

From Carlos Dávila cdavilam at orangecorreo.es on Mon Aug 5 23:02:54 BST 2013

El 05/08/13 23:09, Carlos Dávila escribió:
> El 05/08/13 19:42, Steve Ratcliffe escribió:
>> Hi
>>
>>> Folks, as you know – this comes up time to time – address search is
>>> unpractical in most Latin countries where the street/square name 
>>> usually
>>> starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida,
>>> Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.)
>>> followed by the full name of - usually - the person naming the street.
>>> Nevertheless the street names sometime appears abbreviated (V.le,
>>> Av.da, Bld. etc), sometime the Middle name is skipped, sometime the
>>> work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
>>>
>> The Garmin index format has a way of dealing with this problem and
>> earlier this year I made a branch that creates an index with the extra
>> information to show where the interesting part of the name starts.
>>
>> The latest version indexes every word in the name separately so you
>> could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
>>
>> So this will always work for any language, but at the cost of a
>> much larger index.
>>
>> It would be great if someone could try it out as it is, then
>> if useful, its more likely that someone would improve it. By
>> devising a suitable way to cut down the useless entries.
>>
>> Download it as mkgmap-mixed-index-r2662.jar at the bottom of the 
>> download
>> page.
>>
>>> So what is a simple Mozartstrasse in Austria would look like “Via
>>> Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in
>>> France but possibly also “Av.da de Mozart” etc.
>>>
>>> Now, everyone knows the street/square by its last name and it would be
>>> much more practical to search by it: I’d like to have a style that just
>>> pick the last full word of the street/square name and put it as a 
>>> suffix
>>> followed by a comma and the original name.
>>>
>>> This would really boost address search for Latin countries – so it 
>>> might
>>> be a default style to add to IT, FR, ES, BR, MX… etc).
>>>
>>> Could you help me on making that regular expression for the style?
>>>
>>> “str1 str2… strN” -> “strN, str1 str2… strN”
>>>
>>> Thanks!
>>>
>>> Enrico
> First result with the mixed-index branch, processing Spain with 
> default style
> Total time taken: 391216ms vs 449649ms with r2661
> index size: 29 MB vs 21.6 MB with r2661
> Apart from the numbers, the address search doesn't work by now. 
> Entries in the index are not unique and are not ordered (see 
> screenshot 1). When you type a letter search results don't change 
> accordingly (screenshot 2). This is the console output, if it is of 
> any help:
> === FIRST
> t1=0, t2=55013
> first av 96203/24, last 0/12
> AVENIDA : 32380
> CAMINO : 14816
> PLAZA : 12864
> CARRETERA : 28180
> CALLE : 288500
> RÚA : 9130
> CARRER : 117140
> AVINGUDA : 11602
> === LAST
> KALEA : 9682
> AUZOA : 11604
I have compiled the same input data with the same command and strangely 
now it seems to work better. Typing "C" in the search field selects all 
streets with a "C" as first letter in their name after calle, avenida or 
whatever (see screenshot), apart from the 3 first entries in the list.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MapSource3.png
Type: image/png
Size: 3601 bytes
Desc: not available
Url : http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130806/ab801204/attachment.png 


More information about the mkgmap-dev mailing list