logo separator

[mkgmap-dev] Search addresses for latin countries (help on reg exp)

From Carlos Dávila cdavilam at orangecorreo.es on Mon Aug 5 22:09:12 BST 2013

El 05/08/13 19:42, Steve Ratcliffe escribió:
> Hi
>
>> Folks, as you know – this comes up time to time –  address search is
>> unpractical in most Latin countries where the street/square name usually
>> starts with the type  (Via, Viale,Corso,  Piazza etc [IT];  Avenida,
>> Calle, Plaza etc [ES];  Avenue, Boulevard, Rue, Place etc [FR] etc.)
>> followed by the full name of  - usually - the person naming the street.
>> Nevertheless the street names sometime appears abbreviated (V.le,
>>    Av.da, Bld.  etc),  sometime the Middle name is skipped, sometime the
>> work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
>>
> The Garmin index format has a way of dealing with this problem and
> earlier this year I made a branch that creates an index with the extra
> information to show where the interesting part of the name starts.
>
> The latest version indexes every word in the name separately so you
> could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
>
> So this will always work for any language, but at the cost of a
> much larger index.
>
> It would be great if someone could try it out as it is, then
> if useful, its more likely that someone would improve it. By
> devising a suitable way to cut down the useless entries.
>
> Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download
> page.
>
>> So what is a simple Mozartstrasse in Austria would look like “Via
>> Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in
>> France but possibly also “Av.da de Mozart” etc.
>>
>> Now, everyone knows the street/square by its last name and it would be
>> much more practical to search by it:  I’d like to have a style that just
>> pick the last full word of the street/square name and put it as a suffix
>> followed by a comma and the original name.
>>
>> This would really boost address search for Latin countries – so it might
>> be a default style to add to IT, FR, ES, BR, MX… etc).
>>
>> Could you help me on making that regular expression for the style?
>>
>> “str1 str2… strN” -> “strN, str1 str2… strN”
>>
>> Thanks!
>>
>> Enrico
First result with the mixed-index branch, processing Spain with default 
style
Total time taken: 391216ms vs 449649ms with r2661
index size: 29 MB vs 21.6 MB with r2661
Apart from the numbers, the address search doesn't work by now. Entries 
in the index are not unique and are not ordered (see screenshot 1). When 
you type a letter search results don't change accordingly (screenshot 
2). This is the console output, if it is of any help:
=== FIRST
t1=0, t2=55013
first av 96203/24, last 0/12
AVENIDA : 32380
CAMINO : 14816
PLAZA : 12864
CARRETERA : 28180
CALLE : 288500
RÚA : 9130
CARRER : 117140
AVINGUDA : 11602
=== LAST
KALEA : 9682
AUZOA : 11604
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MapSource1.png
Type: image/png
Size: 3390 bytes
Desc: not available
Url : http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130805/2170ba42/attachment-0002.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MapSource2.png
Type: image/png
Size: 5251 bytes
Desc: not available
Url : http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130805/2170ba42/attachment-0003.png 


More information about the mkgmap-dev mailing list