logo separator

[mkgmap-dev] Search addresses for latin countries (help on reg exp)

From Enrico Liboni eliboni at gmail.com on Tue Aug 6 08:25:45 BST 2013

That's weird... we did the same tests and it fails, but now it is eems it
is partially working fo ryoru... I'll give another try tonight.

On Tue, Aug 6, 2013 at 12:02 AM, Carlos Dávila <cdavilam at orangecorreo.es>wrote:

> El 05/08/13 23:09, Carlos Dávila escribió:
>  El 05/08/13 19:42, Steve Ratcliffe escribió:
>>> Hi
>>>  Folks, as you know – this comes up time to time – address search is
>>>> unpractical in most Latin countries where the street/square name usually
>>>> starts with the type (Via, Viale,Corso, Piazza etc [IT]; Avenida,
>>>> Calle, Plaza etc [ES]; Avenue, Boulevard, Rue, Place etc [FR] etc.)
>>>> followed by the full name of - usually - the person naming the street.
>>>> Nevertheless the street names sometime appears abbreviated (V.le,
>>>> Av.da, Bld. etc), sometime the Middle name is skipped, sometime the
>>>> work “of” is used (Avenue de Bobigny, Corso del Popolo etc)
>>>>  The Garmin index format has a way of dealing with this problem and
>>> earlier this year I made a branch that creates an index with the extra
>>> information to show where the interesting part of the name starts.
>>> The latest version indexes every word in the name separately so you
>>> could find 'corso del popolo' by typing 'corso' , 'del' or 'popolo'.
>>> So this will always work for any language, but at the cost of a
>>> much larger index.
>>> It would be great if someone could try it out as it is, then
>>> if useful, its more likely that someone would improve it. By
>>> devising a suitable way to cut down the useless entries.
>>> Download it as mkgmap-mixed-index-r2662.jar at the bottom of the download
>>> page.
>>>  So what is a simple Mozartstrasse in Austria would look like “Via
>>>> Wolfgang Amadeus Mozart” in Italy or “Rue Wolfgang Amadeus Mozart” in
>>>> France but possibly also “Av.da de Mozart” etc.
>>>> Now, everyone knows the street/square by its last name and it would be
>>>> much more practical to search by it: I’d like to have a style that just
>>>> pick the last full word of the street/square name and put it as a suffix
>>>> followed by a comma and the original name.
>>>> This would really boost address search for Latin countries – so it might
>>>> be a default style to add to IT, FR, ES, BR, MX… etc).
>>>> Could you help me on making that regular expression for the style?
>>>> “str1 str2… strN” -> “strN, str1 str2… strN”
>>>> Thanks!
>>>> Enrico
>>> First result with the mixed-index branch, processing Spain with default
>> style
>> Total time taken: 391216ms vs 449649ms with r2661
>> index size: 29 MB vs 21.6 MB with r2661
>> Apart from the numbers, the address search doesn't work by now. Entries
>> in the index are not unique and are not ordered (see screenshot 1). When
>> you type a letter search results don't change accordingly (screenshot 2).
>> This is the console output, if it is of any help:
>> === FIRST
>> t1=0, t2=55013
>> first av 96203/24, last 0/12
>> AVENIDA : 32380
>> CAMINO : 14816
>> PLAZA : 12864
>> CARRETERA : 28180
>> CALLE : 288500
>> RÚA : 9130
>> CARRER : 117140
>> AVINGUDA : 11602
>> === LAST
>> KALEA : 9682
>> AUZOA : 11604
> I have compiled the same input data with the same command and strangely
> now it seems to work better. Typing "C" in the search field selects all
> streets with a "C" as first letter in their name after calle, avenida or
> whatever (see screenshot), apart from the 3 first entries in the list.
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev at lists.mkgmap.org.uk
> http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.mkgmap.org.uk/pipermail/mkgmap-dev/attachments/20130806/bf098ff0/attachment-0001.html 

More information about the mkgmap-dev mailing list