logo separator

[mkgmap-dev] mixed index branch merge

From Steve Ratcliffe steve at parabola.me.uk on Mon Feb 16 00:21:26 GMT 2015

Hi

There are some interesting comments here.

I did have code to count the number of times certain words appeared in
a name in attempt to automatically create a stop word list for a map.
It turned out that it wasn't all that useful, for England at least.

 From the numbers you get stop words such as 'The', 'Avenue' and
'Road' as you would expect.  However many streets have names such as
'The Avenue' 'Avenue Road' and so on that consist entirely of
likely stop words. And these are not theoretical names that occur
infrequently, these are names of streets that I know.

I think we really need to be able to identify which parts of the
name are useful to index, rather than which parts are not.

So for England I think that the only rule required is to index from
the beginning of the name, as now.

For places where streets are named after people and there is
no word for 'street' included, and the street is generally
refered to by the second name then probably adding entries
for all parts of the name will work.

For places where there is a word for street at the beginning
then we have to step over that word and any following
prepositions etc.  So for France not just
"Rue", but any following "de", "des", "d'" etc.

The required action does of course depend on language rather than
country, but we don't in general have the language, so we will have to
start out using the country (or perhaps region) and see how that goes.
I suspect it will work quite well, but if not we can think of
something else when the problems are more well known.

I guess we will start out having configurable rule types and
word lists, but we need to gather sensible defaults once
a working system is developed for each country.

..Steve


More information about the mkgmap-dev mailing list