as we work in Ukraine for Ukraine & Russian market. Most of our websites are in cyrillic.
And there is a mystery with search result in Russian language.
Search works perfectly in private space, but totally could not found anything if we submit
the search form via public interface. Of course the search string should be typed in cyrillic.
The search works well in 1.9.xx branch, but doen’t in 2.x.x.
And the same with SPIP 3 beta. I did bug report year or two ago.
We did own search module for public part of the site for our needs.
But what could be the source of the mystery? Is any idea or solution?
I'm pretty sure it comes from this line in inc/recherche :
$recherche = trim(translitteration($recherche));
which converts the request into ascii… most useful to remove
accentuated characters, but probably a killer for cyrillic as your
content is not in ascii
do you work with the fulltext plugin or with the "native" spip search?
On Thu, Dec 1, 2011 at 2:26 PM, Serge Markitanenko
<serge.markitanenko@gmail.com> wrote:
Greeting,
as we work in Ukraine for Ukraine & Russian market. Most of our websites are
in cyrillic.
And there is a mystery with search result in Russian language.
Search works perfectly in private space, but totally could not found
anything if we submit
the search form via public interface. Of course the search string should be
typed in cyrillic.
The search works well in 1.9.xx branch, but doen't in 2.x.x.
And the same with SPIP 3 beta. I did bug report year or two ago.
We did own search module for public part of the site for our needs.
But what could be the source of the mystery? Is any idea or solution?
Best regards,
Serge
_______________________________________________
spip-en@rezo.net - http://listes.rezo.net/mailman/listinfo/spip-en
Hello to all!
I was trying to overcome this problem.
I’m pretty sure it comes from this line in inc/recherche :
$recherche = trim(translitteration($recherche));
which converts the request into ascii… most useful to remove
accentuated characters, but probably a killer for cyrillic as your
content is not in ascii
You are right about using translitteration. I removed this line and replaced it with:
include_spip(‹ inc/filtres ›);
$recherche = corriger_caracteres($recherche);
$recherche = unicode_to_utf_8(html2unicode(charset2unicode($recherche)));
In this way I wanted to fix problem with cyrillic characters.
In addition I did the same with highlighted things in next to lines:
? preg_match_all($preg, translitteration_rapide($t[$champ]), $regs, PREG_SET_ORDER)
: preg_match($preg, translitteration_rapide($t[$champ]))
Eventually my search engine started to work with cyrillic characters. But one problem left - it works in case sensitive manner, and I don’t know how to fix that.
Any ideas?