Product Version2.1.2 
Summary0000903: amuleweb failed in CJK keywords
Descriptionno search results from CJK keywords, but others works fine.
Operating SystemAny
Kry (manager)
2006-05-30 07:19

DreamerC (reporter)
2006-06-03 00:49
edited on: 2006-06-03 01:07

CJK = Chinese, Japeness, and Korean

If you run with UTF-8 and Big5 decode, it won't be search in the right way.
it has some results from Big5 in browsers, and shows buggy answers.
UTF-8 shows nothing.

edited on: 06-03-06 01:07
whoami (reporter)
2007-05-08 16:31
edited on: 2007-05-08 16:31

Aha. I was right to check out before I add a new bugreport.

I suffered the same problem with DreamerC and I found the remedy just a minute ago.

The remedy is:
1) set "charset=" metadata of all .php and .html in the php-default site template to UTF-8, not iso-8859-1 nor utf nor utf8. It makes the non-US letters broken.
2) on the line 794 in php_core_lib.cpp in webserver:
794: wxString(char2unicode(search)), wxString(char2unicode(ext)), ...
wxString(UTF82unicode(search)), wxString(char2unicode(ext)), ...
3) recompile, and the webserver shows the right result.

Okay, I admit this remedy might need more testing, but at least it looks ok.

edited on: 05-08-07 16:31
Kry (manager)
2007-05-09 16:02

Thanks for it, I'll touch it later.
whoami (reporter)
2007-05-09 17:47

Well, my "remedy" works okay so far.. but amuleweb tends to find some files without the keyword, although the occurrence is very few (about 1% of total search). I wonder there IS the keyword but the filename is too long to show in the filename section..

Anyways, there might be another needs to "char~" to "UTF8~" replacements, but I cannot tell because my knowledge of amuleweb source is too light. Please do replace if you find some. ;)

To test CJK search, you could grep some CJK text from the site from those countries - e.g. or so - and check whether the result contains the same letter as your guinea-pig one :)

