PDA

View Full Version : I want search RUSSIAN (ISO-8859-5) language in PHPDig, How to ???


Ivan
09-26-2003, 04:08 AM
Please help me! I want search RUSSIAN (ISO-8859-5) language in PHPDig, How to ???

I can not undestand, what mast I do in this:
{
5.3. Configure PhpDig encoding

Modify the following contant. PhpDig does not support multiple encodings : The choosen applies to all indexed documents and admin interface.

define('PHPDIG_ENCODING','iso-8859-1'); // iso-8859-1 and iso-8859-2 supported

If you want PhpDig supports others encoding, you have to add array indexes to the following variables, taking example on existing ones.

$phpdig_string_subst['iso-8859-1']
$phpdig_string_subst['iso-8859-2']
...

$phpdig_words_chars['iso-8859-1']
$phpdig_words_chars['iso-8859-2']
...
}

Thank You!

Charter
09-26-2003, 03:30 PM
Hi. Below are two ISO-8859-5 character sets. To use these sets, try the following.

In the config file, the variable $phpdig_string_subst['iso-8859-1'] has the letter A followed by the colon and then other letter A's with accents. For example A:ÀÁÂÃÄÅ and so forth.

Try making $phpdig_string_subst['iso-8859-5'] in the same form as $phpdig_string_subst['iso-8859-1'] and $phpdig_string_subst['iso-8859-2'].

Also, in the config file are the $phpdig_words_chars['iso-8859-1'] and $phpdig_words_chars['iso-8859-2'] variables.

Try making a $phpdig_words_chars['iso-8859-5'] variable for Russian letters without accents.

Then define('PHPDIG_ENCODING','iso-8859-5'); in the config file.


One ISO-8859-5 character set.

Small letters:
Ð Ñ Ò Ó Ô Õ ñ Ö × Ø Ù Ú Û Ü Ý Þ ß * á â ã ä å æ ç è é ê ë ì * î ï

Capital letters:
° ± ² ³ ´ µ ¡ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï


Another ISO-8859-5 character set.

char dec col/row oct hex description
[_] 160 10/00 240 A0 No-break space
[¡] 161 10/01 241 A1 Cyrillic Io
[¢] 162 10/02 242 A2 Serbocroation Dje
[£] 163 10/03 243 A3 Macedonian Gje
[¤] 164 10/04 244 A4 Ukranian Ie
[¥] 165 10/05 245 A5 Macedonian Dze
[¦] 166 10/06 246 A6 Cyrillic I
[§] 167 10/07 247 A7 Ukranian Yi
[¨] 168 10/08 250 A8 Cyrillic Je
[©] 169 10/09 251 A9 Cyrillic Lje
[ª] 170 10/10 252 AA Cyrillic Nje
[«] 171 10/11 253 AB Serbocroation Chje
[¬] 172 10/12 254 AC Macedonian Kje
[_] 173 10/13 255 AD Soft hyphen
[®] 174 10/14 256 AE Bielorussian Short U
[¯] 175 10/15 257 AF Cyrillic Dze
[°] 176 11/00 260 B0 Cyrillic A
[±] 177 11/01 261 B1 Cyrillic Be
[²] 178 11/02 262 B2 Cyrillic Ve
[³] 179 11/03 263 B3 Cyrillic Ghe
[´] 180 11/04 264 B4 Cyrillic De
[µ] 181 11/05 265 B5 Cyrillic Ie
[¶] 182 11/06 266 B6 Cyrillic Zhe
[·] 183 11/07 267 B7 Cyrillic Ze
[¸] 184 11/08 270 B8 Cyrillic I
[¹] 185 11/09 271 B9 Cyrillic Short I
[º] 186 11/10 272 BA Cyrillic Ka
[»] 187 11/11 273 BB Cyrillic El
[¼] 188 11/12 274 BC Cyrillic Em
[½] 189 11/13 275 BD Cyrillic En
[¾] 190 11/14 276 BE Cyrillic O
[¿] 191 11/15 277 BF Cyrillic Pe
[À] 192 12/00 300 C0 Cyrillic Er
[Á] 193 12/01 301 C1 Cyrillic Es
[Â] 194 12/02 302 C2 Cyrillic Te
[Ã] 195 12/03 303 C3 Cyrillic U
[Ä] 196 12/04 304 C4 Cyrillic Ef
[Å] 197 12/05 305 C5 Cyrillic Ha
[Æ] 198 12/06 306 C6 Cyrillic Tse
[Ç] 199 12/07 307 C7 Cyrillic Che
[È] 200 12/08 310 C8 Cyrillic Sha
[É] 201 12/09 311 C9 Cyrillic Shcha
[Ê] 202 12/10 312 CA Cyrillic Hard Sign
[Ë] 203 12/11 313 CB Cyrillic Yeri
[Ì] 204 12/12 314 CC Cyrillic Soft Sign
[Í] 205 12/13 315 CD Cyrillic E
[Î] 206 12/14 316 CE Cyrillic Yu
[Ï] 207 12/15 317 CF Cyrillic Ya
[Ð] 208 13/00 320 D0 Cyrillic a
[Ñ] 209 13/01 321 D1 Cyrillic be
[Ò] 210 13/02 322 D2 Cyrillic ve
[Ó] 211 13/03 323 D3 Cyrillic ghe
[Ô] 212 13/04 324 D4 Cyrillic de
[Õ] 213 13/05 325 D5 Cyrillic ie
[Ö] 214 13/06 326 D6 Cyrillic zhe
[×] 215 13/07 327 D7 Cyrillic ze
[Ø] 216 13/08 330 D8 Cyrillic i
[Ù] 217 13/09 331 D9 Cyrillic Short i
[Ú] 218 13/10 332 DA Cyrillic ka
[Û] 219 13/11 333 DB Cyrillic el
[Ü] 220 13/12 334 DC Cyrillic em
[Ý] 221 13/13 335 DD Cyrillic en
[Þ] 222 13/14 336 DE Cyrillic o
[ß] 223 13/15 337 DF Cyrillic pe
[*] 224 14/00 340 E0 Cyrillic er
[á] 225 14/01 341 E1 Cyrillic es
[â] 226 14/02 342 E2 Cyrillic te
[ã] 227 14/03 343 E3 Cyrillic u
[ä] 228 14/04 344 E4 Cyrillic ef
[å] 229 14/05 345 E5 Cyrillic ha
[æ] 230 14/06 346 E6 Cyrillic tse
[ç] 231 14/07 347 E7 Cyrillic che
[è] 232 14/08 350 E8 Cyrillic sha
[é] 233 14/09 351 E9 Cyrillic shcha
[ê] 234 14/10 352 EA Cyrillic hard sign
[ë] 235 14/11 353 EB Cyrillic yeri
[ì] 236 14/12 354 EC Cyrillic soft sign
[*] 237 14/13 355 ED Cyrillic e
[î] 238 14/14 356 EE Cyrillic yu
[ï] 239 14/15 357 EF Cyrillic ya
[ð] 240 15/00 360 F0 Number Acronym
[ñ] 241 15/01 361 F1 Cyrillic io
[ò] 242 15/02 362 F2 Serbocroation dje
[ó] 243 15/03 363 F3 Macedonian gje
[ô] 244 15/04 364 F4 Ukranian ie
[õ] 245 15/05 365 F5 Macedonian dze
[ö] 246 15/06 366 F6 Cyrillic i
[÷] 247 15/07 367 F7 Ukranian yi
[ø] 248 15/08 370 F8 Cyrillic je
[ù] 249 15/09 371 F9 Cyrillic lje
[ú] 250 15/10 372 FA Cyrillic nje
[û] 251 15/11 373 FB Serbocroatian chje
[ü] 252 15/12 374 FC Macedonian kje
[ý] 253 15/13 375 FD Paragraph sign
[þ] 254 15/14 376 FE Bielorussian short u
[ÿ] 255 15/15 377 FF Cyrillic dze

Go to http://www.columbia.edu/kermit/cyrillic.html to map the ASCII characters above to their Russian counterparts.

EDIT: See this (http://www.phpdig.net/showthread.php?threadid=275) thread.