PDA

View Full Version : Multiple and Multibyte Support


Charter
01-25-2005, 09:49 PM
PhpDig with multiple and multibyte support is right around the corner, assuming time is available.

Below is a list of encodings that PhpDig should soon support, converting non UTF-8 encodings on the list to UTF-8, and then writing UTF-8 information to the tables and files.

Note that some encodings have multiple names. For example, CP936 is like GB2312, so it is up to you to crosscheck your encodings to see if they are on the list.

Also, when PhpDig with multiple and multibyte support is released, it will initially be an experimental version.

Likely additional requirements are as follows:

- MySQL 4.1.7+ with UTF-8 and ability for SET queries
- PHP 4.3.10+ with mbstring, mbstr-enc-trans, mbregex
- Apache with Linux 2.4.26+ and htaccess file ability
- Understanding of iconv to convert files and tables


cp037 cp856 cp875 iso-8859-4 symbol windows-1257
cp1006 cp857 gsm0338 iso-8859-5 turkish windows-1258
cp1026 cp860 iso-8859-1 iso-8859-6 us-ascii x-mac-ce
cp424 cp861 iso-8859-10 iso-8859-7 us-ascii-quotes x-mac-cyrillic
cp437 cp862 iso-8859-11 iso-8859-8 windows-1250 x-mac-greek
cp500 cp863 iso-8859-13 iso-8859-9 windows-1251 x-mac-icelandic
cp737 cp864 iso-8859-14 koi8-r windows-1252 x-mac-roman
cp775 cp865 iso-8859-15 koi8-u windows-1253 zdingbat
cp850 cp866 iso-8859-16 mazovia windows-1254
cp852 cp869 iso-8859-2 nextstep windows-1255
cp855 cp874 iso-8859-3 stdenc windows-1256

ucs-4 utf-16le byte2be euc-tw
ucs-4be utf-7 byte2le cp950
ucs-4le utf7-imap byte4be big-5
ucs-2 utf-8 byte4le euc-kr
ucs-2be ascii base64 uhc
ucs-2le euc-jp html-entities iso-2022-kr
utf-32 sjis 7bit
utf-32be eucjp-win 8bit
utf-32le sjis-win euc-cn
utf-16 iso-2022-jp cp936
utf-16be jis hz

Mikolaj Jedrzejak kindly gave permission to incorporate ConvertCharset (http://mikolajj.republika.pl/) into PhpDig. The plan is to use this class to convert encodings not handled by PHP itself. Thanks to Mikolaj!

EDIT: PhpDig 1.8.8 RC1 supports multiple and multibyte encodings.