You're right, they are all different encodings for the same characters set. The most current is probably Shift_JIS.
I suppose that developping such an utility wouldn't be problem for me, but I'll be looking for something similar over the net.
But how to deal with the space issue?
Phpdig won't be able to index words. I can't find of any way to pass through this.
Will it have to index each phrase separately as a single word?
|