PDA

View Full Version : problem with chinese characters



paldo
02-11-2014, 01:15 PM
I want to add a chinese page to my site, set the charset to UTF-8 but after uploading to the server the chinese characters don't display correctly. What am I doing wrong? When tested locally on my browser before uploading, the characters are display correctly.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>chinese characters</title>
</head>

<body>
<b>Chinese characters in UTF-8</b><br/>
Simplified characters: 简体中文网页<br/>
Traditional characters: 繁體中文網頁<br/>
</body>
</html>

Any idea? Thanks for your help.

djr33
02-11-2014, 01:17 PM
Check the encoding of the text file itself. The .htm/.txt/etc file may be set to some other encoding and distorting your characters that way.

paldo
02-11-2014, 02:20 PM
I've read in different places that the correct charset for chinese characters is UTF-8. I'm sorry, I don't understand what you mean.

djr33
02-11-2014, 03:28 PM
There are two different ways to set a character encoding:
1. You can include the information in the HTML code, like you have above.
2. You can set the file itself (as an attribute of the file, not code in the file) to that encoding.

Both must match for it to work.

In notepad I believe you can use save as... and select a new encoding if needed.

It's possible that the upload process is changing the encoding of the file, but that is unlikely and might be hard to fix, so see if the simpler answer works first.


Short version: character encodings are not stored in only one place; they must all match.

paldo
02-11-2014, 03:44 PM
I'm using dreamweaver CS5.5. I'm sorry but I don't understand point number 2. Can you give me an example?

jscheuer1
02-11-2014, 04:24 PM
I've just been experimenting with this, and using any encoding for the file that supports these characters (Chinese, simplified Chinese, big or small endian), if the page is served as UTF-8, it still works. It might be that the server is overriding the charset declared on the page. When that happens you can get:


Simplified characters: €“*–‡‘页
Traditional characters: 繁”*–‡網

or:


Chinese characters in UTF-8
Simplified characters: ^I?
Traditional characters: c餤*

etc. If saved in one of the supporting encodings and served as charset=iso-8859-1 or another ANSI only charset.

The server can override the charset. PHP and other methods can be used to force the server to use the preferred encoding or at least respect the declaration on the page. If you want more help, please include a link to the page on your site that contains the problematic code so we can check it out.

molendijk
02-11-2014, 05:01 PM
If nothing helps, you can use this converter (http://mesdomaines.nu/eendracht/converters/convert_to_unicode.html) (to Unicode). Just put Chinese tekst in it.

paldo
02-11-2014, 08:24 PM
Genius !!! It works your way. Many thanks.

paldo
02-11-2014, 08:39 PM
Thank you John. Most probably, as you said, the server is overriding the charset. I thought it might be the browser. I will contact the server provider and see if something can be changed.

I've used the suggested converter of Arie (Molendijk) and it works perfectly.

Thank you to all of you for your help.