Log in

View Full Version : Character encoding problem



wkenny
03-23-2007, 07:37 PM
In the head of my document i have
<meta http-equiv="content-type" content="text/html; charset=UTF-8">

and in the body the text "años" but the ñ is being corrupted when I look at the page in IE6 or Firefox.

I have checked both browsers and their encoding is also UTF-8. However, if I change the browser encoding to Western European the "ñ" shows correctly. If I View Source in IE, the "ñ" also shows correctly but View Source in Firefox replaces the "ñ" with a question mark.

Can anybody tell me why the characters are not rendering correctly.

boxxertrumps
03-23-2007, 08:15 PM
Are you using its character code?
the codes are usually similar to &amp;
http://www.asciitable.com/
So put
&164;
in the source where you want the n.

wkenny
03-23-2007, 08:23 PM
I am already using coding like this but think it should not be necessary. I am certain that sites in Spanish, for example, would be a nightmare to build if every non-English character had to be coded like that.

boxxertrumps
03-23-2007, 08:32 PM
It is nessesary.
html is in english, so are the character codes.

also, ascii doesnt support any characters other than the ones in the first picture in the link i gave you.

wkenny
03-23-2007, 08:49 PM
Yeah I've just had a look at a Spanish site and they use the codes.

Thanks for that.

Twey
03-23-2007, 09:30 PM
I am already using coding like this but think it should not be necessary.You're quite right, it shouldn't be.
<meta http-equiv="content-type" content="text/html; charset=UTF-8">Be aware that this will be overridden if a character set is specified in the HTTP headers (which is the preferred place to specify the character set).
I have checked both browsers and their encoding is also UTF-8. However, if I change the browser encoding to Western European the "&#241;" shows correctly. If I View Source in IE, the "&#241;" also shows correctly but View Source in Firefox replaces the "&#241;" with a question mark.You've specified UTF-8 (assuming that hasn't been overridden), but it sounds as though the file is actually in ISO-8859-1. Make sure you're saving the file as UTF-8 (it's an option in the save dialogue of Notepad; most other text editors will have something similar).

wkenny
03-24-2007, 02:13 AM
Thanks, Twey, but how do I check or change HTTP headers? The pages are hosted on a service provider - do they decide what header is provided?

With regard to writing the files, I use VB6 to output the .htm file. However, I have verified that the output is not putting in any "funnies" by opening the .htm files in metapad.

Twey
03-24-2007, 02:25 PM
Thanks, Twey, but how do I check or change HTTP headers? The pages are hosted on a service provider - do they decide what header is provided?The headers sent can be checked with a browser plugin or extension, or by connecting to your webserver with a plain-text application such as telnet:
twey@peordh /home/twey $ telnet twey.co.uk 80
Trying 82.110.105.24...
Connected to twey.co.uk.
Escape character is '^]'.
HEAD / HTTP/1.1
Host: twey.co.uk
Connection: close

HTTP/1.1 200 OK
Date: Sat, 24 Mar 2007 14:23:24 GMT
Content-Type: text/html;charset=utf-8;
Expires: Sat, 24 Mar 2007 14:23:24 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Connection: close
Server: Apache/2.0.55 (Red Hat)
X-Powered-By: PHP/4.4.4
Pragma: no-cache
Set-Cookie: PHPSESSID=c2b041c0446210e8f8d6f58995933c02; path=/
With regard to writing the files, I use VB6 to output the .htm file. However, I have verified that the output is not putting in any "funnies" by opening the .htm files in metapad.There's nothing wrong with the file, it's just not UTF-8. I can't remember how to specify the character encoding to use when writing to a file in VB6, but an equally valid solution would be to change your headers or <meta> elements to use the correct encoding (ISO-8859-1) instead of UTF-8.

Also note that if you use <meta> elements to declare the character encoding, your character encoding of choice must be ASCII-compatible.

wkenny
03-26-2007, 04:51 PM
Using the ISO encoding has solved the problem. Thanks a lot, Twey