Results 1 to 3 of 3

Thread: utf-8 encoding mystery

  1. #1
    Join Date
    Sep 2012
    Posts
    17
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Default utf-8 encoding mystery

    I have a problem with the display of Spanish characters on some pages and not on others.

    Two examples:

    http://estateagentsespana.com/utftest1.htm
    http://estateagentsespana.com/utftest2.htm

    If you hover over the Spanish flag icon at the top right-hand corner of the page, the Alt text displays Spanish characters correctly in utftest2.htm, but displays them corrupted in utftest1.htm.

    Both files were edited using gedit in Ubuntu 16.04. The Spanish characters display corectly in the editor.

    Most strangely, if I copy/paste the entire <head> section from utftest2 to utftest1, the characters still show corrupted, whereas if I copy the entire <head> section from utftest1 to utftest2, the characters display correctly.

    It looks as if something outside the <head> section is causing problems, but I'm at a total loss as to what's going on here.

  2. #2
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,377
    Thanks
    77
    Thanked 3,421 Times in 3,382 Posts
    Blog Entries
    12

    Default

    Oh, it's something outside the head section. I have an idea, but there are too many variables I'm unable to check from here to know. I can tell you a few things that could be impacting this:

    1.) Not all servers will serve an HTML document in UTF-8 even if it has a meta tag directing it do so. Some servers have a default override to serve in whatever (often ISO-8859-1 or similar).
    2.) The characters you want to show correctly can be shown correctly in either UTF-8 (multibyte) or ISO-8859-1 and similar single byte encodings, but the page itself must be saved in the same or similar encoding via which it will be served for these 'extended' characters to render properly in the browser.
    3.) Text editors like gedit (fairly sure, haven't used that one in some time, but even Windows Notepad can do this) can save your document in a variety of encodings.

    So one or more of these facts is probably at work here. Perhaps your document that works is saved in ISO-8859-1 and the server is serving that, the document that doesn't work may be true UTF-8. Or the server may be serving in UTF-8, but the problem document may be saved as ISO-8859-1 or other single byte encoding. It might even be saved in a different multibyte encoding incompatible with UTF-8. There are likely other possibilities.

    Upon further examination, using the w3c validator, it appears as though the problem document was saved in your editor using ISO-8859-1 (most likely) or some other single byte or at least some UTF-8 incompatible encoding, it also appears that both documents are being served as UTF-8. But I still wouldn't completely rule out another explanation or combination of issues.

    With yet further examination using IE11's encoding feature it now appears about certain that the problem page was saved in the editor as ISO-8859-1, aka Latin-1, or possibly Western European (Windows). And that both are being served as UTF-8.
    Last edited by jscheuer1; 08-21-2017 at 03:40 PM. Reason: add info
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  3. The Following User Says Thank You to jscheuer1 For This Useful Post:

    wkennyspain (08-21-2017)

  4. #3
    Join Date
    Sep 2012
    Posts
    17
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Default

    Hi John,

    Many thanks as always for your help. Gedit was saving the 'bad' file as Western ISO-18859-15. Changing that to utf-8 has solved the problem.

    On a side note, when trying to get to the bottom of the problem, I tried changing the encoding to ISO-8859-1. When I ran that through the W3C HTML validator, it raised a warning and advised me to change it to UTF-8.

    Thanks again

Similar Threads

  1. Mystery Javascript
    By rlfinc in forum JavaScript
    Replies: 7
    Last Post: 11-02-2008, 02:11 AM
  2. Mystery Javascript
    By kuau in forum JavaScript
    Replies: 6
    Last Post: 06-13-2008, 11:24 PM
  3. Mystery Function - say what?
    By kuau in forum PHP
    Replies: 4
    Last Post: 05-16-2008, 04:03 PM
  4. Mystery Script
    By kuau in forum JavaScript
    Replies: 11
    Last Post: 12-26-2007, 12:26 PM
  5. Mystery Letters in IE6
    By spoonman in forum CSS
    Replies: 1
    Last Post: 07-04-2007, 02:33 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •