The W3C does currently recommend the use of a BOM with UTF-16, but only for HTML5 (not for prior versions). I don't pretend to understand why.
Off the top of my head, I think that the BOM might have something to do with East Asian texts...?
I thought it would be good to note that the lack of a BOM is important for web publishing, as I, when looking for UTF-8 in the encodings list, just went to the one that said "UTF-8," without taking note of the other ones, which seemed irrelevant, given that I had already found UTF-8.
Edit: I'm thinking that, given my last point, it might be more useful for applications such as Notepad++ to give options as "UTF-8" and "UTF-8 with BOM" or "UTF-8 without BOM" and "UTF-8 with BOM," instead of "UTF-8 without BOM" and "UTF-8." Of course, knowing nothing about Unicode outside of web publishing, this could have disastrous effects on files being used in other fields.
Update: I created this for myself, and thought I should post it here, in case anyone else keeps making the same mistake in clicking "UTF-8" when they want UTF-8 encoding
. It is a localization file for Notepad++ (tested in Notepad++ 6.1.3) which uses the first "more useful" example in the edit above. It should be placed in the localization directory of your Notepad++ installation folder, and can be used by choosing "English (customizable)" in the "Localization" section of the Properties dialog.
Download
Bookmarks