It was thus said that the Great Philip Pemberton once stated:
Gordon JC Pearce wrote:
Is anyone else seeing broken UTF-8 characters on
the Jargon File? I'm
using Firefox 2.0.0.3 on Ubuntu Feisty.
Same here, with Firefox 2.0.0.3 on Windows 2000 SP4 and Firefox 2.0 on
Fedora Core 6.
Looks awfy like it's been pasted from an MS
Word doc - surely not?
<meta name="generator" content="DocBook XSL Stylesheets
V1.61.0"/>
So either the Docbook source or Docbook->XHTML converter he's using is
FUBAR.
It's actually the HTTP protocol in this case. If you select UTF-8 encoding (under
Firefox, "View -> Character Encodings -> UTF-8" you'll see the page
correctly. Internally, the page is set to UTF-8 (within the <?xml>
processing tag) but Firefox *has* to accept the character set that Apache
sends (in this case, ISO-8859-1, which is the Apache default by the way) and
interpret it that way. So even though the document itself specifies UTF-8,
becasue the way conflicting character set information is resolved, the HTTP
sever (in this case, Apache) wins (wierd, I know, but that's the way it is).
-spc (So there you go)