On Thu, 2004-08-12 at 00:41, Vintage Computer Festival wrote:
On Wed, 11 Aug 2004, Sean 'Captain Napalm'
Conner wrote:
It was thus said that the Great Vintage Computer
Festival once stated:
XML is more a more "current" technology
but I was trying to keep with the
platform neutrality by sticking to text-only and not assuming the use of any
other technology like XML.
XML is platform neutral because it's basically ASCII, right?
Nope. XML files can be represented in multiple character sets, possibly
including (but certainly not limited to):
<snip!>
Best decide this now.
Ok, I choose US-ASCII. This will be up for debate I'm sure, but surely
US-ASCII is the most widely deployed character set in the world currently?
*if* we've decided that it's sensible to use XML for this over some
mechanism, then does it matter? I thought that to be compliant with the
XML spec, the XML document should say what version of the spec and what
character encoding it uses?
In other words, who cares what charset is used - people can use whatever
charset makes sense for them. It just needs to be spelled out that it's
mandatory to say which charset is in use for the archive to be valid.
Someone in Japan, say, may well want to fill in data fields in the
archive (such as description) using their native language. We shouldn't
stop them from doing this and force them to use a single-byte character
set such as ASCII.
I'd rather future generations stumble across an archive in Japanese and
have to translate it if necessary at the time, rather than say someone
in Japan who wasn't too good at English be forced to fill in data in
English and end up with ambiguous information.
cheers
Jules