On Fri, 2005-06-03 at 15:48 +0000, Jules Richardson wrote:
If the PDF spec's on Adobe's web site then I
haven't been able to find
it yet...
fount it at:
http://partners.adobe.com/public/developer/pdf/index_reference.html
... via google; any variation on "PDF specification" through Adobe's own
website found nothing!
Image XObject streams (section 4.8.4) seem to support an optional
"Metadata stream" that appeared in version 1.4 of the PDF specification;
prior to that metadata could only be attached to the doc as a whole, not
to individual parts.
It looks like what appears in the metadata stream is fairly tightly
controlled though. What's missing is any defined way of saying what the
original image was, and therefore how to either convert image metadata
to PDF metadata format, or do the reverse when extracting image data
back out of the PDF file. In both cases that'd be dependant on what the
tools decided to do; image metadata preservation is totally up to the
tools and nothing in the spec says where or how it should be preserved.
I suppose that makes sense as far as Adobe are concerned; they consider
PDF as an end format, and that nobody would ever want to actually
deconstruct one into some other format. Likewise it's really beyond
their scope to say how to convert metadata when pixel images are recoded
to PDF's internal representation; as a group we'd likely have to come up
with a code of conduct for that if PDF's the way forward.
There is reference to an XMP "Extensible Metadata Platform" doc which
I'll have to try and get hold of to see what that's all about....
No wonder there's some awful PDF tools out there; I'd never realised
quite how complicated a format it is!
cheers
Jules