Keep in mind that I am not an advocate of DjVu, I'm just pointing out some
technical details why PDF is not 100% perfect for everything. Read below:
Eric Smith wrote:
DjVu has other
advantages, such as local/window/viewport decoding of
images with ludicrously high dimensions/resolutions but I understand
your point.
I'm not sure I fully understand, but it doesn't sound like anything
that the PDF format can't support. I would rather invest effort into
improving the capabilities of free PDF viewer software such as xpdf
rather than pushing a different standard.
PDF cannot support local decoding (at least, not without some major redesign).
DjVu supports decoding any local region of a gigantic bitmap, say,
20000x20000. For example, you could pan around said image in a window without
having to download it all.
Where are the
tools to create DjVu-like PDF files?
I've been doing it myself with an experimental version of my "tumble"
program. With that, it's an entirely manual process. I have to split
the continuous-tone images into a separate layer or file using a
separate editor (such as Gimp). Then I use tumble to compose a page
with the background as G4 and the images as JPEG.
You're not exactly helping your argument if you have to write your own software
to do it *and* it's laborious :-)
I hope to automate the multi-color text problem in
tumble using code
derived from Tim Shoppa's "timify.c". Automating the detection and
processing of continuous tone images is in my plans as well, but further
out. There don't seem to be any good published algorithms for image
detection, so I'll have to experiment with it. As a first step I plan
to do 2D FFTs on areas of the page; text and line art should predominantly
have DC and high frequencies, while continuous-tone images should have
a more even frequency distribution.
So far I'm doing this work by scanning a page twice, once in bilevel and
once in greyscale or color. I do that because the published algorithms
for converting greyscale text and line art to bilevel (thresholding) are
nowhere near as good as what's done in a good scanner. Picture Elements
makes a PCI card that can be used to do this (and even works with Linux),
but it's very expensive so I really want a software-only solution.
All of this sounds like Pagis Pro, a 5+ year old program that interfaced with
scanners beyond 24-bit (most scanners are 30-, 36-, or more). It automatically
recognized text from graphics and encoded appropriately.
--
Jim Leonard (trixter(a)oldskool.org)
http://www.oldskool.org/
Want to help an ambitious games project?
http://www.mobygames.com/
Or check out some trippy MindCandy at
http://www.mindcandydvd.com/