I know we've been over this before. I just want to be sure nothing
has changed and that PDF is still the work of the Devil, so I'm
setting this down for posterity and comments by and for posterity.
It is folly to pretend there are not any good reasons why HTTP is the
most wildly successful Internet protocol. It is because of HTML, and
HTML is successful for reasons that can be stipulated, too.
From my point of view it's all about abstracting presentation from
content. Markup languages are good at that: Text is text. What isn't
text is structure. What isn't structure is presentation folderal.
When individuals post content on the Web, they do it for consumption
by uncounted millions of readers around the universe. Rationally they
concern themselves with making content available to the widest
possible audience. They have to organize their content, so they need
to provide some structure. It is clear nowadays that they don't have
to prescribe how their content ought to be presented. Leaving that up
to the reader works fine. It works to some extent for people who
don't read in the language of the author, for people who don't read,
and even for people who don't see to read, so any attempt to specify
presentation of content is more than likely to get in the way of its
wider distribution. Yet we have individuals who -- apparently
irrationally -- try to save themselves trouble by posting PDF, which
instead bundles presentation with content.
PDF documents are a take-it-or-leave-it proposition. Content is
obscured by layout cues, by background colors, patterns and images,
and even by font choice, ligatures, and hyphenation. It's like
untangling a captcha to copy anything from a PDF document onto the
clipboard. You can't clear away the presentation formatting to get at
the content and its structure.
People post PDFs because they have little respect (and no sympathy)
for their audience. What can one say really about authors who wish to
block the cutting of content to the clipboard from within their
readers' viewers, which many PDF authors do by specifying heightened
security settings for their work?
I use PDFs when I'm sending advertising material to quick-print stores
and newspapers. I want to be sure the content is exact, and I want to
nail down the layout. I want them to use my fonts, which they seem
inordinately reluctant to do. I don't use PDFs for anything else.
If you have brochures you want other people to print for you, you will
probably want to send them as PDFs. However, the situations in which
you would post such documents on the Web are much rarer than would be
indicated by the prevalence of PDF links that turn up in Google
searches. Usually you want people to read first and then print.
Posting PDFs instead of HTML means you've got that bassackwards.
If you think you can control your online readers' experience more
exactly by sending them PDFs, you are whistling Dixie. Many
prospective readers will not take the trouble to download and open a
PDF or even to download and install a PDF viewer if you tell them
beforehand that's what your links lead to.
I know people are going to point out that posting PDFs is preferable
to posting Microsoft Word documents and nearly as easy to do for naive
authors who are unfamiliar with HTML. This may be acceptable to
limited audiences of equally naive readers, but is unacceptable for
targeting broad swaths of the general public who are not so naive.
I will stipulate that posting PDFs is preferable to posting the
execrable HTML that Microsoft Word generates. Don't get me started.
That stuff may look good enough in Microsoft Internet Explorer. It
does not necessarily look very good in Firefox and other browsers, and
it takes a long time to retrieve and render. Furthermore, it probably
contains concealed information about the author and may even include
hidden deleted content and commentary meta data that should not be
published lest someone of less naivete should look for, discover, and
decipher it. It usually contains tons of formatting shifts into and
out of fonts of different sizes, weights, faces, styles, and colors
each with null content that are artifacts of cutting and pasting
portions of paragraphs with non-default formatting in and out, back
and forth, and through and through the document during composition.
The bloat that these presentation specifications introduce is not
after all related in any significant way to the final structure and
content of the document. Thus, inefficient efforts to specify
presentation can ultimately swamp the content.
Here then is the case for HTML against PDF:
o Hypertext URLs within PDF documents are a pale imitation of robust
support for the myriads of link types such as eMail addresses and
phone numbers provided by graphical browsers nowadays. Many PDF
authors do not make URL references "hot" so that the reader can
traverse them. Many PDF viewers cannot load and transfer control to
other external viewers such as Internet browsers to handle non-native
content provided by hot links. Heck, even the *gnome-terminal* and
almost all eMail and News clients can guess when a URL shows up in
text on the screen and allow traversal!
On the other hand, that's what Internet browsers do. If you are going
to structure your content as hypertext or even show a few URLs in it,
you should post it as HTML.
o PDF links cannot be presented by most browsers in the browser
window. Instead a separate viewer must be loaded to read and present
the document. In real-world implementations, downloading the document
is handled by the browser's retrieval mechanism, so the viewer doesn't
even start to load until the download is completely accomplished. The
reader sees no partial results and is, thus, unable to audit the state
or relevance of the document, but he must wait upon its entirety.
Also, the viewer will have a different look and feel from the browser,
and the reader will have to be aware of it and accommodate it. If
that were not enough to condemn the use of PDF links, consider that
there is no back button that will unequivocally quit the viewer and
return the reader to his previous position on the browser page.
On the other hand, HTML browsers handle HTML natively. This in and of
itself must commend the use of HTML to the exclusion of other document
formats.
o Most Web page designers do not impose presentation specifications
that prevent successful text wrapping on even the narrowest devices
such as cell phones and tablets. Thus the layout is deemed vertical,
and the only scrolling required is vertical, which is natural and
customary to readers who use browsers.
On the other hand, graphic artists and typesetters are obsessed with
casting brochures into side-by-side columns. Reading these in a PDF
viewer on even the widest devices requires the user to perform a
complex pirouette between horizontal and vertical scrolling, and, even
it did not, simulating the column and page breaks on the screen still
presents an artificial and unnecessary distraction from the content.
Tablet devices are said to be ushering in a new paradigm where
horizontal scrolling (page turning or page flipping) is the norm. Let
it be said that tablet computing has been discredited more than once
in recent memory principally do to deficient human interfaces
(pointing devices and keyboards). Let's be frank that tablet devices
need documents to be formatted with specific page sizes and cannot
perform very well on those that are too wide or too deep. Evidently
publishers really believe that all tablet device users have eyes the
same size just as book readers do.
o Many mobile devices and especially those without sufficient RAM
storage cannot render PDF documents.
On the other hand, they do a terrific job of rendering HTML.
--
.. Be Seeing You,
.. Chuck Rhode, Sheboygan, WI, USA
.. Weather: http://LacusVeris.com/WX
.. 50° — Wind NNE 10 mph
>has changed and that PDF is still the work of the Devil, so I'm
>setting this down for posterity and comments by and for posterity.
>It is folly to pretend there are not any good reasons why HTTP is the
>most wildly successful Internet protocol. It is because of HTML, and
>HTML is successful for reasons that can be stipulated, too.