%PDF-1.5
%
4069 0 obj
<< /Linearized 1 /L 16898026 /H [ 15415456 812 ] /O 4072 /E 16344460 /N 12 /T 16816520 >>
endobj
60000 0 obj
<< /Length 15413236 >>
stream
.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-
!!! If you see garbage below, make sure your text editor is set to use !!!
!!! UTF-8 encoding. For example, in Vim, switch with :e ++enc=utf-8 !!!
.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-
_ _ __ ISSUE 1 £1.99 JUNE 2021
_ | | __ _| |__ / /_ In this issue:
_| |_ | |/ _` | '_ \| '_ \ New Frontiers in PDF Accessibility
|_ _| | | (_| | |_) | (_) | Publishing in PDF: preliminary experimental
|_| |_|\__,_|_.__/ \___/ results
Fantasy Footbyte
Colophon
Tracking
█▀▀▀▀▀█ ▀▀▀█▄█▄ █ █▀▀▀▀▀█
█ ███ █ ▀▄█ ▀█▄▄ █ ███ █
█ ▀▀▀ █ ▄▀█▀██ █ ▀▀▀ █
▀▀▀▀▀▀▀ █▄▀ █ █ █ ▀▀▀▀▀▀▀
▀█▄██▄▀ ▄▀▄ ▀▀█▄▀ ▀▄▄▄▄▄▀
█▄▄▄▀▀▀ ▄▀ ▀ ██▀▀█▄██▄▄█
█▄ ▀▀▄▀██▀█ ▄▀▀▄█ ▀▀▄ ▄▀
█▀▄▄▀▀▀█▄██ ▄▄██▄ ▀▄▀██▀█
▀ ▀ ▀▀ █▄ █▀ ▀█▀▀▀█ ██
█▀▀▀▀▀█ █ ▄▀█▀▄█ ▀ █ ▀
█ ███ █ ███▀▄█ ▀▀▀█▀▀ ▄█
█ ▀▀▀ █ ▄ ▀█▄█ ▀ ▄▄█▀███
▀▀▀▀▀▀▀ ▀▀ ▀▀▀ ▀▀▀ ▀ ▀ 1LAB6ABnKev8dXGpHPCctyJfnUunJs13ah
Find the disk image as a PDF attachment.
.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-=-._.-
New Frontiers in PDF Accessibility
==================================
This file is both a valid PDF/A-3b document and a valid MP3 file containing a
dramatic reading of the content. It is also readable as plain text in any text
editor. It has been tested in applications such as Adobe Reader, Windows Media
Player, and Vim.
Why? Because sometimes you want to read Lab 6 with full colour and glorious
layout, and sometimes you want to read Lab 6 in a text mode terminal, and
sometimes you want to listen to it while swimming - and you don't want to
maintain multiple copies in separate files that might go missing.
How? Through the magic of binary polyglots!
While many PDF readers take a flexible approach to finding the 5 PDF header
bytes %PDF-, the spec requires that they occur at offset zero. Adobe Reader is
happy to search the first 1024 bytes for the magic string, but validation tools
are rightly stricter. Meanwhile, the MP3¹ file format has no overall header,
instead consisting of a sequence of frames for which players must search. I
suspect that this behaviour is helpful in the context of live streams that may
deliver partial frames and leading garbage such as HTTP headers; the player just
needs to find the frame sync bits and pick up from there.
This tolerance means we can hide the MP3 file away within a PDF stream object.
Placing non-PDF data here doesn't clash with PDF content because the PDF file
format contains pointers to explicit offsets that allow readers to seek straight
to the relevant objects. We still need to place this non-PDF data close to the
start of the file as some media players bail out if they search for too long
without locating any MP3 frames. Fortunately, there are very few mandatory PDF
preamble bytes, so we don't need to keep media players waiting too long.
Since PDF uses hard-coded pointers to objects, adding additional objects near
the start is not going to work without rebuilding all those pointers so they
point to the new, higher, offsets. Fortunately, qpdf is adept at inserting
extra text after the header, and with only a slight modification can be
convinced to add arbitrary binary data.
What remains is to add the plain text representation of the document. PDFs treat
any line starting with a % as a comment, so we could just write the text as a
series of comment lines, but this is an unnecessary constraint when we can just
wrap it in another stream object, to make the text visually cleaner. The flaw in
this plan is that if we add too much plain text, we risk the media players
losing interest. Fortunately, plain text is very light weight, so there is
substantial headroom available in the file.
It's true that the plain text content will be immediately followed by a wall of
binary gibberish that won't look good in a text editor - but it's trivial to
just not read any further.
It is important to adhere to the PDF spec precisely, as Adobe have a history of
blacklisting² polyglot techniques that bend the spec too far.
And hey presto! A universally accessible PDF/MP3/TXT document, all in one file.