Command Line
Some common tasks I do with PDFs.
Selecting a range of pages
If you want to make a new PDF that only has some pages, this is how to do it with pdftk
:
pdftk input.pdf cat 1-4 8 output out.pdf
This takes in an input.pdf
, takes pages 1 through 4 and page 8 and makes a 5-page output called output.pdf
.
Concatenating PDFs
pdftk
is also useful for concatenating PDFs:
pdftk input1.pdf input2.pdf cat output out.pdf
You can also use the following format to select specific pages:
pdftk A=input1.pdf B=input2.pdf cat A1 B1 A1 B1 output out.pdf
This makes a four page PDF. This is sometimes useful when I want to print junior-sized PDFs on letter paper.
SVG to PDF
This is covered in my Inkscape notes, but repeated here:
inkscape --export-type=pdf --export-filename=output.pdf input.svg
Word Doc to PDF
unoconv -T 30 -f pdf input.docx
Here, -T 30
makes a 30 second timeout.
This is just a workaround for me, since otherwise the first time I try to launch unoconv
it’ll typically time out as it takes a while to start up a server (I think that’s what’s going on).
Setting metadata
You can use this to set the title of a PDF. I didn’t like the default title that Xournal++ set, so I wanted to overwrite it.
exiftool -Title="Some Title" -Author="Jane Whoever" -xmptitle= something.pdf
This is adapted from this askubuntu.com answer. Note that previous PDF metadata is still recoverable. You can reverse an update using:
exiftool -PDF-update:all= something.pdf
If you linearize the PDF, though, this is no longer reversible. You can do this with:
qpdf --linearize in.pdf out.pdf
There are other issues to consider with PDF metadata. Julien Voisin has an article about PDF metadata if you’re concerned about leaking information. He suggests using mat2. Which you can use like so:
mat2 ./something.pdf
This’ll create a file called “something.cleaned.pdf
”.
It’ll likely be a much larger file size as text is converted to paths (at least, it seems this way since text no longer becomes selectable).
Rotate PDF
Here’s how to rotate a PDF 90° clockwise.
qpdf in.pdf out.pdf --rotate=+90
PDF booklets
I print out half-sized (or “junior”-sized) PDFs (like A5 vs A4) often. See junior-pdf.nix for a script that does this. It basically makes it easy to print a PDF that can be cut and bound easily.
Tools
- pdftk is a command line tool for PDFs
- Inkscape (see /notes/inkscape) can convert SVGs to PDF
- Xournal++ is good for handwritten annotations of PDFs (like for signing documents)
- unoconv converts formats like
.docx
toPDF
. - WeasyPrint (also see my notes on WeasyPrint) can generate PDFs from HTML and CSS (and is supposed to support more paged media CSS than typical web browsers)
- pandoc (see my pandoc notes) can do things like output markdown to PDF
- img2pdf can convert a sequence of JPEGs to PDF losslessly (see my image/document scanning notes for an example)
- ExifTool (see my ExifTool notes) for viewing and appending metadata
- QPDF can linearize a PDF.
- mat2 if you are concerned about remnant metadata