Today we'll be going over how to convert a directory full of jpg images to a PDF document.
This exercise requires
ImageMagick1 and
PDFTK. Windows users will also need
CYGWIN for the bash shell and
Perl if they want the "rename" command to work.
Imagemagick is pretty standard in linux, and is available for OS X and Windows. PDFTK will need to be installed via your linux package manager, and binaries can be downloaded for OS X and Windows.
The most common response when asking Google "convert jpgs to pdf" is:
convert *.jpg output-name.pdf
and this is okay if you only have a few jpegs, and they're not very big.
The reason for this is that when doing this operation, the system has to cache each and every JPG in memory AND also it has to cache a temporary PDF while it converts and adds each JPG. If you've got 100 JPGs or a couple dozen huge JPGs, you can crash your system doing it this way.
I found part of this script on one of the helpful linux forums and it's much easier on the system, as it converts each JPG individually to a PDF and then concatenates all the PDFs into one big one at the end of the operaton.
To get the ball rolling, you'll need to get all your JPGs in one directory.
You'll then need to make sure they're named with number prefixes. If you have fewer than 99 JPGs, you can name them 01-blahblah.jpg through 99-blahblah-jpg. If you have several hundred, use 001-blahblah.jpg through 999-blahblah.jpg.
These numbers are the order in which your JPGs will be made into pages. It's very important to have a leading zero for single-digit numbers, as in linux and OS X, "10" comes before "1", but not before "01".
Now here's the script that does it:
#!/bin/bash
for i in *.jpg ;
do convert $i $i.pdf ;
rename -v 's/\.jpg\.pdf/\.pdf/' *.pdf;
done
pdftk *.pdf cat output book.pdf
You'll need to tell the system (linux and OS X) that the script is executable. To do so run
chmod +x scriptname
Line 1 tells the system it's a bash shell script
Line 2 tells bash to look for all files ending in "jpg"
Line 3 tells bash to call the ImageMagick 'convert' command and convert each JPG to a PDF
Line 4 tells bash to call the rename command and change the converted PDF filename from 01-blahblah.jpg.pdf to 01-blahblah.pdf (this step is aesthetic
2, and isn't necessary for the production of the final ebook - I'm just finicky about file names)
Line 5 tells bash "You're done! Miller time!"
Line 6 tells bash to call pdftk and concatenate all the pdfs it finds in the directory to one file named 'book.pdf' (you can change this after the script is done)
So there ya go, a method to make PDFs out of a bunch of JPGs without stressing your system.
[1] This script can be modified so that it works on any image type that ImageMagick recognizes (which is a bunch).
[2] If you don't want the rename part, just put a # in the front of that line (like line 1) before you run the script.
Want to discuss this? Have a comment for the author? Mosey on over to the Novarata Forums and let us know what you think.