I have PDF forms that I want to autopopulate with data from my Django web application and then offer to the user to download. What python library would let me easily pre-populate PDF forms? These forms are intended to be printed out.
Reportlab is great if you’re generating very dynamic PDFs and need to programmatically control all of it: data and layout.
To just fill out forms in existing PDFs, reportlab is overkill and you’ll basically have to rebuild the PDF from scratch in reportlab instead of just taking a PDF with a form that’s already been made.
PDF forms work with FDF data. I ported a PHP FDF library to Python a while back when I had to do this and released it as fdfgen. I use that to generate an fdf file with the data for the form, then use pdftk to push the fdf into a PDF form and generate the output.
The whole process works like this:
- You (or a designer) design the PDF in Acrobat or whatever and mark the form fields and take note of the field names (I’m not sure exactly how this is done; our designer does this step). Let’s say your form has fields “name” and “telephone”.
Use fdfgen to create a FDF file:
from fdfgen import forge_fdf fields = [('name','John Smith'),('telephone','555-1234')] fdf = forge_fdf("",fields,,,) fdf_file = open("data.fdf","w") fdf_file.write(fdf) fdf_file.close()
Then you run pdftk to merge and flatten:
pdftk form.pdf fill_form data.fdf output output.pdf flatten
and a filled out, flattened (meaning that there are no longer editable form fields) pdf will be in output.pdf.
It’s a bit complicated, and pdftk can be a pain to install (requires a java stack and there are bugs on Ubuntu 9.10 that have to be worked around) but it’s the simplest process I’ve been able to come up with yet and the workflow is convenient (ie, our designers can make all the layout changes to the PDF they want and as long as they don’t change the names of the fields, I can drop the new one in and everything keeps working).
I apologize for the lack of docs on fdfgen. forge_fdf() is really the only function you should need and it has a docstrings to explain the arguments. I’ve just never quite gotten around to doing more with it.
Also look at this code segment which is a ready made solution for creating a pdf view in django which builds on Thraxil’s solution above. Thanks to github user zyegfryed.
Also, take a gander at Outputting PDFs.
I had another thought (but it won’t help if you are already have the PDF files, and I like @thraxil’s answer better).
Earlier this year I worked on a project where I generated “certificates of completion” for continuing education courses. One of the angles I looked at was trying to generate a PDF directly from an appropriately styled web page (something like a server-side “Print to PDF”).
One of the tools I found was wkhtmltopdf. It’s a self-contained WebKit browser that will turn a URL into a PDF, and with pretty good results.
The idea is that you use django’s template engine to put together a page containing whatever you want (including images), pass it’s url to wkhtmltopdf, grab the output and return it to the user.
I liked the approach because it’s really simple to implement (just open a pipe), you don’t have to worry about keeping the source PDF files accessible to the server, and you can redesign the PDFs by changing the HTML.