Using Flying Saucer and iText in Java to convert XHTML to PDF
The situation
Where I work, we were generating reports in XHTML, for printing. The styling should be easily configurable. The problem with this approach is the cross-browser look of your report. IE and Chrome print two completely different reports, so to speak.
To prevent the difference between browsers, and to re-use the existing report generation, I decided I needed to render the print at the serverside. So I had html, and I wanted it to look the same on every computer. Pdf is a good medium for this purpose, so I needed a html to pdf library, for our Java system. I first tried iText by itself, but this did not apply the css. Browsing the web a bit further I found the combo Flying Saucer and iText, and this was a winning combination for us.
Flying Saucer is a Java library that renders XHTML/XML + CSS to screen/image/PDF. For PDF there is a dependency on iText, a library to create pdf files.
The code
In this blogpost I provide some codesnippets, not a full working example.
Maven dependencies
Add this to your pom.xml in the dependencies section:
<!-- Flying Saucer and iText --> <dependency> <groupId>com.itextpdf</groupId> <artifactId>itextpdf</artifactId> <version>5.1.3</version> </dependency> <dependency> <groupId>org.xhtmlrenderer</groupId> <artifactId>core-renderer</artifactId> <version>R8</version> </dependency>
The converter
The converter is pretty straight forward. It reads a XHTML String and writes the pdf to a FileOutputStream. Please read the gotchas below, because there are some, well, gotchas.
import org.apache.log4j.Logger; import org.xhtmlrenderer.pdf.ITextRenderer; import java.io.FileOutputStream; public class HtmlToPdfConverter { private static final Logger LOG = Logger.getLogger(HtmlToPdfConverter.class); public void htmlStringToPdfStream(String html, String tempFile) { try { FileOutputStream pdf = new FileOutputStream(tempFile); ITextRenderer renderer = new ITextRenderer(); renderer.setDocumentFromString(html); renderer.layout(); renderer.createPDF(pdf); pdf.close(); } catch (Exception e) { throw new RuntimeException(e); } } }
Debugging/logging
To enable logging output from our converter, add this file to your system:
See both the Flying Saucer Userguide and this example file, what you can put in there.
$user.home/.flyingsaucer/local.xhtmlrenderer.conf
See both the Flying Saucer Userguide and this example file, what you can put in there.
Gotchas
Valid XHTML
Flying saucer will generate errors and produce no output, when the XHTML contains errors. An XHTML document is in fact a valid XML document, so all those rules apply. You can check your XHTML on the W3C Validator. Make sure tags are nested correctly, there are no block level tags inside inline level tags, and all special characters are escaped properly. HTML is not the same as XHTML and XML, if you want to parse a HTML document, you should do some preprocessing with JTidy or TagSoup.
Table layout with divs and CSS
One special case I encountered: I tried to make a table layout with divs with css properties
display: table/table-header-group/table-row/table-celletc. (see here). The implementation of Flying Saucer of these properties come very precise. It should all be nested correctly. A table (needs a header/footer-group) needs a row-group needs a row needs cells.
Comments
Post a Comment