Chapter 7: Frequently Asked Questions about pdfHTML

How to convert an ASP or JSP page to PDF? How to parse multiple HTML files into one PDF? Can pdfHTML render Base64 images to PDF? Can I generate a PDF from a URL instead of from a file on disk? Do we need a browser engine to render HTML+CSS to PDF Does my HTML have to be valid XML? How to add metadata to a PDF using pdfHTML? How do the measurement systems in HTML relate to the measurement system in PDF? Can I convert an HTML form to a PDF? How to render certain HTML entities (such as arrows) to PDF? Why can't I embed a font due to licensing restrictions? Which languages are supported in pdfHTML? How to convert HTML containing Arabic/Hebrew characters to PDF? Why is my PDF missing several characters?


Chapter 6: Using fonts in pdfHTML

Up until now, we haven't spent much attention to the fonts that were used when we converted HTML to PDF. We know that Helvetica is the default font used by iText when no font is specified (chapter 2), and we know that pdfHTML ships with some built-in fonts if you need to embed a font (chapter 4), but we didn't get a clear overview of which fonts are supported as of yet.

There are two things you need to know before reading this chapter:


Chapter 5: Custom tag workers and CSS appliers

In this chapter, we'll change two of the most important internal mechanisms of the pdfHTML add-on.

  • We'll override the default functionality that matches HTML tags with iText objects, more specifically the DefaultTagWorkerFactory mechanism, and

  • We'll override the default functionality that matches CSS styles to iText styles, more specifically the DefaultCssApplierFactory mechanism.


Chapter 4: Creating reports using pdfHTML

Roughly speaking, there are three major ways to create PDF documents using iText,

  1. You can create a PDF document from scratch using iText objects such as Paragraph, Table, Cell, List,... The advantage of this approach is that everything is programmable, hence configurable just the way you want it. The disadvantage is that you need to program everything; even small changes such as changing one color into another, require a developer to change the Java code of the application, to recompile the code, etc.


Chapter 2: Defining styles with CSS

In the previous chapter, we looked at different snippets of Java code.

In this chapter, we'll use the same snippet for every example:

public void createPdf(String src, String dest) throws IOException {
    HtmlConverter.convertToPdf(new File(src), new File(dest));

Instead of looking at different snippets of Java code, we'll look at different snippets of HTML and CSS.


Chapter 1: Hello HTML to PDF

In this chapter, we'll convert a simple HTML file to a PDF document in many different ways. The content of the HTML file will consist of a "Test" header, a "Hello World" paragraph, and an image representing the iText logo.

Structure of the examples

All the examples throughout this book will have a similar structure.



iText 7: Converting HTML to PDF with pdfHTML


In this tutorial, we'll learn how to convert HTML to PDF using pdfHTML, an add-on to iText 7. If you're new to iText, please jump to chapter 1 immediately. If you've been working with iText in the past, you might remember the old HTML to PDF functionality. If that's the case, you've either been using the obsolete HTMLWorker class (iText 2), or the old XML Worker add-on (iText 5).

Introduction iText 7's new add-on pdfHTML is a tool that aims to greatly simplify HTML to PDF conversion in Java or .NET. This is a straightforward and uniform use case, so many users will get satisfactory results with the one-line code sample below. For more complex usage, you may need to provide some configuration to pdfHTML. In this post, I will attempt to explain why you may need to use the config options, and how to use them. Basics The default way to use pdfHTML is either one of two basic one-line code samples: HtmlConverter.convertToPdf(new File("input.html"), new...