iText 5.4.4

Tags: Release

When we released iText 5.4.3 a month and a half ago, we promised that we would continue working on PDF/A, more specifically: provide all the checks that ensure you the file you're creating is either PDF/A-2 or PDF/A-3, just the way we did with PDF/A-1. However, that will have to wait yet another release: one of our customers made us change our plans, and that's why iText 5.4.4 brings you something completely different, that —in our opinion— isn't less interesting.

Merging accessible files
In iText 5.3.4, we enhanced PdfCopy so that it would preserve the StructTreeRoot. Before that version, all structure information was lost when concatenation different PDF files that were tagged. In plain language: since 5.3.4, you can merge tagged PDF files, and tagged PDF is essential for accessible files (in de US: files that are compliant with Section 508). Granted, the functionality was still experimental in version 5.3.4 and we fixed plenty of bugs in later version, but one of our customers bumped into a very specific problem: they wanted to concatenate Tagged PDF files containing form fields. As documented, PdfCopy doesn't support merging forms. When dealing with fonts, we used to advise the use of PdfCopyFields instead. Unfortunately, PdfCopyFields doesn't support merging Tagged PDFs. Our customer faced a dilemma: either merge accessible files and lose the forms, or merge the forms and lose the accessibility. Neither choice was acceptable for the customer of our customer, so we made their requirement our priorityn resulting in iText 5.4.4. From now on you can now merge forms and preserve the tagged PDF structure when using the addDocument() method in PdfCopy. At the same time, we've deprecated PdfCopyFields.

Flattening accessible files
While one team was working on merging forms, one developer experimented with an alternative solution: what if it was possible to flatten a filled out form, preserving not only the structured tree root, but also the reading order of the content in the content stream (the latter being overkill according to the specs). You could then think of a scenario where you flatten the form first, and merge them afterwards, preserving the Section 508 compatibility throughout the workflow. We ended up with code that works for many forms, but not for all (the pitfalls are documented in the source code). We decided to ship this experimental code in the xtra package, so that those who need it, can take a look at it and see if it meets their needs.

Accessibility
PDF/UA, Section 508 and accessibility: we've used those words frequently when announcing our plans for 2013. In this new release, you'll discover that we've fixed a number of issues related to tagged PDF and structure: table borders are now marked as artefacts, images were tagged incorrectly in some cases, links weren't added to the structure tree correctly,... The more customers joining our accessibility efforts, the better the accessibility functionality is getting. We're almost there!

Performance
Two customers informed us that the performance of the latest versions of iText was worse than the performance of some earlier versions. We discovered that the use of the Java UUID class in combination with some specific JVM implementations on Linux were indeed slowing iText down. We removed the dependency on UUID, fixing the problem completely. Unfortunately, our profiling tests indicated that the performance of iText has indeed decreased in cases where PdfPTable is used. This can't be avoided because the PdfPTable code is now much more accurate than it used to be. The only way to improve the performance in this case, is to use PdfPTable in a different way.

Images
iText supports a wide variety of images, but even within one specific image type, there can also be a wide variety of flavors. That's an understatement when looking at TIFF. Again we've discovered a strange phenomenon that made some specific type of TIFF file appear as a pink image instead of a white image when adding it to a PDF using iText. That's fixed now. We also fixed a problem when manipulating existing PDFs that contain JBIG2 images as well as existing PDFs of which the /Length parameter of the image stream is one byte off.

Invalid PDFs
The more customers we have, the more weird PDFs are sent to us. PDFs using names with a length greater than 127, PDFs without a root dictionary, PDFs with a page tree that refers to page dictionaries that are null,... In many cases, iText threw a NullPointerException because these PDFs are invalid and there's very little you can do about it. Now we've started changing these NullPointerExceptions into InvalidPdfExceptions informing the users what is wrong with the PDF file. Note that this isn't always possible: in many cases human eyes are needed to see what is wrong with the file.

Summarized
iText 5.4.4 brings interesting new functionality in the area of accessibility (PDF/UA), more specifically when forms are involved. The next release will again focus on archiving (PDF/A).

Read more about it in the changelog.