As the last quarter of 2022 rolls around, it’s time for our final release of the year for the iText 7 Suite PDF SDK. While there’s nothing quite as big to announce as the Android reference implementation from last time, we do have a few cool new things in iText 7 Suite 7.2.4 to tell you about. For the iText 7 Core PDF library there’s a nice convenience method improvement for creating tagged PDFs. We’ve also made a couple of changes relating to default color spaces for PDF/A documents and for PDF/A detection in the sign module.
As for the iText 7 add-ons we’re updating along with Core, pdfHTML now allows you to define colors using the CSS4 device-cmyk property and we’ve added support for multithreaded conversions in pdfOffice. The pdfOptimizer add-on now supports Decode array entries in Image Dictionaries and we also have updates for pdfRender, pdfXFA and pdfOCR. And as usual, we’ve updated our PDF debugger iText RUPS to include all the updates to the Core library.
Head over to the release notes for the full rundown or read on for a summary of what’s new.
First of all, we’ll go over the changes to PDF tagging. We noticed that applying iText's tagging defaults could result in pretty elaborate tag structures where your layouts were particularly advanced. We’ve now added a chainable
setNeutralRole() method which can be implemented at the layout element level, which will allow you to more easily define layout elements you wish to be “tag-neutral”.
As for the changes to PDF/A, iText 7 can now check for the Default Color Space of PDF/A documents. The PDF 1.7 specification states that when rendering colors specified in DeviceRGB or DeviceCMYK, and where no matching device independent default color space has been set, conforming readers should use the profile in the file’s PDF/A OutputIntent dictionary as the source color space.
We’ve also made an improvement to the sign module which relates to the detection and initialization of PDF/A documents. In addition, we’ve made an improvement to the sign module relating to the detection and initialization of PDF/A documents. In previous versions of iText 7 Core, we did not check if the given document was a PDF or a PDF/A document. This prevented the
PdfSigner class from initializing the document as a
PdfADocument if it was required. Now the
PdfSigner class will check to see if a document claims PDF/A conformance, and treat it accordingly.
Note that this does count as a breaking change, since in earlier releases
PdfSigner did not enforce PDF/A compliance on documents that claimed conformance. Now, an exception will be thrown if any content is added in a way that does not conform to the PDF/A standard present on the document before the signing process started. See the example linked in the Core release notes for more information.
See the release notes and changelog for more details of other improvements and miscellaneous bugfixes.
As noted above, the big news for our add-on to create PDF from HTML is the addition of support for CSS4
device-cmyk colors. Until now pdfHTML only supported CSS RGB colors for direct conversion to PDF, although a workaround was possible. However, we decided to add support for this for customers who want to make use of this functionality to optimize their workflow.
Note that this functionality is not available out of the box since this property currently only exists in CSS working drafts, and so has very limited browser support. To enable it you'll simply need to change the validation rules for pdfHTML, see the example linked from the release notes for more details.
For this release of our OCR add-on for iText 7, we have upgraded the underlying tess4j library to version 4.6.0, which uses version 4.1.3 of the Tesseract OCR engine and version 1.82.0 of the Leptonica image processing and analysis library.
pdfOffice is our add-on for high-quality native conversion of Microsoft Office (Word, Excel, and PowerPoint) documents to PDF.
In previous releases of pdfOffice, conversions would be performed one-by-one, regardless of the number of threads calling the pdfOffice API. However, this release natively supports the utilization of multiple threads which should see a significant speedup in high-volume conversion situations. You’ll need to set up things correctly to benefit for this change, so see the example linked from the release notes.
In other changes, there’s a fix for Excel to PDF conversions, where in certain cases a PdfOfficeException would be thrown for files using the .xls, .xla, or .xlt extensions. Another fix relates to a bug where pdfOffice could crash when converting consecutive documents under certain circumstances.
For this release of our iText 7 add-on to reduce the file size of PDFs, we’ve added support for the Decode array when processing images. As noted in Table 89 of the PDF 1.7 specification, the Decode array is an optional entry for an Image Dictionary which defines how to map image samples into the range of values appropriate for the image’s color space. This meant that if a PDF containing images with this entry in the Image Dictionary was processed by pdfOptimizer, then an error could be reported when subsequently opening the file in a PDF viewer.
A workaround to this issue would be to simply remove such Decode entries before optimizing the PDF. However, pdfOptimizer will now consider any Decode array entries when processing images.
pdfRender is our iText 7 add-on to generate images from PDF documents. It is currently only available for Java, though a CLI version is also available.
In this update, we’ve fixed a bug where in certain circumstances a rendered image may not contain all text from a source PDF.
pdfXFA is our add-on for iText 7 which allows you to flatten dynamic XFA forms to static PDF. It also enables you to add a digital signature to converted XFA forms as additional security for further processing in PDF workflows or for archiving.
In this release, we’ve replaced our log4j dependency to resolve reported CVE vulnerabilities for this logging framework. While the vulnerabilities did not pose a serious threat for attackers without write access, we’ve now switched to using version 1.2.11 of the Logback Classic Module.
We have resolved an issue where subforms corresponding to pages could be skipped in some cases, resulting in dynamic content being placed on the wrong page.
We’ve also fixed a couple of bugs affecting document flattening. The flattening process could get stuck in an infinite loop when tabs were present in the XML, and we’ve resolved an issue where an XObject could shift position and affect the layout after flattening.
As always, the Java and C# source code for the iText 7 Core PDF library can be found on GitHub, together with all our other open-source projects.
That’s all for now, so we’ll see you for the next iText 7 Suite release in 2023!