The 4th of July celebrations are over, though the festivities at iText continue. We’re pleased to announce the availability of high-quality, native conversion of Microsoft Office files to PDF with our new pdfOffice add-on. With this addition to the iText 7 Suite, you get best-in-class conversion of Office documents, without the need for costly licensing subscriptions and conversion costs.
If you’re keen to learn more, tune in on Thursday July 15th, for our dedicated webinar. Our product expert Cal Reynolds will show you how pdfOffice works.
Even though pdfOffice takes up a substantial part of the 7.1.16 release, we have many more new features in store for you with this release. Let’s go through the different releases in more detail.
iText 7 Suite Releases
There are a number of quite substantial enhancements which should please many of our users:
- Improvements to the SVG support in iText by making the
TextLeafSvgNodeRendererclass more easily extendable.
- Refactoring the
PdfSignerclass responsible for digital signatures to simplify signature appearance customization, allowing developers to more easily use SVG images.
- Optimization of Metadata parsing, and a couple of bugs were fixed which could cause deadlocks and memory leaks.
- Support of non-encrypted PDF files with encrypted attachments, and ICCBased colorspaces has been added.
This release brings reduced memory consumption, which you will notice even when processing small volumes.
Besides that, we added an implicit validation and filtration of extracted symbols. This was an issue in previous versions, since Unicode allows you to use symbols in PDF that aren`t supported by the XML standard, so could cause exceptions or malformed output files creation.
Quite some improvements for pdfHTML with the newest release! They include:
- The addition of support for the
attr()CSS function. This is used to retrieve the value of an attribute of the selected element and use it in the stylesheet. See the example linked on the release page for more information
- We also focused on handling corner cases and bug fixes in the CSS Flexible Layout Box module support that was introduced in the previous release. However, the addition of support for flex-items with an intrinsic aspect ratio should be noted separately. As it is not clearly defined in the specification when and how an intrinsic aspect ratio should be handled, we did our best to offer as wide support of this property as possible in pdfHTML.
- Improvements to our Bézier curve approximation for rounded borders. A feature which many amongst you will enjoy, and don’t forget to check out the included example!
- Moreover, in this release you can expect a bunch of bugfixes and further performance improvements as usual.
This the first release of the pdfOCR add-on this year. And we’re pleased to share a significant improvement in image type detection. From now on, pdfOCR does not rely on the file extension to determine the image type, instead, it detects the image type by considering a file's content to prevent errors in OCR processes.
This will allow you to use files with unknown or incorrect extensions as an input, as long as they have the correct structure from a specifications point of view.
As mentioned, with this release, we are adding pdfOffice to the iText 7 Suite. From now on, developers can programmatically convert MS Office documents, fully natively, into fully ISO-compliant PDF files. It can handle files created in Office 97 through Office 2019 and the latest Office 365 updates.
In this first release it can convert Word and PowerPoint files, with Excel support in development. Currently supported file types are doc, .docx, .dotx, docm, .dotm, .dot, .ppt, .pptx, .potx, .pptm, .potm, .ppsx, .ppsm, .pot, and .pps formats.
You probably already know that PDF as a format has a lot of advantages over MS Office files. It is much more suited for archiving and distribution purposes, as files can be easily viewed using multiple free tools and from within web browsers. PDF also has the added benefit that files are guaranteed to appear the same, regardless of the application or environment.
Usage of pdfOffice is especially beneficial when combined with iText 7 Core or other add-ons in the iText 7 Suite, as this allows you to achieve many more possibilities. Some example use cases:
- For distribution, you may want to add watermarks (iText 7 Core) and redact sensitive data (pdfSweep).
- For archiving, you would definitely benefit from digitally signing them with them to have proof of integrity, reducing the file size (pdfOptimizer) or converting them into images (pdfRender).
- For data extraction, pdfOffice+pdf2Data might be the best option.
Note: pdfOffice is currently only available for Java developers, and it does not rely on any external software to perform the conversion, everything is handled natively. As such, all you need is the pdfOffice add-on and a valid license to enable high-fidelity conversion of all supported document formats.
As it is a brand-new product, we are still busy creating some documentation for it, although its usage is pretty straightforward... Don't believe us? Check out this example, and the other examples linked on the release page demonstrating adding annotations, redacting content, and digitally signing converted documents.
This release contains a bug fix which could cause exceptions in the CLI version while using the volume license library. Some stability improvements in iText Core, and the fixing of a couple of performance-related bugs also benefit pdfRender too.
While there are no new features for pdfSweep with this release, we did fix an annoying bug that could cause a NullPointerException for TD and TL operators when analyzing a document stream.
Another bug fix in this release: for pdfXFA 2.0.11 we resolved an issue when handling XFA files which contain empty rich text fields.
As usual, our PDF debugging tool RUPS has been updated together with iText 7 Core, as it is one of the most important dependencies.
Contributions and pull requests
We always welcome contributions and pull requests from the community. If you have any, let us know.