For the last release of the year of our PDF library, again we have quite a number of interesting new features in store, including a few which many of you asked for:
- Initial support for
<symbol>
in SVG, which benefits both Core, and pdfHTML 3.0.2 - Changed how word wrapping is handled in Core for languages such as Thai, Khmer, Lao, and Myanmar
- In pdf2Data, the grouping of parsed values in the output XML file based on their y position is now possible, which is a feature that can be useful in numerous cases
- Myanmar (official language of Burma) is now supported in pdfCalligraph 2.0.8
- Extended the support of the CSS specification to add full support for the background properties in CSS, available now in pdfHTML 3.0.2
- pdfOCR sees a refinement of the symbol position based on the HOCR data that fixes output for Thai and some CJK fonts. In addition, it is now possible to configure image preprocessing.
BufferedImage
is now also supported as an output type in pdfRender, enabling you to create an in-memory image object when converting from PDF with PdfRender- The newest version of pdfXFA now supports the setProperty declaration.
As for the rest of the changes, expect the typical assortment of performance improvements and convenience additions.
Note: As we recently announced, Java 7 and .NET Standard 1.6 are deprecated as of this release. For more information about what this means for developers, please see the linked articles.
iText 7 Suite Releases
Release iText 7.1.13 Core
The iText Core 7.1.13 release is the fourth quarterly release for 2020 of our innovative PDF library.
This Core release brings initial support for <symbol>
in SVG, which benefits both Core, and pdfHTML 3.0.2. As we've mentioned previously, we continue to work on increasing support for SVG since it is an important topic for us.
Improvements have been made to the way MemoryLimitsAwareHandler
works, and we've also made important changes regarding how word-wrapping is handled in the context of languages that use special scripts, such as Thai, Khmer, Lao, and Myanmar.
Besides that, we investigated and fixed a potential vulnerability in the way iText used to parse XMP metadata which affects both Core and the license library, improved the PDF merging so that it does not ignore OCG dictionaries anymore, and added two new Image filters (DCT and JPXDecode).
As for the rest of the changes, expect the typical assortment of performance improvements and convenience additions.
Release pdf2Data 2.1.9
In this release, we added a feature which a number of customers have requested.
This is the grouping of parsed values in the output XML file based on their y position, which is a feature that can be useful in numerous cases.
It is available in the Expert mode of the template editor as groupByTb: dataFieldName yet you shouldn`t need to be an expert to start using it.
We've provided an example showing how this works but if you are familiar with the expert mode of pdf2Data you can probably just skip the example and start playing around with it yourself!
We also improved the fontSize selector so that it now accepts a range instead of the exact value.
Release pdfCalligraph 2.0.8
We are pleased to introduce pdfCalligraph 2.0.8.
In this new version, Myanmar (the official language of Burma) has been added to the supported scripts.
This is another big step for pdfCalligraph, since Myanmar requires both processing and clustering mechanisms which differ from previously supported languages.
Check out our example linked on the pdfCalligraph page above for more details.
Besides that, the processing of Thai in the case of mixed Thai-Latin-Thai strings was fixed, as well as extending the test coverage of the entire library.
Release pdfHTML 3.0.2
In this release, we extended the support of the CSS specification to add full support of the background properties in CSS: background-repeat
, background-position-x
, background-position-y
, background-blend-mode
, background-clip
, background-origin
etc.
Besides this, the release brings further improvements to SVG processing, thanks to the introduction of the <symbol> element which is used to define reusable symbols within the SVG content. Please take a look at the examples for this release to give you more insight and use cases.
Speaking of wider support for CSS, the object-fit
property that is widely used in responsive HTML markup is now supported by pdfHTML 3.0.2 too.
Besides this, we continue to improve the processing of inline text by adding the text-transform: capitalize
property and provide more control over the document conversion process to let developers define custom resource resolving strategies by creating their own ResourceRetriever
.
A number of bug fixes are also included.
Release pdfOCR 1.0.2
pdfOCR 1.0.2 is already the third release of our newest product.
It brings some important improvements which allow you to process documents more precisely. These are:
- Refinement of the symbol position based on the HOCR data that fixes output for Thai and some CJK fonts. This is especially important for our pdfCalligraph customers.
You can turn it on with:tesseract4OcrEngineProperties.setUseTxtToImproveHocrParsing(true);
- Possibility for configuration of image preprocessing. That allows smoothing out fluctuations in a document's brightness to give you better results in cases of images taken by a camera.
You can pass the parameters which are described on http://www.leptonica.org/binarization.html usingtesseract4OcrEngineProperties.setImagePreprocessingOption
Release pdfRender 1.0.2
The most important feature which was added in this release is a change to the supported output formats. Previously, you could render PDF pages to JPG, PNG or other image file formats, but now BufferedImage
as also supported. This means it is now possible to create an in memory image object when converting from PDF using PdfRender, which in some cases will be more efficient to work with. A number of customers asked for this, and we were happy to oblige.
pdfRender is a closed source add-on and is currently available for Java, and since version 1.0.1 a CLI version is also available. You will require an iText 7 Core license, a pdfRender license, and version 3.0.6 of the license key library.
You can download pdfRender from our Artifactory here and the API docs can be viewed here.
Release pdfXFA 2.0.8
In a nutshell, we added support of the setProperty
declaration in this release and fixed some few bugs in XFA flattening. Please head over to the Changelog for details.
Having said that, we'd also like to point you to a couple of XFA-related blog posts that may be interesting to you:
iText Knowledge Base: release notes
For full details on the improvements, bug fixes, and installation details, head over to the iText 7.1.13 Release notes on the iText Knowledge Base.
Highlight: pull requests
This time round, we’d like to thank DinoAGW for his Pull Request and StackOverflow question mentioning an issue relating to pdfHTML and resizing a background-image (special thanks also to mkl), which contributed to us making improvements to Core and pdfHTML for this release.
We always welcome contributions to our code. If you have any, just let us know.
Continue to stay safe, and let’s take care of each other!