A non-exhaustive overview of differences in functionality between iText 2 and 5
1. DIGITAL SIGNATURES
This functionality has been completely rewritten so that iText can be used to create signatures that are legally binding in Europe and the US. We have implemented the PAdES standard (including Long-Term Validation and XFA Signing).
For this functionality we offer an alternative for Adobe LiveCycle, read this paper for a 3rd party comparison.
In 2012, our founder wrote a detailed white paper about Digital Signatures for PDF documents which is available as a free download.
2. XML WORKER
A generic tool for people who want to convert XML to PDF. The basic implementation converts XHTML+CSS to PDF. It is more accurate than the old HTMLWorker, but not meant to convert web sites to PDF. It is more like a template system where people create simple templates using HTML and CSS (instead of the convoluted XSL:FO approach). They populate the HTML with data, and convert that HTML to PDF. XML Worker was the first step in the development of XFA Worker. See the "how to use XML worker video".
3. XFA WORKER
This is a closed source product created for people who want an alternative for Adobe LiveCycle (for XFA form flattening). An XFA form is a dynamic type of interactive form, which can change based on user interaction or data input. Flattening an XFA form turns it into a non-interactive PDF, a final version of the filled form that should no longer be changed (e.g. for archiving).
For XFA form flattening, XFA Worker is an alternative to Adobe LiveCycle: read more in this thread.
4. TAGGED PDF
PDF was originally designed as an end product, for visual representation. It's not a word processing format, intended for further editing. Simplistically, PDF has a set of instructions to place content (text, images, etc) at absolute positions on pages. It had no concept of document structure or structural elements, like headers, paragraphs, tables and lists. This makes it difficult to extract, process and reuse PDF content.
Tagged PDF (PDF 1.3) added possibilities to store additional information to facilitate this. Adding the logical document structure (structure tree) to the PDF is an important part.
Tagged PDF is important in the context of PDF/UA (Section 508: documents need to be accessible) and PDF/A (level A).
It is not impossible to create a Tagged PDF using iText 2, but it's extremely difficult to do it correctly and efficiently, because you need to create the structure at the lowest level (you need to be fluent in PDF syntax). In the most recent versions of iText, you can now automate Tagging when using iText's high level objects (PdfPTable, Paragraph,...).
We now have conformance checking for PDF/A-1, PDF/A-2, and PDF/A-3 (levels A and B).
5. IMPROVED DATA EXTRACTION
iText 5 contains improvements for extracting text and images from PDF. Using heuristics we can reconstruct text from textual content in the PDF page content. We don't have generic structure recognition (i.e. detecting paragraphs, list, etc) yet, but we have built a custom system for one of our customers.
6. IMPROVED MERGING AND COPYING
The process of merging and copying PDFs has been rewritten. Pdf(Smart)Copy has been improved to be able to process Tagged PDF and PDFs with Acroforms. In iText 2, you lose the StructTreeRoot and forms are broken.
7. SUPPORT FOR PDF DOCUMENTS UP TO 1TBYTE
iText 2 only supports PDFs up to 2 GByte. The current iText version allows for PDFs that are up to 1 TByte.
Improvements that we refer to as "YATP". Literally "Yet Another TIFF Problem." TIFF is a standard that is "abused" by many TIFF producers. As a result, we have encountered some really strange TIFFs that couldn't be interpreted by iText. Every couple of months we need to provide a solution for similar problems not limited to TIFF, but as TIFF is the most problematic format, we refer to these issues as YATP.
9. ITEXT FOR ANDROID AND GAE
iTextG: a version of iText that can be used on: