Digital Document Archiving with PDF/A
Future proof digital preservation
To most people, PDF is just a “digital printout”. However, PDF comes in a variety of subtypes and standards that may be arcane to the everyday user, but could be crucial to a professional. For instance, the PDF/A format is specifically optimized to serve an archiving and documentation platform, and is the accepted standard for long-term archiving and preservation of PDF electronic documents. The current PDF/A standard (PDF/A-3) allows users to embed various types of data and rich content that may be necessary for a company to have, including securing documents with digital signatures based on the PAdES standard. iText libraries can handle this with ease where most common PDF programmers have no real way of dealing with this.
Optical Character Recognition or OCR
One of the major challenges in document management is dealing with inaccessible data locked away in non-editable documents. Scanning a document containing printed text does not make it editable or searchable since it produces an image-only PDF. Because scanned printed documents are “flat”, you need to make sure the text they contain becomes accessible. Without machine-readable text, scanned documents cannot be edited with word processors like Microsoft Word, nor can they be searched, indexed or processed for later use. Optical Character Recognition (OCR) can help to unlock this data.
The iText pdfOCR add-on enables you to automate the OCR process and integrate it into your archiving workflow by providing text recognition functionality, and conversion of scanned documents, PDFs and images into fully ISO-compliant PDF/A-3u files.
Universal document accessibility with PDF/UA
Reaching out to people with disabilities
Many governmental organizations are required by law (e.g. Section 508 in the United States) to make their documents accessible to people with visual impairments. PDF/UA (“Universal Accessibility”) was set up as a PDF standard to ensure accessibility for people with disabilities who use as screen readers, screen magnifiers, joysticks and other assistive technologies to navigate and read electronic content. It can help assistive technologies convert PDF text data to braille writing or to explain to users what a certain image depicts if they can’t actually see the image. iText can let you add the right structural elements to your PDFs to achieve this type of compliance and expand the reach of your organization to people who are regularly underserved in this type of documentation.
Easily convert HTML and CSS into standards compliant PDFs that are accessible, searchable and usable for indexing.
Securely remove content from your PDF files quickly and efficiently.
Ensure compatibility with archival and legacy workflow requirements by generating images from PDFs.
Achieve the right ligatures, calligraphy and type-setting for special and non-Latin script types (e.g. Arabic and Devanagari).
iText 7 Core
Learn more about the backbone of embeddable PDF functionalities.
Recognize texts in scanned documents, PDFs and images and convert them to editable PDFs.
iText pdfOffice is an add-on for the iText 7 Core PDF library which enables high-quality native conversion of Microsoft Office documents to PDF.
Still have questions about PDF solutions for archiving & accessibility?
We're happy to help! Send your questions to us, and we'll get back to you a.s.a.p.