iText pdfSweep

pdfSweep is an iText Core add-on for Java and C# (.NET) that removes (redacts) information from a PDF document in a reliable and secure way

How it works

With just a few lines of code you can use the powerful PDF redaction capabilities of pdfSweep to irretrievably remove content. The following example will find and redact all instances of the word "Alice" in a document, regardless of casing:

try (PdfDocument pdf = new PdfDocument(new PdfReader(SRC), new PdfWriter(new_File(SRC, "redact")))) {
            final ICleanupStrategy cleanupStrategy = new RegexBasedCleanupStrategy(Pattern.compile("Alice", Pattern.CASE_INSENSITIVE)).setRedactionColor(ColorConstants.PINK);
            PdfCleaner.autoSweepCleanUp(pdf, cleanupStrategy);
PdfDocument pdf = new PdfDocument(new PdfReader(SRC), new PdfWriter(DEST));
ICleanupStrategy cleanupStrategy = new RegexBasedCleanupStrategy(new Regex(@"Alice", RegexOptions.IgnoreCase)).SetRedactionColor(ColorConstants.PINK);
PdfCleaner.AutoSweepCleanUp(pdf, cleanupStrategy);

The original PDF

An unredacted page from Alice in Wonderland

The redacted PDF

A redacted page from Alice in Wonderland

Key features

Core capabilities of the iText pdfSweep redaction tool

pdfSweep intervenes as you edit a PDF document with iText Core's document stamping and watermarking tools. After adding a digital "blackout bar" over the sensitive text, image or part of an image, pdfSweep changes the document's rendering instructions causing the hidden content of your digital document to become impossible to extract. This works for both text and images, affording you full information security. 
Looking at the advantages of pdfSweep and the data security it offers, you may find it surprising that it only takes five lines of code to integrate pdfSweep into your document workflow.

Automatic removal of words and phrases

Remove text from a document, based on patterns like regular expressions.

Customized removal areas

Offers you the ability to remove content as necessary, just like a digital black bar.

Secure and reliable removal

As well as the visual appearance that is rendered when viewing or printing the PDF document, pdfSweep also takes care of the underlying rendering instructions and data structures to ensure the removed information is not retrievable.

Partial removal of text and images

When content is partially covered by a redaction area, it is only partially removed, allowing you to remove selected parts of text and images.


Why use iText pdfSweep?

pdfSweep is a highly efficient PDF tool for confidential data redaction.

Remove content from your digital documents irretrievably instead of just covering it up. You can also redact text, images, parts of images or drawings for complete confidentiality. iText pdfSweep complies with GDPR for data redaction.

pdfSweep icon svg
Flexible options

Use recurring data or data fields to automate redaction throughout any volume of documents, with a set of predefined patterns for common data such as social security numbers, account numbers, ID numbers etc... Define custom redaction areas using coordinates to redact any content within.


Still have questions? 

We're happy to answer your questions. Reach out to us and we'll get back to you shortly.

Contact us
Stay updated

Join 11,000+ subscribers and become an iText PDF expert by staying up to date with our new products, updates, tips, technical solutions and happenings.

Subscribe Now