One of the most interesting features that has been added to the PDF format since its inception way back in 1993 is the ability to create portable collections, more commonly known as PDF portfolios. Portable collections were introduced by the ISO committee as part of the PDF 1.7 specification (and expanded in PDF 2.0) and can contain multiple files integrated into a single PDF. Although “portable collection” is the name defined in the specification, for convenience we’ll be using “PDF portfolio” for the rest of this article.
Can't I just combine files into a PDF?
This feature offers similar functionality to combining files into a single PDF but differs in one major respect. Simply combining files means that all the files will be converted to PDF, whereas creating a PDF portfolio preserves the files in their original file format and you can edit or modify them in their native application without removing them from the portfolio. It should be noted that if the portfolio is signed with a digital signature, then edits to documents will break the signature since it covers the whole PDF including the PDF portfolio and its files. We’ll look at digital signatures and how they work in PDF portfolios later in this article.
PDF includes features such as "embedded file streams" (PDF 1.3) and "associated files" (PDF/A-3 and PDF 2.0) which allow the containment and characterization of arbitrary content (such as files commonly found in email attachments) within the PDF file. As noted in this article from the PDF Association, the PDF standard includes embedded-file, metadata, navigation, data-protection and accessibility/reuse features in an ISO-standardized, vendor-independent specification. In a similar way that PDF documents can be a container for other types of data, PDF portfolios themselves are also a data container format that enable you to collect many different file types together in a single file.
What can I use PDF portfolios for?
There are many business use cases and applications where PDF portfolios could be ideal. For example, loan application requests where there are forms to fill out and read-only disclosures, or packets for new employees containing information such as health insurance forms and company policy documents in different formats.
They can also be used for non-business applications too, such as art students who need to submit a portfolio for college. Using a PDF portfolio, they can easily incorporate original images, photographs and videos into a single file without needing to worry about compression artifacts affecting the perception of their work, since unlike a combined PDF where all files are converted to PDF, files contained within the PDF portfolio remain untouched and easily viewed with a supported application.
PDF portfolios offer a number of benefits, depending on your use case. For example, imagine you run a construction company that is building a house. There might be various documents relating to the project, such as CAD drawings, pictures, Word documents such as .doc and .docx files, .xls and .xlsx spreadsheets for the budget etc. All these files could be neatly packaged into a PDF portfolio for convenience, so everything relating to the project can be shared easily with anyone that needs it.
But PDF portfolios are not just a convenient container format, they also have significant security benefits as well. Let’s imagine your construction company is contracted to build a government facility, such as a prison. The files relating to this type of project would be similar, but now you are required to meet much more stringent confidentiality and security standards.
For PDF documents that relate to the project you can use PDF digital signatures to ensure they are secured. But how do you digitally sign all the other project files?
Well, there are a number of options to consider. You could convert them to PDF first and then use your existing digital signature process to secure them, but that presumes you are able to convert the files to PDF in the first place, and also that you no longer want to edit them. Alternatively, maybe you could convert them to some other format that supports digital signatures, again, assuming that conversion will be accurate and practical for your workflow.
Other solutions might be to add a signature outside of the file format, for example some web downloads will publish a checksum (hash) next to the file to prove its integrity. Or you could use a document management system or other controlled environment to handle file security and integrity. Both these options have the drawback of hindering easy data sharing though, since you’d need separate tools to verify the signatures, or give external people access to your document management system, etc.
However, we believe using a PDF portfolio would be a simpler, and more elegant solution. Crucially, it’s important to note that a PDF digital signature applied to the portfolio covers all the files it contains, whatever file format they are. So, if any of the files in your portfolio are changed (e.g. the specifications for the building are revised, or the spreadsheet for your budget is updated) then you only need to generate a single digital signature for the portfolio to maintain security for the files contained within.
Even if you don’t require the level of security provided by a digital signature, using PDF portfolios is still a great idea. You can set a password for the entire portfolio, or for individual PDFs contained within the portfolio if you prefer. Since PDF 2.0, PDF portfolios are also used as the supporting technology for the “unencrypted wrapper document” concept. When using non-standard encryption, the user is typically confronted with a non-working document without much information why. With “unencrypted wrapper document” the files with custom encryption are added as embedded files and a cover page can be used to explain what to do or installed to handle the custom encryption. If you’d like to read about this concept in more detail, see section 7.6.7 in the PDF 2.0 specification.
Sharing the project files is also simple since everything is contained within a single file. Once the PDF portfolio is opened in a supported viewer, then each contained file can be easily opened in the corresponding application without affecting any of the other files.
As mentioned above, cover pages can be used in PDF portfolios to display information about the contents of the portfolio, or for other purposes. For instance, creating a PDF portfolio in Acrobat adds a standard cover page which can be displayed in unsupported viewers to advise the user to open the PDF portfolio in a supported viewer instead:
Opening such a PDF portfolio in a supported viewer such as Foxit Reader will display this warning, even though files contained within the portfolio can be accessed without any problems. This is simply because Acrobat sets the cover page as the default view of the portfolio for non-Acrobat PDF viewers. If you wish, you can also see this page in Acrobat by simply going to View > Portfolio > Cover Sheet.
Cover pages are just standard PDF pages (or even complete documents), and creating a PDF portfolio in iText 7 allows you to use a custom image, text or both. You can also select whether the cover page is the initial view of the portfolio, and whether to display the portfolio in Detail, Tiled or Hidden format.
Creating PDF portfolios with iText
But how do you create a PDF portfolio? Below is a quick example showing how you can use iText 7 Core to create a PDF portfolio containing a PDF, a .csv spreadsheet and a JPEG image:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
/*
This file is part of the iText (R) project.
Copyright (c) 1998-2020 iText Group NV
Authors: iText Software.
For more information, please contact iText Software at this address:
sales@itextpdf.com
*/
/**
* Example written by Bruno Lowagie in answer to:
* http://stackoverflow.com/questions/27063677/use-of-relative-path-for-anchor-method-using-itext-for-pdf-generation
*/
package com.itextpdf.samples.sandbox.collections;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.kernel.pdf.collection.PdfCollection;
import com.itextpdf.kernel.pdf.filespec.PdfFileSpec;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Paragraph;
import java.io.File;
import java.io.IOException;
public class PortableCollection {
public static final String DEST = "./target/sandbox/collections/portable_collection.pdf";
public static final String DATA = "./src/test/resources/data/united_states.csv";
public static final String HELLO = "./src/test/resources/pdfs/hello.pdf";
public static final String IMG = "./src/test/resources/img/berlin2013.jpg";
public static void main(String[] args) throws Exception {
File file = new File(DEST);
file.getParentFile().mkdirs();
new PortableCollection().manipulatePdf(DEST);
}
protected void manipulatePdf(String dest) throws Exception {
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(dest));
Document doc = new Document(pdfDoc);
doc.add(new Paragraph("Portable collection"));
PdfCollection collection = new PdfCollection();
collection.setView(PdfCollection.TILE);
pdfDoc.getCatalog().setCollection(collection);
addFileAttachment(pdfDoc, DATA, "united_states.csv");
addFileAttachment(pdfDoc, HELLO, "hello.pdf");
addFileAttachment(pdfDoc, IMG, "berlin2013.jpg");
doc.close();
}
// This method adds file attachment to the pdf document
private void addFileAttachment(PdfDocument document, String attachmentPath, String fileName) throws IOException {
String embeddedFileName = fileName;
String embeddedFileDescription = fileName;
String fileAttachmentKey = fileName;
// the 5th argument is the mime-type of the embedded file;
// the 6th argument is the AFRelationship key value.
PdfFileSpec fileSpec = PdfFileSpec.createEmbeddedFileSpec(document, attachmentPath, embeddedFileDescription,
embeddedFileName, null, null);
document.addFileAttachment(fileAttachmentKey, fileSpec);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
/*
This file is part of the iText (R) project.
Copyright (c) 1998-2020 iText Group NV
Authors: iText Software.
For more information, please contact iText Software at this address:
sales@itextpdf.com
*/
using System;
using System.IO;
using iText.Kernel.Pdf;
using iText.Kernel.Pdf.Collection;
using iText.Kernel.Pdf.Filespec;
using iText.Layout;
using iText.Layout.Element;
namespace iText.Samples.Sandbox.Collections
{
public class PortableCollection
{
public static readonly String DEST = "results/sandbox/collections/portable_collection.pdf";
public static readonly String DATA = "../../../resources/data/united_states.csv";
public static readonly String HELLO = "../../../resources/pdfs/hello.pdf";
public static readonly String IMG = "../../../resources/img/berlin2013.jpg";
public static void Main(String[] args)
{
FileInfo file = new FileInfo(DEST);
file.Directory.Create();
new PortableCollection().ManipulatePdf(DEST);
}
protected void ManipulatePdf(String dest)
{
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(dest));
Document doc = new Document(pdfDoc);
doc.Add(new Paragraph("Portable collection"));
PdfCollection collection = new PdfCollection();
collection.SetView(PdfCollection.TILE);
pdfDoc.GetCatalog().SetCollection(collection);
AddFileAttachment(pdfDoc, DATA, "united_states.csv");
AddFileAttachment(pdfDoc, HELLO, "hello.pdf");
AddFileAttachment(pdfDoc, IMG, "berlin2013.jpg");
doc.Close();
}
// This method adds file attachment to the pdf document
private void AddFileAttachment(PdfDocument document, String attachmentPath, String fileName)
{
String embeddedFileName = fileName;
String embeddedFileDescription = fileName;
String fileAttachmentKey = fileName;
// the 5th argument is the mime-type of the embedded file;
// the 6th argument is the AFRelationship key value.
PdfFileSpec fileSpec = PdfFileSpec.CreateEmbeddedFileSpec(document, attachmentPath, embeddedFileDescription,
embeddedFileName, null, null);
document.AddFileAttachment(fileAttachmentKey, fileSpec);
}
}
}
You can download this example PDF portfolio here.
Creating a PDF portfolio with a custom cover page list
Alternatively, you can use the embedded compiler below to generate your own PDF portfolio with iText. In this example, we'll demonstrate an additional benefit of using iText to create a PDF portfolio, you can generate a cover page which lists all files contained within the portfolio.
To change the files/description, you just need to edit the following code:
Map<String, String> portfolioEntries = Stream.of(new String[][] {
{ "/uploads/test.docx", "My word document" }
}).collect(Collectors.toMap(data -> data[0], data -> data[1]));
If you don't want to change anything then simply click the Upload File button to choose a file to upload, change the filename/file type specified in "/uploads/test.docx"
to match your document’s name, and then click Execute to run. You can then download the resulting PDF portfolio.
To remove an uploaded file, click the x displayed next to the file name.
Conclusion
We hope you have found this deep-dive into PDF portfolios useful. Even though the feature has been an established part of the PDF specification for over 10 years, it's something that's often overlooked when looking to combine files together in PDF and we think many people could benefit from using them.