I have a program that calls
PdfPageLabels.getPageLabelFormats() on the same
PdfReader object on
successive lines of my code:
PdfPageLabels.PdfPageLabelFormat pplf = PdfPageLabels.getPageLabelFormats(reader); String labs = PdfPageLabels.getPageLabels(reader);
I have an example. It's a 150Mb PDF file which appears to have 4670 labels via
getPageLabels(), but only 1 via
So my question is: Under what circumstances could the two calls return arrays of different lengths?
The difference between both methods is simple:
getPageLabels()returns the label of every page in an array. If your PDF has 4670 pages, you will get an array with 4670
getPageLabelFormat()returns an array with the formats that are used in the document. It doesn't return
PdfPageLabelFormatinstances. In many cases, there is only one page label format used throughout the document.
You have a document with an intro of five pages, numbered i, ii, iii, iv and v. Then you have a hundred pages, numbers 1 to 100.
In this case,
getPageLabels() should return an array with 105
String values. The
getPageLabelFormat() method however, will only return two
PageLabelFormat values because we are only using two page label formats:
one saying that the first physical page starts with lowercase roman numbers starting with i.
one saying that the sixth physical page starts with arabic numbers, starting with 1.
Only the start format is needed, physical page 2 to 4 have the same format as physical page 1; physical page 7 to 105 have the same format as page 6.