How To find internal links in a PDF file?

I am using ItextSharp for searching internal links in a PDF file. This is already done with External Links.

 

//Get the current page
PdfDictionary PageDictionary = R.GetPageN(page);
//Get all of the annotations for the current page
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0)) {
    Console.WriteLine("nothing");
}
//Loop through each annotation
if (Annots != null) {
    foreach (PdfObject A in Annots.ArrayList) {
        //Convert the itext-specific object as a generic PDF object
        PdfDictionary AnnotationDictionary =
            (PdfDictionary)PdfReader.GetPdfObject(A);
        //Make sure this annotation has a link
        if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
            continue;
        //Make sure this annotation has an ACTION
        if (AnnotationDictionary.Get(PdfName.A) == null)
            continue;
        //Get the ACTION for the current annotation
        PdfDictionary AnnotationAction =
            AnnotationDictionary.GetAsDict(PdfName.A);
        // Test if it is a URI action (There are tons of other types of actions,
        // some of which might mimic URI, such as JavaScript,
        // but those need to be handled seperately)
        if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI)) {
            PdfString Destination = AnnotationAction.GetAsString(PdfName.URI);
            string url1 = Destination.ToString();
        }
    }
}
Posted on StackOverflow on Feb 22, 2014 by Ashwani

You've already done most of the work.

In iText 7 for Java your code will be the following:

//Get the current page
PdfPage pdfPage = pdfDoc.getPage(page);
//Get all of the annotations for the current page
List annots = pdfPage.getAnnotations();
//Make sure we have something
if ((annots == null) || (annots.size() == 0)) {
    System.out.println("nothing");
}
//Loop through each annotation
else {
    for (PdfAnnotation a : annots) {
        //Make sure this annotation has a link
        if (a.getSubtype().equals(PdfName.Link))
            continue;
        //Make sure this annotation has an ACTION
        if (a.getAction() != null) {
            //Get the ACTION for the current annotation
            PdfDictionary annotAction = a.getAction();
            // Test if it is a URI action (There are tons of other types of actions,
            // some of which might mimic URI, such as JavaScript,
            // but those need to be handled seperately)
            if (annotAction.get(PdfName.S).equals(PdfName.URI) ||
                annotAction.get(PdfName.S).equals(PdfName.GoToR)) {
                    //do smth with external links
                    PdfString destination = annotAction.getAsString(PdfName.URI);
                    String url1 = destination.toString();
            }
            else if (annotAction.get(PdfName.S).equals(PdfName.GoTo) ||
                annotAction.get(PdfName.S).equals(PdfName.GoToE)) {
                    //do smth with internal links
            }
        }
    }
}

As you see, you don’t need to get the array of annotations yourself and convert annotation object to the PdfDictionary, as it was done in iText 5. Just use built-in methods.

Please take a look at the following screen shot:

Internal view of the PDF

Internal view of the PDF

You see the /Annots array of a page. You are already parsing that array in your code and you skip all annotations that aren't of the /Subtype /Link or don't have an /A key, which is excellent.

Currently you're only looking for values of /S that are of type /URI. You say you're already done with external links, but that's not true: you should also look for entries where /S is /GoToR (remote goto). If you want internal links, you need to look for /S values equal to /GoTo, /GoToE, and (in the future) /GoToDp. Maybe you also want to remove the /JavaScript actions, because they can also be used to jump to a specific page.

Click this link if you want to see how to answer this question in iText 5.



Ready to use iText?

Try our iText 7 Library and add-ons FREE for 30 days. Test your proof of concept, and see if our solution is right for you.

Get my FREE trial
Contact

Still have questions? 

We're happy to answer your questions. Reach out to us and we'll get back to you shortly.

Contact us
Stay updated

Join 11,000+ subscribers and become an iText PDF expert by staying up to date with our new products, updates, tips, technical solutions and happenings.

Subscribe Now