Why X,Y-coordinates (returned by iText) differ for the same text sitting in the same spot on different pdf pages?

Good morning, everyone! I've encountered an issue with iText that might be a bug and am posting it here as suggested by the iText guidelines.

If we look at this file - South Africa Tariff - we'll see its pages have the landscape orientation (Rotation=90) and show the same text (Date: 2022-04-29) in the same spot (top-left corner of the page). For our test, p.1,2,3 would suffice.

Now, if we implement our own TextExtractionStrategy and inspect X,Y-coordinates of the character "D" (from the above-mentioned text), we'll see they differ from page to page:

  • p.1 (X=32.07, Y=36.0),
  • p.2 (X=559.0, Y=32.07),
  • p.3 (X=562.93,Y=559.0), where:

X is retrieved from getDescentLine().getStartPoint().get(0),
Y is retrieved from getDescentLine().getStartPoint().get(1).

All 3 pages have the same CropBox [0,0,595,842].

I would expect to see the same X,Y across all pages. Other products - xpdf, PdfBox and iTextSharp - do show consistent coordinates for "D" on all pages.

Does this look like a bug in iText to your eyes? If anyone from the iText Team picks it up & needs more info, I'd be happy to help.

Kind regards, Sit Anko.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum