For approx. 1000 image based PDFs I am trying to explore if there's a machine based process for identifying redactions made to the image files - here understood as a blacked out image space.
Think for example of redactions in Freedom of Information requests: https://www.muckrock.com/news/archives/2016/mar/14/muckrocks-redaction-hall-shame/
Any suggestions would be much welcome.
Here is the process for redacting documents via Adobe Acrobat DC. Might be useful for thinking through how the PDFs were created. In terms of identifying redaction boxes I asked around for you, but still waiting for an answer: