An ingested PDF does not have transcript metadata even though the collection has a Full Text Search metadata field.

Last updated
Save as PDF

Applies to

CONTENTdm
Project Client

Answer

This is happening because there is no embedded text in the PDF being ingested. PDF files created by scanning physical documents often behave this way because the scanning process creates a digital representation of the page and does not automatically process the scanned images using Ocular Character Recognition.

Text can be embedded in the PDF by processing it using OCR before ingesting the file into Project Client. Alternatively, the PDF can be exported as a series of image files and those image files can be ingested as a compound object to be processed with the Project Client's integrated OCR functionality.

For more on adding PDF files in CONTENTdm see Work with PDF files.

Page ID

49187

Need help?

Follow OCLC

Support

Related sites

Stay in the know.