We have released a new version (6.0.251) of Solid Framework SDK that now includes full support for image processing and optical text recognition to allow conversion of scanned PDF files to editable Word documents. Solid Framework takes advantage of the MODI API (part of Microsoft Office) to provide OCR capability.
When converting PDF to Office documents, you can specify when OCR is used by setting the TextRecoveryType:
To one of the following settings:
Always - All pages are rendered to images and processed as scanned pages.
Automatic - Pages that contain scanned text-like images are recognized automatically.
Default - Same as Automatic.
Never - No scanned page processing. Scanned pages converted as images.
 


 




