OCR PDF
Run OCR on scanned PDFs so you can search, copy, and select the text inside.
Drop scanned PDFs or images here
PDF · JPG · PNG · WEBP · TIFF · BMP — up to 30 files at once
Languages
Choose the main language in your scan. Add a second only for bilingual documents (for example English + Arabic). Maximum two languages.
Only English is installed on this server. Other languages appear in the list but need Tesseract language packs — run scripts/install-tesseract-langs.ps1 locally or redeploy the VPS.
OCR options
Frequently Asked Questions
About OCR PDF
A scanned PDF looks like a text document but is actually a series of images. You can't select a sentence, copy a number, or search for a word anywhere in it. OCR — optical character recognition — reads those images and adds a hidden text layer behind them. After OCR, the file looks exactly the same but now has real, selectable, searchable text.
Results depend on scan quality. Bright, high-resolution, straight pages give the engine clear shapes to work from. Dark, tilted, or low-resolution scans produce more misreads. Rotating sideways pages before running OCR improves accuracy noticeably — the engine handles text at the correct angle far better.
A searchable PDF is far more useful in the long run. You can convert it to Word or Excel for editing, find specific information quickly, and use it properly in document management systems. For archiving even modest collections of scanned documents, adding OCR makes the whole archive usable instead of a folder of opaque image files.