MPDF to Markdown
OCR · Scanned PDF

Scanned PDF to Markdown

Extract text from scanned or image-only PDFs and get clean Markdown. OCR runs entirely in your browser — your file never leaves your device.

100% private — files never leave your browser

Drop a scanned PDF here, or click to browse

Image or scanned PDF · OCR in your browser · Max 50MB

No PDF handy?

First run downloads a language model (~2–15MB). After that it works offline — your file is processed locally.

Built-in OCR

Reads text from scanned and image-only PDFs that normal converters can't, powered by Tesseract in your browser.

100+ languages

Recognize English, Chinese, Japanese, Korean, Spanish, French, German, Russian and many more.

Private by design

OCR runs locally via WebAssembly. Only the language model is fetched — your document never leaves your device.

Frequently asked questions

What is a scanned PDF?

A scanned or image-only PDF has no selectable text layer — it's essentially pictures of pages. Regular converters return nothing, so OCR is needed to read the text.

Is the OCR free and private?

Yes. Recognition runs entirely in your browser using WebAssembly. Only the language model is downloaded from a CDN; your PDF is never uploaded.

Which languages are supported?

Pick from English, Simplified and Traditional Chinese, Japanese, Korean, Spanish, French, German, Russian and more before you start.

Why is OCR slower than a normal conversion?

OCR analyzes each page image pixel by pixel to recognize characters, which is heavier than reading an existing text layer. Larger documents take longer.

Related tools