OCR Extract Text, Tables, and Mathematical Formulas

When reading PDF documents and books, you often encounter scanned files. At this time, it becomes very troublesome to extract the text information in them. In addition, current PDF files are also very unfriendly to copy and extract tables and math formulas.

To solve this problem, Lattics uses the most advanced AI algorithm. By analyzing the document structure information, it can accurately extract text, tables, math formulas, code blocks and graphics in PDF documents. The recognition rate is very high. The recognized content will be automatically converted to Lattics cards. Text, tables, math formulas, and code blocks can be edited and modified.

The method of using OCR is also very convenient. When reading a PDF file, select the entire page or part of it, and select the OCR option in the pop-up menu. The AI algorithm will automatically recognize and extract the information in it and automatically save it as a card. At the same time, the PDF document metadata will be saved to the card, including the name of the paper/book, the author, the publication time, and the page number of the excerpt. These metadata will become the reference information of the bibliography. Some PDF files do not carry document metadata. At this time, you can manually supplement the metadata in the extended interface of the card.

Note:

Lattics' OCR recognition uses an AI algorithm, which still has some recognition errors. This algorithm is specifically trained for high-precision recognition of academic papers, business documents, and printed books. Documents in other formats will have larger recognition errors.
Supports 41 languages, including: English, French, German, Japanese, Korean, Italian, Spanish, Portuguese, Simplified Chinese, Traditional Chinese, Russian, Ukrainian, Dutch, Swedish, Polish, Turkish, Hungarian, Latin, Indonesian, etc.
Simplified Chinese, Traditional Chinese, and Japanese vertical document will also be supported in the future