Document Image Recognition (PDF Recognition)
SimpleTex document image recognition currently supports recognition in both Chinese and English languages. It is the OCR interface used for PDF file recognition in the formula and chart enhanced mode. [Note: This interface may change at any time and is currently only for testing and reference]
API Call Method
Lightweight model API address: https://server.simpletex.net/api/doc_ocr
Model version:
SimpleTex Doc OCR V1
Interface method:
POST
Request parameters:
Header
: Authentication parameters (UAT or APP information)Body
: Multipart/form-data
Other notes: Currently, this interface only supports uploading one file at a time and does not support batch calls.
Parameter details
Parameter Name | Parameter Type | Required | Description | Example |
---|---|---|---|---|
file | File | Yes | Valid PDF page image binary file, including png/jpg formats. Batch processing is not supported, only one image can be uploaded at a time | / |
inline_formula_wrapper | String JSON array | No | Used to modify the wrapper symbols for inline formulas in markdown. Enter in JSON format. If the format is incorrect, default wrapper symbols will be used | ["$","$"] |
isolated_formula_wrapper | String JSON array | No | Used to modify the wrapper symbols for isolated line formulas in markdown. Enter in JSON format. If the format is incorrect, default wrapper symbols will be used | ["$$","$$"] |
Pricing
Monthly API Calls (times) | Price (CNY/page) |
---|---|
<1000 | Free |
1000+ | 0.02 |
Concurrency Limits | Default Free Quota |
---|---|
Request Processing Concurrency | 1 |
Normal Request QPS | 1 |
Response Example
Single file upload
{ "status": true, // Whether the API call was successful "res": { // Call result "content": "...", // Markdown information }, "request_id": "tr_16755479007123063412063155819" // Request ID }
Sample Code
The following code can be used to convert PDF files to Markdown files. It uses the PyMuPDF library for reading PDF files, the PIL library for image processing, the requests library for file uploading, and the tqdm library for progress bar display.
You can first install the required libraries PyMuPDF, requests, Pillow, tqdm using pip with the following command
pip install PyMuPDF requests Pillow tqdm
Detailed code
import io
import fitz
from PIL import Image
import requests
from tqdm import tqdm
UAT = "xxxxx" # User Authorization Token
def pillow_image_to_file_binary(image):
btyes_io = io.BytesIO()
image.save(btyes_io, format='PNG')
return btyes_io.getvalue()
def convert_pdf_to_images(pdf_binary, dpi=100):
doc = fitz.open("pdf", pdf_binary)
images = []
for i in range(doc.page_count):
page = doc[i]
image = page.get_pixmap(dpi=dpi)
image = Image.frombytes("RGB", [image.width, image.height], image.samples)
images.append(image)
return images
def pdf_ocr(image):
api_url = "https://server.simpletex.net/api/doc_ocr/"
header = {"token": UAT} # Authentication information, using UAT method here
img_file = {"file": pillow_image_to_file_binary(image)}
res = requests.post(api_url, files=img_file, data={}, headers=header).json() # Use requests library to upload files
print(res)
return res["res"]["content"]
if __name__ == '__main__':
pdf_path = 'test.pdf' # Input PDF file
file_binary = open(pdf_path, 'rb').read()
images = convert_pdf_to_images(file_binary)
final_markdown_content = ""
for image in tqdm(images):
final_markdown_content += pdf_ocr(image) + "\n"
open("test.md", "w", encoding="utf-8").write(final_markdown_content)
print(final_markdown_content) # Save and output the final markdown file
In the future, we will further support asynchronous direct upload service for PDF files. Currently, during the testing period, users can automatically receive 1000 free recognitions per day.