This document explains how to use the SimpleTex Open Platform services and provides important information. If you have any questions, please feel free to contact the SimpleTex team.
Open Platform API Documentation
I. Open Platform Authentication
- After registering a regular SimpleTex account and enabling the open platform account in the user center, you can access the open platform either through a User Access Token (UAT) or by signing the request body using the APP ID and APP Secret of the open platform application.
- The UAT method is only for development purposes and should not be used in production environments. Additionally, if your application needs to be deployed on the client side rather than on your own server, do not directly expose the core API authentication interfaces within the application, but instead make requests through temporary application authorization tokens.
1. UAT Authentication
Access through User Authorization Token, the simplest and quickest method
How to obtain: Go to the User Center (https://simpletex.net/user/center), create it in the "User Authorization Token" menu
Request method: Simply place the user authorization token information in the request
header
, the field name istoken
, such asheader={"token":"XXXXX"}
Note: To ensure the security of your application and account, please do not use this method for requests in production environments.
2. APP Authentication
Access through the Open Platform application method (APP method, a more secure approach)
How to obtain: Go to the User Center (https://simpletex.net/user/center), after enabling the open platform function, create it in the "Application List" menu. Upon creation, you will receive the APP ID and APP Secret for the new application. Note that the APP Secret is important sensitive information and is only displayed once. If lost, please create a new application.
Request method:
The authentication method for open platform applications requires signing the POST uploaded data (i.e., form key-value pairs, parameters in the non-binary file part). The signature algorithm process is as follows:
(1) Generate a 16-character random string (numbers and uppercase/lowercase letters), place it in the
random-str
field of the header(2) Get the current timestamp, place it in the
timestamp
field of the header (accurate to the second)(3) Place the current APP ID information in the
app-id
field of the header(4) Take out other keys in the data, sort the keys based on string order (from a-z, ascending), connect each key and its corresponding field information with
&
, such askey1=xxx&key2=xxx&...&keyn=xxx
(therandom-str
,timestamp
, andapp-id
fields from steps 1-3 also need to be included in this string)(5) Add the APP Secret information to the end of the string generated in step 4 via
&secret=xxx
(ignoring the key sorting order), becomingkey1=xxx&key2=xxx&...&keyn=xxx&secret=xxx
(6) Sign the string obtained in the previous step using the MD5 signature algorithm, and get its string expression (32 characters in length)
(7) Place this signature string in the
header
, with the field namesign
(8) The final
header
format for authentication should be:header={ "app-id":xxx, "random-str":xxx, "timestamp":xxx, "sign":xxx }
(Note: Do not include the APP Secret information in any other part of the business request. This information is only used to generate the signature to prove legitimate identity and should not be included in the request body)
(9) At this point, the preparation of information required for APP authentication is complete
(10) Example
- Original data information:
use_batch=True
(In this example, we use what is shown below)
{ "use_batch"=True }
- Obtain other required information:
{ 'timestamp': '1675550577', 'random-str': 'mSkYSY28N4WkvidB', 'app-id': '19X4f10YM1Va894nvFl89ikY', // For testing purposes only }
- For this example, the APP Secret is
fu4Wfmna4153DFN12ctBsPqgVI3vvGGK
, so the string to be signed can be calculated as (fields sorted asapp-id
,random-str
,use_batch
,secret
):app-id=19X4f10YM1Va894nvFl89ikY&random-str=mSkYSY28N4WkvidB×tamp=1675550577&use_batch=True&secret=fu4Wfmna4153DFN12ctBsPqgVI3vvGGK
- Using the MD5 algorithm to calculate the signature, we get
5f271e1deccd95d467c7dd430ca2c8b1
(you can use the website (http://tool.pfan.cn/md5) for testing, or search for online MD5) - The final header information is:
{ 'timestamp': '1675550577', 'random-str': 'mSkYSY28N4WkvidB', 'app-id': '19X4f10YM1Va894nvFl89ikY', 'sign': '5f271e1deccd95d467c7dd430ca2c8b1' }
- Original data information:
II. API Response Information
1. Response Structure
In the API response information,
status
represents whether the request was successful, other result information is placed in theres
field, andrequest_id
contains the ID of this requestThe standard return format is
{ "status": true/false, // Whether the interface was successfully called "res": { // Call result ... }, "request_id": "tr_xxxxxxxxxx" // Request ID }
2. Error Codes
errType Name | HTTP Status Code | Description |
---|---|---|
api_not_find | 404 | API or corresponding version not found |
req_method_error | 405 | Incorrect request method (such as GET, POST) |
req_unauthorized | 401 | Authentication failed (any detailed error will result in this response, please check carefully!) |
resource_no_valid | 402 | No resources available to call the interface, such as no resource package or insufficient account balance |
image_missing | 413 | No image file uploaded |
image_oversize | 413 | Image file too large |
sever_closed | 503 | Server not started/under maintenance |
server_error | 500 | Internal server error |
exceed_max_qps | 429 | Exceeded maximum QPS, please try again later |
exceed_max_ccy | 429 | Exceeded maximum concurrent requests, please try again later |
server_inference_error | 500 | Server inference error |
image_proc_error | 500 | Error processing uploaded image |
invalid_param | 500 | Server error caused by invalid parameters |
too_many_file | 500 | Server error caused by too many files |
no_file_error | 500 | Server error caused by no files |
III. Sample Code
- Python sample code
1. UAT Authentication Method
import requests
SIMPLETEX_UAT="xxxxx"
api_url="https://server.simpletex.net/xxxxx" # API endpoint address
data = { } # Request parameter data (non-file parameters), fill in as needed, refer to each API's parameter description
header={ "token": SIMPLETEX_UAT } # Authentication information, using UAT method here
file=[("file",("test.png",open("test.png", 'rb')))] # Request file, field name is generally "file"
res = requests.post(api_url, files=file, data=data, headers=header) # Use requests library to upload files
print(json.loads(res.text))
- Python sample code
2. APP Authentication Method
import datetime
import json
import requests
from random import Random
import hashlib
SIMPLETEX_APP_ID = "xxxxx"
SIMPLETEX_APP_SECRET = "xxxxxxxxxxxxxxx"
def random_str(randomlength=16):
str = ''
chars = 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz0123456789'
length = len(chars) - 1
random = Random()
for i in range(randomlength):
str += chars[random.randint(0, length)]
return str
def get_req_data(req_data, appid, secret):
header = {}
header["timestamp"] = str(int(datetime.datetime.now().timestamp()))
header["random-str"] = random_str(16)
header["app-id"] = appid
pre_sign_string = ""
sorted_keys = list(req_data.keys()) + list(header)
sorted_keys.sort()
for key in sorted_keys:
if pre_sign_string:
pre_sign_string += "&"
if key in header:
pre_sign_string += key + "=" + str(header[key])
else:
pre_sign_string += key + "=" + str(req_data[key])
pre_sign_string += "&secret=" + secret
header["sign"] = hashlib.md5(pre_sign_string.encode()).hexdigest()
return header, req_data
img_file = {"file": open("./image/1.png", 'rb')}
data = { } # Request parameter data (non-file parameters), fill in as needed, refer to each interface's parameter description
header, data = get_req_data(data, SIMPLETEX_APP_ID, SIMPLETEX_APP_SECRET)
res = requests.post("https://server.simpletex.net/xxxx", files=img_file, data=data, headers=header)
print(json.loads(res.text))
Open Capabilities
I. SimpleTex Formula Recognition
The formula recognition model is divided into lightweight and standard models. The lightweight model is faster while the standard model performs slightly better. You can choose based on your specific scenario testing.
The specialized formula recognition currently supports recognition of various texts in more than 80 languages, as well as LaTeX symbols, matrices, chemical structure formulas, and complex equations. It supports both handwritten and printed text recognition. If you need to recognize document-type images, please use the SimpleTex General Image Recognition API. Online testing experience address: https://simpletex.net/ai/latex_ocr
Pricing Models: Pay-as-you-go and Prepaid Resource Packages
There are currently two billing methods: pay-as-you-go and prepaid resource packages for call deductions. The deduction order is Free Resource Package -> Paid Resource Package (sorted by expiration time, with earlier expiration dates prioritized) -> Pay-as-you-go
Service billing only charges for successful calls. Users can check their usage and related orders on the open platform. Note: Each successful file/image calculation in a Batch call counts as a separate call.
API Pay-as-you-go (Note that the prices listed below are not the official prices after SimpleTex formally launches the service, they are for reference only! If you have any questions, please contact us)
1. Lightweight Formula Recognition Model API
Service Pricing (Lightweight)
Monthly Usage (Times) | Price (CNY/Time) |
---|---|
<1000 | Free |
1000-5000 | 0.04 |
>5000 | 0.01 |
API Speed Limits
Limit Type | Default Free Quota |
---|---|
Request Processing Concurrency | 5 |
Regular Request QPS | 5 |
Batch Request QPS/Concurrency | 25 |
API Usage Method
API Endpoint: https://server.simpletex.net/api/latex_ocr_turbo
Model Version:
SimpleTex V2.5
Request Method:
POST
Request Parameters:
Header
: Authentication parameters (UAT or APP information)Body
: Multipart/form-data
Parameter Details
Parameter Name | Parameter Type | Required | Description | Example |
---|---|---|---|---|
file | File | Yes | Valid image binary file information, including png/jpg formats. If batch requests are enabled, file names cannot be duplicated, otherwise results for files with the same name will conflict and overwrite each other | / |
2. Standard Formula Recognition Model API
Service Pricing (Standard)
Monthly Usage (Times) | Price (CNY/Time) |
---|---|
<1000 | Free |
1000-5000 | 0.05 |
>5000 | 0.02 |
Records are cleared at 00:00 on the 1st of each month, and billing is done according to usage tiers. The lightweight model provides 2000 free calls daily, while the standard model provides 500 free calls daily.
Prepaid Resource Packages (Due to business adjustments, please contact us for pricing)
If you need resource packages of other specifications, please contact us. Resource packages are non-refundable, so please estimate a reasonable number of uses before purchasing. If you need to upgrade the number of calls, please purchase a new resource package or use the pay-as-you-go billing method.
API Speed Limits
Limit Type | Default Free Quota |
---|---|
Request Processing Concurrency | 2 |
Regular Request QPS | 2 |
Batch Request QPS/Concurrency | 10 |
QPS refers to the number of requests per second, and request processing concurrency refers to how many threads the server has simultaneously processing user requests. QPS can be expanded by purchasing QPS add-on packages. For a Batch request, each individual object to be processed within the Batch will occupy one Batch request QPS quota and use the same amount of request processing concurrency.
Example: If an interface has QPS and concurrency both set to 1, and assuming a request takes 0.3s for server calculation and response, the maximum request speed is limited by QPS to 1 request/second. If a request takes 3s for server calculation and response, the maximum request speed is limited by concurrency to 1 request/3s. (According to Little's Law, concurrency = QPS * average interface processing time)
For businesses with high request speed requirements, please contact us to select and add QPS resource packages. If you have special requirements, please contact us.
API Usage Method
API Endpoint: https://server.simpletex.net/api/latex_ocr
Model Version:
SimpleTex V2.5
Request Method:
POST
Request Parameters:
Header
: Authentication parameters (UAT or APP information)Body
: Multipart/form-data
Parameter Details
Parameter Name | Parameter Type | Required | Description | Example |
---|---|---|---|---|
file | File | Yes | Valid image binary file information, including png/jpg formats. If batch requests are enabled, file names cannot be duplicated, otherwise results for files with the same name will conflict and overwrite each other | / |
3. API Response Examples
Single File Upload
{ "status": true, // Whether the API call was successful "res": { // Call result "latex": "a^{2}-b^{2}", // LaTeX information, more information will be available in this section in the future "conf":0.95 // Confidence level }, "request_id": "tr_16755479007123063412063155819" // Request ID }
Multiple Files Upload
{ "status": true, // Whether the API call was successful "res": { // Call result "stats": { // Success and failure call statistics "fail": 0, "success": 2 }, "fail_res": {}, // Error information for failed image calls "success_res": { // Result information for successfully recognized images "test_1.png": { "latex": "a^{2}-b^{2}", "conf":0.95 // Confidence level }, "test_2.png": { "latex": "a^{3}+b^{3}", "conf":0.90 } } }, "request_id": "tr_16755477466238226695895375638" // Request ID }
- Special Return Values [EMPTY]: Image is empty
[DOCIMG]: Image is a document type, it is recommended to use the general image recognition interface, as the formula model cannot output results
2. SimpleTex General Image Recognition
SimpleTex General Image Recognition currently supports recognition of various text in over 80 languages as well as LaTeX symbols, matrices, and complex equations. It supports tables, mixed text layouts, document pages, double-column papers, and common handwritten/printed text recognition.
1. API Usage Method
Lightweight Model API Endpoint: https://server.simpletex.net/api/simpletex_ocr
Model Version:
SimpleTex General OCR V1
Request Method:
POST
Request Parameters:
Header
: Authentication parameters (UAT or APP information)Body
: Multipart/form-data
Parameter Details
Parameter Name | Parameter Type | Required | Description | Example |
---|---|---|---|---|
file | File | Yes | Valid image binary file information, including png/jpg formats. Batch upload is not supported, only one image can be uploaded at a time | / |
rec_mode | String | No | Can be set to "auto", "document", or "formula" to specify the type of image recognition. "auto" will automatically detect the type, "document" will return markdown document results, and "formula" will return LaTeX results | "auto" |
enable_img_rot | Boolean | No | When enabled, the model will automatically correct the orientation of the uploaded image based on 0°, 90°, 180°, 270°. Disabled by default | "false" |
inline_formula_wrapper | JSON String Array | No | Used to modify the wrapper symbols for inline formulas in markdown. Input in JSON format. Default wrapper symbols will be used if format is incorrect | ["$","$"] |
isolated_formula_wrapper | JSON String Array | No | Used to modify the wrapper symbols for isolated formulas in markdown. Input in JSON format. Default wrapper symbols will be used if format is incorrect | ["$$","$$"] |
2. API Pricing
Monthly Usage (Times) | Price (USD/Page) |
---|---|
<1000 | Free |
1000-5000 | 0.1 |
>5000 | 0.04 |
Concurrency Limits | Default Free Quota |
---|---|
Request Processing Concurrency | 1 |
Regular Request QPS | 1 |
During the current testing period, 50 free recognitions are automatically provided daily.
3. SimpleTex Document Image Recognition (PDF Recognition)
SimpleTex Document Image Recognition currently supports recognition in both Chinese and English languages. It is the OCR interface used for PDF file recognition in the formula and chart enhancement mode. [Note: This interface may change at any time and is currently only provided for testing and reference purposes]
1. API Usage Method
Lightweight Model API Endpoint: https://server.simpletex.net/api/doc_ocr
Model Version:
SimpleTex Doc OCR V1
Request Method:
POST
Request Parameters:
Header
: Authentication parameters (UAT or APP information)Body
: Multipart/form-data
Additional Notes: Currently, this API only supports uploading one file at a time, batch processing is not supported.
Parameter Details
Parameter Name | Parameter Type | Required | Description | Example |
---|---|---|---|---|
file | File | Yes | Valid image binary file information of PDF pages, including png/jpg formats. Batch upload is not supported, only one image can be uploaded at a time | / |
inline_formula_wrapper | JSON String Array | No | Used to modify the wrapper symbols for inline formulas in markdown. Input in JSON format. Default wrapper symbols will be used if format is incorrect | ["$","$"] |
isolated_formula_wrapper | JSON String Array | No | Used to modify the wrapper symbols for isolated formulas in markdown. Input in JSON format. Default wrapper symbols will be used if format is incorrect | ["$$","$$"] |
2. API Pricing (New Lower Prices for 2025!)
Monthly Usage (Times) | Price (USD/Page) |
---|---|
<1000 | Free |
1000+ | 0.003 |
Concurrency Limits | Default Free Quota |
---|---|
Request Processing Concurrency | 1 |
Regular Request QPS | 1 |
3. API Response Example
Single File Upload
{ "status": true, // Whether the API call was successful "res": { // Response result "content": "...", // Markdown content }, "request_id": "tr_16755479007123063412063155819" // Request ID }
4. PDF Recognition Example Code
The following code can be used to convert PDF files to Markdown files. It uses the PyMuPDF library for reading PDF files, the PIL library for image processing, the requests library for file uploading, and the tqdm library for progress bar display.
First, install the required libraries PyMuPDF, requests, Pillow, tqdm etc. using pip with the following command:
pip install PyMuPDF requests Pillow tqdm
Detailed Code
import io
import fitz
from PIL import Image
import requests
from tqdm import tqdm
UAT = "xxxxx" # User Authorization Token
def pillow_image_to_file_binary(image):
btyes_io = io.BytesIO()
image.save(btyes_io, format='PNG')
return btyes_io.getvalue()
def convert_pdf_to_images(pdf_binary, dpi=100):
doc = fitz.open("pdf", pdf_binary)
images = []
for i in range(doc.page_count):
page = doc[i]
image = page.get_pixmap(dpi=dpi)
image = Image.frombytes("RGB", [image.width, image.height], image.samples)
images.append(image)
return images
def pdf_ocr(image):
api_url = "https://server.simpletex.net/api/doc_ocr/"
header = {"token": UAT} # Authentication info, using UAT method here
img_file = {"file": pillow_image_to_file_binary(image)}
res = requests.post(api_url, files=img_file, data={}, headers=header).json() # Use requests library to upload file
print(res)
return res["res"]["content"]
if __name__ == '__main__':
pdf_path = 'test.pdf' # Input PDF file
file_binary = open(pdf_path, 'rb').read()
images = convert_pdf_to_images(file_binary)
final_markdown_content = ""
for image in tqdm(images):
final_markdown_content += pdf_ocr(image) + "\n"
open("test.md", "w", encoding="utf-8").write(final_markdown_content)
print(final_markdown_content) # Save and output the final markdown file
Direct PDF file asynchronous upload service will be supported in the future. During the current testing period, 1000 free recognitions are automatically granted daily.
Other open capabilities will be gradually released (General OCR, Word Embedding, Chatbot, Chinese-English Translation, Layout Analysis)