Process Document (Sync)
Synchronously process a document and extract structured data with DocuRift API
Upload and process a document synchronously to extract structured data. This endpoint waits for processing to complete before returning the response.
/v1/documents/processOverview
The synchronous processing endpoint is ideal for:
- Single document processing
- Real-time applications requiring immediate results
- Documents under 10 pages
- Interactive user workflows
For large documents or batch processing, use the async endpoint instead.
Request
Headers
| Parameter | Type | Description |
|---|---|---|
X-API-Keyrequired | string | Your DocuRift API key (format: frc_xxxxx) |
Content-Typerequired | string | Must be multipart/form-data for file uploads |
Body Parameters
| Parameter | Type | Description |
|---|---|---|
filerequired | File | Document file to process. Supported formats: PDF, PNG, JPG, JPEG, WEBP, TIFF. Maximum size: 50MB |
documentType | string | Type of document for optimized extraction Default: generic |
language | string | Primary language of the document (ISO 639-1 code) Default: en |
extractTables | boolean | Enable table extraction and structuring Default: true |
ocrFallback | boolean | Use OCR for scanned documents or images Default: true |
Supported Document Types
| Type | Description | Use Case |
|------|-------------|----------|
| generic | Auto-detect document type | Unknown or mixed documents |
| invoice | Commercial invoices | Accounts payable automation |
| bill_of_lading | Bills of lading (BOL) | Shipping and cargo tracking |
| packing_list | Packing lists | Inventory and fulfillment |
| customs_declaration | Customs forms | Import/export compliance |
| certificate_of_origin | Origin certificates | Trade compliance |
| delivery_order | Delivery orders | Last-mile logistics |
| freight_invoice | Freight/shipping invoices | Logistics billing |
| air_waybill | Air cargo waybills | Air freight tracking |
| sea_waybill | Ocean cargo waybills | Ocean freight tracking |
Supported File Formats
| Format | Extension | Max Size | Notes |
|--------|-----------|----------|-------|
| PDF | .pdf | 50MB | Single or multi-page |
| PNG | .png | 50MB | High-resolution recommended |
| JPEG | .jpg, .jpeg | 50MB | Minimum 150 DPI recommended |
| WebP | .webp | 50MB | Modern format supported |
| TIFF | .tiff, .tif | 50MB | Multi-page TIFF supported |
Code Examples
cURL
curl -X POST https://api.docurift.com/v1/documents/process \
-H "X-API-Key: frc_your_api_key_here" \
-F "file=@invoice.pdf" \
-F "documentType=invoice"Python
import requests
import os
API_KEY = os.getenv('DOCURIFT_API_KEY')
API_URL = 'https://api.docurift.com/v1'
def process_document(file_path, document_type='generic'):
"""Process a document synchronously."""
headers = {
'X-API-Key': API_KEY
}
with open(file_path, 'rb') as f:
files = {'file': f}
data = {'documentType': document_type}
response = requests.post(
f'{API_URL}/documents/process',
headers=headers,
files=files,
data=data
)
response.raise_for_status()
return response.json()
# Example usage
result = process_document('invoice.pdf', 'invoice')
print(f"Document ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")
print(f"Confidence: {result['data']['confidence']}")
# Access extracted data
extracted = result['data']['extractedData']
print(f"Invoice Number: {extracted.get('invoiceNumber')}")
print(f"Total Amount: {extracted.get('totalAmount')}")JavaScript (Node.js)
import fs from 'fs';
import FormData from 'form-data';
import fetch from 'node-fetch';
const API_KEY = process.env.DOCURIFT_API_KEY;
const API_URL = 'https://api.docurift.com/v1';
async function processDocument(filePath, documentType = 'generic') {
const form = new FormData();
form.append('file', fs.createReadStream(filePath));
form.append('documentType', documentType);
const response = await fetch(`${API_URL}/documents/process`, {
method: 'POST',
headers: {
'X-API-Key': API_KEY,
...form.getHeaders()
},
body: form
});
if (!response.ok) {
const error = await response.json();
throw new Error(error.error.message);
}
return response.json();
}
// Example usage
const result = await processDocument('invoice.pdf', 'invoice');
console.log('Document ID:', result.data.id);
console.log('Status:', result.data.status);
console.log('Confidence:', result.data.confidence);
// Access extracted data
const extracted = result.data.extractedData;
console.log('Invoice Number:', extracted.invoiceNumber);
console.log('Total Amount:', extracted.totalAmount);JavaScript (Browser)
async function processDocument(file, documentType = 'generic') {
const formData = new FormData();
formData.append('file', file);
formData.append('documentType', documentType);
const response = await fetch('https://api.docurift.com/v1/documents/process', {
method: 'POST',
headers: {
'X-API-Key': 'frc_your_api_key_here'
},
body: formData
});
if (!response.ok) {
const error = await response.json();
throw new Error(error.error.message);
}
return response.json();
}
// Example with file input
const fileInput = document.getElementById('fileInput');
fileInput.addEventListener('change', async (event) => {
const file = event.target.files[0];
try {
const result = await processDocument(file, 'invoice');
console.log('Extracted data:', result.data.extractedData);
} catch (error) {
console.error('Processing failed:', error.message);
}
});Response
Success Response (200 OK)
{
"success": true,
"data": {
"id": "doc_abc123xyz456",
"organizationId": "org_xyz789",
"fileName": "invoice.pdf",
"fileType": "application/pdf",
"fileSize": 245678,
"documentType": "invoice",
"status": "completed",
"pagesProcessed": 2,
"confidence": 0.96,
"extractedData": {
"invoiceNumber": "INV-2024-00123",
"invoiceDate": "2024-01-15",
"dueDate": "2024-02-15",
"currency": "USD",
"vendor": {
"name": "Acme Shipping Co.",
"address": "123 Harbor Blvd, Los Angeles, CA 90021",
"taxId": "12-3456789",
"email": "billing@acmeshipping.com",
"phone": "+1-555-123-4567"
},
"customer": {
"name": "Global Imports Inc.",
"address": "456 Trade St, New York, NY 10001",
"taxId": "98-7654321"
},
"lineItems": [
{
"description": "Ocean Freight - Container 20ft",
"quantity": 2,
"unitPrice": 1500.00,
"total": 3000.00,
"hsCode": "8609.00"
},
{
"description": "Documentation Fee",
"quantity": 1,
"unitPrice": 150.00,
"total": 150.00
}
],
"subtotal": 3150.00,
"taxRate": 0.08,
"taxAmount": 252.00,
"totalAmount": 3402.00,
"paymentTerms": "Net 30",
"notes": "Payment due within 30 days of invoice date"
},
"metadata": {
"processingTimeMs": 2340,
"modelVersion": "v2.1.0",
"pageConfidences": [0.97, 0.95]
},
"createdAt": "2024-01-26T10:30:00Z",
"processedAt": "2024-01-26T10:30:02Z"
}
}Response Fields
| Parameter | Type | Description |
|---|---|---|
id | string | Unique document identifier (format: doc_xxxxx) |
organizationId | string | Organization that owns this document |
fileName | string | Original uploaded file name |
fileType | string | MIME type of the uploaded file |
fileSize | number | File size in bytes |
documentType | string | Document type used for processing |
status | string | Processing status: completed, failed |
pagesProcessed | number | Number of pages processed (affects credit usage) |
confidence | number | Overall extraction confidence score (0-1) |
extractedData | object | Structured data extracted from the document |
metadata | object | Processing metadata including timing and model version |
createdAt | string | ISO 8601 timestamp when document was uploaded |
processedAt | string | ISO 8601 timestamp when processing completed |
Confidence Scores
| Score Range | Quality | Recommendation | |-------------|---------|----------------| | 0.95 - 1.00 | Excellent | High confidence, minimal review needed | | 0.85 - 0.94 | Good | Minor fields may need verification | | 0.70 - 0.84 | Fair | Review important fields | | Below 0.70 | Low | Manual review recommended |
Error Responses
400 Bad Request
{
"success": false,
"error": {
"code": "INVALID_FILE_TYPE",
"message": "File type 'application/msword' is not supported. Supported types: PDF, PNG, JPG, JPEG, WEBP, TIFF"
}
}401 Unauthorized
{
"success": false,
"error": {
"code": "INVALID_API_KEY",
"message": "Invalid API key. 20 attempts remaining before IP block."
}
}402 Payment Required
{
"success": false,
"error": {
"code": "INSUFFICIENT_CREDITS",
"message": "Insufficient credits. Document requires 3 pages but only 1 available."
}
}413 Payload Too Large
{
"success": false,
"error": {
"code": "FILE_TOO_LARGE",
"message": "File size 55MB exceeds maximum allowed size of 50MB"
}
}429 Too Many Requests
{
"success": false,
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Rate limit exceeded. Please retry after 60 seconds."
}
}Error Codes Reference
| Code | HTTP Status | Description | Solution |
|------|-------------|-------------|----------|
| INVALID_FILE_TYPE | 400 | Unsupported file format | Use PDF, PNG, JPG, WEBP, or TIFF |
| INVALID_DOCUMENT_TYPE | 400 | Unknown document type | Check supported document types |
| FILE_TOO_LARGE | 413 | File exceeds 50MB limit | Compress or split the document |
| INVALID_API_KEY | 401 | API key invalid or expired | Verify API key in dashboard |
| INSUFFICIENT_CREDITS | 402 | Not enough page credits | Purchase more credits |
| RATE_LIMIT_EXCEEDED | 429 | Too many requests | Implement backoff and retry |
| PROCESSING_FAILED | 500 | Internal processing error | Retry or contact support |
Best Practices
Optimize for Accuracy
- Use high-resolution images: Minimum 150 DPI for scanned documents
- Specify document type: Improves extraction accuracy by 10-15%
- Ensure good lighting: Avoid shadows and glare in photos
- Keep documents straight: Minimal rotation for best results
Handle Errors Gracefully
import requests
from requests.exceptions import RequestException
import time
def process_with_retry(file_path, max_retries=3):
"""Process document with exponential backoff retry."""
for attempt in range(max_retries):
try:
result = process_document(file_path, 'invoice')
return result
except requests.HTTPError as e:
if e.response.status_code == 429:
# Rate limited - wait and retry
wait_time = 2 ** attempt * 10
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
elif e.response.status_code >= 500:
# Server error - retry
time.sleep(2 ** attempt)
else:
# Client error - don't retry
raise
raise Exception("Max retries exceeded")Processing Time
Synchronous processing typically completes in 2-10 seconds depending on document size and complexity. For documents over 10 pages, consider using the async endpoint.
Credit Usage
Each page processed consumes 1 page credit from your balance. Multi-page PDFs consume credits equal to the number of pages.
Related Endpoints
- Process Document (Async) - For large documents and batch processing
- Get Document - Retrieve a processed document
- List Documents - List all processed documents