8 min read

Process Document (Sync)

Synchronously process a document and extract structured data with DocuRift API

Upload and process a document synchronously to extract structured data. This endpoint waits for processing to complete before returning the response.

POST/v1/documents/process

Overview

The synchronous processing endpoint is ideal for:

  • Single document processing
  • Real-time applications requiring immediate results
  • Documents under 10 pages
  • Interactive user workflows

For large documents or batch processing, use the async endpoint instead.

Request

Headers

ParameterTypeDescription
X-API-Keyrequired
stringYour DocuRift API key (format: frc_xxxxx)
Content-Typerequired
stringMust be multipart/form-data for file uploads

Body Parameters

ParameterTypeDescription
filerequired
FileDocument file to process. Supported formats: PDF, PNG, JPG, JPEG, WEBP, TIFF. Maximum size: 50MB
documentType
stringType of document for optimized extraction
Default: generic
language
stringPrimary language of the document (ISO 639-1 code)
Default: en
extractTables
booleanEnable table extraction and structuring
Default: true
ocrFallback
booleanUse OCR for scanned documents or images
Default: true

Supported Document Types

| Type | Description | Use Case | |------|-------------|----------| | generic | Auto-detect document type | Unknown or mixed documents | | invoice | Commercial invoices | Accounts payable automation | | bill_of_lading | Bills of lading (BOL) | Shipping and cargo tracking | | packing_list | Packing lists | Inventory and fulfillment | | customs_declaration | Customs forms | Import/export compliance | | certificate_of_origin | Origin certificates | Trade compliance | | delivery_order | Delivery orders | Last-mile logistics | | freight_invoice | Freight/shipping invoices | Logistics billing | | air_waybill | Air cargo waybills | Air freight tracking | | sea_waybill | Ocean cargo waybills | Ocean freight tracking |

Supported File Formats

| Format | Extension | Max Size | Notes | |--------|-----------|----------|-------| | PDF | .pdf | 50MB | Single or multi-page | | PNG | .png | 50MB | High-resolution recommended | | JPEG | .jpg, .jpeg | 50MB | Minimum 150 DPI recommended | | WebP | .webp | 50MB | Modern format supported | | TIFF | .tiff, .tif | 50MB | Multi-page TIFF supported |

Code Examples

cURL

curl
curl -X POST https://api.docurift.com/v1/documents/process \
-H "X-API-Key: frc_your_api_key_here" \
-F "file=@invoice.pdf" \
-F "documentType=invoice"

Python

process_document.py
import requests
import os

API_KEY = os.getenv('DOCURIFT_API_KEY')
API_URL = 'https://api.docurift.com/v1'

def process_document(file_path, document_type='generic'):
  """Process a document synchronously."""
  headers = {
      'X-API-Key': API_KEY
  }

  with open(file_path, 'rb') as f:
      files = {'file': f}
      data = {'documentType': document_type}

      response = requests.post(
          f'{API_URL}/documents/process',
          headers=headers,
          files=files,
          data=data
      )

  response.raise_for_status()
  return response.json()

# Example usage
result = process_document('invoice.pdf', 'invoice')

print(f"Document ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")
print(f"Confidence: {result['data']['confidence']}")

# Access extracted data
extracted = result['data']['extractedData']
print(f"Invoice Number: {extracted.get('invoiceNumber')}")
print(f"Total Amount: {extracted.get('totalAmount')}")

JavaScript (Node.js)

processDocument.js
import fs from 'fs';
import FormData from 'form-data';
import fetch from 'node-fetch';

const API_KEY = process.env.DOCURIFT_API_KEY;
const API_URL = 'https://api.docurift.com/v1';

async function processDocument(filePath, documentType = 'generic') {
const form = new FormData();
form.append('file', fs.createReadStream(filePath));
form.append('documentType', documentType);

const response = await fetch(`${API_URL}/documents/process`, {
  method: 'POST',
  headers: {
    'X-API-Key': API_KEY,
    ...form.getHeaders()
  },
  body: form
});

if (!response.ok) {
  const error = await response.json();
  throw new Error(error.error.message);
}

return response.json();
}

// Example usage
const result = await processDocument('invoice.pdf', 'invoice');

console.log('Document ID:', result.data.id);
console.log('Status:', result.data.status);
console.log('Confidence:', result.data.confidence);

// Access extracted data
const extracted = result.data.extractedData;
console.log('Invoice Number:', extracted.invoiceNumber);
console.log('Total Amount:', extracted.totalAmount);

JavaScript (Browser)

browser.js
async function processDocument(file, documentType = 'generic') {
const formData = new FormData();
formData.append('file', file);
formData.append('documentType', documentType);

const response = await fetch('https://api.docurift.com/v1/documents/process', {
  method: 'POST',
  headers: {
    'X-API-Key': 'frc_your_api_key_here'
  },
  body: formData
});

if (!response.ok) {
  const error = await response.json();
  throw new Error(error.error.message);
}

return response.json();
}

// Example with file input
const fileInput = document.getElementById('fileInput');
fileInput.addEventListener('change', async (event) => {
const file = event.target.files[0];

try {
  const result = await processDocument(file, 'invoice');
  console.log('Extracted data:', result.data.extractedData);
} catch (error) {
  console.error('Processing failed:', error.message);
}
});

Response

Success Response (200 OK)

response.json
{
"success": true,
"data": {
  "id": "doc_abc123xyz456",
  "organizationId": "org_xyz789",
  "fileName": "invoice.pdf",
  "fileType": "application/pdf",
  "fileSize": 245678,
  "documentType": "invoice",
  "status": "completed",
  "pagesProcessed": 2,
  "confidence": 0.96,
  "extractedData": {
    "invoiceNumber": "INV-2024-00123",
    "invoiceDate": "2024-01-15",
    "dueDate": "2024-02-15",
    "currency": "USD",
    "vendor": {
      "name": "Acme Shipping Co.",
      "address": "123 Harbor Blvd, Los Angeles, CA 90021",
      "taxId": "12-3456789",
      "email": "billing@acmeshipping.com",
      "phone": "+1-555-123-4567"
    },
    "customer": {
      "name": "Global Imports Inc.",
      "address": "456 Trade St, New York, NY 10001",
      "taxId": "98-7654321"
    },
    "lineItems": [
      {
        "description": "Ocean Freight - Container 20ft",
        "quantity": 2,
        "unitPrice": 1500.00,
        "total": 3000.00,
        "hsCode": "8609.00"
      },
      {
        "description": "Documentation Fee",
        "quantity": 1,
        "unitPrice": 150.00,
        "total": 150.00
      }
    ],
    "subtotal": 3150.00,
    "taxRate": 0.08,
    "taxAmount": 252.00,
    "totalAmount": 3402.00,
    "paymentTerms": "Net 30",
    "notes": "Payment due within 30 days of invoice date"
  },
  "metadata": {
    "processingTimeMs": 2340,
    "modelVersion": "v2.1.0",
    "pageConfidences": [0.97, 0.95]
  },
  "createdAt": "2024-01-26T10:30:00Z",
  "processedAt": "2024-01-26T10:30:02Z"
}
}

Response Fields

ParameterTypeDescription
id
stringUnique document identifier (format: doc_xxxxx)
organizationId
stringOrganization that owns this document
fileName
stringOriginal uploaded file name
fileType
stringMIME type of the uploaded file
fileSize
numberFile size in bytes
documentType
stringDocument type used for processing
status
stringProcessing status: completed, failed
pagesProcessed
numberNumber of pages processed (affects credit usage)
confidence
numberOverall extraction confidence score (0-1)
extractedData
objectStructured data extracted from the document
metadata
objectProcessing metadata including timing and model version
createdAt
stringISO 8601 timestamp when document was uploaded
processedAt
stringISO 8601 timestamp when processing completed

Confidence Scores

| Score Range | Quality | Recommendation | |-------------|---------|----------------| | 0.95 - 1.00 | Excellent | High confidence, minimal review needed | | 0.85 - 0.94 | Good | Minor fields may need verification | | 0.70 - 0.84 | Fair | Review important fields | | Below 0.70 | Low | Manual review recommended |

Error Responses

400 Bad Request

error_400.json
{
"success": false,
"error": {
  "code": "INVALID_FILE_TYPE",
  "message": "File type 'application/msword' is not supported. Supported types: PDF, PNG, JPG, JPEG, WEBP, TIFF"
}
}

401 Unauthorized

error_401.json
{
"success": false,
"error": {
  "code": "INVALID_API_KEY",
  "message": "Invalid API key. 20 attempts remaining before IP block."
}
}

402 Payment Required

error_402.json
{
"success": false,
"error": {
  "code": "INSUFFICIENT_CREDITS",
  "message": "Insufficient credits. Document requires 3 pages but only 1 available."
}
}

413 Payload Too Large

error_413.json
{
"success": false,
"error": {
  "code": "FILE_TOO_LARGE",
  "message": "File size 55MB exceeds maximum allowed size of 50MB"
}
}

429 Too Many Requests

error_429.json
{
"success": false,
"error": {
  "code": "RATE_LIMIT_EXCEEDED",
  "message": "Rate limit exceeded. Please retry after 60 seconds."
}
}

Error Codes Reference

| Code | HTTP Status | Description | Solution | |------|-------------|-------------|----------| | INVALID_FILE_TYPE | 400 | Unsupported file format | Use PDF, PNG, JPG, WEBP, or TIFF | | INVALID_DOCUMENT_TYPE | 400 | Unknown document type | Check supported document types | | FILE_TOO_LARGE | 413 | File exceeds 50MB limit | Compress or split the document | | INVALID_API_KEY | 401 | API key invalid or expired | Verify API key in dashboard | | INSUFFICIENT_CREDITS | 402 | Not enough page credits | Purchase more credits | | RATE_LIMIT_EXCEEDED | 429 | Too many requests | Implement backoff and retry | | PROCESSING_FAILED | 500 | Internal processing error | Retry or contact support |

Best Practices

Optimize for Accuracy

  1. Use high-resolution images: Minimum 150 DPI for scanned documents
  2. Specify document type: Improves extraction accuracy by 10-15%
  3. Ensure good lighting: Avoid shadows and glare in photos
  4. Keep documents straight: Minimal rotation for best results

Handle Errors Gracefully

error_handling.py
import requests
from requests.exceptions import RequestException
import time

def process_with_retry(file_path, max_retries=3):
  """Process document with exponential backoff retry."""
  for attempt in range(max_retries):
      try:
          result = process_document(file_path, 'invoice')
          return result
      except requests.HTTPError as e:
          if e.response.status_code == 429:
              # Rate limited - wait and retry
              wait_time = 2 ** attempt * 10
              print(f"Rate limited. Waiting {wait_time}s...")
              time.sleep(wait_time)
          elif e.response.status_code >= 500:
              # Server error - retry
              time.sleep(2 ** attempt)
          else:
              # Client error - don't retry
              raise
  raise Exception("Max retries exceeded")
💡

Processing Time

Synchronous processing typically completes in 2-10 seconds depending on document size and complexity. For documents over 10 pages, consider using the async endpoint.

⚠️

Credit Usage

Each page processed consumes 1 page credit from your balance. Multi-page PDFs consume credits equal to the number of pages.