8 min read

Process Document (Sync)

Name: DocuRift Document Extraction
Brand: DocuRift
Price: 4 INR
Availability: InStock

Synchronously process a document and extract structured data with DocuRift API

Upload and process a document synchronously to extract structured data. This endpoint waits for processing to complete before returning the response.

POST/v1/documents/process

Overview

The synchronous processing endpoint is ideal for:

Single document processing
Real-time applications requiring immediate results
Documents under 10 pages
Interactive user workflows

For large documents or batch processing, use the async endpoint instead.

Request

Headers

Parameter	Type	Description
`X-API-Key`required	`string`	Your DocuRift API key (format: frc_xxxxx)
`Content-Type`required	`string`	Must be multipart/form-data for file uploads

Body Parameters

Parameter	Type	Description
`file`required	`File`	Document file to process. Supported formats: PDF, PNG, JPG, JPEG, WEBP, TIFF. Maximum size: 50MB
`documentType`	`string`	Type of document for optimized extraction Default: `generic`
`language`	`string`	Primary language of the document (ISO 639-1 code) Default: `en`
`extractTables`	`boolean`	Enable table extraction and structuring Default: `true`
`ocrFallback`	`boolean`	Use OCR for scanned documents or images Default: `true`

| Type | Description | Use Case | |------|-------------|----------| | generic | Auto-detect document type | Unknown or mixed documents | | invoice | Commercial invoices | Accounts payable automation | | bill_of_lading | Bills of lading (BOL) | Shipping and cargo tracking | | packing_list | Packing lists | Inventory and fulfillment | | customs_declaration | Customs forms | Import/export compliance | | certificate_of_origin | Origin certificates | Trade compliance | | delivery_order | Delivery orders | Last-mile logistics | | freight_invoice | Freight/shipping invoices | Logistics billing | | air_waybill | Air cargo waybills | Air freight tracking | | sea_waybill | Ocean cargo waybills | Ocean freight tracking |

Supported File Formats

| Format | Extension | Max Size | Notes | |--------|-----------|----------|-------| | PDF | .pdf | 50MB | Single or multi-page | | PNG | .png | 50MB | High-resolution recommended | | JPEG | .jpg, .jpeg | 50MB | Minimum 150 DPI recommended | | WebP | .webp | 50MB | Modern format supported | | TIFF | .tiff, .tif | 50MB | Multi-page TIFF supported |

Code Examples

cURL

curl

curl -X POST https://api.docurift.com/v1/documents/process \
-H "X-API-Key: frc_your_api_key_here" \
-F "file=@invoice.pdf" \
-F "documentType=invoice"

Python

process_document.py

import requests
import os

API_KEY = os.getenv('DOCURIFT_API_KEY')
API_URL = 'https://api.docurift.com/v1'

def process_document(file_path, document_type='generic'):
  """Process a document synchronously."""
  headers = {
      'X-API-Key': API_KEY
  }

  with open(file_path, 'rb') as f:
      files = {'file': f}
      data = {'documentType': document_type}

      response = requests.post(
          f'{API_URL}/documents/process',
          headers=headers,
          files=files,
          data=data
      )

  response.raise_for_status()
  return response.json()

# Example usage
result = process_document('invoice.pdf', 'invoice')

print(f"Document ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")
print(f"Confidence: {result['data']['confidence']}")

# Access extracted data
extracted = result['data']['extractedData']
print(f"Invoice Number: {extracted.get('invoiceNumber')}")
print(f"Total Amount: {extracted.get('totalAmount')}")

JavaScript (Node.js)

processDocument.js

import fs from 'fs';
import FormData from 'form-data';
import fetch from 'node-fetch';

const API_KEY = process.env.DOCURIFT_API_KEY;
const API_URL = 'https://api.docurift.com/v1';

async function processDocument(filePath, documentType = 'generic') {
const form = new FormData();
form.append('file', fs.createReadStream(filePath));
form.append('documentType', documentType);

const response = await fetch(`${API_URL}/documents/process`, {
  method: 'POST',
  headers: {
    'X-API-Key': API_KEY,
    ...form.getHeaders()
  },
  body: form
});

if (!response.ok) {
  const error = await response.json();
  throw new Error(error.error.message);
}

return response.json();
}

// Example usage
const result = await processDocument('invoice.pdf', 'invoice');

console.log('Document ID:', result.data.id);
console.log('Status:', result.data.status);
console.log('Confidence:', result.data.confidence);

// Access extracted data
const extracted = result.data.extractedData;
console.log('Invoice Number:', extracted.invoiceNumber);
console.log('Total Amount:', extracted.totalAmount);

JavaScript (Browser)

browser.js

async function processDocument(file, documentType = 'generic') {
const formData = new FormData();
formData.append('file', file);
formData.append('documentType', documentType);

const response = await fetch('https://api.docurift.com/v1/documents/process', {
  method: 'POST',
  headers: {
    'X-API-Key': 'frc_your_api_key_here'
  },
  body: formData
});

if (!response.ok) {
  const error = await response.json();
  throw new Error(error.error.message);
}

return response.json();
}

// Example with file input
const fileInput = document.getElementById('fileInput');
fileInput.addEventListener('change', async (event) => {
const file = event.target.files[0];

try {
  const result = await processDocument(file, 'invoice');
  console.log('Extracted data:', result.data.extractedData);
} catch (error) {
  console.error('Processing failed:', error.message);
}
});

Response

Success Response (200 OK)

response.json

{
"success": true,
"data": {
  "id": "doc_abc123xyz456",
  "organizationId": "org_xyz789",
  "fileName": "invoice.pdf",
  "fileType": "application/pdf",
  "fileSize": 245678,
  "documentType": "invoice",
  "status": "completed",
  "pagesProcessed": 2,
  "confidence": 0.96,
  "extractedData": {
    "invoiceNumber": "INV-2024-00123",
    "invoiceDate": "2024-01-15",
    "dueDate": "2024-02-15",
    "currency": "USD",
    "vendor": {
      "name": "Acme Shipping Co.",
      "address": "123 Harbor Blvd, Los Angeles, CA 90021",
      "taxId": "12-3456789",
      "email": "billing@acmeshipping.com",
      "phone": "+1-555-123-4567"
    },
    "customer": {
      "name": "Global Imports Inc.",
      "address": "456 Trade St, New York, NY 10001",
      "taxId": "98-7654321"
    },
    "lineItems": [
      {
        "description": "Ocean Freight - Container 20ft",
        "quantity": 2,
        "unitPrice": 1500.00,
        "total": 3000.00,
        "hsCode": "8609.00"
      },
      {
        "description": "Documentation Fee",
        "quantity": 1,
        "unitPrice": 150.00,
        "total": 150.00
      }
    ],
    "subtotal": 3150.00,
    "taxRate": 0.08,
    "taxAmount": 252.00,
    "totalAmount": 3402.00,
    "paymentTerms": "Net 30",
    "notes": "Payment due within 30 days of invoice date"
  },
  "metadata": {
    "processingTimeMs": 2340,
    "modelVersion": "v2.1.0",
    "pageConfidences": [0.97, 0.95]
  },
  "createdAt": "2024-01-26T10:30:00Z",
  "processedAt": "2024-01-26T10:30:02Z"
}
}

Response Fields

Parameter	Type	Description
`id`	`string`	Unique document identifier (format: doc_xxxxx)
`organizationId`	`string`	Organization that owns this document
`fileName`	`string`	Original uploaded file name
`fileType`	`string`	MIME type of the uploaded file
`fileSize`	`number`	File size in bytes
`documentType`	`string`	Document type used for processing
`status`	`string`	Processing status: completed, failed
`pagesProcessed`	`number`	Number of pages processed (affects credit usage)
`confidence`	`number`	Overall extraction confidence score (0-1)
`extractedData`	`object`	Structured data extracted from the document
`metadata`	`object`	Processing metadata including timing and model version
`createdAt`	`string`	ISO 8601 timestamp when document was uploaded
`processedAt`	`string`	ISO 8601 timestamp when processing completed

Confidence Scores

| Score Range | Quality | Recommendation | |-------------|---------|----------------| | 0.95 - 1.00 | Excellent | High confidence, minimal review needed | | 0.85 - 0.94 | Good | Minor fields may need verification | | 0.70 - 0.84 | Fair | Review important fields | | Below 0.70 | Low | Manual review recommended |

Error Responses

400 Bad Request

error_400.json

{
"success": false,
"error": {
  "code": "INVALID_FILE_TYPE",
  "message": "File type 'application/msword' is not supported. Supported types: PDF, PNG, JPG, JPEG, WEBP, TIFF"
}
}

401 Unauthorized

error_401.json

{
"success": false,
"error": {
  "code": "INVALID_API_KEY",
  "message": "Invalid API key. 20 attempts remaining before IP block."
}
}

402 Payment Required

error_402.json

{
"success": false,
"error": {
  "code": "INSUFFICIENT_CREDITS",
  "message": "Insufficient credits. Document requires 3 pages but only 1 available."
}
}

413 Payload Too Large

error_413.json

{
"success": false,
"error": {
  "code": "FILE_TOO_LARGE",
  "message": "File size 55MB exceeds maximum allowed size of 50MB"
}
}

429 Too Many Requests

error_429.json

{
"success": false,
"error": {
  "code": "RATE_LIMIT_EXCEEDED",
  "message": "Rate limit exceeded. Please retry after 60 seconds."
}
}

Error Codes Reference

| Code | HTTP Status | Description | Solution | |------|-------------|-------------|----------| | INVALID_FILE_TYPE | 400 | Unsupported file format | Use PDF, PNG, JPG, WEBP, or TIFF | | INVALID_DOCUMENT_TYPE | 400 | Unknown document type | Check supported document types | | FILE_TOO_LARGE | 413 | File exceeds 50MB limit | Compress or split the document | | INVALID_API_KEY | 401 | API key invalid or expired | Verify API key in dashboard | | INSUFFICIENT_CREDITS | 402 | Not enough page credits | Purchase more credits | | RATE_LIMIT_EXCEEDED | 429 | Too many requests | Implement backoff and retry | | PROCESSING_FAILED | 500 | Internal processing error | Retry or contact support |

Best Practices

Optimize for Accuracy

Use high-resolution images: Minimum 150 DPI for scanned documents
Specify document type: Improves extraction accuracy by 10-15%
Ensure good lighting: Avoid shadows and glare in photos
Keep documents straight: Minimal rotation for best results

Handle Errors Gracefully

error_handling.py

import requests
from requests.exceptions import RequestException
import time

def process_with_retry(file_path, max_retries=3):
  """Process document with exponential backoff retry."""
  for attempt in range(max_retries):
      try:
          result = process_document(file_path, 'invoice')
          return result
      except requests.HTTPError as e:
          if e.response.status_code == 429:
              # Rate limited - wait and retry
              wait_time = 2 ** attempt * 10
              print(f"Rate limited. Waiting {wait_time}s...")
              time.sleep(wait_time)
          elif e.response.status_code >= 500:
              # Server error - retry
              time.sleep(2 ** attempt)
          else:
              # Client error - don't retry
              raise
  raise Exception("Max retries exceeded")

💡

Processing Time

Synchronous processing typically completes in 2-10 seconds depending on document size and complexity. For documents over 10 pages, consider using the async endpoint.

⚠️

Credit Usage

Each page processed consumes 1 page credit from your balance. Multi-page PDFs consume credits equal to the number of pages.

Process Document (Async) - For large documents and batch processing
Get Document - Retrieve a processed document
List Documents - List all processed documents