4 min read

Document Types

Name: DocuRift Document Extraction
Brand: DocuRift
Price: 4 INR
Availability: InStock

Supported document types and their extraction schemas

DocuRift supports multiple document types, each with specialized extraction models optimized for that specific format. Our machine learning models are trained on millions of real-world documents, enabling high-accuracy extraction across a wide variety of layouts, formats, and languages.

Overview

Choosing the correct document type is crucial for accurate extraction. Each document type has a specialized model that understands the unique structure and fields of that format. For example, an invoice model knows to look for vendor information, line items, totals, and payment terms, while a bill of lading model focuses on shipper details, cargo descriptions, and port information.

When you specify the correct document type, DocuRift:

Applies the most accurate extraction model for that format
Returns structured data with fields specific to that document type
Provides higher confidence scores through targeted extraction
Reduces errors by understanding the document's context

If you're unsure about the document type, you can use the generic type, which performs OCR and returns raw text. However, for best results, we recommend specifying the exact document type whenever possible.

Supported Document Types

Generic Document

General-purpose OCR for any document

Invoice

Extract billing data from invoices

Bill of Lading

Parse maritime shipping documents

Proof of Delivery

Extract delivery confirmation data

Packing List

Process package manifests

Customs Declaration

Extract customs form data

Freight Invoice

Parse freight billing documents

Delivery Challan

Process delivery challans

Lorry Receipt

Extract transport receipt data

Airway Bill

Parse air cargo documents

Choosing the Right Type

Not sure which document type to use? This guide will help you select the best option based on your document's purpose:

| If Your Document Is... | Use Type | Typical Fields Extracted | |------------------------|----------|--------------------------| | Unknown or mixed format | generic | Raw text, tables | | Commercial billing | invoice | Vendor, line items, totals, dates | | Ocean shipping | bill_of_lading | Shipper, consignee, cargo, ports | | Delivery confirmation | proof_of_delivery | Recipient, signature, timestamp | | Package contents | packing_list | Items, quantities, weights | | Import/export forms | customs_declaration | HS codes, values, duties | | Shipping charges | freight_invoice | Charges, routes, carriers | | Goods transfer | delivery_challan | Sender, receiver, goods | | Truck transport | lorry_receipt | Vehicle, driver, consignment | | Air cargo | airway_bill | AWB number, routing, weights |

Credit Costs by Type

Different document types require different levels of processing complexity. Here's how credits are calculated:

| Type | Credits per Page | Why | |------|-----------------|-----| | Generic | 1 | Basic OCR processing | | Invoice | 2 | Structured extraction with line items | | Bill of Lading | 3 | Complex multi-party documents | | Customs Declaration | 3 | Regulatory compliance fields | | All Others | 2 | Standard structured extraction |

Credits are charged per page processed. Multi-page documents are charged based on the total number of pages. For example, a 5-page invoice would cost 10 credits (5 pages × 2 credits).

Example Request

Processing a document is simple. Just specify the file and document type in your API request:

curl -X POST https://api.docurift.com/v1/documents/process \
  -H "X-API-Key: your_api_key" \
  -F "file=@document.pdf" \
  -F "documentType=invoice"

The API will return structured JSON with all extracted fields, confidence scores, and metadata. See each document type's page for the specific fields and schema returned.

Accuracy and Confidence

Each extracted field includes a confidence score between 0 and 1. Higher scores indicate greater certainty in the extraction. We recommend:

0.9+: High confidence, suitable for automated processing
0.7-0.9: Medium confidence, may want human review
Below 0.7: Low confidence, recommend manual verification

Need a New Document Type?

If you need support for a document type not listed here, contact us. We're always expanding our supported formats based on customer needs. Our team can work with you to build custom extraction models for your specific document types.