5 min read

Invoice

Extract structured data from invoices and bills including vendor details, line items, totals, and payment information.

Overview

The Invoice document type is optimized for extracting structured financial data from invoices, bills, and similar commercial documents. It automatically identifies and extracts vendor and customer information, itemized charges, tax calculations, and payment details.

This is ideal for:

  • Accounts payable automation
  • Invoice reconciliation workflows
  • Expense management systems
  • Financial data entry automation
  • Audit trail documentation

Extracted Fields

| Field | Type | Description | |-------|------|-------------| | invoiceNumber | string | Unique invoice identifier | | invoiceDate | string | Date the invoice was issued (ISO 8601) | | dueDate | string | Payment due date (ISO 8601) | | vendor | object | Vendor/seller company information | | customer | object | Customer/buyer company information | | lineItems | array | Itemized products or services | | subtotal | number | Sum before tax | | tax | number | Tax amount | | total | number | Final amount due | | currency | string | ISO 4217 currency code |

Example Request

POST/v1/documents/process
curl -X POST 'https://api.docurift.com/v1/documents/process' \
  -H 'X-API-Key: your_api_key' \
  -F 'file=@invoice.pdf' \
  -F 'documentType=invoice'

Example Response

{
  "success": true,
  "data": {
    "id": "doc_inv_9c2d4e6f8a1b3c5d",
    "documentType": "invoice",
    "result": {
      "invoiceNumber": "INV-2024-00847",
      "invoiceDate": "2024-11-15",
      "dueDate": "2024-12-15",
      "vendor": {
        "name": "TechSupply Solutions Inc.",
        "address": "1250 Innovation Drive, Suite 400",
        "city": "San Francisco",
        "state": "CA",
        "postalCode": "94107",
        "country": "USA",
        "taxId": "94-3847291",
        "email": "billing@techsupply.com",
        "phone": "+1 (415) 555-0192"
      },
      "customer": {
        "name": "Acme Corporation",
        "address": "500 Market Street, Floor 12",
        "city": "San Francisco",
        "state": "CA",
        "postalCode": "94105",
        "country": "USA",
        "taxId": "94-2938475",
        "email": "ap@acmecorp.com",
        "phone": "+1 (415) 555-0847"
      },
      "lineItems": [
        {
          "lineNumber": 1,
          "description": "Enterprise Software License - Annual",
          "quantity": 50,
          "unit": "licenses",
          "unitPrice": 299.00,
          "amount": 14950.00,
          "sku": "ESL-2024-ENT"
        },
        {
          "lineNumber": 2,
          "description": "Premium Support Package",
          "quantity": 1,
          "unit": "package",
          "unitPrice": 2500.00,
          "amount": 2500.00,
          "sku": "SUP-PREM-12M"
        },
        {
          "lineNumber": 3,
          "description": "Implementation Services",
          "quantity": 40,
          "unit": "hours",
          "unitPrice": 175.00,
          "amount": 7000.00,
          "sku": "SVC-IMPL"
        }
      ],
      "subtotal": 24450.00,
      "tax": 2017.13,
      "total": 26467.13,
      "currency": "USD"
    },
    "confidence": 0.97,
    "processingTimeMs": 1890
  }
}

Field Definitions

invoiceNumber

The unique identifier assigned to the invoice by the vendor. This may include prefixes, dates, or sequential numbers depending on the vendor's numbering system.

invoiceDate

The date when the invoice was created or issued, formatted as an ISO 8601 date string (YYYY-MM-DD).

dueDate

The date by which payment is expected, formatted as an ISO 8601 date string. If no due date is explicitly stated, this field may be null.

vendor

An object containing the seller's business information:

| Property | Type | Description | |----------|------|-------------| | name | string | Company or business name | | address | string | Street address | | city | string | City name | | state | string | State or province | | postalCode | string | ZIP or postal code | | country | string | Country name or code | | taxId | string | Tax identification number (EIN, VAT, etc.) | | email | string | Contact email address | | phone | string | Contact phone number |

customer

An object containing the buyer's business information with the same structure as the vendor object.

lineItems

An array of individual charges on the invoice. Each item contains:

| Property | Type | Description | |----------|------|-------------| | lineNumber | number | Sequential line number | | description | string | Product or service description | | quantity | number | Number of units | | unit | string | Unit of measure | | unitPrice | number | Price per unit | | amount | number | Total for this line (quantity x unitPrice) | | sku | string | Product SKU or code (if available) |

subtotal

The sum of all line item amounts before tax is applied.

tax

The total tax amount. This may represent a single tax rate or the combined total of multiple taxes (sales tax, VAT, etc.).

total

The final amount due, equal to subtotal plus tax (and any additional fees if applicable).

currency

The three-letter ISO 4217 currency code (e.g., "USD", "EUR", "GBP").

Best Practices

  1. Clear Document Scans: Ensure all text is legible, especially numbers and decimal points which are critical for financial accuracy.

  2. Complete Invoices: Process complete invoice documents rather than partial pages to ensure all fields can be correlated correctly.

  3. Currency Consistency: If your invoices use multiple currencies, verify the extracted currency code matches the document.

  4. Line Item Validation: Cross-check that the sum of line item amounts equals the extracted subtotal for verification.

  5. Tax Handling: For invoices with multiple tax rates, the tax field represents the total. Use the generic document type if you need individual tax breakdowns.

  6. Address Parsing: The address components are parsed individually. For addresses with non-standard formats, some fields may be combined or require manual verification.

  7. Date Formats: Regardless of the original format (MM/DD/YYYY, DD/MM/YYYY, etc.), dates are normalized to ISO 8601 format in the response.

  8. Vendor Identification: Consider storing extracted vendor information to build a vendor database that can assist with future invoice matching.