smart-extract-documentation
v4.1.1
v4.1.1
  • SmartExtract
  • API Reference
    • Data Extraction API
      • Authentication
      • SmartExtract JSON API
    • Admin API
      • API Key Management
      • Extraction Logs API
  • smart-extract.js
    • Integration Overview
    • SmartExtract API
    • SmartExtractSimple API
    • Styling and Customisations
    • Pre-Fillling Extraction Form
    • Performing Multiple Extraction
    • Code Examples
      • SmartExtract API Examples
      • SmartExtractSimple API Examples
  • Appendix
Powered by GitBook
On this page
  • Extract single data set from a document
  • Extract multiple data sets from a document
  • Extract single data set from a document
  • Extract multiple data sets from a document
  1. API Reference
  2. Data Extraction API

SmartExtract JSON API

The SmartExtract JSON API can be used to extract data from financial documents in JSON format.

For Production environment:

Extract single data set from a document

POST https://api.clik.ai/smart-extract-api/api/account/v1/extraction/document

This API endpoint allows you to extract a single data set from a document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with Data Extraction role applied to it.

Headers

Name
Type
Description

Authorization*

String

Basic <base64 encoded credentials>

Request Body

Name
Type
Description

assetType*

String

documentType*

String

periodFrom

String

Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if documentType is a type of operating statement.

sheet

Number

Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.

to

Number

Page number to which data should be extracted (valid for PADF files). Defaults to last page.

from

Number

Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.

file*

String

fileName*

String

Name of the file

periodTo

String

Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if documentType is a type of operating statement.

{
    "taggingData": { /* Data passed in request */ },
    "data": { // Data extracted from the document
      "meta": { /* meta data object */ },
      "source" : { /* raw text data found parsed as a 2D data array */ },
      "extracted": { /* normalized extracted data */ } 
    }
}
{
    "status": "error",
    "error": "Invalid credentials provided",
}
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}

Extract multiple data sets from a document

POST https://api.clik.ai/smart-extract-api/api/account/v1/extraction/document/multiple

This API endpoint allows you to extract a multiple data sets from a single document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with Data Extraction role applied to it.

Request Body

Name
Type
Description

file*

String

taggingData*

TaggingData[]

An array of tagging data objects representing each data set to extract

fileName*

String

Name of the file

taggingData[0]*

Object

Tagging data object representing a data set to extract

taggingData[0].periodTo

String

Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if documentType is a type of operating statement.

taggingData[0].periodFrom

String

Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if documentType is a type of operating statement.

taggingData[0].sheet

Number

Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.

taggingData[0].to

Number

Page number to which data should be extracted (valid for PADF files). Defaults to last page.

taggingData[0].from

Number

Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.

taggingData[0].documentType*

String

taggingData[0].assetType*

String

{
    "data": [
      {
        "taggingData": { /* Data passed in request */ },
        "data": { // Data extracted from the document
          "meta": { /* meta data object */ },
          "source" : { /* raw text data found parsed as a 2D data array */ },
          "extracted": { /* normalized extracted data */ } 
        }
      }
      // .. more datasets for each tagging data
    ]
}
{
    "status": "error",
    "error": "Invalid credentials provided",
}
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}

For Staging environment:

Extract single data set from a document

POST https://api.clik.ai/smart-extract-stg-api/api/account/v1/extraction/document

This API endpoint allows you to extract a single data set from a document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with Data Extraction role applied to it.

Headers

Name
Type
Description

Authorization*

String

Basic <base64 encoded credentials>

Request Body

Name
Type
Description

assetType*

String

documentType*

String

periodFrom

String

Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if documentType is a type of operating statement.

sheet

Number

Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.

to

Number

Page number to which data should be extracted (valid for PADF files). Defaults to last page.

from

Number

Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.

file*

String

fileName*

String

Name of the file

periodTo

String

Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if documentType is a type of operating statement.

{
    "taggingData": { /* Data passed in request */ },
    "data": { // Data extracted from the document
      "meta": { /* meta data object */ },
      "source" : { /* raw text data found parsed as a 2D data array */ },
      "extracted": { /* normalized extracted data */ } 
    }
}
{
    "status": "error",
    "error": "Invalid credentials provided",
}
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}

Extract multiple data sets from a document

POST https://api.clik.ai/smart-extract-stg-api/api/account/v1/extraction/document/multiple

This API endpoint allows you to extract a multiple data sets from a single document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with Data Extraction role applied to it.

Request Body

Name
Type
Description

file*

String

taggingData*

TaggingData[]

An array of tagging data objects representing each data set to extract

fileName*

String

Name of the file

taggingData[0]*

Object

Tagging data object representing a data set to extract

taggingData[0].periodTo

String

Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if documentType is a type of operating statement.

taggingData[0].periodFrom

String

Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if documentType is a type of operating statement.

taggingData[0].sheet

Number

Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.

taggingData[0].to

Number

Page number to which data should be extracted (valid for PADF files). Defaults to last page.

taggingData[0].from

Number

Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.

taggingData[0].documentType*

String

taggingData[0].assetType*

String

{
    "data": [
      {
        "taggingData": { /* Data passed in request */ },
        "data": { // Data extracted from the document
          "meta": { /* meta data object */ },
          "source" : { /* raw text data found parsed as a 2D data array */ },
          "extracted": { /* normalized extracted data */ } 
        }
      }
      // .. more datasets for each tagging data
    ]
}
{
    "status": "error",
    "error": "Invalid credentials provided",
}
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
PreviousAuthenticationNextAdmin API

Last updated 1 year ago

One of the

One of the

File to extract data from encoded as a

File to extract data from encoded as a

One of the

One of the

One of the

One of the

File to extract data from encoded as a

File to extract data from encoded as a

One of the

One of the

Valid Asset Types
Valid Document Types
DataUrl
DataUrl
Valid Document Types
Valid Asset Types
Valid Asset Types
Valid Document Types
DataUrl
DataUrl
Valid Document Types
Valid Asset Types