# SmartExtract JSON API

The SmartExtract JSON API can be used to extract data from financial documents in JSON format.&#x20;

#### For Production environment:

## Extract single data set from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-api/api/account/v1/extraction/document`

This API endpoint allows you to extract a single data set from a document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Headers

| Name                                            | Type   | Description                          |
| ----------------------------------------------- | ------ | ------------------------------------ |
| Authorization<mark style="color:red;">\*</mark> | String | Basic <*base64 encoded credentials>* |

#### Request Body

| Name                                           | Type   | Description                                                                                                                                      |
| ---------------------------------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| assetType<mark style="color:red;">\*</mark>    | String | One of the [Valid Asset Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#asset-type)                                       |
| documentType<mark style="color:red;">\*</mark> | String | One of the [Valid Document Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#document-type)                                 |
| periodFrom                                     | String | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| sheet                                          | Number | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| to                                             | Number | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| from                                           | Number | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| file<mark style="color:red;">\*</mark>         | String | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| fileName<mark style="color:red;">\*</mark>     | String | Name of the file                                                                                                                                 |
| periodTo                                       | String | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if `documentType` is a type of operating statement.   |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "taggingData": { /* Data passed in request */ },
    "data": { // Data extracted from the document
      "meta": { /* meta data object */ },
      "source" : { /* raw text data found parsed as a 2D data array */ },
      "extracted": { /* normalized extracted data */ } 
    }
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}

## Extract multiple data sets from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-api/api/account/v1/extraction/document/multiple`

This API endpoint allows you to extract a multiple data sets from a single document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Request Body

| Name                                                           | Type           | Description                                                                                                                                      |
| -------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| file<mark style="color:red;">\*</mark>                         | String         | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| taggingData<mark style="color:red;">\*</mark>                  | TaggingData\[] | An array of tagging data objects representing each data set to extract                                                                           |
| fileName<mark style="color:red;">\*</mark>                     | String         | Name of the file                                                                                                                                 |
| taggingData\[0]<mark style="color:red;">\*</mark>              | Object         | Tagging data object representing a data set to extract                                                                                           |
| taggingData\[0].periodTo                                       | String         | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if  `documentType` is a type of operating statement.  |
| taggingData\[0].periodFrom                                     | String         | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| taggingData\[0].sheet                                          | Number         | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| taggingData\[0].to                                             | Number         | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| taggingData\[0].from                                           | Number         | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| taggingData\[0].documentType<mark style="color:red;">\*</mark> | String         | One of the [Valid Document Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#document-type)                                 |
| taggingData\[0].assetType<mark style="color:red;">\*</mark>    | String         | One of the [Valid Asset Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#asset-type)                                       |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "data": [
      {
        "taggingData": { /* Data passed in request */ },
        "data": { // Data extracted from the document
          "meta": { /* meta data object */ },
          "source" : { /* raw text data found parsed as a 2D data array */ },
          "extracted": { /* normalized extracted data */ } 
        }
      }
      // .. more datasets for each tagging data
    ]
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}

#### For Staging environment:

## Extract single data set from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-stg-api/api/account/v1/extraction/document`

This API endpoint allows you to extract a single data set from a document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Headers

| Name                                            | Type   | Description                          |
| ----------------------------------------------- | ------ | ------------------------------------ |
| Authorization<mark style="color:red;">\*</mark> | String | Basic <*base64 encoded credentials>* |

#### Request Body

| Name                                           | Type   | Description                                                                                                                                      |
| ---------------------------------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| assetType<mark style="color:red;">\*</mark>    | String | One of the [Valid Asset Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#asset-type)                                       |
| documentType<mark style="color:red;">\*</mark> | String | One of the [Valid Document Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#document-type)                                 |
| periodFrom                                     | String | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| sheet                                          | Number | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| to                                             | Number | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| from                                           | Number | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| file<mark style="color:red;">\*</mark>         | String | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| fileName<mark style="color:red;">\*</mark>     | String | Name of the file                                                                                                                                 |
| periodTo                                       | String | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if `documentType` is a type of operating statement.   |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "taggingData": { /* Data passed in request */ },
    "data": { // Data extracted from the document
      "meta": { /* meta data object */ },
      "source" : { /* raw text data found parsed as a 2D data array */ },
      "extracted": { /* normalized extracted data */ } 
    }
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}

## Extract multiple data sets from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-stg-api/api/account/v1/extraction/document/multiple`

This API endpoint allows you to extract a multiple data sets from a single document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Request Body

| Name                                                           | Type           | Description                                                                                                                                      |
| -------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| file<mark style="color:red;">\*</mark>                         | String         | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| taggingData<mark style="color:red;">\*</mark>                  | TaggingData\[] | An array of tagging data objects representing each data set to extract                                                                           |
| fileName<mark style="color:red;">\*</mark>                     | String         | Name of the file                                                                                                                                 |
| taggingData\[0]<mark style="color:red;">\*</mark>              | Object         | Tagging data object representing a data set to extract                                                                                           |
| taggingData\[0].periodTo                                       | String         | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if  `documentType` is a type of operating statement.  |
| taggingData\[0].periodFrom                                     | String         | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| taggingData\[0].sheet                                          | Number         | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| taggingData\[0].to                                             | Number         | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| taggingData\[0].from                                           | Number         | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| taggingData\[0].documentType<mark style="color:red;">\*</mark> | String         | One of the [Valid Document Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#document-type)                                 |
| taggingData\[0].assetType<mark style="color:red;">\*</mark>    | String         | One of the [Valid Asset Types](https://clik-ai.gitbook.io/smart-extract-documentation/appendix#asset-type)                                       |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "data": [
      {
        "taggingData": { /* Data passed in request */ },
        "data": { // Data extracted from the document
          "meta": { /* meta data object */ },
          "source" : { /* raw text data found parsed as a 2D data array */ },
          "extracted": { /* normalized extracted data */ } 
        }
      }
      // .. more datasets for each tagging data
    ]
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}
