> For the complete documentation index, see [llms.txt](https://clik-ai.gitbook.io/smart-extract-documentation/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://clik-ai.gitbook.io/smart-extract-documentation/api-reference/data-extraction-api/smartextract-json-api.md).

# SmartExtract JSON API

The SmartExtract JSON API can be used to extract data from financial documents in JSON format.&#x20;

#### For Production environment:

## Extract single data set from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-api/api/account/v1/extraction/document`

This API endpoint allows you to extract a single data set from a document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Headers

| Name                                            | Type   | Description                          |
| ----------------------------------------------- | ------ | ------------------------------------ |
| Authorization<mark style="color:red;">\*</mark> | String | Basic <*base64 encoded credentials>* |

#### Request Body

| Name                                           | Type   | Description                                                                                                                                      |
| ---------------------------------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| assetType<mark style="color:red;">\*</mark>    | String | One of the [Valid Asset Types](/smart-extract-documentation/appendix.md#asset-type)                                                              |
| documentType<mark style="color:red;">\*</mark> | String | One of the [Valid Document Types](/smart-extract-documentation/appendix.md#document-type)                                                        |
| periodFrom                                     | String | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| sheet                                          | Number | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| to                                             | Number | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| from                                           | Number | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| file<mark style="color:red;">\*</mark>         | String | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| fileName<mark style="color:red;">\*</mark>     | String | Name of the file                                                                                                                                 |
| periodTo                                       | String | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if `documentType` is a type of operating statement.   |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "taggingData": { /* Data passed in request */ },
    "data": { // Data extracted from the document
      "meta": { /* meta data object */ },
      "source" : { /* raw text data found parsed as a 2D data array */ },
      "extracted": { /* normalized extracted data */ } 
    }
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}

## Extract multiple data sets from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-api/api/account/v1/extraction/document/multiple`

This API endpoint allows you to extract a multiple data sets from a single document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Request Body

| Name                                                           | Type           | Description                                                                                                                                      |
| -------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| file<mark style="color:red;">\*</mark>                         | String         | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| taggingData<mark style="color:red;">\*</mark>                  | TaggingData\[] | An array of tagging data objects representing each data set to extract                                                                           |
| fileName<mark style="color:red;">\*</mark>                     | String         | Name of the file                                                                                                                                 |
| taggingData\[0]<mark style="color:red;">\*</mark>              | Object         | Tagging data object representing a data set to extract                                                                                           |
| taggingData\[0].periodTo                                       | String         | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if  `documentType` is a type of operating statement.  |
| taggingData\[0].periodFrom                                     | String         | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| taggingData\[0].sheet                                          | Number         | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| taggingData\[0].to                                             | Number         | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| taggingData\[0].from                                           | Number         | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| taggingData\[0].documentType<mark style="color:red;">\*</mark> | String         | One of the [Valid Document Types](/smart-extract-documentation/appendix.md#document-type)                                                        |
| taggingData\[0].assetType<mark style="color:red;">\*</mark>    | String         | One of the [Valid Asset Types](/smart-extract-documentation/appendix.md#asset-type)                                                              |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "data": [
      {
        "taggingData": { /* Data passed in request */ },
        "data": { // Data extracted from the document
          "meta": { /* meta data object */ },
          "source" : { /* raw text data found parsed as a 2D data array */ },
          "extracted": { /* normalized extracted data */ } 
        }
      }
      // .. more datasets for each tagging data
    ]
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}

#### For Staging environment:

## Extract single data set from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-stg-api/api/account/v1/extraction/document`

This API endpoint allows you to extract a single data set from a document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Headers

| Name                                            | Type   | Description                          |
| ----------------------------------------------- | ------ | ------------------------------------ |
| Authorization<mark style="color:red;">\*</mark> | String | Basic <*base64 encoded credentials>* |

#### Request Body

| Name                                           | Type   | Description                                                                                                                                      |
| ---------------------------------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| assetType<mark style="color:red;">\*</mark>    | String | One of the [Valid Asset Types](/smart-extract-documentation/appendix.md#asset-type)                                                              |
| documentType<mark style="color:red;">\*</mark> | String | One of the [Valid Document Types](/smart-extract-documentation/appendix.md#document-type)                                                        |
| periodFrom                                     | String | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| sheet                                          | Number | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| to                                             | Number | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| from                                           | Number | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| file<mark style="color:red;">\*</mark>         | String | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| fileName<mark style="color:red;">\*</mark>     | String | Name of the file                                                                                                                                 |
| periodTo                                       | String | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if `documentType` is a type of operating statement.   |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "taggingData": { /* Data passed in request */ },
    "data": { // Data extracted from the document
      "meta": { /* meta data object */ },
      "source" : { /* raw text data found parsed as a 2D data array */ },
      "extracted": { /* normalized extracted data */ } 
    }
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}

## Extract multiple data sets from a document

<mark style="color:green;">`POST`</mark> `https://api.clik.ai/smart-extract-stg-api/api/account/v1/extraction/document/multiple`

This API endpoint allows you to extract a multiple data sets from a single document. The endpoint uses basic authentication with api key and secret key as credentials. You can create an API key from the SmartExtract dashboard.

The authorization token used must be for an API Key with `Data Extraction` role applied to it.

#### Request Body

| Name                                                           | Type           | Description                                                                                                                                      |
| -------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| file<mark style="color:red;">\*</mark>                         | String         | File to extract data from encoded as a [DataUrl](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs)                     |
| taggingData<mark style="color:red;">\*</mark>                  | TaggingData\[] | An array of tagging data objects representing each data set to extract                                                                           |
| fileName<mark style="color:red;">\*</mark>                     | String         | Name of the file                                                                                                                                 |
| taggingData\[0]<mark style="color:red;">\*</mark>              | Object         | Tagging data object representing a data set to extract                                                                                           |
| taggingData\[0].periodTo                                       | String         | Date in YYYY-MM-DD fromat representing the end period for an operating statement. Required if  `documentType` is a type of operating statement.  |
| taggingData\[0].periodFrom                                     | String         | Date in YYYY-MM-DD fromat representing the start period for an operating statement. Required if `documentType` is a type of operating statement. |
| taggingData\[0].sheet                                          | Number         | Sheet number to extract data from ()valid for XLSX files). Sheet index starts from 1. Defaults to first sheet.                                   |
| taggingData\[0].to                                             | Number         | Page number to which data should be extracted (valid for PADF files). Defaults to last page.                                                     |
| taggingData\[0].from                                           | Number         | Page number to start data extraction from (valid for PDF files). Page index starts from 1. Defaults to first page.                               |
| taggingData\[0].documentType<mark style="color:red;">\*</mark> | String         | One of the [Valid Document Types](/smart-extract-documentation/appendix.md#document-type)                                                        |
| taggingData\[0].assetType<mark style="color:red;">\*</mark>    | String         | One of the [Valid Asset Types](/smart-extract-documentation/appendix.md#asset-type)                                                              |

{% tabs %}
{% tab title="200: OK " %}

```javascript
{
    "data": [
      {
        "taggingData": { /* Data passed in request */ },
        "data": { // Data extracted from the document
          "meta": { /* meta data object */ },
          "source" : { /* raw text data found parsed as a 2D data array */ },
          "extracted": { /* normalized extracted data */ } 
        }
      }
      // .. more datasets for each tagging data
    ]
}
```

{% endtab %}

{% tab title="401: Unauthorized " %}

```javascript
{
    "status": "error",
    "error": "Invalid credentials provided",
}
```

{% endtab %}

{% tab title="500: Internal Server Error " %}

```javascript
{
    "status": "error",
    "error": "Error: Data Extraction Failed",
    "message": "Oops! something went wrong."
}
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://clik-ai.gitbook.io/smart-extract-documentation/api-reference/data-extraction-api/smartextract-json-api.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
