SmartExtractSimple API

The SmartExtractSimple API provides a much simpler and cleaner interface than SmartExtract API. Most of the times you should be good using this. This API is a wrapper on top of the SmartExtract hiding all the event handling API with a simpler promise based API.

Usage

Creating an Instance

To start a SmartExtract session in order to extract a document or edit previously extracted data, you need to first create a SmartExtract instance.

import { SmartExtractSimple } from '@clik-ai/smart-extract';

const smex = new SmartExtractSimple({
  baseUrl: 'https://app.clik.ai/smart-extract',
});

import { SmartExtractSimple } from '@clik-ai/smart-extract';

const smex = new SmartExtractSimple({
  baseUrl: 'https://app.clik.ai/smart-extract-stg',
});

Parameter

Required

Description

baseUrl

Points to the SmartExtract service base url. You normally will not need to provide this value. It will default to the correct url.

Extracting Document Data

You can start a document data extraction session by calling a simple promised based extractDocumentData API. The promise returned by this API will be resolved with the extracted data or null in case the user clicked the cancel button.

const data = await smex.extractDocumentData({
  session: {
    mountNode: $('#smexWrapper')[0],
    sessionAuthToken: token,
    closeOnComplete: false,
  },
  file: fileDataUrl,
  fileName: file.name,
  options: {
    disableRetry: true,
    shouldPreProcessData: false,
    preProcessFunction: someFunctionToPreProcessExtractedData,
    
    // ...
  }
});

// Note:
// -----
//
// If user clicked 'Cancel'
// data = null 

// If user clicked 'Save'
// data = {
//   meta: {
//     assetType: '<The asset type of the document>',
//     documentType: '<The document type>',
//     osPeriod: [new Date('<start-date>'), new Date('<end-date>')],
//     fileName: <The document file name>,
//     ...
//   },
//   workbookData: {...},
//   documentData: {
//     // plain text data as detected in the document
//     source: {
//       rows: [
//         // First value is S.N. on each row signifying a unique row-id for each row in the document
//         ['S.N.', '', '', '', ....], // Each row is an array of all text-tokens detected in the row
//         [1010, ..., ..., ..., ...],
//         // ... rest of the data rows
//       ],
//     }
//     // extracted document data as a list of object. 
//     extracted: {
//       rows: [
//         { '<column-name-1>': '<column-value>', '<column-name-2>': '<column-value>', /* ... ,*/}
//         // ... rest of the data rows
//       ]
//     }
//   },
// }

Parameter

Required

Type

Description

session

Yes

object

Configuration option for the SmartExtract session

session.mountNode

Yes

DOM Element

The DOM element where SmartExtract iframe will be mounted

session.sessionAuthToken

Yes

string

The session auth token obtained from the authentication api

session.closeOnComplete

boolean

If set to true the smart extract session will end removing the iframe element from DOM when user clicks Save or Cancel button.

Defaults to false

file

Yes

string

The pdf or xlsx file encoded as a Data Url string

fileName

Yes

string

The file name. Make sure that the file name has correct extension or your users may not see a proper file preview.

options

object

Options to configure data extraction. See Styling and Customisation section for more details on how to style and customise the extraction form and spreadsheet view.

options.disableRetry

boolean

If true the returned promise will be rejected in case data extraction fails.

If false SmartExtract will show a retry option so that user can retry extraction.

options.shouldPreProcessData

boolean

If true then function provided via the option

preProcessFunction

will be called with the extracted data in the paramater. This data can be mutated and then sent back to smart-extract to be rendered in the spreadsheet.

If false preProcessFunction will be ignored i.e there smart-extract will not send back the data for pre-processing.

options.preProcessFunction

Function

Smart-extract will pass extracted data to this function. This function can be used to mutate the extracted data before it is loaded in the spreadsheet. This function should return a promise which should resolve to the updated data.

Editing Previously Extracted Data

The SmartExtract component not just allows you to extract data from documents, your users can continue editing the data in the SmartExtract widget from where they left off. The editDocumentData allows you to pass on previously saved data and continue editing in the SmartExtract component.

Under the edit mode, the user will be taken directly to the spread sheet screen where they can edit the data. Clicking 'Save' or 'Cancel' will trigger one of the 'data' or 'cancel' events.

const data = await smex.editDocumentData({
  session: {
    mountNode: $('#smexWrapper')[0],
    sessionAuthToken: token,
    closeOnComplete: false,
  },
  data: {
    workbookData: {...},
    meta: {...},
  }
});

Parameter

Required

Type

Description

session

Yes

object

Configuration option for the SmartExtract session

session.mountNode

Yes

DOM Element

The DOM element where SmartExtract iframe will be mounted

session.sessionAuthToken

Yes

string

The session auth token obtained from the authentication api

session.closeOnComplete

boolean

If set to true the smart extract session will end removing the iframe element from DOM when user clicks Save or Cancel button.

Defaults to false

data

Yes

object

The data object returned by the extractDocumentData api

Ending SmartExtract Session

If you passed the closeOnClomplete flag as false the SmartExtract session will remain open when user clicks Save or Cancel. It will be your responsibility to call the endSession API to end the session. Calling this API will remove the iframe from the DOM and clean up any event listeners added.

smex.endSession()

Its important to end SmartExtract session when you are done with extraction. Ending the session clears any event handlers attached. Keeping the closeOnComplete flag as true would take care of this most of the times i.e. if the extraction flow completes in a success or cancel or error event. However, if you externally end the session e.g. by hiding a popup dialog on which the SmartExtract iframe was attached, you must call the endSession API to ensure that eveythins is cleaned up correctly.

PreviousSmartExtract API NextStyling and Customisations

Last updated 2 years ago