Once the document is uploaded and the tagging is in progress, user might apply the parsing to retrieve the data in the document. For a better quality parsing, users should follow these few best practices :

1. Document original format

The Parser will use the original document formatting as a pattern to be able to process the document data. The result will always be more optimal if the original document has a structured content. A special attention is required for: Information Hierarchy (h1, h2, etc.), lists, bold vs light formats, table structures, etc.

2. Tagging Hierarchy

When creating a template or tagging a document for the first time, it’s important to think about the data hierarchy.

When tagging the document, users should always validate the data on the Hierarchy and Association table on the left of the Screen.

3. Tagging objects

It’s recommended to tag the integrality of the text as objects. Partially tagged objects might result in errors when the data will be exported.

Example on how to tag the text:

Original text

In this example, the Sequence numbers were tagged separately and just before the object names. It’s important to tag the whole object description but always avoiding to tag unnecessary spaces.

The following image represent what a wrong tagging looks like that can result in data errors. In this example, Sequence numbers are merged with object names and unnecessary spacing is tagged at the end of the paragraph.

Example on how to correctly tag cells in a table for the parsing:

User can only tag the first row of cells

After clicking on Parse, all tables with the same structure will be tagged accordingly to what was indicated in the first row

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Please do not use this for support questions.
Visit the Support Portal

Post Comment