Once the document is uploaded and the tagging is in progress, user might apply the parsing to retrieve the data in the document. For a better quality parsing, users should follow these few best practices :
1. Document original format
The Parser will use the original document formatting as a pattern to be able to process the document data. The result will always be more optimal if the original document has a structured content. A special attention is required for: Information Hierarchy (h1, h2, etc.), lists, bold vs light formats, table structures, etc.
2. Tagging Hierarchy
When creating a template or tagging a document for the first time, it’s important to think about the data hierarchy.
When tagging the document, users should always validate the data on the Hierarchy and Association table on the left of the Screen.
3. Tagging objects
It’s recommended to tag the integrality of the text as objects. Partially tagged objects might result in errors when the data will be exported.
Example on how to tag the text:
Original text
In this example, the Sequence numbers were tagged separately and just before the object names. It’s important to tag the whole object description but always avoiding to tag unnecessary spaces.
The following image represent what a wrong tagging looks like that can result in data errors. In this example, Sequence numbers are merged with object names and unnecessary spacing is tagged at the end of the paragraph.
Example on how to correctly tag cells in a table for the parsing:
User can only tag the first row of cells
After clicking on Parse, all tables with the same structure will be tagged accordingly to what was indicated in the first row
Hinterlasse einen Kommentar.