When we work with Invoice documents or any other documents we often see the table in the document that we need to extract for some reason. It is usually with the invoice document because in the invoice document we have an item table which we often call Line item. This line item shows us the item and the amount for which the invoice is generated.
In this post, I’ll share with you how we can extract line items from the document using the character string element inside the repeating group like we extract data in the Table element. So to extract line time from the document first find the header labels. In the table of the document, we can find the header section above the table. Header sections contain static text like serial number, description, quantity, unit price, amount, etc. So we can treat this section as a header and take reference of these static texts for our table. To define this region we can use a static text element and search for any of the header names. See below the image of the table in the Invoice document.
Once we have the header static text then we look for the footer item. Now our question should be why do we need a footer element to extract the table. We need a footer element so we can properly define the field region from where we are trying to extract the table data. If we won't define the footer element then we can face a challenge in defining the below region of the table and we will end up extracting the wrong data also. For the footer element, we can use any static field which is placed immediately after our table data. It can be Subtotal, grand total, the total amount due, GST amount, etc. To capture the footer also we can use the static text element and look for the footer text.
Once we have the header and footer field region defined then we can start to work on capturing the data from the table.
Now take a repeating group element and give it any name you like. In my case, I am giving the name LineItem.
After adding the repeating group now add a character string element into the repeating group we just created. In this character string first, give the name of the column that you want to extract, So I want to extract two columns only, column description and column amount. So in the first character string element, I’ll give a name description. Then in the character string tab, I am going to leave the regular expression and alphabets option blank because we are extracting the alphanumeric item from the description table so we can leave this tab default. Now go to the relations tab and give relation as per your document required. In my case, you can see how I have given the relation to extracting the description column.We will add one more character string in the repeating group for the amount column. We will perform the same activities as we performed for the description column in the amount column. But in the character string tab, we'll add numbers, comma, and dot in the character set as you can see in the below image.
Then give relation in the relation tab for the amount column. you can see how I have given the relation in the below image.
After adding two-character string elements in the repeating group elements that now match the document you’ll see the table item captured in the repeating group. Once you see your repeating group successfully capturing the table then you can assign the elements into the block.
To assign the element in the block first you’ll have to add the repeating group block in the block. In this block, you can pass the repeating group element with all instances in the repeating instance option. Once a repeating group block is added and assigned with the repeating group element now you can add the description and amount block in the repeating group block.
For more information and how I did it, you can watch the video below.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.