AI+ Tool and Solution Catalogue

Solution Description

The Generic Rule-based Entity Extraction Network Platform (GREEN) is an innovative platform developed to address the challenges of extracting information from forms and documents. Traditional methods rely on aligning images with predefined templates, but variations between the image and template can lead to incorrect cropping and recognition errors. Additionally, predefined fields limit recognition accuracy when characters are written outside their bounds.

The objectives of GREEN are to design a system that provides flexibility in extracting information from documents, even in the absence of templates, and to ensure extensibility to accommodate different document information extraction scenarios. It starts by recognizing all characters in the image and grouping them into words. Language model-based contextual understanding is applied to classify words and determine if they form sentences. Each grouped text block is assigned a 2D position and type, such as "name", "date", "number", "HKID" and "address" Spatial linkage information is then used to establish relationships between text blocks. Users can define rules based on the relative positions of recognized text blocks to recognize specific areas.

Use Case

By employing GREEN, organizations can streamline the extraction process, improve accuracy, and adapt to changing document formats. This innovative solution empowers businesses to efficiently extract valuable information from various documents, enhancing productivity and reducing manual effort.

Presentation Videos

If any government department would like to obtain additional information about the AI solution, please contact Smart LAB.