Table Understanding

Table1.PNG
Table2.PNG

PDF Transcript:

Lymba can extract knowledge from text, but what about tables? Tables are in financial reports, tech specs, presentations, and so on. A table is a great way to communicate information concisely, but if they can’t be found, the knowledge cannot be leveraged. Now, this is possible with K-Extractor Table Extraction. 

Table Understanding begins in the Document Preprocessing step of the NLP Pipeline with basic layout recognition. The system determines the characteristics of each page, identifying the header, the left and right margins, the body, and the footer. 

 

The Basic processing includes: 

  • Text rectangle recognition 

  • Line recognition 

  • White space recognition 

Then, these elements are interpreted using multiple features: 

  • Font size and style 

  • position and size 

  • Relative position 

  • Repetitions from page to page 

 

In the Table Processing stage, the system recognizes table presence, their borders. structures, and the internal hierarchy. 

How does LYMBA process tables into graph databases for future querying?

The file’s content is then represented into RDF/TriX triples and can be integrated into a graph database. When combined with our NLP modules, Lymba can represent knowledge extracted from text and tables in a unified manner. 

Lymba provides a natural language querying tool, so that a user can ask a question in plain English. 

Organizations can now make quicker, more thoughtful decisions with access to their company’s data resources, regardless of file format. 

Imagine what you could do with table extraction

Previous
Previous

Jaguar™: Automatic Ontology Generation

Next
Next

NL2Query™