Table Understanding
PDF Transcript:
Lymba can extract knowledge from text, but what about tables? Tables are in financial reports, tech specs, presentations, and so on. A table is a great way to communicate information concisely, but if they can’t be found, the knowledge cannot be leveraged. Now, this is possible with K-Extractor Table Extraction.
Table Understanding begins in the Document Preprocessing step of the NLP Pipeline with basic layout recognition. The system determines the characteristics of each page, identifying the header, the left and right margins, the body, and the footer.
The Basic processing includes:
Text rectangle recognition
Line recognition
White space recognition
Then, these elements are interpreted using multiple features:
Font size and style
position and size
Relative position
Repetitions from page to page
In the Table Processing stage, the system recognizes table presence, their borders. structures, and the internal hierarchy.
How does LYMBA process tables into graph databases for future querying?
The file’s content is then represented into RDF/TriX triples and can be integrated into a graph database. When combined with our NLP modules, Lymba can represent knowledge extracted from text and tables in a unified manner.
Lymba provides a natural language querying tool, so that a user can ask a question in plain English.
Organizations can now make quicker, more thoughtful decisions with access to their company’s data resources, regardless of file format.
Imagine what you could do with table extraction