CONTEXT & issues
Throughout the life cycle of industrial and nuclear installations, operators produce a large quantity of documents in a variety of formats and media.
This information concerns different parameters, and the consultation of this data is essential for maintenance, audits, upgrades, extensions, dismantling operations, etc. In order to process, structure and exploit this information, the text analysis tools (EDM) currently used are insufficient.
New tools allow the aggregation and structuring of all forms of textual data, in paper or digital form, which are then exploited as an aid to decision making at operational and commercial levels.
DeepFinder is an open-source NLP AI (artificial intelligence applied to natural language) solution developed by Assystem. It allows the extraction of relevant data from documents (texts, images, plans) accumulated during the life of a complex industrial installation.
Thus, DeepFinder allows:
- Classification of documents by domain
- Syntactic and semantic indexing of documents for rapid data retrieval
- Natural language interrogation of documents for accurate response
- Creation of domain-specific ontologies
- Secure access to sensitive documents
CHARACTERISTICS OF THE SOLUTION
The DeepFinder solution mixes traditional techniques for capturing and classifying textual documents (EDM, search engine) with the use of ontologies and new Deep Learning and Artificial Intelligence (AI) technologies applied to natural language processing (NLP).
The solution uses an innovative indexing engine that allows the search of millions of documents in a few seconds. A Question Answering model, trained on more than 10,000 nuclear question-answer pairs, can find an exact answer to a user's natural language request regardless of the chosen domain.
DeepFinder can be deployed across multiple data catalogues with user management allowing access to be restricted to one or the other. Once deployed, DeepFinder synchronises the data at a defined frequency.
Dematerialise and transform paper or digital documents into machine-readable text and data
Classify documents by subject area and key terms
Searching and structuring information and data contained in documents by their semantics due to Question Answering systems (search engine)
Provide a single-entry point to this mass of documents
Share information between actors
Manage several file catalogues
Secure the storage and access to documents and data
Exploiting archived data
Quick access to data
Pertinent analysis of the data
Easy to use
Reduction of uncertainties