Large Language Models (LLMs) are currently at the forefront of innovation in the field of AI and represent its most powerful and fascinating application.
LLMs combine the following capabilities into a single technology:
- Deep language understanding and generation, managing syntax, grammar, synonyms, etc. This language management extends beyond human language to include programming languages and similar ones like SQL.
- Extensive, albeit imprecise, world knowledge: facts, places, people, scientific concepts.
- Planning ability to solve problems step by step.
The integration of these three capabilities theoretically positions LLMs as one of the key technologies for supporting Data Management activities. An assistant based on these capabilities can:
- Understand the functional context of a problem.
- Analyze data structures.
- Produce SQL and other code to manipulate and transform data.
However, there are limitations to address. Despite efforts and advancements, LLMs still suffer from hallucination, occasionally generating imperfect outputs compared to expectations.
This presents both the opportunity to revolutionize data interaction with AI and the need to provide all necessary contextual elements (business information, vocabularies, functional and technical metadata) to assist AI in precisely answering questions.
To fully seize these opportunities, organizing business concepts and metadata into a Knowledge Graph is essential. This organic knowledge map serves as the foundation for instructing AI.
Moreover, it’s possible to use AI in reverse: organizing and rationalizing the corporate vocabulary of business concepts, fostering a shared knowledge base across the organization and shared KPIs/metrics among various workgroups.
Constructing the Knowledge Graph extends and enriches the mission of Data Governance from a new perspective, with AI as a new interlocutor and tool to enable:
- Exploration and use of data through AI-based Self-Service BI tools.
- Rationalization of business concepts and domains.
- Explicitation of meaningful relationships between business concepts and data present in systems.