Modules#

Dataherald is built on a modular architecture and provides standaridized, extensible interfaces for anyone to replace any of these modules with their own implementation. This section outlines these modules.

Introduction#

The system is built on the following modules, and you can implement your own and replace the default implementation from the .env file. In many instances the codebase already has multiple implementations which can be selected. We encourage the community to build their own modules and submit them for inclusion in the codebase.

System Modules#

The following are the core modules which make up the Dataherald engine.

  • Context Store: Which stores the relevant business and data context used in few-shot prompting.

  • Text-to-SQL Engine: The module that translates the Natural Language question to SQL

  • API Server: The server that exposes the API

  • Evaluator: The module that assigns a confidence score to the generated SQL

  • Vector Store: The vector store which stores application embeddings.

  • Database Integration: Used to store and persist application data.

  • Finetuning: The module that responsible for finetuning LLMs on ground truth Question/SQL pairs.

System architecture#

The following diagram illustrates the overall system architecture.

Dataherald Architecture