Text-to-SQL Engine#
The Text-to-SQL agent is a core module which translates the Natural Language question received through the /question
endpoint to SQL. The implementation can leverage business and data logic stored in the Context Store module.
to generate accurate SQL given the DB schema. Currently the following NL-to-SQL implementations are included in the codebase:
Langchain SQL Agent
- A wrapper around the Langchain SQLAgentLangchain SQL Chain
- A wrapper around the Langchain SQLChainLlamaIndex SQL Generator
- A wrapper around the LlamaIndex SQL GeneratorDataherald SQL Agent
- Our in-house Natural Language-to-SQL agent which uses uses in-context learning
Dataherald SQL Agent#
The dataherald_sqlagent
is an agent that outperforms the Langchain SQL Agent by 12%-250% in our benchmarking. It does this by leveraging up to
7 tools to generate valid SQL:
QuerySQLDataBaseTool: Executes a given SQL query on the database and returns the string representation of the results.
GetCurrentTimeTool: Provides access to the current date and time, allowing the agent to address temporal questions, such as “how much did the income increase this month?”
TablesSQLDatabaseTool: Returns a list of table names, accompanied by a relevance score indicating their potential relevance to the given question.
SchemaSQLDatabaseTool: Grants access to the schema of the tables, aiding the agent in locating relevant columns.
InfoRelevantColumns: Offers additional information about specific columns, which can include descriptions from data analysts, categories, and sample rows, enriching the context for query generation.
ColumnEntityChecker: Facilitates entity searches within specific columns, presenting a list of syntactically relevant entities from the selected column.
GetFewShotExamples: Allows the agent to request relevant Question/SQL pairs dynamically. The agent can ask for more examples based on question complexity, fostering adaptive learning.
Abstract SQLGenerator Class#
Base class that all SQL generation classes inherit from.
SQLGenerator
#
This base class defines the common structure for SQL generation classes.
- create_sql_query_status(db, query, response, top_k)
Creates a SQL query status using provided parameters.
- Parameters:
db (SQLDatabase) – The SQL database instance.
query (str) – The SQL query.
response (Response) – The Response instance.
top_k (int) – The number of results to return.
- Returns:
The updated Response instance with the SQL query status.
- Return type:
Response
- generate_response(user_question, database_connection, context=None)
Generates a response to a user question based on the given user question, database connection, and optional context.
- Parameters:
user_question (Question) – The user’s natural language question.
database_connection (DatabaseConnection) – The database connection information.
context (List[dict], optional) – (Optional) Additional context information.
- Returns:
The Response containing the generated response.
- Return type:
Response