API#
The Dataherald Engine exposes RESTful APIs that can be used to:
🔌 Connect to and manage connections to databases
🔑 Add context to the engine through scanning the databases, adding database level instructions, adding descriptions to tables and columns and adding golden records
🙋♀️ Ask natural language questions from the relational data
Our APIs have resource-oriented URL built around standard HTTP response codes and verbs. The core resources are described below.
Database Connections#
The database-connections object allows you to define connections to your relational data stores.
Related endpoints are:
Create database connection –
POST api/v1/database-connectionsList database connections –
GET api/v1/database-connectionsUpdate a database connection –
PUT api/v1/database-connections/{db_connection_id}
Database connection resource example:
{
"alias": "string",
"use_ssh": false,
"connection_uri": "string",
"path_to_credentials_file": "string",
"llm_api_key": "string",
"ssh_settings": {
"db_name": "string",
"host": "string",
"username": "string",
"password": "string",
"remote_host": "string",
"remote_db_name": "string",
"remote_db_password": "string",
"private_key_password": "string",
"db_driver": "string"
}
}
Table Descriptions#
The table-descriptions object is used to add context about the tables and columns in the relational database.
These are then used to help the LLM build valid SQL to answer natural language questions.
Related endpoints are:
Scan table description –
POST api/v1/table-descriptions/sync-schemasUpdate a table description –
PATCH api/v1/table-descriptions/{table_description_id}List table description –
GET api/v1/table-descriptionsGet a description –
GET api/v1/table-descriptions/{table_description_id}Refresh table descriptions –
GET api/v1/table-descriptions/refresh
Table description resource example:
{
"columns": [{}],
"db_connection_id": "string",
"description": "string",
"examples": [{}],
"table_name": "string",
"table_schema": "string"
}
Database Instructions#
The database-instructions object is used to set constraints on the SQL that is generated by the LLM.
These are then used to help the LLM build valid SQL to answer natural language questions based on your business rules.
Related endpoints are:
Add database instructions –
POST api/v1/{db_connection_id}/instructionsList database instructions –
GET api/v1/{db_connection_id}/instructionsUpdate database instructions –
PUT api/v1/{db_connection_id}/instructions/{instruction_id}Delete database instructions –
DELETE api/v1/{db_connection_id}/instructions/{instruction_id}
Instruction resource example:
{
"db_connection_id": "string",
"instruction": "string",
}
Prompts#
The prompt object is used to ask questions or pass any natural language text to the LLM.
Related endpoints are:
Create a prompt –
POST api/v1/promptsList prompts –
GET api/v1/promptsGet a prompt –
GET api/v1/prompts/{prompt_id}
prompt resource example:
{
"id": "str"
"db_connection_id": "str"
"text": "str"
"created_at": "datetime",
"metadata": "dict | None"
}
SQL generations#
Given a prompt, Dataherald AI agents can generate SQL queries to efficiently answer the question or provide the necessary information.
Related endpoints are:
create a SQL generation for a given prompt –
POST api/v1/prompts/{prompt_id}/sql-generationscreate a prompt and a SQL generation –
POST api/v1/prompts/sql-generationslist SQL generations –
GET api/v1/sql-generationsget a SQL generation –
GET api/v1/sql-generations/{sql_generation_id}get a SQL a csv file of SQL query results –
GET api/v1/sql-generations/{sql_generation_id}/csv-fileexecute a SQL generation –
GET api/v1/sql-generations/{sql_generation_id}/execute
SQL generation resource example:
{
"id": "str"
"prompt_id": "str"
"finetuning_id": "str"
"evaluate": "bool"
"llm_config": {
"llm_name": "str",
"api_base": "str"
},
"sql": "str"
"status": "str"
"completed_at": "datetime"
"tokens_used": "int"
"confidence_score": "float"
"error": "str"
"created_at": "datetime",
"metadata": "dict | None"
}
NL generations#
For each SQL generation, the LLMs can generate a natural language response based on the SQL query results.
Related endpoints are:
create a NL generation for a given SQL generation –
POST api/v1/sql-generations/{sql_generation_id}/nl-generationscreate a SQL generation and a NL generation for a given prompt –
POST api/v1/prompts/{prompt_id}/sql-generations/nl-generationscreate a prompt and a SQL generation and a NL generation –
POST api/v1/prompts/sql-generations/nl-generationslist NL generations –
GET api/v1/nl-generationsget a NL generation –
GET api/v1/nl-generations/{nl_generation_id}
Nl generation resource example:
{
"id": "str"
"sql_generation_id": "str"
"llm_config": {
"llm_name": "str",
"api_base": "str"
},
"text": "str"
"created_at": "datetime",
"metadata": "dict | None"
}
Finetuning jobs#
The finetuning object is used to finetune the LLM to your data. This is an asynchronous process that uploads your golden records to model provider servers and creates a finetuning job.
The finetuned model is going to be used inside an agent for generating SQL queries.
Related endpoints are:
Finetuning job create –
POST api/v1/finetuningsFinetuning job get –
GET api/v1/finetunings/{finetuning_id}Finetuning job cancel –
POST api/v1/finetunings/{finetuning_id}/cancelFinetuning job delete –
DELETE api/v1/finetunings/{finetuning_id}Finetuning job list –
GET api/v1/finetunings
Finetuning resource example:
{
"id": "finetuing-job-id",
"db_connection_id": "database_connection_id",
"alias": "model name",
"status": "finetuning_job_status", // Possible values: queued, running, succeeded, validating_files, failed, or cancelled
"error": "The error message if the job failed", // Optional, default is None
"base_llm": {
"model_provider": "model_provider_name", // Currently, only 'openai'
"model_name": "model_name", // Supported: gpt-3.5-turbo, gpt-4
"model_parameters": {
"n_epochs": "int or string", // Optional, default 3
"batch_size": "int or string", // Optional, default 1
"learning_rate_multiplier": "int or string" // Optional, default "auto"
}
},
"finetuning_file_id": "File ID for finetuning file",
"finetuning_job_id": "Finetuning job ID",
"model_id": "Model ID after finetuning",
"created_at": "datetime",
"golden_sqls": "array[ids]", // Default is None, meaning use all golden records
"metadata": "dict | None" // Optional, default None
}
Error Codes#
Certain errors are accompanied by an error code and an explanatory message. These errors trigger an HTTP 400 response code.
DB connection error code response example:
{
"error_code": "invalid_database_uri_format",
"message": "Invalid URI format: foo",
"description": null,
"detail": {
"alias": "foo",
"use_ssh": false,
"connection_uri": "gdfgdgAABl5e-dfg_-wErFJdFZeVXwnmew_dfg__WU-dfgdfa=="
}
}