DuckDB Integration Documentation

What the Integration Does

The DuckDB integration allows seamless interaction with DuckDB databases directly through the Fluo Platform. This is primarily used for the management, transformation, and querying of data stored in a DuckDB database.

Short Description

Purpose: Facilitate data analytics and management tasks using DuckDB within the Fluo ecosystem.
Typical Use Cases: Creating, querying, and manipulating tables in DuckDB, exporting and importing data to and from various formats such as CSV and Parquet.

Example Scenarios

The integration lets you create tables from local files or S3 paths.
Execute SQL queries to analyze data.
Perform full-text search operations on tables.
Export tables to a specified format and location.

Capabilities

Features

Table Management: Create, describe, and summarize tables.
Data Querying: Run and inspect SQL queries.
Data Import/Export: Load data from local files/S3 and export tables to paths.
Full Text Search: Create indexes and perform searches within data tables.

What the Integration Enables

Show Tables: List all tables currently in the DuckDB database.
Run Queries: Execute customized SQL statements and return results.
Create/Export Tables: From paths or export to specific formats like Parquet.

Input/Output Schemas for Each Capability

show_tables: Takes a boolean. Returns a list of table names.
describe_table: Takes a table name (string). Returns a description of the table structure.
run_query: Takes a SQL query (string). Returns the result in the form of a formatted string.
create_table_from_path: Takes a file path, optional table name, and a replace flag.

Limitations

Performance may vary with the dataset size.
Requires proper setup of local file paths and permissions for S3 access.

Setup & Configuration

Prerequisites

An environment with DuckDB installed. Use pip install duckdb to ensure installation.

Authentication

No explicit authentication needed but setting correct file paths is crucial.

Step-by-step Guide

Ensure DuckDB is installed in your environment.
Configure the database path and any initial SQL commands.
Use Fluo interface to input local/S3 paths for data operations.

Testing Connection

Verify by executing a simple SHOW TABLES; query to check the visibility of tables.

How to Use in Agents

Example Prompts or Configurations

Configure the agent to call the integration with specific datasets paths or SQL queries for processing.

Example Configuration

yaml config: db_path: "path/to/db" run_queries: True create_tables: True

Best Practices

Validate your queries using the inspect_query before execution.
Regularly backup your database when performing write operations.

Reference Section

API Endpoints

Not exposed publicly in the UI, handled through Fluo commands.

Schema Definitions

Defined by DuckDB specifications.

More Information

Refer to DuckDB Documentation for deeper details.

PreviousDiscord

NextDuckDuckGo