DuckDB Integration Documentation
What the Integration Does
The DuckDB integration allows seamless interaction with DuckDB databases directly through the Fluo Platform. This is primarily used for the management, transformation, and querying of data stored in a DuckDB database.
Short Description
- Purpose: Facilitate data analytics and management tasks using DuckDB within the Fluo ecosystem.
- Typical Use Cases: Creating, querying, and manipulating tables in DuckDB, exporting and importing data to and from various formats such as CSV and Parquet.
Example Scenarios
- The integration lets you create tables from local files or S3 paths.
- Execute SQL queries to analyze data.
- Perform full-text search operations on tables.
- Export tables to a specified format and location.
Capabilities
Features
- Table Management: Create, describe, and summarize tables.
- Data Querying: Run and inspect SQL queries.
- Data Import/Export: Load data from local files/S3 and export tables to paths.
- Full Text Search: Create indexes and perform searches within data tables.
What the Integration Enables
- Show Tables: List all tables currently in the DuckDB database.
- Run Queries: Execute customized SQL statements and return results.
- Create/Export Tables: From paths or export to specific formats like Parquet.
Input/Output Schemas for Each Capability
- show_tables: Takes a boolean. Returns a list of table names.
- describe_table: Takes a table name (string). Returns a description of the table structure.
- run_query: Takes a SQL query (string). Returns the result in the form of a formatted string.
- create_table_from_path: Takes a file path, optional table name, and a replace flag.
Limitations
- Performance may vary with the dataset size.
- Requires proper setup of local file paths and permissions for S3 access.
Setup & Configuration
Prerequisites
- An environment with DuckDB installed. Use pip install duckdb to ensure installation.
Authentication
- No explicit authentication needed but setting correct file paths is crucial.
Step-by-step Guide
- Ensure DuckDB is installed in your environment.
- Configure the database path and any initial SQL commands.
- Use Fluo interface to input local/S3 paths for data operations.
Testing Connection
- Verify by executing a simple SHOW TABLES; query to check the visibility of tables.
How to Use in Agents
Example Prompts or Configurations
- Configure the agent to call the integration with specific datasets paths or SQL queries for processing.
Example Configuration
yaml config: db_path: "path/to/db" run_queries: True create_tables: True
Best Practices
- Validate your queries using the inspect_query before execution.
- Regularly backup your database when performing write operations.
Reference Section
API Endpoints
- Not exposed publicly in the UI, handled through Fluo commands.
Schema Definitions
- Defined by DuckDB specifications.
More Information
- Refer to DuckDB Documentation for deeper details.