# ChartTool LangChain Agent Usage Guide ## Overview ChartTool is a powerful data analysis and visualization tool that supports multiple data format reading, diverse chart creation, and flexible data export. Through the LangChain adapter, each feature of ChartTool is converted into independent tools for ReAct Agent invocation. ## Available Tools List In LangChain ReAct Agent, ChartTool is converted into the following 3 independent tools: 1. **`chart_read_data`** - Data reading and analysis 2. **`chart_visualize`** - Data visualization chart creation 3. **`chart_export_data`** - Data format conversion and export --- ## 1. chart_read_data ### Function Description Read data files in various formats, perform basic analysis, and return data summary information. ### Supported File Formats - **CSV** (`.csv`) - **Excel** (`.xlsx`, `.xls`) - **JSON** (`.json`) - **Parquet** (`.parquet`) - **Feather** (`.feather`) - **SPSS** (`.sav`) - **SAS** (`.sas7bdat`) - **Stata** (`.por`) ### LangChain Invocation Method ```python # Basic invocation result = agent_executor.invoke({ "input": "Use chart_read_data to read file /path/to/data.csv" }) # Complete parameter invocation result = agent_executor.invoke({ "input": """Use chart_read_data tool with the following parameters: file_path: /path/to/data.xlsx nrows: 1000 sheet_name: Sheet1 export_format: json export_path: /tmp/analysis_results.json """ }) ``` ### Input Parameters | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `file_path` | str | ✅ | Complete path to the data file | | `nrows` | int | ❌ | Limit the number of rows to read (default: read all) | | `sheet_name` | str/int | ❌ | Excel worksheet name or index (default: 0) | | `export_format` | str | ❌ | Export format: json/csv/html/excel/markdown | | `export_path` | str | ❌ | Export file save path | ### Use Cases - 🔍 **Data Exploration**: Quickly understand the structure and basic information of data files - 📊 **Data Overview**: View data types, row count, column names, and other metadata - 🔄 **Format Conversion**: Read data and convert to other formats - 📋 **Data Preview**: View the first few rows of data ### Return Result ```json { "variables": ["column1", "column2", "column3"], "observations": 1000, "dtypes": { "column1": "object", "column2": "int64", "column3": "float64" }, "memory_usage": 0.25, "preview": [ {"column1": "value1", "column2": 123, "column3": 45.67}, {"column1": "value2", "column2": 456, "column3": 78.90} ], "exported_to": "/tmp/analysis_results.json" // If export was specified } ``` ### Agent Invocation Example ``` Human: I want to analyze this sales data file /data/sales.csv to see how many rows of data it contains Agent: I'll help you analyze the sales data file. Action: chart_read_data Action Input: {"file_path": "/data/sales.csv"} Observation: { "variables": ["date", "product", "sales", "region"], "observations": 5000, "dtypes": {"date": "object", "product": "object", "sales": "int64", "region": "object"}, "memory_usage": 0.15, "preview": [...] } Thought: Data successfully read, containing 5000 records with 4 columns: date, product, sales, and region. Final Answer: Your sales data file contains 5000 records with 4 fields: date, product, sales, and region. The data size is approximately 0.15MB. ``` --- ## 2. chart_visualize ### Function Description Create various types of visualization charts based on data files, supporting multiple chart styles and custom configurations. ### Supported Chart Types | Chart Type | Value | Use Case | |-----------|-------|----------| | Histogram | `histogram` | Single variable distribution analysis | | Box Plot | `boxplot` | Distribution comparison, outlier detection | | Scatter Plot | `scatter` | Two-variable relationship analysis | | Bar Chart | `bar` | Categorical data comparison | | Line Chart | `line` | Time series, | Heatmap | `heatmap` | Correlation matrix visualization | | Pair Plot | `pair` | Multi-variable relationship matrix | ### LangChain Invocation Method ```python # Basic visualization result = agent_executor.invoke({ "input": """Use chart_visualize to create a scatter plot: file_path: /data/sales.csv plot_type: scatter x: price y: sales title: Price vs Sales Relationship Chart """ }) # Advanced visualization configuration result = agent_executor.invoke({ "input": """Use chart_visualize tool: file_path: /data/multi_vars.csv plot_type: heatmap variables: ["var1", "var2", "var3", "var4"] title: Variable Correlation Analysis figsize: [12, 8] output_path: /charts/correlation_heatmap.png dpi: 300 """ }) ``` ### Input Parameters | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `file_path` | str | ✅ | Data file path | | `plot_type` | str | ✅ | Chart type (see supported chart types table) | | `x` | str | ❌ | Column name for X-axis | | `y` | str | ❌ | Column name for Y-axis | | `hue` | str | ❌ | Column name for color encoding | | `variables` | List[str] | ❌ | List of variables to include in visualization | | `title` | str | ❌ | Chart title | | `figsize` | List[int] | ❌ | Chart size [width, height] | | `output_path` | str | ❌ | Chart save path | | `dpi` | int | ❌ | Image resolution (default: 100) | | `export_format` | str | ❌ | Export format: json/csv/html/excel/markdown | | `export_path` | str | ❌ | Export result save path | ### Use Cases #### 📈 **Trend Analysis** ``` plot_type: line x: date y: sales Use case: Time series analysis, trend observation ``` #### 📊 **Distribution Analysis** ``` plot_type: histogram x: age Use case: Single variable distribution shape analysis ``` #### 🔗 **Relationship Analysis** ``` plot_type: scatter x: advertising_spend y: revenue hue: channel Use case: Explore relationships and correlations between variables ``` #### 🔥 **Correlation Analysis** ``` plot_type: heatmap variables: ["var1", "var2", "var3"] Use case: Multi-variable correlation matrix visualization ``` #### 📦 **Categorical Comparison** ``` plot_type: boxplot x: department y: salary Use case: Distribution comparison across different groups ``` ### Return Result ```json { "plot_type": "scatter", "output_path": "/tmp/chart_exports/plot_a1b2c3d4.png", "variables": ["price", "sales", "region"], "title": "Price vs Sales Relationship Chart", "exported_to": "/charts/analysis_results.json" // If export was specified } ``` ### Agent Invocation Example ``` Human: Help me create a chart showing sales performance by different regions using sales data Agent: I'll create a visualization chart showing sales performance by different regions for you. Action: chart_visualize Action Input: { "file_path": "/data/sales.csv", "plot_type": "bar", "x": "region", "y": "sales", "title": "Sales Performance Comparison by Region", "figsize": [12, 6] } Observation: { "plot_type": "bar", "output_path": "/tmp/chart_exports/plot_xy123.png", "variables": ["region", "sales"], "title": "Sales Performance Comparison by Region" } Thought: Chart successfully created, showing sales performance comparison by different regions. Final Answer: I've created a bar chart showing sales performance comparison by different regions. The chart has been saved to /tmp/chart_exports/plot_xy123.png. This chart clearly shows the sales data differences across regions. ``` --- ## 3. chart_export_data ### Function Description Convert and export data files to different formats, supporting variable selection and format customization. ### Supported Export Formats | Format | Value | Description | Use Case | |--------|-------|-------------|----------| | JSON | `json` | Structured data format | API interaction, data exchange | | CSV | `csv` | Comma-separated values format | Excel compatibility, simple analysis | | HTML | `html` | Web table format | Report display, web integration | | Excel | `excel` | Excel workbook format | Office software compatibility | | Markdown | `markdown` | Markdown table format | Documentation writing, GitHub display | ### LangChain Invocation Method ```python # Basic export result = agent_executor.invoke({ "input": """Use chart_export_data to export data: file_path: /data/sales.csv format: excel export_path: /exports/sales_data.xlsx """ }) # Selective export of specific variables result = agent_executor.invoke({ "input": """Use chart_export_data tool: file_path: /data/large_dataset.csv format: json variables: ["id", "name", "score"] export_path: /exports/filtered_data.json """ }) ``` ### Input Parameters | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `file_path` | str | ✅ | Source data file path | | `format` | str | ✅ | Export format (json/csv/html/excel/markdown) | | `variables` | List[str] | ❌ | List of column names to export (if not specified, exports all columns) | | `export_path` | str | ❌ | Export file save path (if not specified, auto-generated) | | `export_format` | str | ❌ | Result metadata export format | ### Use Cases #### 📄 **Format Conversion** ``` Source file: data.xlsx → Target format: CSV Use case: Data migration between different systems ``` #### 🎯 **Selective Data Export** ``` variables: ["name", "score", "grade"] Use case: Extract specific fields for subsequent analysis ``` #### 📊 **Report Generation** ``` format: html or markdown Use case: Create readable data reports ``` #### 💾 **Data Backup** ``` format: excel or json Use case: Data archiving and long-term storage ``` ### Return Result ```json { "format": "excel", "path": "/exports/sales_data.xlsx", "rows": 5000, "columns": 4, "variables": ["date", "product", "sales", "region"], "exported_to": "/metadata/export_summary.json" // If metadata export was specified } ``` ### Agent Invocation Example ``` Human: Convert sales data to Excel format, keeping only product name and sales columns Agent: I'll help you convert the sales data to Excel format with only the two columns you specified. Action: chart_export_data Action Input: { "file_path": "/data/sales.csv", "format": "excel", "variables": ["product", "sales"], "export_path": "/exports/product_sales.xlsx" } Observation: { "format": "excel", "path": "/exports/product_sales.xlsx", "rows": 5000, "columns": 2, "variables": ["product", "sales"] } Thought: Data successfully converted to Excel format with only product name and sales columns. Final Answer: Data conversion complete! I've converted the sales data to Excel format, saved to /exports/product_sales.xlsx. The exported data contains 5000 records with only the two columns you requested: product name and sales. ``` --- ## Complete Workflow Example ### Scenario: Complete Data Analysis Workflow ``` Human: I have a customer data file customer_data.csv, please help me analyze the data structure, create an age distribution chart, and export the results as a report Agent: I'll help you complete this complete data analysis workflow. Action: chart_read_data Action Input: {"file_path": "/data/customer_data.csv"} Observation: { "variables": ["customer_id", "age", "income", "city"], "observations": 2500, "dtypes": {"customer_id": "int64", "age": "int64", "income": "float64", "city": "object"}, "preview": [...] } Thought: Data structure analysis complete, now creating age distribution histogram. Action: chart_visualize Action Input: { "file_path": "/data/customer_data.csv", "plot_type": "histogram", "x": "age", "title": "Customer Age Distribution Chart", "output_path": "/reports/age_distribution.png" } Observation: { "plot_type": "histogram", "output_path": "/reports/age_distribution.png", "variables": ["age"], "title": "Customer Age Distribution Chart" } Thought: Age distribution chart created, now exporting analysis results as HTML report. Action: chart_export_data Action Input: { "file_path": "/data/customer_data.csv", "format": "html", "export_path": "/reports/customer_analysis.html" } Observation: { "format": "html", "path": "/reports/customer_analysis.html", "rows": 2500, "columns": 4, "variables": ["customer_id", "age", "income", "city"] } Thought: Complete analysis workflow finished. Final Answer: Analysis complete! Your customer data contains 2500 records with 4 fields. I have: 1. Analyzed the data structure and basic information 2. Created an age distribution histogram (saved to /reports/age_distribution.png) 3. Exported a complete HTML data report (saved to /reports/customer_analysis.html) The data shows customer age, income, and city distribution, which can be used for further business analysis. ``` --- ## Advanced Usage Tips ### 1. Batch Analysis of Multiple Files ```python # Analyze multiple files sequentially files = ["/data/q1.csv", "/data/q2.csv", "/data/q3.csv"] for file in files: agent.invoke(f"Use chart_read_data to analyze file {file}") ``` ### 2. Create Comparison Charts ```python # Multi-variable comparison visualization agent.invoke(""" Use chart_visualize to create a comparison chart: file_path: /data/performance.csv plot_type: bar x: team y: score hue: quarter title: Team Performance Comparison by Quarter """) ``` ### 3. Correlation Analysis ```python # Create correlation heatmap agent.invoke(""" Use chart_visualize for correlation analysis: file_path: /data/features.csv plot_type: heatmap variables: ["feature1", "feature2", "feature3", "target"] title: Feature Correlation Matrix """) ``` ### 4. Custom Output Configuration ```python # High-resolution chart export agent.invoke(""" Use chart_visualize to create a high-quality chart: file_path: /data/data.csv plot_type: line x: month y: revenue title: Monthly Revenue Trend figsize: [16, 10] dpi: 300 output_path: /reports/high_res_chart.png """) ``` --- ## Error Handling ### Common Errors and Solutions | Error Type | Cause | Solution | |-----------|-------|----------| | `File not found` | File path does not exist | Check if the file path is correct | | `Variables not found in dataset` | Specified column names do not exist | First use read_data to view available column names | | `Extension not allowed` | Unsupported file format | Check the supported formats list | | `Error creating visualization` | Chart creation failed | Check data types and parameter combinations | ### Best Practices 1. **📋 Read Data First**: Use `chart_read_data` to understand data structure 2. **🎯 Clarify Objectives**: Choose appropriate chart type based on analysis purpose 3. **🔍 Verify Column Names**: Ensure specified column names exist in the data 4. **📐 Reasonable Configuration**: Adjust chart size and resolution based on data characteristics 5. **💾 Plan Paths**: Organize output file directory structure reasonably --- ## Configuration Options ChartTool supports the following configuration options (set during tool initialization): ```python config = { "export_dir": "/custom/export/path", # Custom export directory "plot_dpi": 150, # Default chart resolution "plot_figsize": [12, 8], # Default chart size "allowed_extensions": [".csv", ".xlsx"] # Limit allowed file formats } ``` --- ## Performance Optimization Recommendations 1. **🚀 Large Dataset Processing**: Use `nrows` parameter to limit rows for sampling analysis 2. **💾 Memory Optimization**: Release memory promptly after processing large datasets to avoid memory usage 3. **📁 File Organization**: Plan export directories reasonably to avoid file clutter 4. **🎨 Chart Optimization**: Choose appropriate DPI and size based on usage 5. **🔄 Cache Utilization**: Repeated reads of the same file will use cache to improve performance --- ## Summary ChartTool provides complete data analysis workflow support through the LangChain adapter: - ✅ **Data Reading**: Supports 9 file formats with flexible reading options - ✅ **Visualization Creation**: 7 chart types with rich customization options - ✅ **Data Export**: 5 export formats to meet different usage needs - ✅ **Error Handling**: Complete input validation and exception handling - ✅ **Performance Optimization**: Built-in cache, performance monitoring, and security checks Through these tools, LangChain ReAct Agent can execute complete data science workflows from data reading, analysis, visualization to result export, providing users with powerful data analysis capabilities.