Image Tool Configuration Guide

Overview

The Image Tool provides image processing capabilities including loading, OCR text extraction, metadata retrieval, resizing, and filtering. It can be configured via environment variables using the IMAGE_TOOL_ prefix or through programmatic configuration when initializing the tool.

Using .env Files in Your Project

When using aiecs as a dependency in your project, you can store configuration in a .env file for convenience. The Image Tool reads from environment variables that are already loaded into the process, so you need to load the .env file in your application before importing aiecs tools.

Setting Up .env Files

1. Install python-dotenv:

pip install python-dotenv

2. Create a .env file in your project root:

# .env file in your project root
IMAGE_TOOL_MAX_FILE_SIZE_MB=50
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]
IMAGE_TOOL_TESSERACT_POOL_SIZE=2
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

3. Load the .env file in your application:

# main.py or app.py - at the top of your entry point
from dotenv import load_dotenv

# Load environment variables from .env file
# This must be done BEFORE importing aiecs tools
load_dotenv()

# Now import and use aiecs tools
from aiecs.tools.task_tools.image_tool import ImageTool

# The tool will automatically use the environment variables
image_tool = ImageTool()

Multiple Environment Files

You can use different .env files for different environments:

import os
from dotenv import load_dotenv

# Load environment-specific configuration
env = os.getenv('APP_ENV', 'development')

if env == 'production':
    load_dotenv('.env.production')
elif env == 'staging':
    load_dotenv('.env.staging')
else:
    load_dotenv('.env.development')

from aiecs.tools.task_tools.image_tool import ImageTool
image_tool = ImageTool()

Example .env.production:

# Production settings - strict limits for security
IMAGE_TOOL_MAX_FILE_SIZE_MB=20
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png"]
IMAGE_TOOL_TESSERACT_POOL_SIZE=4

Example .env.development:

# Development settings - relaxed limits for testing
IMAGE_TOOL_MAX_FILE_SIZE_MB=100
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]
IMAGE_TOOL_TESSERACT_POOL_SIZE=1

Best Practices for .env Files

Never commit .env files to version control - Add .env to your .gitignore:

# .gitignore
.env
.env.local
.env.*.local
.env.production
.env.staging

Provide a template - Create .env.example with documented dummy values:

# .env.example
# Image Tool Configuration

# Maximum file size in megabytes
IMAGE_TOOL_MAX_FILE_SIZE_MB=50

# Allowed image file extensions (JSON array)
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]

# Number of Tesseract OCR processes
IMAGE_TOOL_TESSERACT_POOL_SIZE=2

# Default OCR language (e.g., 'eng', 'chi_sim', 'eng+chi_sim')
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

Document your variables - Add comments explaining each setting
Use load_dotenv() early - Call it at the very top of your entry point, before any aiecs imports
Format complex types correctly:
- Integers: Plain numbers: 50, 100
- Lists: Use JSON array format with double quotes: [".jpg",".png"]

Configuration Options

1. Max File Size (MB)

Environment Variable: IMAGE_TOOL_MAX_FILE_SIZE_MB

Type: Integer

Default: 50

Description: Maximum allowed file size in megabytes. Files larger than this limit will be rejected during validation for security and performance reasons.

Common Values:

10 - Conservative limit for public APIs
20 - Moderate limit for web applications
50 - Default (balanced)
100 - Generous limit for internal tools

Example:

export IMAGE_TOOL_MAX_FILE_SIZE_MB=20

Security Note: Keep this value as low as practical for your use case to prevent memory exhaustion attacks.

2. Allowed Extensions

Environment Variable: IMAGE_TOOL_ALLOWED_EXTENSIONS

Type: List[str]

Default: ['.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.gif']

Description: List of allowed image file extensions. This is a critical security feature that prevents processing of unauthorized or potentially malicious file types.

Format: JSON array string with double quotes

Supported Formats:

.jpg, .jpeg - JPEG images
.png - PNG images
.bmp - Bitmap images
.tiff - TIFF images
.gif - GIF images

Example:

# Strict - Only common web formats
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'

# Lenient - All supported formats
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]'

Security Note: Only allow extensions that your application actually needs to process.

3. Tesseract Pool Size

Environment Variable: IMAGE_TOOL_TESSERACT_POOL_SIZE

Type: Integer

Default: 2

Description: Number of Tesseract OCR processes to maintain in the pool for parallel text extraction. Higher values allow more concurrent OCR operations but consume more system resources.

Common Values:

1 - Single process (development/testing)
2 - Default (balanced)
4 - Higher concurrency for production
8 - Maximum concurrency for high-load scenarios

Example:

export IMAGE_TOOL_TESSERACT_POOL_SIZE=4

Performance Note: Set based on expected concurrent OCR requests and available CPU cores. Each process consumes memory and CPU.

Requirement: Tesseract must be installed on the system for OCR functionality to work.

4. Default OCR Language

Environment Variable: IMAGE_TOOL_DEFAULT_OCR_LANGUAGE

Type: String

Default: eng

Description: Default language code for OCR text extraction. This value is used when the lang parameter is not specified in the ocr() method call. Supports single language codes (e.g., eng, chi_sim) or multi-language format using + separator (e.g., eng+chi_sim).

Common Language Codes:

eng - English (default)
chi_sim - Simplified Chinese
chi_tra - Traditional Chinese
spa - Spanish
fra - French
jpn - Japanese
deu - German

Multi-Language Support: You can specify multiple languages using the + separator. Tesseract will try to recognize text in any of the specified languages:

eng+chi_sim - English and Simplified Chinese
eng+spa+fra - English, Spanish, and French

Examples:

# Single language (English)
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

# Single language (Simplified Chinese)
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=chi_sim

# Multi-language (English + Simplified Chinese)
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng+chi_sim

Usage:

from aiecs.tools.task_tools.image_tool import ImageTool

# Initialize with default language from config
image_tool = ImageTool()

# Uses default_ocr_language from config (e.g., 'eng+chi_sim')
text = image_tool.ocr('image.png')

# Override default language for this call
text = image_tool.ocr('image.png', lang='chi_sim')

# Use multi-language for this call
text = image_tool.ocr('image.png', lang='eng+jpn')

Note: Make sure the corresponding Tesseract language data packs are installed on your system. See the “Language Data” section below for installation instructions.

Usage Examples

Example 1: Basic Environment Configuration

# Set custom limits and pool size
export IMAGE_TOOL_MAX_FILE_SIZE_MB=30
export IMAGE_TOOL_TESSERACT_POOL_SIZE=4
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'

# Run your application
python app.py

Example 2: Security-Focused Configuration

# Strict limits for public-facing applications
export IMAGE_TOOL_MAX_FILE_SIZE_MB=10
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'
export IMAGE_TOOL_TESSERACT_POOL_SIZE=2

Example 3: High-Performance Configuration

# Optimized for internal high-throughput processing
export IMAGE_TOOL_MAX_FILE_SIZE_MB=100
export IMAGE_TOOL_TESSERACT_POOL_SIZE=8
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".bmp",".tiff"]'

Example 4: Programmatic Configuration

from aiecs.tools.task_tools.image_tool import ImageTool

# Initialize with custom configuration
image_tool = ImageTool(config={
    'max_file_size_mb': 30,
    'allowed_extensions': ['.jpg', '.jpeg', '.png'],
    'tesseract_pool_size': 4,
    'default_ocr_language': 'eng+chi_sim'  # Multi-language support
})

Example 5: Mixed Configuration

Environment variables are used as defaults, but can be overridden programmatically:

# Set environment defaults
export IMAGE_TOOL_MAX_FILE_SIZE_MB=50

# Override for specific instance
image_tool = ImageTool(config={
    'max_file_size_mb': 20  # This overrides the environment variable
})

Example 6: Dynamic Configuration Update

from aiecs.tools.task_tools.image_tool import ImageTool

# Initialize with defaults
image_tool = ImageTool()

# Update configuration at runtime
image_tool.update_config({
    'max_file_size_mb': 100,
    'tesseract_pool_size': 6,  # Pool will be reinitialized
    'default_ocr_language': 'chi_sim'  # Change default language
})

Configuration Priority

When the Image Tool is initialized, configuration values are resolved in the following order (highest to lowest priority):

Programmatic config - Values passed to the constructor
Environment variables - Values set via IMAGE_TOOL_* variables
Default values - Built-in defaults as specified above

Data Type Parsing

Integer Values

Integers should be provided as numeric strings:

export IMAGE_TOOL_MAX_FILE_SIZE_MB=50
export IMAGE_TOOL_TESSERACT_POOL_SIZE=4

List Values

Lists must be provided as JSON array strings with double quotes:

# Correct
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".png",".gif"]'

# Incorrect (will not parse)
export IMAGE_TOOL_ALLOWED_EXTENSIONS=".jpg,.png,.gif"

Important: Use single quotes for the shell, double quotes for JSON:

export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'
#                                    ^                      ^
#                                    Single quotes for shell
#                                       ^     ^     ^
#                                       Double quotes for JSON

Validation

Automatic Type Validation

Pydantic automatically validates configuration values:

max_file_size_mb must be a positive integer
allowed_extensions must be a list of strings
tesseract_pool_size must be a positive integer
default_ocr_language must be a non-empty string

File Validation

When processing images, the tool validates:

File existence - File must exist at the specified path
File extension - Must be in allowed_extensions list
File size - Must not exceed max_file_size_mb limit
File integrity - Must be a valid image file

Security Validation

The tool includes multiple security layers:

Extension whitelist prevents processing unauthorized file types
File size limits prevent memory exhaustion
Path normalization prevents directory traversal attacks
Output path validation prevents overwriting existing files

Tesseract Setup

The Image Tool uses Tesseract OCR for text extraction. Follow these steps to set it up:

Installation

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install tesseract-ocr

macOS:

brew install tesseract

Windows: Download and install from: https://github.com/UB-Mannheim/tesseract/wiki

Language Data

Install additional language packs as needed:

# English (usually included by default)
sudo apt-get install tesseract-ocr-eng

# Chinese
sudo apt-get install tesseract-ocr-chi-sim tesseract-ocr-chi-tra

# Spanish
sudo apt-get install tesseract-ocr-spa

# French
sudo apt-get install tesseract-ocr-fra

Verify Installation

tesseract --version

You should see output like:

tesseract 4.1.1

Using OCR with Different Languages

Method 1: Configure Default Language

Set the default language via environment variable or config:

# Set default to Chinese
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=chi_sim

from aiecs.tools.task_tools.image_tool import ImageTool

image_tool = ImageTool()

# Uses configured default (chi_sim)
text = image_tool.ocr('chinese_image.png')

# Override for specific call
text = image_tool.ocr('english_image.png', lang='eng')

Method 2: Multi-Language Support

Enable multi-language recognition by configuring multiple languages:

# Set default to English + Chinese
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng+chi_sim

from aiecs.tools.task_tools.image_tool import ImageTool

image_tool = ImageTool()

# Uses configured default (eng+chi_sim) - recognizes both English and Chinese
text = image_tool.ocr('mixed_image.png')

# Use different multi-language combination for specific call
text = image_tool.ocr('image.png', lang='eng+jpn+spa')

Method 3: Per-Call Language Specification

Specify language for each OCR call:

from aiecs.tools.task_tools.image_tool import ImageTool

image_tool = ImageTool()

# English (uses default if not configured)
text = image_tool.ocr('image.png')

# Chinese
text = image_tool.ocr('chinese_image.png', lang='chi_sim')

# Spanish
text = image_tool.ocr('spanish_image.png', lang='spa')

# Multi-language
text = image_tool.ocr('mixed_image.png', lang='eng+chi_sim')

Operations Supported

The Image Tool supports the following operations:

1. Load

Load an image and return its dimensions and color mode.

info = image_tool.load('photo.jpg')
# Returns: {'size': (width, height), 'mode': 'RGB'}

2. OCR (Optical Character Recognition)

Extract text from an image.

text = image_tool.ocr('document.png')
# Returns: extracted text as string

3. Metadata

Retrieve image metadata including EXIF data.

metadata = image_tool.metadata('photo.jpg', include_exif=True)
# Returns: {'size': tuple, 'mode': str, 'exif': dict}

4. Resize

Resize an image to specified dimensions.

result = image_tool.resize('input.jpg', 'output.jpg', width=800, height=600)
# Returns: {'success': True, 'output_path': 'output.jpg'}

5. Filter

Apply image filters (blur, sharpen, edge_enhance).

result = image_tool.filter('input.jpg', 'output.jpg', filter_type='sharpen')
# Returns: {'success': True, 'output_path': 'output.jpg'}

Troubleshooting

Issue: File size validation fails

Error: File too large: 75.3MB, max 50MB

Solution:

# Increase max file size limit
export IMAGE_TOOL_MAX_FILE_SIZE_MB=100

Issue: Extension not allowed

Error: Extension '.webp' not allowed

Solution:

# Add the extension to allowed list (if safe)
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".webp"]'

Issue: Tesseract not found

Error: Tesseract not found; OCR will be disabled

Solution:

# Install Tesseract
sudo apt-get install tesseract-ocr  # Ubuntu/Debian
brew install tesseract              # macOS

# Verify installation
tesseract --version

Issue: OCR returns empty or garbled text

Causes: Poor image quality, wrong language, unsupported format

Solutions:

Ensure image has good contrast and resolution
Configure default language: export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=chi_sim
Specify correct language per call: ocr('image.png', lang='chi_sim')
Use multi-language format: ocr('image.png', lang='eng+chi_sim')
Pre-process image (increase contrast, remove noise)
Install appropriate language data packs

Issue: Pool size too small for concurrent requests

Error: No Tesseract processes available

Solution:

# Increase pool size
export IMAGE_TOOL_TESSERACT_POOL_SIZE=8

Issue: List parsing error

Error: Configuration parsing fails for allowed_extensions

Solution:

# Use proper JSON array syntax with double quotes
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".png"]'

# NOT: ['.jpg','.png'] or .jpg,.png

Issue: Memory issues with large images

Causes: Large image files consuming too much memory

Solutions:

Reduce max_file_size_mb limit
Implement image downsampling before processing
Reduce tesseract_pool_size to free memory
Monitor and increase system memory if needed

Best Practices

Security

Minimize allowed extensions - Only allow file types you actually need
Set conservative file size limits - Use smallest practical value
Validate file content - Don’t trust extensions alone (Pillow handles this)
Sanitize file paths - Tool automatically normalizes paths
Use output path validation - Prevents overwriting existing files

Performance

Tune pool size - Match tesseract_pool_size to expected concurrent OCR load
Optimize file sizes - Compress images before processing
Cache results - Leverage BaseTool’s built-in caching
Monitor resources - Watch memory and CPU usage under load
Use appropriate formats - PNG for text, JPEG for photos

Development vs Production

Development:

IMAGE_TOOL_MAX_FILE_SIZE_MB=100
IMAGE_TOOL_TESSERACT_POOL_SIZE=1
IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]'
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

Production:

IMAGE_TOOL_MAX_FILE_SIZE_MB=20
IMAGE_TOOL_TESSERACT_POOL_SIZE=4
IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng+chi_sim  # Multi-language support

Error Handling

Always wrap image operations in try-except blocks:

from aiecs.tools.task_tools.image_tool import ImageTool, FileOperationError, SecurityError

image_tool = ImageTool()

try:
    result = image_tool.load('photo.jpg')
except FileOperationError as e:
    print(f"File operation failed: {e}")
except SecurityError as e:
    print(f"Security validation failed: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Support

For issues or questions about Image Tool configuration:

Check the tool source code for implementation details
Review Pillow and Tesseract documentation for specific functionality
Consult the main documentation for architecture overview
Test with simple images first to isolate configuration vs. image quality issues

Image Tool Configuration Guide

Overview

Using .env Files in Your Project

Setting Up .env Files

Multiple Environment Files

Best Practices for .env Files

Configuration Options

1. Max File Size (MB)

2. Allowed Extensions

3. Tesseract Pool Size

4. Default OCR Language

Usage Examples

Example 1: Basic Environment Configuration

Example 2: Security-Focused Configuration

Example 3: High-Performance Configuration

Example 4: Programmatic Configuration

Example 5: Mixed Configuration

Example 6: Dynamic Configuration Update

Configuration Priority

Data Type Parsing

Integer Values

List Values

Validation

Automatic Type Validation

File Validation

Security Validation

Tesseract Setup

Installation

Language Data

Verify Installation

Using OCR with Different Languages

Operations Supported

1. Load

2. OCR (Optical Character Recognition)

3. Metadata

4. Resize

5. Filter

Troubleshooting

Issue: File size validation fails

Issue: Extension not allowed

Issue: Tesseract not found

Issue: OCR returns empty or garbled text

Issue: Pool size too small for concurrent requests

Issue: List parsing error

Issue: Memory issues with large images

Best Practices

Security

Performance

Development vs Production

Error Handling

Related Documentation

Support