Image Tool Configuration Guide

Overview

The Image Tool provides image processing capabilities including loading, OCR text extraction, metadata retrieval, resizing, and filtering. It can be configured via environment variables using the IMAGE_TOOL_ prefix or through programmatic configuration when initializing the tool.

Using .env Files in Your Project

When using aiecs as a dependency in your project, you can store configuration in a .env file for convenience. The Image Tool reads from environment variables that are already loaded into the process, so you need to load the .env file in your application before importing aiecs tools.

Setting Up .env Files

1. Install python-dotenv:

pip install python-dotenv

2. Create a .env file in your project root:

# .env file in your project root
IMAGE_TOOL_MAX_FILE_SIZE_MB=50
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]
IMAGE_TOOL_TESSERACT_POOL_SIZE=2
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

3. Load the .env file in your application:

# main.py or app.py - at the top of your entry point
from dotenv import load_dotenv

# Load environment variables from .env file
# This must be done BEFORE importing aiecs tools
load_dotenv()

# Now import and use aiecs tools
from aiecs.tools.task_tools.image_tool import ImageTool

# The tool will automatically use the environment variables
image_tool = ImageTool()

Multiple Environment Files

You can use different .env files for different environments:

import os
from dotenv import load_dotenv

# Load environment-specific configuration
env = os.getenv('APP_ENV', 'development')

if env == 'production':
    load_dotenv('.env.production')
elif env == 'staging':
    load_dotenv('.env.staging')
else:
    load_dotenv('.env.development')

from aiecs.tools.task_tools.image_tool import ImageTool
image_tool = ImageTool()

Example .env.production:

# Production settings - strict limits for security
IMAGE_TOOL_MAX_FILE_SIZE_MB=20
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png"]
IMAGE_TOOL_TESSERACT_POOL_SIZE=4

Example .env.development:

# Development settings - relaxed limits for testing
IMAGE_TOOL_MAX_FILE_SIZE_MB=100
IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]
IMAGE_TOOL_TESSERACT_POOL_SIZE=1

Best Practices for .env Files

  1. Never commit .env files to version control - Add .env to your .gitignore:

    # .gitignore
    .env
    .env.local
    .env.*.local
    .env.production
    .env.staging
    
  2. Provide a template - Create .env.example with documented dummy values:

    # .env.example
    # Image Tool Configuration
    
    # Maximum file size in megabytes
    IMAGE_TOOL_MAX_FILE_SIZE_MB=50
    
    # Allowed image file extensions (JSON array)
    IMAGE_TOOL_ALLOWED_EXTENSIONS=[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]
    
    # Number of Tesseract OCR processes
    IMAGE_TOOL_TESSERACT_POOL_SIZE=2
    
    # Default OCR language (e.g., 'eng', 'chi_sim', 'eng+chi_sim')
    IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng
    
  3. Document your variables - Add comments explaining each setting

  4. Use load_dotenv() early - Call it at the very top of your entry point, before any aiecs imports

  5. Format complex types correctly:

    • Integers: Plain numbers: 50, 100

    • Lists: Use JSON array format with double quotes: [".jpg",".png"]

Configuration Options

1. Max File Size (MB)

Environment Variable: IMAGE_TOOL_MAX_FILE_SIZE_MB

Type: Integer

Default: 50

Description: Maximum allowed file size in megabytes. Files larger than this limit will be rejected during validation for security and performance reasons.

Common Values:

  • 10 - Conservative limit for public APIs

  • 20 - Moderate limit for web applications

  • 50 - Default (balanced)

  • 100 - Generous limit for internal tools

Example:

export IMAGE_TOOL_MAX_FILE_SIZE_MB=20

Security Note: Keep this value as low as practical for your use case to prevent memory exhaustion attacks.

2. Allowed Extensions

Environment Variable: IMAGE_TOOL_ALLOWED_EXTENSIONS

Type: List[str]

Default: ['.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.gif']

Description: List of allowed image file extensions. This is a critical security feature that prevents processing of unauthorized or potentially malicious file types.

Format: JSON array string with double quotes

Supported Formats:

  • .jpg, .jpeg - JPEG images

  • .png - PNG images

  • .bmp - Bitmap images

  • .tiff - TIFF images

  • .gif - GIF images

Example:

# Strict - Only common web formats
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'

# Lenient - All supported formats
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]'

Security Note: Only allow extensions that your application actually needs to process.

3. Tesseract Pool Size

Environment Variable: IMAGE_TOOL_TESSERACT_POOL_SIZE

Type: Integer

Default: 2

Description: Number of Tesseract OCR processes to maintain in the pool for parallel text extraction. Higher values allow more concurrent OCR operations but consume more system resources.

Common Values:

  • 1 - Single process (development/testing)

  • 2 - Default (balanced)

  • 4 - Higher concurrency for production

  • 8 - Maximum concurrency for high-load scenarios

Example:

export IMAGE_TOOL_TESSERACT_POOL_SIZE=4

Performance Note: Set based on expected concurrent OCR requests and available CPU cores. Each process consumes memory and CPU.

Requirement: Tesseract must be installed on the system for OCR functionality to work.

4. Default OCR Language

Environment Variable: IMAGE_TOOL_DEFAULT_OCR_LANGUAGE

Type: String

Default: eng

Description: Default language code for OCR text extraction. This value is used when the lang parameter is not specified in the ocr() method call. Supports single language codes (e.g., eng, chi_sim) or multi-language format using + separator (e.g., eng+chi_sim).

Common Language Codes:

  • eng - English (default)

  • chi_sim - Simplified Chinese

  • chi_tra - Traditional Chinese

  • spa - Spanish

  • fra - French

  • jpn - Japanese

  • deu - German

Multi-Language Support: You can specify multiple languages using the + separator. Tesseract will try to recognize text in any of the specified languages:

  • eng+chi_sim - English and Simplified Chinese

  • eng+spa+fra - English, Spanish, and French

Examples:

# Single language (English)
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

# Single language (Simplified Chinese)
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=chi_sim

# Multi-language (English + Simplified Chinese)
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng+chi_sim

Usage:

from aiecs.tools.task_tools.image_tool import ImageTool

# Initialize with default language from config
image_tool = ImageTool()

# Uses default_ocr_language from config (e.g., 'eng+chi_sim')
text = image_tool.ocr('image.png')

# Override default language for this call
text = image_tool.ocr('image.png', lang='chi_sim')

# Use multi-language for this call
text = image_tool.ocr('image.png', lang='eng+jpn')

Note: Make sure the corresponding Tesseract language data packs are installed on your system. See the “Language Data” section below for installation instructions.

Usage Examples

Example 1: Basic Environment Configuration

# Set custom limits and pool size
export IMAGE_TOOL_MAX_FILE_SIZE_MB=30
export IMAGE_TOOL_TESSERACT_POOL_SIZE=4
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'

# Run your application
python app.py

Example 2: Security-Focused Configuration

# Strict limits for public-facing applications
export IMAGE_TOOL_MAX_FILE_SIZE_MB=10
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'
export IMAGE_TOOL_TESSERACT_POOL_SIZE=2

Example 3: High-Performance Configuration

# Optimized for internal high-throughput processing
export IMAGE_TOOL_MAX_FILE_SIZE_MB=100
export IMAGE_TOOL_TESSERACT_POOL_SIZE=8
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".bmp",".tiff"]'

Example 4: Programmatic Configuration

from aiecs.tools.task_tools.image_tool import ImageTool

# Initialize with custom configuration
image_tool = ImageTool(config={
    'max_file_size_mb': 30,
    'allowed_extensions': ['.jpg', '.jpeg', '.png'],
    'tesseract_pool_size': 4,
    'default_ocr_language': 'eng+chi_sim'  # Multi-language support
})

Example 5: Mixed Configuration

Environment variables are used as defaults, but can be overridden programmatically:

# Set environment defaults
export IMAGE_TOOL_MAX_FILE_SIZE_MB=50
# Override for specific instance
image_tool = ImageTool(config={
    'max_file_size_mb': 20  # This overrides the environment variable
})

Example 6: Dynamic Configuration Update

from aiecs.tools.task_tools.image_tool import ImageTool

# Initialize with defaults
image_tool = ImageTool()

# Update configuration at runtime
image_tool.update_config({
    'max_file_size_mb': 100,
    'tesseract_pool_size': 6,  # Pool will be reinitialized
    'default_ocr_language': 'chi_sim'  # Change default language
})

Configuration Priority

When the Image Tool is initialized, configuration values are resolved in the following order (highest to lowest priority):

  1. Programmatic config - Values passed to the constructor

  2. Environment variables - Values set via IMAGE_TOOL_* variables

  3. Default values - Built-in defaults as specified above

Data Type Parsing

Integer Values

Integers should be provided as numeric strings:

export IMAGE_TOOL_MAX_FILE_SIZE_MB=50
export IMAGE_TOOL_TESSERACT_POOL_SIZE=4

List Values

Lists must be provided as JSON array strings with double quotes:

# Correct
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".png",".gif"]'

# Incorrect (will not parse)
export IMAGE_TOOL_ALLOWED_EXTENSIONS=".jpg,.png,.gif"

Important: Use single quotes for the shell, double quotes for JSON:

export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'
#                                    ^                      ^
#                                    Single quotes for shell
#                                       ^     ^     ^
#                                       Double quotes for JSON

Validation

Automatic Type Validation

Pydantic automatically validates configuration values:

  • max_file_size_mb must be a positive integer

  • allowed_extensions must be a list of strings

  • tesseract_pool_size must be a positive integer

  • default_ocr_language must be a non-empty string

File Validation

When processing images, the tool validates:

  1. File existence - File must exist at the specified path

  2. File extension - Must be in allowed_extensions list

  3. File size - Must not exceed max_file_size_mb limit

  4. File integrity - Must be a valid image file

Security Validation

The tool includes multiple security layers:

  • Extension whitelist prevents processing unauthorized file types

  • File size limits prevent memory exhaustion

  • Path normalization prevents directory traversal attacks

  • Output path validation prevents overwriting existing files

Tesseract Setup

The Image Tool uses Tesseract OCR for text extraction. Follow these steps to set it up:

Installation

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install tesseract-ocr

macOS:

brew install tesseract

Windows: Download and install from: https://github.com/UB-Mannheim/tesseract/wiki

Language Data

Install additional language packs as needed:

# English (usually included by default)
sudo apt-get install tesseract-ocr-eng

# Chinese
sudo apt-get install tesseract-ocr-chi-sim tesseract-ocr-chi-tra

# Spanish
sudo apt-get install tesseract-ocr-spa

# French
sudo apt-get install tesseract-ocr-fra

Verify Installation

tesseract --version

You should see output like:

tesseract 4.1.1

Using OCR with Different Languages

Method 1: Configure Default Language

Set the default language via environment variable or config:

# Set default to Chinese
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=chi_sim
from aiecs.tools.task_tools.image_tool import ImageTool

image_tool = ImageTool()

# Uses configured default (chi_sim)
text = image_tool.ocr('chinese_image.png')

# Override for specific call
text = image_tool.ocr('english_image.png', lang='eng')

Method 2: Multi-Language Support

Enable multi-language recognition by configuring multiple languages:

# Set default to English + Chinese
export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng+chi_sim
from aiecs.tools.task_tools.image_tool import ImageTool

image_tool = ImageTool()

# Uses configured default (eng+chi_sim) - recognizes both English and Chinese
text = image_tool.ocr('mixed_image.png')

# Use different multi-language combination for specific call
text = image_tool.ocr('image.png', lang='eng+jpn+spa')

Method 3: Per-Call Language Specification

Specify language for each OCR call:

from aiecs.tools.task_tools.image_tool import ImageTool

image_tool = ImageTool()

# English (uses default if not configured)
text = image_tool.ocr('image.png')

# Chinese
text = image_tool.ocr('chinese_image.png', lang='chi_sim')

# Spanish
text = image_tool.ocr('spanish_image.png', lang='spa')

# Multi-language
text = image_tool.ocr('mixed_image.png', lang='eng+chi_sim')

Operations Supported

The Image Tool supports the following operations:

1. Load

Load an image and return its dimensions and color mode.

info = image_tool.load('photo.jpg')
# Returns: {'size': (width, height), 'mode': 'RGB'}

2. OCR (Optical Character Recognition)

Extract text from an image.

text = image_tool.ocr('document.png')
# Returns: extracted text as string

3. Metadata

Retrieve image metadata including EXIF data.

metadata = image_tool.metadata('photo.jpg', include_exif=True)
# Returns: {'size': tuple, 'mode': str, 'exif': dict}

4. Resize

Resize an image to specified dimensions.

result = image_tool.resize('input.jpg', 'output.jpg', width=800, height=600)
# Returns: {'success': True, 'output_path': 'output.jpg'}

5. Filter

Apply image filters (blur, sharpen, edge_enhance).

result = image_tool.filter('input.jpg', 'output.jpg', filter_type='sharpen')
# Returns: {'success': True, 'output_path': 'output.jpg'}

Troubleshooting

Issue: File size validation fails

Error: File too large: 75.3MB, max 50MB

Solution:

# Increase max file size limit
export IMAGE_TOOL_MAX_FILE_SIZE_MB=100

Issue: Extension not allowed

Error: Extension '.webp' not allowed

Solution:

# Add the extension to allowed list (if safe)
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".webp"]'

Issue: Tesseract not found

Error: Tesseract not found; OCR will be disabled

Solution:

# Install Tesseract
sudo apt-get install tesseract-ocr  # Ubuntu/Debian
brew install tesseract              # macOS

# Verify installation
tesseract --version

Issue: OCR returns empty or garbled text

Causes: Poor image quality, wrong language, unsupported format

Solutions:

  1. Ensure image has good contrast and resolution

  2. Configure default language: export IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=chi_sim

  3. Specify correct language per call: ocr('image.png', lang='chi_sim')

  4. Use multi-language format: ocr('image.png', lang='eng+chi_sim')

  5. Pre-process image (increase contrast, remove noise)

  6. Install appropriate language data packs

Issue: Pool size too small for concurrent requests

Error: No Tesseract processes available

Solution:

# Increase pool size
export IMAGE_TOOL_TESSERACT_POOL_SIZE=8

Issue: List parsing error

Error: Configuration parsing fails for allowed_extensions

Solution:

# Use proper JSON array syntax with double quotes
export IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".png"]'

# NOT: ['.jpg','.png'] or .jpg,.png

Issue: Memory issues with large images

Causes: Large image files consuming too much memory

Solutions:

  1. Reduce max_file_size_mb limit

  2. Implement image downsampling before processing

  3. Reduce tesseract_pool_size to free memory

  4. Monitor and increase system memory if needed

Best Practices

Security

  1. Minimize allowed extensions - Only allow file types you actually need

  2. Set conservative file size limits - Use smallest practical value

  3. Validate file content - Don’t trust extensions alone (Pillow handles this)

  4. Sanitize file paths - Tool automatically normalizes paths

  5. Use output path validation - Prevents overwriting existing files

Performance

  1. Tune pool size - Match tesseract_pool_size to expected concurrent OCR load

  2. Optimize file sizes - Compress images before processing

  3. Cache results - Leverage BaseTool’s built-in caching

  4. Monitor resources - Watch memory and CPU usage under load

  5. Use appropriate formats - PNG for text, JPEG for photos

Development vs Production

Development:

IMAGE_TOOL_MAX_FILE_SIZE_MB=100
IMAGE_TOOL_TESSERACT_POOL_SIZE=1
IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png",".bmp",".tiff",".gif"]'
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng

Production:

IMAGE_TOOL_MAX_FILE_SIZE_MB=20
IMAGE_TOOL_TESSERACT_POOL_SIZE=4
IMAGE_TOOL_ALLOWED_EXTENSIONS='[".jpg",".jpeg",".png"]'
IMAGE_TOOL_DEFAULT_OCR_LANGUAGE=eng+chi_sim  # Multi-language support

Error Handling

Always wrap image operations in try-except blocks:

from aiecs.tools.task_tools.image_tool import ImageTool, FileOperationError, SecurityError

image_tool = ImageTool()

try:
    result = image_tool.load('photo.jpg')
except FileOperationError as e:
    print(f"File operation failed: {e}")
except SecurityError as e:
    print(f"Security validation failed: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Support

For issues or questions about Image Tool configuration:

  • Check the tool source code for implementation details

  • Review Pillow and Tesseract documentation for specific functionality

  • Consult the main documentation for architecture overview

  • Test with simple images first to isolate configuration vs. image quality issues