Document Writer Tool Configuration Guide
Overview
The Document Writer Tool provides comprehensive capabilities for writing documents in various formats with production-grade features including atomic writes, content validation, security scanning, automatic backup, versioning, and cloud storage integration. It supports multiple document formats (TXT, JSON, CSV, XML, Markdown, HTML, YAML, PDF, DOCX, XLSX, Binary), various write modes (create, overwrite, append, update, backup_write, version_write, insert, replace, delete), and advanced edit operations. The tool integrates with Google Cloud Storage (GCS) for cloud-based document storage and provides enterprise-level security and validation features. The tool can be configured via environment variables using the DOC_WRITER_ prefix or through programmatic configuration when initializing the tool.
Using .env Files in Your Project
When using aiecs as a dependency in your project, you can store configuration in a .env file for convenience. The Document Writer Tool reads from environment variables that are already loaded into the process, so you need to load the .env file in your application before importing aiecs tools.
Setting Up .env Files
1. Install python-dotenv:
pip install python-dotenv
2. Create a .env file in your project root:
# .env file in your project root
DOC_WRITER_TEMP_DIR=/path/to/temp
DOC_WRITER_BACKUP_DIR=/path/to/backups
DOC_WRITER_OUTPUT_DIR=/path/to/output
DOC_WRITER_MAX_FILE_SIZE=104857600
DOC_WRITER_MAX_BACKUP_VERSIONS=10
DOC_WRITER_DEFAULT_ENCODING=utf-8
DOC_WRITER_ENABLE_BACKUP=true
DOC_WRITER_ENABLE_VERSIONING=true
DOC_WRITER_ENABLE_CONTENT_VALIDATION=true
DOC_WRITER_ENABLE_SECURITY_SCAN=true
DOC_WRITER_ATOMIC_WRITE=true
DOC_WRITER_VALIDATION_LEVEL=basic
DOC_WRITER_TIMEOUT_SECONDS=60
DOC_WRITER_AUTO_BACKUP=true
DOC_WRITER_ATOMIC_WRITES=true
DOC_WRITER_DEFAULT_FORMAT=md
DOC_WRITER_VERSION_CONTROL=true
DOC_WRITER_SECURITY_SCAN=true
DOC_WRITER_ENABLE_CLOUD_STORAGE=true
DOC_WRITER_GCS_BUCKET_NAME=aiecs-documents
DOC_WRITER_GCS_PROJECT_ID=your-project-id
3. Load the .env file in your application:
# main.py or app.py - at the top of your entry point
from dotenv import load_dotenv
# Load environment variables from .env file
# This must be done BEFORE importing aiecs tools
load_dotenv()
# Now import and use aiecs tools
from aiecs.tools.docs.document_writer_tool import DocumentWriterTool
# The tool will automatically use the environment variables
writer_tool = DocumentWriterTool()
Multiple Environment Files
You can use different .env files for different environments:
import os
from dotenv import load_dotenv
# Load environment-specific configuration
env = os.getenv('APP_ENV', 'development')
if env == 'production':
load_dotenv('.env.production')
elif env == 'staging':
load_dotenv('.env.staging')
else:
load_dotenv('.env.development')
from aiecs.tools.docs.document_writer_tool import DocumentWriterTool
writer_tool = DocumentWriterTool()
Example .env.production:
# Production settings - optimized for security and performance
DOC_WRITER_TEMP_DIR=/app/temp/writer
DOC_WRITER_BACKUP_DIR=/app/backups/documents
DOC_WRITER_OUTPUT_DIR=/app/output/documents
DOC_WRITER_MAX_FILE_SIZE=209715200
DOC_WRITER_MAX_BACKUP_VERSIONS=20
DOC_WRITER_DEFAULT_ENCODING=utf-8
DOC_WRITER_ENABLE_BACKUP=true
DOC_WRITER_ENABLE_VERSIONING=true
DOC_WRITER_ENABLE_CONTENT_VALIDATION=true
DOC_WRITER_ENABLE_SECURITY_SCAN=true
DOC_WRITER_ATOMIC_WRITE=true
DOC_WRITER_VALIDATION_LEVEL=enterprise
DOC_WRITER_TIMEOUT_SECONDS=120
DOC_WRITER_AUTO_BACKUP=true
DOC_WRITER_ATOMIC_WRITES=true
DOC_WRITER_DEFAULT_FORMAT=md
DOC_WRITER_VERSION_CONTROL=true
DOC_WRITER_SECURITY_SCAN=true
DOC_WRITER_ENABLE_CLOUD_STORAGE=true
DOC_WRITER_GCS_BUCKET_NAME=prod-aiecs-documents
DOC_WRITER_GCS_PROJECT_ID=production-project-id
Example .env.development:
# Development settings - more permissive for testing
DOC_WRITER_TEMP_DIR=./temp/writer
DOC_WRITER_BACKUP_DIR=./backups/documents
DOC_WRITER_OUTPUT_DIR=./output/documents
DOC_WRITER_MAX_FILE_SIZE=52428800
DOC_WRITER_MAX_BACKUP_VERSIONS=5
DOC_WRITER_DEFAULT_ENCODING=utf-8
DOC_WRITER_ENABLE_BACKUP=false
DOC_WRITER_ENABLE_VERSIONING=false
DOC_WRITER_ENABLE_CONTENT_VALIDATION=false
DOC_WRITER_ENABLE_SECURITY_SCAN=false
DOC_WRITER_ATOMIC_WRITE=true
DOC_WRITER_VALIDATION_LEVEL=none
DOC_WRITER_TIMEOUT_SECONDS=30
DOC_WRITER_AUTO_BACKUP=false
DOC_WRITER_ATOMIC_WRITES=true
DOC_WRITER_DEFAULT_FORMAT=md
DOC_WRITER_VERSION_CONTROL=false
DOC_WRITER_SECURITY_SCAN=false
DOC_WRITER_ENABLE_CLOUD_STORAGE=false
DOC_WRITER_GCS_BUCKET_NAME=dev-aiecs-documents
DOC_WRITER_GCS_PROJECT_ID=development-project-id
Best Practices for .env Files
Never commit .env files to version control - Add
.envto your.gitignore:# .gitignore .env .env.local .env.*.local .env.production .env.staging
Provide a template - Create
.env.examplewith documented dummy values:# .env.example # Document Writer Tool Configuration # Temporary directory for document processing DOC_WRITER_TEMP_DIR=/path/to/temp # Directory for document backups DOC_WRITER_BACKUP_DIR=/path/to/backups # Default output directory for documents DOC_WRITER_OUTPUT_DIR=/path/to/output # Maximum file size in bytes (100MB) DOC_WRITER_MAX_FILE_SIZE=104857600 # Maximum number of backup versions to keep DOC_WRITER_MAX_BACKUP_VERSIONS=10 # Default text encoding for documents DOC_WRITER_DEFAULT_ENCODING=utf-8 # Whether to enable automatic backup functionality DOC_WRITER_ENABLE_BACKUP=true # Whether to enable document versioning DOC_WRITER_ENABLE_VERSIONING=true # Whether to enable content validation DOC_WRITER_ENABLE_CONTENT_VALIDATION=true # Whether to enable security scanning DOC_WRITER_ENABLE_SECURITY_SCAN=true # Whether to use atomic write operations DOC_WRITER_ATOMIC_WRITE=true # Content validation level DOC_WRITER_VALIDATION_LEVEL=basic # Operation timeout in seconds DOC_WRITER_TIMEOUT_SECONDS=60 # Whether to automatically backup before write operations DOC_WRITER_AUTO_BACKUP=true # Whether to use atomic write operations DOC_WRITER_ATOMIC_WRITES=true # Default document format DOC_WRITER_DEFAULT_FORMAT=md # Whether to enable version control DOC_WRITER_VERSION_CONTROL=true # Whether to enable security scanning DOC_WRITER_SECURITY_SCAN=true # Whether to enable cloud storage integration DOC_WRITER_ENABLE_CLOUD_STORAGE=true # Google Cloud Storage bucket name DOC_WRITER_GCS_BUCKET_NAME=aiecs-documents # Google Cloud Storage project ID (optional) DOC_WRITER_GCS_PROJECT_ID=your-project-id
Document your variables - Add comments explaining each setting
Use load_dotenv() early - Call it at the very top of your entry point, before any aiecs imports
Format values correctly:
Strings: Plain text:
utf-8,/path/to/dirIntegers: Plain numbers:
104857600,60Booleans:
trueorfalse
Configuration Options
1. Temp Directory
Environment Variable: DOC_WRITER_TEMP_DIR
Type: String
Default: os.path.join(tempfile.gettempdir(), 'document_writer')
Description: Temporary directory used for document processing operations. This directory stores intermediate files, temporary processing results, and processing artifacts.
Example:
export DOC_WRITER_TEMP_DIR="/app/temp/writer"
Security Note: Ensure the directory has appropriate permissions and is not accessible via web servers.
2. Backup Directory
Environment Variable: DOC_WRITER_BACKUP_DIR
Type: String
Default: os.path.join(tempfile.gettempdir(), 'document_backups')
Description: Directory where document backups are stored. This directory contains backup copies of documents before modifications.
Example:
export DOC_WRITER_BACKUP_DIR="/app/backups/documents"
Backup Strategy: Backups are organized by date and document type for easy retrieval.
3. Output Directory
Environment Variable: DOC_WRITER_OUTPUT_DIR
Type: Optional[String]
Default: None
Description: Default output directory for created documents. When set, documents are written to this directory unless a specific path is provided.
Example:
export DOC_WRITER_OUTPUT_DIR="/app/output/documents"
Organization: Consider organizing by project, date, or document type.
4. Max File Size
Environment Variable: DOC_WRITER_MAX_FILE_SIZE
Type: Integer
Default: 100 * 1024 * 1024 (100MB)
Description: Maximum file size in bytes for document writing operations. Files larger than this will be rejected to prevent memory issues.
Common Values:
50 * 1024 * 1024- 50MB (small documents)100 * 1024 * 1024- 100MB (default)200 * 1024 * 1024- 200MB (large documents)500 * 1024 * 1024- 500MB (very large documents)
Example:
export DOC_WRITER_MAX_FILE_SIZE=209715200
Memory Note: Larger values allow bigger files but use more memory during processing.
5. Max Backup Versions
Environment Variable: DOC_WRITER_MAX_BACKUP_VERSIONS
Type: Integer
Default: 10
Description: Maximum number of backup versions to keep for each document. Older backups are automatically cleaned up.
Common Values:
5- 5 versions (minimal storage)10- 10 versions (default)20- 20 versions (extensive history)50- 50 versions (maximum history)
Example:
export DOC_WRITER_MAX_BACKUP_VERSIONS=20
Storage Note: Higher values provide more history but use more storage space.
6. Default Encoding
Environment Variable: DOC_WRITER_DEFAULT_ENCODING
Type: String
Default: "utf-8"
Description: Default text encoding for document writing operations. This encoding is used when no specific encoding is specified.
Supported Encodings:
utf-8- UTF-8 encoding (default, most common)utf-16- UTF-16 encodingascii- ASCII encodinggbk- GBK encoding (Chinese)auto- Automatic encoding detection
Example:
export DOC_WRITER_DEFAULT_ENCODING=utf-8
Encoding Note: UTF-8 is recommended for international text support.
7. Enable Backup
Environment Variable: DOC_WRITER_ENABLE_BACKUP
Type: Boolean
Default: True
Description: Whether to enable automatic backup functionality. When enabled, the tool creates backup copies before making modifications.
Values:
true- Enable backup functionality (default)false- Disable backup functionality
Example:
export DOC_WRITER_ENABLE_BACKUP=true
Backup Note: Essential for data protection and recovery.
8. Enable Versioning
Environment Variable: DOC_WRITER_ENABLE_VERSIONING
Type: Boolean
Default: True
Description: Whether to enable document versioning. When enabled, the tool tracks document versions and maintains version history.
Values:
true- Enable versioning (default)false- Disable versioning
Example:
export DOC_WRITER_ENABLE_VERSIONING=true
Versioning Note: Provides document history and rollback capabilities.
9. Enable Content Validation
Environment Variable: DOC_WRITER_ENABLE_CONTENT_VALIDATION
Type: Boolean
Default: True
Description: Whether to enable content validation. When enabled, the tool validates document content before writing.
Values:
true- Enable content validation (default)false- Disable content validation
Example:
export DOC_WRITER_ENABLE_CONTENT_VALIDATION=true
Validation Note: Ensures document integrity and format compliance.
10. Enable Security Scan
Environment Variable: DOC_WRITER_ENABLE_SECURITY_SCAN
Type: Boolean
Default: True
Description: Whether to enable security scanning. When enabled, the tool scans documents for security threats and malicious content.
Values:
true- Enable security scanning (default)false- Disable security scanning
Example:
export DOC_WRITER_ENABLE_SECURITY_SCAN=true
Security Note: Essential for enterprise environments and compliance.
11. Atomic Write
Environment Variable: DOC_WRITER_ATOMIC_WRITE
Type: Boolean
Default: True
Description: Whether to use atomic write operations. When enabled, writes are atomic (all-or-nothing) to prevent partial writes.
Values:
true- Enable atomic writes (default)false- Disable atomic writes
Example:
export DOC_WRITER_ATOMIC_WRITE=true
Atomic Note: Prevents data corruption from interrupted writes.
12. Validation Level
Environment Variable: DOC_WRITER_VALIDATION_LEVEL
Type: String
Default: "basic"
Description: Content validation level for document writing operations. Determines the depth and strictness of validation.
Supported Levels:
none- No validationbasic- Basic validation (format, size) - defaultstrict- Strict validation (content, structure)enterprise- Enterprise validation (security, compliance)
Example:
export DOC_WRITER_VALIDATION_LEVEL=strict
Validation Note: Higher levels provide more security but may impact performance.
13. Timeout Seconds
Environment Variable: DOC_WRITER_TIMEOUT_SECONDS
Type: Integer
Default: 60
Description: Operation timeout in seconds for document writing operations. Operations that exceed this timeout will be cancelled.
Common Values:
30- 30 seconds (fast operations)60- 60 seconds (default)120- 120 seconds (slow operations)300- 300 seconds (very slow operations)
Example:
export DOC_WRITER_TIMEOUT_SECONDS=120
Timeout Note: Increase for large files or slow storage systems.
14. Auto Backup
Environment Variable: DOC_WRITER_AUTO_BACKUP
Type: Boolean
Default: True
Description: Whether to automatically backup documents before write operations. When enabled, backups are created automatically.
Values:
true- Enable auto backup (default)false- Disable auto backup
Example:
export DOC_WRITER_AUTO_BACKUP=true
Auto Backup Note: Provides automatic data protection.
15. Atomic Writes
Environment Variable: DOC_WRITER_ATOMIC_WRITES
Type: Boolean
Default: True
Description: Whether to use atomic write operations. This is a duplicate of atomic_write for compatibility.
Values:
true- Enable atomic writes (default)false- Disable atomic writes
Example:
export DOC_WRITER_ATOMIC_WRITES=true
16. Default Format
Environment Variable: DOC_WRITER_DEFAULT_FORMAT
Type: String
Default: "md"
Description: Default document format for writing operations. This format is used when no specific format is specified.
Supported Formats:
txt- Plain text formatjson- JSON formatcsv- CSV formatxml- XML formatmd- Markdown format (default)html- HTML formatyaml- YAML formatpdf- PDF formatdocx- Microsoft Word formatxlsx- Microsoft Excel formatbinary- Binary format
Example:
export DOC_WRITER_DEFAULT_FORMAT=html
Format Note: Choose based on your primary use case.
17. Version Control
Environment Variable: DOC_WRITER_VERSION_CONTROL
Type: Boolean
Default: True
Description: Whether to enable version control. This is a duplicate of enable_versioning for compatibility.
Values:
true- Enable version control (default)false- Disable version control
Example:
export DOC_WRITER_VERSION_CONTROL=true
18. Security Scan
Environment Variable: DOC_WRITER_SECURITY_SCAN
Type: Boolean
Default: True
Description: Whether to enable security scanning. This is a duplicate of enable_security_scan for compatibility.
Values:
true- Enable security scanning (default)false- Disable security scanning
Example:
export DOC_WRITER_SECURITY_SCAN=true
19. Enable Cloud Storage
Environment Variable: DOC_WRITER_ENABLE_CLOUD_STORAGE
Type: Boolean
Default: True
Description: Whether to enable cloud storage integration. When enabled, the tool can store documents in Google Cloud Storage.
Values:
true- Enable cloud storage (default)false- Disable cloud storage
Example:
export DOC_WRITER_ENABLE_CLOUD_STORAGE=true
Cloud Note: Requires proper GCS configuration and credentials.
20. GCS Bucket Name
Environment Variable: DOC_WRITER_GCS_BUCKET_NAME
Type: String
Default: "aiecs-documents"
Description: Google Cloud Storage bucket name for storing documents. This bucket is used for cloud-based document storage.
Example:
export DOC_WRITER_GCS_BUCKET_NAME="my-document-bucket"
Bucket Requirements:
Bucket must exist and be accessible
Proper permissions must be configured
Bucket name must be globally unique
21. GCS Project ID
Environment Variable: DOC_WRITER_GCS_PROJECT_ID
Type: Optional[String]
Default: None
Description: Google Cloud Storage project ID for authentication and billing. This is optional if using default project credentials.
Example:
export DOC_WRITER_GCS_PROJECT_ID="my-gcp-project"
Authentication Note: Can be omitted if using default project credentials or service account.
Usage Examples
Example 1: Basic Environment Configuration
# Set basic writing parameters
export DOC_WRITER_TEMP_DIR="/app/temp/writer"
export DOC_WRITER_BACKUP_DIR="/app/backups/documents"
export DOC_WRITER_MAX_FILE_SIZE=104857600
export DOC_WRITER_DEFAULT_ENCODING=utf-8
export DOC_WRITER_ATOMIC_WRITE=true
# Run your application
python app.py
Example 2: Enterprise Configuration
# Optimized for enterprise use
export DOC_WRITER_ENABLE_BACKUP=true
export DOC_WRITER_ENABLE_VERSIONING=true
export DOC_WRITER_ENABLE_CONTENT_VALIDATION=true
export DOC_WRITER_ENABLE_SECURITY_SCAN=true
export DOC_WRITER_VALIDATION_LEVEL=enterprise
export DOC_WRITER_MAX_BACKUP_VERSIONS=20
export DOC_WRITER_ENABLE_CLOUD_STORAGE=true
export DOC_WRITER_GCS_BUCKET_NAME="enterprise-documents"
Example 3: Development Configuration
# Development-friendly settings
export DOC_WRITER_TEMP_DIR="./temp/writer"
export DOC_WRITER_BACKUP_DIR="./backups/documents"
export DOC_WRITER_MAX_FILE_SIZE=52428800
export DOC_WRITER_ENABLE_BACKUP=false
export DOC_WRITER_ENABLE_VERSIONING=false
export DOC_WRITER_ENABLE_CONTENT_VALIDATION=false
export DOC_WRITER_ENABLE_SECURITY_SCAN=false
export DOC_WRITER_VALIDATION_LEVEL=none
export DOC_WRITER_ENABLE_CLOUD_STORAGE=false
Example 4: Programmatic Configuration
from aiecs.tools.docs.document_writer_tool import DocumentWriterTool
# Initialize with custom configuration
writer_tool = DocumentWriterTool(config={
'temp_dir': '/app/temp/writer',
'backup_dir': '/app/backups/documents',
'output_dir': '/app/output/documents',
'max_file_size': 104857600,
'max_backup_versions': 10,
'default_encoding': 'utf-8',
'enable_backup': True,
'enable_versioning': True,
'enable_content_validation': True,
'enable_security_scan': True,
'atomic_write': True,
'validation_level': 'basic',
'timeout_seconds': 60,
'auto_backup': True,
'atomic_writes': True,
'default_format': 'md',
'version_control': True,
'security_scan': True,
'enable_cloud_storage': True,
'gcs_bucket_name': 'my-document-bucket',
'gcs_project_id': 'my-gcp-project'
})
Example 5: Mixed Configuration
Environment variables are used as defaults, but can be overridden programmatically:
# Set environment defaults
export DOC_WRITER_MAX_FILE_SIZE=104857600
export DOC_WRITER_ENABLE_BACKUP=true
# Override for specific instance
writer_tool = DocumentWriterTool(config={
'max_file_size': 209715200, # This overrides the environment variable
'enable_backup': False # This overrides the environment variable
})
Configuration Priority
When the Document Writer Tool is initialized, configuration values are resolved in the following order (highest to lowest priority):
Programmatic config - Values passed to the constructor
Environment variables - Values set via
DOC_WRITER_*variablesDefault values - Built-in defaults as specified above
Data Type Parsing
String Values
Strings should be provided as plain text without quotes:
export DOC_WRITER_DEFAULT_ENCODING=utf-8
export DOC_WRITER_TEMP_DIR=/path/to/temp
Integer Values
Integers should be provided as numeric strings:
export DOC_WRITER_MAX_FILE_SIZE=104857600
export DOC_WRITER_TIMEOUT_SECONDS=60
Boolean Values
Booleans should be provided as lowercase strings:
export DOC_WRITER_ENABLE_BACKUP=true
export DOC_WRITER_ATOMIC_WRITE=false
Optional Values
Optional values can be omitted or set to empty string:
# Omit optional value
# DOC_WRITER_OUTPUT_DIR not set
# DOC_WRITER_GCS_PROJECT_ID not set
# Or set to empty string
export DOC_WRITER_OUTPUT_DIR=""
export DOC_WRITER_GCS_PROJECT_ID=""
Validation
Automatic Type Validation
Pydantic’s BaseSettings automatically validates configuration values:
temp_dirmust be a non-empty stringbackup_dirmust be a non-empty stringoutput_dirmust be a string or Nonemax_file_sizemust be a positive integermax_backup_versionsmust be a positive integerdefault_encodingmust be a valid encoding stringAll boolean fields must be boolean values
validation_levelmust be a valid validation leveltimeout_secondsmust be a positive integerdefault_formatmust be a valid format stringgcs_bucket_namemust be a non-empty stringgcs_project_idmust be a string or None
Runtime Validation
When writing documents, the tool validates:
Directory accessibility - Temp, backup, and output directories must be accessible
File size limits - Documents must not exceed max_file_size
Cloud storage - GCS bucket must be accessible if enabled
Content validation - Document content must pass validation if enabled
Security scanning - Documents must pass security scan if enabled
Document Formats
The Document Writer Tool supports various document formats:
Text Formats
TXT - Plain text format
JSON - JavaScript Object Notation
CSV - Comma-Separated Values
XML - Extensible Markup Language
Markdown - Markdown format
HTML - HyperText Markup Language
YAML - YAML Ain’t Markup Language
Document Formats
PDF - Portable Document Format
DOCX - Microsoft Word format
XLSX - Microsoft Excel format
Binary Formats
Binary - Raw binary data
Write Modes
Basic Modes
Create - Create new file, fail if exists
Overwrite - Overwrite existing file
Append - Append to existing file
Update - Update existing file (smart merge)
Advanced Modes
Backup Write - Backup before writing
Version Write - Versioned writing
Insert - Insert at specified position
Replace - Replace specified content
Delete - Delete specified content
Edit Operations
Text Formatting
Bold - Bold text formatting
Italic - Italic text formatting
Underline - Underline text formatting
Strikethrough - Strikethrough text formatting
Highlight - Highlight text formatting
Text Operations
Insert Text - Insert text at position
Delete Text - Delete specified text
Replace Text - Replace specified text
Copy Text - Copy text to clipboard
Cut Text - Cut text to clipboard
Paste Text - Paste text from clipboard
Advanced Operations
Find Replace - Find and replace text
Insert Line - Insert new line
Delete Line - Delete specified line
Move Line - Move line to new position
Encoding Types
Standard Encodings
UTF-8 - UTF-8 encoding (default, most common)
UTF-16 - UTF-16 encoding
ASCII - ASCII encoding
GBK - GBK encoding (Chinese)
Special Encodings
Auto - Automatic encoding detection
Validation Levels
Validation Types
None - No validation
Basic - Basic validation (format, size)
Strict - Strict validation (content, structure)
Enterprise - Enterprise validation (security, compliance)
Cloud Storage
Google Cloud Storage Integration
The Document Writer Tool supports Google Cloud Storage for:
Document Storage - Store documents in cloud storage
Backup Storage - Store backups in cloud storage
Version Storage - Store document versions in cloud storage
Distributed Access - Access documents from multiple locations
GCS Configuration
Required Setup:
Create a GCS bucket
Configure authentication (service account or default credentials)
Set appropriate permissions
Configure the tool with bucket name and project ID
Authentication Methods:
Service Account Key
Default Application Credentials
Workload Identity
User Account Credentials
Cloud Storage Benefits
Scalability - Handle large volumes of documents
Reliability - High availability and durability
Performance - Fast access to stored documents
Cost Efficiency - Pay only for storage used
Operations Supported
The Document Writer Tool supports comprehensive document writing operations:
Basic Writing
write_document- Write document to filewrite_text- Write text contentwrite_json- Write JSON contentwrite_csv- Write CSV contentwrite_xml- Write XML contentwrite_markdown- Write Markdown contentwrite_html- Write HTML contentwrite_yaml- Write YAML content
Advanced Writing
write_with_backup- Write with automatic backupwrite_with_versioning- Write with version controlwrite_atomic- Atomic write operationwrite_secure- Write with security validationwrite_cloud- Write to cloud storage
Document Operations
create_document- Create new documentupdate_document- Update existing documentappend_document- Append to documentoverwrite_document- Overwrite documentdelete_document- Delete document
Edit Operations
edit_text- Edit text contentformat_text- Format text (bold, italic, etc.)find_replace- Find and replace textinsert_content- Insert content at positiondelete_content- Delete specified content
Backup and Versioning
create_backup- Create document backuprestore_backup- Restore from backuplist_backups- List available backupscreate_version- Create document versionlist_versions- List document versionsrestore_version- Restore document version
Validation and Security
validate_content- Validate document contentscan_security- Scan for security threatscheck_permissions- Check write permissionsvalidate_format- Validate document format
Cloud Operations
upload_to_cloud- Upload document to clouddownload_from_cloud- Download document from cloudsync_with_cloud- Sync with cloud storagelist_cloud_documents- List cloud documents
Batch Operations
batch_write- Write multiple documentsbatch_backup- Backup multiple documentsbatch_validate- Validate multiple documentsbatch_upload- Upload multiple documents
Troubleshooting
Issue: Directory not accessible
Error: PermissionError when accessing directories
Solutions:
# Set accessible directories
export DOC_WRITER_TEMP_DIR="/accessible/temp/path"
export DOC_WRITER_BACKUP_DIR="/accessible/backup/path"
export DOC_WRITER_OUTPUT_DIR="/accessible/output/path"
# Or create directories with proper permissions
mkdir -p /path/to/directories
chmod 755 /path/to/directories
Issue: File too large
Error: WriteError for files exceeding size limit
Solutions:
# Increase file size limit
export DOC_WRITER_MAX_FILE_SIZE=209715200
# Or use cloud storage for large files
export DOC_WRITER_ENABLE_CLOUD_STORAGE=true
Issue: Backup creation fails
Error: StorageError during backup operations
Solutions:
Check backup directory permissions
Ensure sufficient disk space
Verify backup directory path
Check backup version limits
Issue: Validation fails
Error: ValidationError during content validation
Solutions:
# Disable validation for testing
export DOC_WRITER_ENABLE_CONTENT_VALIDATION=false
export DOC_WRITER_VALIDATION_LEVEL=none
# Or use less strict validation
export DOC_WRITER_VALIDATION_LEVEL=basic
Issue: Security scan fails
Error: SecurityError during security scanning
Solutions:
# Disable security scanning for testing
export DOC_WRITER_ENABLE_SECURITY_SCAN=false
export DOC_WRITER_SECURITY_SCAN=false
# Or check security scan configuration
Issue: Cloud storage not working
Error: GCS integration fails
Solutions:
Verify GCS credentials
Check bucket permissions
Ensure bucket exists
Verify project ID
# Disable cloud storage if not needed
export DOC_WRITER_ENABLE_CLOUD_STORAGE=false
Issue: Atomic write fails
Error: WriteError during atomic operations
Solutions:
# Disable atomic writes for testing
export DOC_WRITER_ATOMIC_WRITE=false
export DOC_WRITER_ATOMIC_WRITES=false
# Or check file system support for atomic operations
Issue: Timeout errors
Error: Operations timeout
Solutions:
# Increase timeout
export DOC_WRITER_TIMEOUT_SECONDS=120
# Or optimize file size and operations
export DOC_WRITER_MAX_FILE_SIZE=52428800
Best Practices
Performance Optimization
File Size Management - Set appropriate file size limits
Timeout Configuration - Configure timeouts based on operations
Cloud Storage Usage - Use cloud storage for large files
Backup Strategy - Implement efficient backup strategies
Batch Operations - Use batch operations for multiple documents
Data Protection
Backup Strategy - Enable automatic backups
Version Control - Use versioning for important documents
Atomic Operations - Use atomic writes for data integrity
Validation - Enable content validation
Security Scanning - Enable security scanning
Error Handling
Graceful Degradation - Handle write failures gracefully
Retry Logic - Implement retry for transient failures
Fallback Strategies - Provide fallback write methods
Error Logging - Log errors for debugging
User Feedback - Provide clear error messages
Security
Content Validation - Validate all document content
Security Scanning - Scan for security threats
Access Control - Control access to directories
Cloud Security - Secure cloud storage access
Input Sanitization - Sanitize all inputs
Resource Management
Memory Usage - Monitor memory consumption
Disk Space - Manage temp and backup directories
Network Usage - Optimize cloud operations
Processing Time - Set reasonable timeouts
Cleanup - Regular cleanup of temp files
Integration
Tool Dependencies - Ensure required tools are available
API Compatibility - Maintain API compatibility
Error Propagation - Properly propagate errors
Logging Integration - Integrate with logging systems
Monitoring - Monitor tool performance
Development vs Production
Development:
DOC_WRITER_TEMP_DIR=./temp/writer
DOC_WRITER_BACKUP_DIR=./backups/documents
DOC_WRITER_OUTPUT_DIR=./output/documents
DOC_WRITER_MAX_FILE_SIZE=52428800
DOC_WRITER_MAX_BACKUP_VERSIONS=5
DOC_WRITER_ENABLE_BACKUP=false
DOC_WRITER_ENABLE_VERSIONING=false
DOC_WRITER_ENABLE_CONTENT_VALIDATION=false
DOC_WRITER_ENABLE_SECURITY_SCAN=false
DOC_WRITER_VALIDATION_LEVEL=none
DOC_WRITER_TIMEOUT_SECONDS=30
DOC_WRITER_ENABLE_CLOUD_STORAGE=false
Production:
DOC_WRITER_TEMP_DIR=/app/temp/writer
DOC_WRITER_BACKUP_DIR=/app/backups/documents
DOC_WRITER_OUTPUT_DIR=/app/output/documents
DOC_WRITER_MAX_FILE_SIZE=209715200
DOC_WRITER_MAX_BACKUP_VERSIONS=20
DOC_WRITER_ENABLE_BACKUP=true
DOC_WRITER_ENABLE_VERSIONING=true
DOC_WRITER_ENABLE_CONTENT_VALIDATION=true
DOC_WRITER_ENABLE_SECURITY_SCAN=true
DOC_WRITER_VALIDATION_LEVEL=enterprise
DOC_WRITER_TIMEOUT_SECONDS=120
DOC_WRITER_ENABLE_CLOUD_STORAGE=true
DOC_WRITER_GCS_BUCKET_NAME=prod-documents
DOC_WRITER_GCS_PROJECT_ID=production-project
Error Handling
Always wrap document writing operations in try-except blocks:
from aiecs.tools.docs.document_writer_tool import DocumentWriterTool, DocumentWriterError, WriteError, ValidationError, SecurityError, WritePermissionError, ContentValidationError, StorageError
writer_tool = DocumentWriterTool()
try:
result = writer_tool.write_document(
content="Hello, World!",
file_path="document.txt",
format="txt",
mode="create"
)
except WriteError as e:
print(f"Write operation failed: {e}")
except ValidationError as e:
print(f"Validation failed: {e}")
except SecurityError as e:
print(f"Security scan failed: {e}")
except WritePermissionError as e:
print(f"Write permission denied: {e}")
except ContentValidationError as e:
print(f"Content validation failed: {e}")
except StorageError as e:
print(f"Storage operation failed: {e}")
except DocumentWriterError as e:
print(f"Document writer error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
Dependencies
Core Dependencies
# Install core dependencies
pip install pydantic pydantic-settings python-dotenv
# Install document processing dependencies
pip install python-docx openpyxl python-pptx
# Install PDF processing dependencies
pip install reportlab
# Install cloud storage dependencies
pip install google-cloud-storage
Optional Dependencies
# For advanced document processing
pip install PyPDF2 pdfplumber
# For image processing
pip install pillow
# For advanced validation
pip install jsonschema
# For security scanning
pip install python-magic
Verification
# Test dependency availability
try:
import pydantic
from pydantic_settings import BaseSettings
import docx
import openpyxl
import reportlab
print("Core dependencies available")
except ImportError as e:
print(f"Missing dependency: {e}")
# Test cloud storage availability
try:
from google.cloud import storage
print("Cloud storage available")
except ImportError:
print("Cloud storage not available")
# Test document processing availability
try:
import docx
import openpyxl
import reportlab
print("Document processing available")
except ImportError as e:
print(f"Document processing not available: {e}")
Support
For issues or questions about Document Writer Tool configuration:
Check the tool source code for implementation details
Review external tool documentation for specific features
Consult the main aiecs documentation for architecture overview
Test with simple documents first to isolate configuration vs. writing issues
Monitor directory permissions and disk space
Verify cloud storage configuration and credentials
Ensure proper file size and timeout limits
Check validation and security scan settings
Validate document format support