APISource Tool - Configuration Reference

Table of Contents

Configuration Overview
Configuration Parameters
API Credentials
Performance Settings
Feature Flags
Provider-Specific Configuration
Environment Variables
Configuration Examples
Validation and Testing

1. Configuration Overview

1.1 Configuration Methods

The APISource Tool supports multiple configuration methods:

Dictionary Configuration:

from aiecs.tools.apisource import APISourceTool

config = {
    'fred_api_key': 'YOUR_KEY',
    'cache_ttl': 300,
    'enable_fallback': True
}
tool = APISourceTool(config)

Environment Variables:

import os
os.environ['APISOURCE_FRED_API_KEY'] = 'YOUR_KEY'
os.environ['APISOURCE_CACHE_TTL'] = '300'

tool = APISourceTool()  # Auto-loads from environment

Configuration File:

import json

with open('apisource_config.json') as f:
    config = json.load(f)

tool = APISourceTool(config)

Pydantic Model:

from aiecs.tools.apisource.tool import Config

config = Config(
    fred_api_key='YOUR_KEY',
    cache_ttl=300,
    enable_fallback=True
)
tool = APISourceTool(config)

1.2 Configuration Priority

When multiple configuration sources are present, the priority is:

Explicit parameters (highest priority)
Configuration dictionary/object
Environment variables
Default values (lowest priority)

2. Configuration Parameters

2.1 Complete Parameter Reference

Parameter	Type	Default	Description
`fred_api_key`	str	None	FRED API key
`newsapi_api_key`	str	None	News API key
`guardian_api_key`	str	None	The Guardian API key
`census_api_key`	str	None	Census Bureau API key
`congress_api_key`	str	None	Congress.gov API key
`cache_ttl`	int	300	Cache TTL in seconds
`default_timeout`	int	30	Request timeout in seconds
`max_retries`	int	3	Maximum retry attempts
`enable_rate_limiting`	bool	True	Enable rate limiting
`enable_fallback`	bool	True	Enable provider fallback
`enable_data_fusion`	bool	True	Enable data fusion
`enable_query_enhancement`	bool	True	Enable query enhancement
`enable_intelligent_cache`	bool	True	Enable intelligent caching
`log_level`	str	‘INFO’	Logging level
`metrics_enabled`	bool	True	Enable metrics collection

2.2 Parameter Details

cache_ttl

Type: Integer
Default: 300 (5 minutes)
Range: 0-86400 (0 = no cache, 86400 = 24 hours)
Description: Time-to-live for cached results in seconds
Recommendation:
- Development: 60-300 seconds
- Production: 300-3600 seconds
- High-frequency data: 60-300 seconds
- Static data: 3600-86400 seconds

default_timeout

Type: Integer
Default: 30 seconds
Range: 5-300 seconds
Description: Maximum time to wait for API response
Recommendation:
- Fast APIs (FRED, News): 10-30 seconds
- Slow APIs (World Bank): 30-60 seconds
- Batch operations: 60-120 seconds

max_retries

Type: Integer
Default: 3
Range: 0-10
Description: Maximum number of retry attempts for failed requests
Recommendation:
- Production: 3-5 retries
- Development: 1-2 retries
- Critical operations: 5-10 retries

enable_rate_limiting

Type: Boolean
Default: True
Description: Enable automatic rate limiting to prevent API quota exhaustion
Recommendation: Always True in production

enable_fallback

Type: Boolean
Default: True
Description: Enable automatic failover to alternative providers
Recommendation: True for high-availability applications

enable_data_fusion

Type: Boolean
Default: True
Description: Enable intelligent merging of multi-provider results
Recommendation: True for search operations

enable_query_enhancement

Type: Boolean
Default: True
Description: Enable automatic parameter completion from query text
Recommendation: True for AI agent integration

enable_intelligent_cache

Type: Boolean
Default: True
Description: Enable intent-aware cache TTL strategies
Recommendation: True for optimal performance

3. API Credentials

3.1 FRED API Key

Obtaining the Key:

Visit https://fred.stlouisfed.org/docs/api/api_key.html
Register for a free account
Request an API key

Configuration:

# Method 1: Direct configuration
tool = APISourceTool({'fred_api_key': 'YOUR_FRED_KEY'})

# Method 2: Environment variable
export APISOURCE_FRED_API_KEY="YOUR_FRED_KEY"

# Method 3: Configuration file
{
    "fred_api_key": "YOUR_FRED_KEY"
}

Rate Limits:

Free tier: 120 requests per minute
No daily limit

3.2 News API Key

Obtaining the Key:

Visit https://newsapi.org/register
Choose a plan (Free tier available)
Get your API key

Configuration:

tool = APISourceTool({'newsapi_api_key': 'YOUR_NEWS_KEY'})

Rate Limits:

Free tier: 100 requests per day
Developer tier: 250 requests per day
Business tier: 250,000 requests per day

3.3 The Guardian API Key

Obtaining the Key:

Visit https://open-platform.theguardian.com/access/
Register for a free account
Request an API key

Configuration:

# Method 1: Direct configuration
tool = APISourceTool({'guardian_api_key': 'YOUR_GUARDIAN_KEY'})

# Method 2: Environment variable
export GUARDIAN_API_KEY="YOUR_GUARDIAN_KEY"

# Method 3: Configuration file
{
    "guardian_api_key": "YOUR_GUARDIAN_KEY"
}

Rate Limits:

Free tier: 5,000 requests per day
Developer tier: 15,000 requests per day
Higher tiers available for commercial use

Important API Rules:

API Key Required: All API requests require an API key
Rate Limiting: Free tier allows 5,000 requests per day
Attribution: Must acknowledge The Guardian when displaying content
Data Freshness: Content is updated in real-time
Commercial Use: Contact The Guardian for commercial licensing

API Documentation:

API Documentation: https://open-platform.theguardian.com/documentation/
Content API: https://open-platform.theguardian.com/documentation/search
Tags API: https://open-platform.theguardian.com/documentation/tag
Sections API: https://open-platform.theguardian.com/documentation/section

Available Operations:

search_content: Search all Guardian content with advanced filtering
get_item: Get a specific content item by ID
get_tags: Get all tags or filter by type
search_tags: Search for tags by query
get_sections: Get all Guardian sections
get_edition: Get content for a specific edition (UK, US, AU, International)

Example Usage:

# Search for articles about climate change
result = tool.query(
    provider='guardian',
    operation='search_content',
    params={
        'q': 'climate change',
        'section': 'environment',
        'page_size': 10,
        'show_fields': 'headline,body,thumbnail'
    }
)

# Get all sections
result = tool.query(
    provider='guardian',
    operation='get_sections',
    params={}
)

# Search for tags
result = tool.query(
    provider='guardian',
    operation='search_tags',
    params={'q': 'technology', 'page_size': 10}
)

# Get US edition content
result = tool.query(
    provider='guardian',
    operation='get_edition',
    params={'edition': 'us', 'page_size': 20}
)

3.4 Census Bureau API Key

Obtaining the Key:

Visit https://api.census.gov/data/key_signup.html
Fill out the request form
Receive key via email

Configuration:

tool = APISourceTool({'census_api_key': 'YOUR_CENSUS_KEY'})

Rate Limits:

500 requests per IP per day (without key)
Higher limits with API key

3.4 Congress.gov API

API Key Required:

Visit https://api.congress.gov/sign-up/
Fill out the registration form
Receive key via email

Configuration:

tool = APISourceTool({'congress_api_key': 'YOUR_CONGRESS_KEY'})

Rate Limits:

Reasonable usage limits with API key
Data updated regularly from official sources

Available Operations:

search_bills: Search for bills and resolutions
get_bill: Get detailed bill information
list_members: List members of Congress
get_member: Get member details
list_committees: List congressional committees
get_committee: Get committee details
search_amendments: Search for amendments
get_amendment: Get amendment details

3.5 OpenStates API

API Key Required:

config = {
    'openstates_api_key': 'YOUR_API_KEY'
}
tool = APISourceTool(config)

Configuration:

config = {
    'openstates_api_key': 'YOUR_API_KEY',
    'openstates_config': {
        'timeout': 30,
        'rate_limit': 10,
        'max_burst': 20
    }
}
tool = APISourceTool(config)

Environment Variables:

export OPENSTATES_API_KEY="your_api_key_here"
export OPENSTATES_TIMEOUT=30
export OPENSTATES_RATE_LIMIT=10
export OPENSTATES_MAX_BURST=20

Obtaining an API Key:

Visit https://openstates.org/accounts/profile/
Register for a free account
Generate an API key from your profile
Copy the API key to your configuration

Rate Limits:

Free tier: Reasonable usage limits
Be respectful of the free service
Recommended: Max 10 requests per second

Important API Rules:

API Key Required: Must register for a free API key
Rate Limiting: Be respectful - implement reasonable delays between requests
Attribution: Acknowledge OpenStates.org when using the data
Data Freshness: Data is updated regularly from official state sources

API Documentation:

API v3 Documentation: https://docs.openstates.org/api-v3/
Interactive API Docs: https://v3.openstates.org/docs/
About OpenStates: https://openstates.org/about/

Available Operations:

search_bills: Search for state bills and resolutions with advanced filtering
get_bill: Get detailed information about a specific bill by ID
search_people: Search for state legislators with filtering options
get_person: Get detailed information about a specific legislator
list_jurisdictions: List all available state jurisdictions
get_jurisdiction: Get detailed information about a specific jurisdiction

Example Usage:

# Search for bills in California
result = tool.query(
    provider='openstates',
    operation='search_bills',
    params={'jurisdiction': 'CA', 'session': '2023', 'per_page': 10}
)

# Get current legislators from Texas
result = tool.query(
    provider='openstates',
    operation='search_people',
    params={'jurisdiction': 'TX', 'current': True, 'per_page': 10}
)

# List all available jurisdictions
result = tool.query(
    provider='openstates',
    operation='list_jurisdictions',
    params={'per_page': 52}
)

3.6 World Bank API

No API Key Required:

# World Bank API is publicly accessible
tool = APISourceTool()  # No key needed for World Bank

Rate Limits:

No official rate limit
Recommended: Max 10 requests per second

3.7 Alpha Vantage API Key

Obtaining the Key:

Visit https://www.alphavantage.co/support/#api-key
Register for a free account
Get your API key

Configuration:

tool = APISourceTool({'alphavantage_api_key': 'YOUR_ALPHAVANTAGE_KEY'})

Rate Limits:

Free tier: 5 API requests per minute, 500 per day
Premium tiers available with higher limits

3.8 REST Countries API

No API Key Required:

# REST Countries API is publicly accessible
tool = APISourceTool()  # No key needed for REST Countries

Rate Limits:

No official rate limit
Recommended: Max 10 requests per second

3.9 ExchangeRate-API

No API Key Required (Free Tier):

# ExchangeRate-API free tier works without key
tool = APISourceTool()  # No key needed for free tier

Optional API Key for Enhanced Features:

tool = APISourceTool({'exchangerate_api_key': 'YOUR_EXCHANGERATE_KEY'})

Rate Limits:

Free tier: 1,500 requests per month
Standard tier: Higher limits with API key

3.10 Open Library API

No API Key Required:

# Open Library API is completely free and open
tool = APISourceTool()  # No key needed for Open Library

Rate Limits:

No official rate limit
Recommended: Max 10 requests per second
Be respectful of the free service

3.11 Metropolitan Museum of Art (The Met) API

No API Key Required:

# The Met Museum API is completely free and open
tool = APISourceTool()  # No key needed for Met Museum

Configuration (Optional):

config = {
    'metmuseum_config': {
        'timeout': 30,
        'rate_limit': 10,  # Requests per second
        'max_burst': 20    # Maximum burst size
    }
}
tool = APISourceTool(config)

Environment Variables:

export METMUSEUM_TIMEOUT=30
export METMUSEUM_RATE_LIMIT=10
export METMUSEUM_MAX_BURST=20

Rate Limits:

No official rate limit
Recommended: Max 10 requests per second
Be respectful of the free service

Important API Rules:

No API Key Required: Completely free and open access
Rate Limiting: Be respectful - implement reasonable delays between requests
Data Coverage: Access to 470,000+ artworks from The Met collection
Images: Many objects include high-resolution images (check isPublicDomain flag)
Attribution: Acknowledge The Metropolitan Museum of Art when using the data

API Documentation:

API Documentation: https://metmuseum.github.io/
GitHub Repository: https://github.com/metmuseum/openaccess
Open Access Initiative: https://www.metmuseum.org/about-the-met/policies-and-documents/open-access

Supported Operations:

search_objects - Search for art objects with advanced filtering (query, department, date range, etc.)
get_object - Get detailed information about a specific art object by ID
get_departments - Get list of all departments at The Met
get_objects_by_department - Get all objects in a specific department
search_by_artist - Search for artworks by artist name
search_by_medium - Search for artworks by medium (Paintings, Sculpture, etc.)
search_by_culture - Search for artworks by culture or civilization
search_highlight_objects - Search for highlighted/featured objects
download_image - Download high-resolution images from The Met collection

Example Usage:

# Search for artworks by Van Gogh
result = tool.query(
    provider='metmuseum',
    operation='search_by_artist',
    params={'artist_name': 'Vincent van Gogh', 'has_images': True, 'limit': 10}
)

# Get detailed object information
result = tool.query(
    provider='metmuseum',
    operation='get_object',
    params={'object_id': 436535}  # Wheat Field with Cypresses
)

# Search for Egyptian art
result = tool.query(
    provider='metmuseum',
    operation='search_by_culture',
    params={'culture': 'Egyptian', 'has_images': True, 'limit': 20}
)

# Get all departments
result = tool.query(
    provider='metmuseum',
    operation='get_departments',
    params={}
)

# Search with date range filter
result = tool.query(
    provider='metmuseum',
    operation='search_objects',
    params={
        'q': 'impressionism',
        'has_images': True,
        'date_begin': 1860,
        'date_end': 1900,
        'limit': 15
    }
)

# Download image by object ID
result = tool.query(
    provider='metmuseum',
    operation='download_image',
    params={'object_id': 436535}  # Downloads primary image
)
print(f"Image saved to: {result['data']['output_path']}")

# Download image by direct URL
result = tool.query(
    provider='metmuseum',
    operation='download_image',
    params={
        'image_url': 'https://images.metmuseum.org/CRDImages/ep/original/DP-42549-001.jpg',
        'output_path': './artwork.jpg'  # Optional custom path
    }
)

Data Fields Available:

Object metadata: title, artist, date, medium, dimensions
Department and classification information
Geographic and cultural origin
High-resolution images (primaryImage, additionalImages)
Exhibition history and provenance
Related artworks and references
Public domain status (isPublicDomain)
Gallery information (isOnView, GalleryNumber)

3.12 CoinGecko API

No API Key Required:

# CoinGecko API is free for basic usage
tool = APISourceTool()  # No key needed for free tier

Rate Limits:

Free tier: 10-50 calls/minute (varies by endpoint)
Pro tier available with API key for higher limits

3.12 OpenWeatherMap API

Obtaining the Key:

Visit https://openweathermap.org/api
Sign up for a free account
Generate an API key from your account dashboard

Configuration:

tool = APISourceTool({'openweathermap_api_key': 'YOUR_OPENWEATHERMAP_KEY'})

Rate Limits:

Free tier: 60 calls/minute, 1,000,000 calls/month
Various paid tiers available

3.13 Wikipedia API

No API Key Required:

# Wikipedia API is completely free and open
tool = APISourceTool()  # No key needed for Wikipedia

Configuration with User-Agent (REQUIRED):

config = {
    'wikipedia_config': {
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
    }
}
tool = APISourceTool(config)

Rate Limits:

Maximum: 200 requests per second
Recommended: 10 requests per second (default in configuration)
Be respectful of the free service

API Rules (https://www.mediawiki.org/wiki/API:Etiquette):

User-Agent Header REQUIRED: Must include a unique User-Agent header with:
- Application name and version
- Contact URL or email address
- Format: "AppName/Version (URL; contact@email.com)"
Rate Limiting: Limit to 200 requests/second maximum
Caching: Cache responses when possible to reduce load

API Documentation:

MediaWiki Action API: https://www.mediawiki.org/wiki/API:Main_page
REST API: https://en.wikipedia.org/api/rest_v1/
API Etiquette: https://www.mediawiki.org/wiki/API:Etiquette

3.14 GitHub API

API Key Recommended:

config = {
    'github_api_key': 'YOUR_GITHUB_TOKEN'
}
tool = APISourceTool(config)

Environment Variable:

export GITHUB_API_KEY="your_github_personal_access_token"

Rate Limits:

Authenticated: 5,000 requests per hour
Unauthenticated: 60 requests per hour
Strongly recommended to use authentication for higher limits

Obtaining an API Key:

Visit https://github.com/settings/tokens
Click “Generate new token” → “Generate new token (classic)”
Select scopes based on your needs:
- public_repo - Access public repositories
- repo - Full control of private repositories (if needed)
- user - Read user profile data
Generate and copy the token

API Documentation:

REST API: https://docs.github.com/en/rest
Authentication: https://docs.github.com/en/rest/authentication
Rate Limiting: https://docs.github.com/en/rest/rate-limit

3.13 arXiv API

No API Key Required:

# arXiv API is completely free and open
tool = APISourceTool()  # No key needed for arXiv

Configuration (Optional):

config = {
    'arxiv_config': {
        'timeout': 30,
        'rate_limit': 0.33,  # ~3 second delays between requests (1/3 req/s)
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
    }
}
tool = APISourceTool(config)

Important API Rules:

Rate Limiting: Be respectful - implement 3 second delays between requests
Max Results: Limited to 30,000 results in slices of at most 2,000 at a time
Caching: Cache responses when possible to reduce server load
User-Agent: Set a descriptive User-Agent header

API Documentation:

API User Manual: https://info.arxiv.org/help/api/user-manual.html
API Basics: https://info.arxiv.org/help/api/basics.html
arXiv Categories: https://arxiv.org/category_taxonomy

3.14 PubMed/NCBI E-utilities API

API Key Optional but Recommended:

# Works without API key (3 requests/second limit)
tool = APISourceTool()

# With API key (10 requests/second limit)
config = {
    'pubmed_api_key': 'YOUR_PUBMED_API_KEY'
}
tool = APISourceTool(config)

Environment Variable:

export PUBMED_API_KEY="your_ncbi_api_key"

Configuration (Optional):

config = {
    'pubmed_config': {
        'api_key': 'YOUR_API_KEY',  # Optional but recommended
        'timeout': 30,
        'rate_limit': 3,  # 3 req/s without key, 10 with key
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
    }
}
tool = APISourceTool(config)

Rate Limits:

Without API key: 3 requests per second
With API key: 10 requests per second
API key strongly recommended for better service

Obtaining an API Key:

Visit https://www.ncbi.nlm.nih.gov/account/
Register for a free NCBI account
Go to Settings → API Key Management
Generate a new API key

Important API Rules:

Rate Limiting: Max 3 requests/second without API key, 10 with API key
User-Agent: Set a descriptive User-Agent header with email
Caching: Cache responses when possible to reduce server load
API Key: Recommended for higher rate limits and better service

API Documentation:

E-utilities Quick Start: https://www.ncbi.nlm.nih.gov/books/NBK25500/
E-utilities API Guide: https://www.ncbi.nlm.nih.gov/books/NBK25501/
PubMed Help: https://pubmed.ncbi.nlm.nih.gov/help/

Supported Operations:

search_papers: Search for papers by query string
get_paper_by_id: Get paper metadata by PubMed ID (PMID)
search_by_author: Search for papers by author name
get_paper_details: Get detailed paper information including abstract and citations

3.15 CrossRef API

No API Key Required:

# CrossRef API is completely free and open
tool = APISourceTool()  # No key needed for CrossRef

Configuration (Optional):

config = {
    'crossref_config': {
        'mailto': 'your-email@example.com',  # For polite pool access (better rate limits)
        'timeout': 30,
        'rate_limit': 10,
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
    }
}
tool = APISourceTool(config)

Environment Variable:

export CROSSREF_MAILTO="your-email@example.com"

Important API Rules:

Rate Limiting: Use polite pool (include mailto parameter) for better rate limits
User-Agent: Set a descriptive User-Agent header
Caching: Cache responses when possible to reduce server load
Attribution: Acknowledge CrossRef when using the data

API Documentation:

REST API Documentation: https://www.crossref.org/documentation/retrieve-metadata/rest-api/
API Etiquette: https://github.com/CrossRef/rest-api-doc#etiquette
Metadata Plus: https://www.crossref.org/services/metadata-delivery/

Supported Operations:

get_work_by_doi: Get metadata for a work by its DOI
search_works: Search for works by query string
get_journal_works: Get works published in a specific journal by ISSN
search_funders: Search for funders in the Open Funder Registry
get_funder_works: Get works associated with a specific funder

3.16 Semantic Scholar API

No API Key Required:

# Semantic Scholar API is completely free and open
tool = APISourceTool()  # No key needed for Semantic Scholar

Configuration (Optional):

config = {
    'semanticscholar_config': {
        'timeout': 30,
        'rate_limit': 1,  # Requests per second (recommended for sustained use)
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
    }
}
tool = APISourceTool(config)

Environment Variables:

export SEMANTICSCHOLAR_TIMEOUT=30
export SEMANTICSCHOLAR_RATE_LIMIT=1
export SEMANTICSCHOLAR_MAX_BURST=5

Rate Limits:

Free tier: 1 request per second recommended (100 requests per 5 minutes)
Higher limits available upon request

Important API Rules:

Rate Limiting: Recommended 1 request per second for sustained use
Max Results: Limited to 100 results per request for search, use pagination for more
Caching: Cache responses when possible to reduce server load
User-Agent: Set a descriptive User-Agent header

API Documentation:

API Documentation: https://api.semanticscholar.org/api-docs/
Academic Graph API: https://www.semanticscholar.org/product/api
API Tutorial: https://www.semanticscholar.org/product/api/tutorial

Supported Operations:

search_papers: Search for papers by query string
get_paper: Get paper details by ID (S2 ID, DOI, arXiv ID, etc.)
get_paper_authors: Get authors of a specific paper
get_paper_citations: Get papers that cite this paper
get_paper_references: Get papers referenced by this paper
get_author: Get author details by ID
get_author_papers: Get papers by a specific author

3.17 CORE API Key

Obtaining the Key:

Visit https://core.ac.uk/services/api
Register for a free account
Request an API key from your account dashboard

Configuration:

# Method 1: Direct configuration
tool = APISourceTool({'core_api_key': 'YOUR_CORE_KEY'})

# Method 2: Environment variable
export CORE_API_KEY="YOUR_CORE_KEY"

# Method 3: Configuration file
{
    "core_api_key": "YOUR_CORE_KEY"
}

Rate Limits:

Free tier: Reasonable usage with rate limiting
Contact CORE for higher limits if needed

Features:

Access to millions of open access research papers
Search by query, DOI, or title
Full metadata including authors, abstract, citations
Support for pagination

3.18 USPTO API Key

Obtaining the Key:

Visit https://developer.uspto.gov/
Register for a free developer account
Request an API key from your account dashboard

Configuration:

# Method 1: Direct configuration
tool = APISourceTool({'uspto_api_key': 'YOUR_USPTO_KEY'})

# Method 2: Environment variable
export USPTO_API_KEY="YOUR_USPTO_KEY"

# Method 3: Configuration file
{
    "uspto_api_key": "YOUR_USPTO_KEY"
}

Rate Limits:

Free tier: Reasonable usage with rate limiting
Contact USPTO for higher limits if needed

Features:

Search US patents by query, inventor, or assignee
Get detailed patent information by patent number
Access to comprehensive US patent database
Full metadata including inventors, assignees, classifications, citations

3.19 SEC EDGAR API

No API Key Required:

# SEC EDGAR API is publicly accessible
# User-Agent header is REQUIRED
config = {
    'secedgar_config': {
        'user_agent': 'YourCompanyName contact@example.com'
    }
}
tool = APISourceTool(config)

Environment Variable:

export SECEDGAR_USER_AGENT="YourCompanyName contact@example.com"

Configuration with User-Agent (REQUIRED):

config = {
    'secedgar_config': {
        'user_agent': 'AIECS-APISource contact@example.com',
        'timeout': 30,
        'rate_limit': 10,
        'max_burst': 20
    }
}
tool = APISourceTool(config)

Rate Limits:

Maximum: 10 requests per second
SEC may block access if rules are not followed
Be respectful of the free service

API Rules (https://www.sec.gov/os/accessing-edgar-data):

User-Agent Header REQUIRED: Must include:
- Company or individual name
- Contact email address
- Format: "CompanyName contact@email.com"
Rate Limiting: Limit to 10 requests per second maximum
Caching: Cache responses when possible to reduce load
Fair Access: SEC monitors usage and may block non-compliant access

API Documentation:

API Overview: https://www.sec.gov/search-filings/edgar-application-programming-interfaces
Accessing EDGAR Data: https://www.sec.gov/os/accessing-edgar-data
Data Sets: https://www.sec.gov/data-research/sec-markets-data

Features:

Company submissions and filing history
XBRL financial data and concepts
Company facts across all filings
No API key required - completely free

Supported Operations:

get_company_submissions - Get company filing history by CIK
get_company_concept - Get XBRL concept data for specific metrics
get_company_facts - Get all XBRL facts for a company

Example CIKs:

Apple Inc.: 0000320193
Tesla Inc.: 0001318605
Microsoft Corp.: 0000789019

3.20 Stack Exchange API

API Key Optional (Recommended):

# Stack Exchange API works without key but has lower rate limits
# API key strongly recommended for production use
config = {
    'stackexchange_config': {
        'api_key': 'YOUR_STACKEXCHANGE_API_KEY'
    }
}
tool = APISourceTool(config)

Environment Variable:

export STACKEXCHANGE_API_KEY="your_api_key_here"

Get Your API Key:

Visit https://stackapps.com/apps/oauth/register
Register your application
Copy your API key

Configuration:

config = {
    'stackexchange_config': {
        'api_key': 'YOUR_API_KEY',  # Optional but recommended
        'timeout': 30,
        'rate_limit': 10,
        'max_burst': 20
    }
}
tool = APISourceTool(config)

Rate Limits:

Without API key: 300 requests per day
With API key: 10,000 requests per day
Respect the backoff field in API responses

API Rules (https://api.stackexchange.com/docs/throttle):

API Key Recommended: Increases daily quota from 300 to 10,000 requests
Backoff: Respect the backoff field in responses when present
Compression: API returns gzip compressed responses by default
Attribution: Required when displaying Stack Exchange content
Fair Use: Follow the API terms of service

API Documentation:

API Documentation: https://api.stackexchange.com/docs
Authentication: https://api.stackexchange.com/docs/authentication
Throttling: https://api.stackexchange.com/docs/throttle

Features:

Search questions across Stack Exchange network
Get detailed question and answer data
Search for users and their profiles
Browse tags and their statistics
Access all Stack Exchange sites (Stack Overflow, Server Fault, Super User, etc.)
Rich metadata including votes, views, acceptance status, and bounties

Supported Operations:

search_questions - Search for questions by query and tags
get_question - Get detailed information about a specific question
get_answers - Get answers for a specific question
search_users - Search for users by name
get_tags - Get tags and their statistics
get_sites - Get all sites in the Stack Exchange network

Popular Sites:

Stack Overflow: stackoverflow
Server Fault: serverfault
Super User: superuser
Ask Ubuntu: askubuntu
Mathematics: math

3.21 OpenCorporates API

API Key Required:

config = {
    'opencorporates_api_key': 'YOUR_OPENCORPORATES_API_KEY'
}
tool = APISourceTool(config)

Environment Variable:

export OPENCORPORATES_API_KEY="your_opencorporates_api_key"

Rate Limits:

Free tier: 200 requests per month, 50 requests per day
Open data projects: Free with share-alike attribution
Paid plans: Available for commercial use without restrictions

Obtaining an API Key:

Visit https://opencorporates.com/api_accounts/new
Register for a free account
Choose your plan (free for open data projects)
Get your API key from the dashboard

Features:

Search for companies by name across 140+ jurisdictions
Get detailed company information by jurisdiction and company number
Search for company officers (directors, agents)
Access company filings and statutory documents
Get jurisdiction information
Access to 200+ million companies worldwide

API Documentation:

API Reference: https://api.opencorporates.com/documentation/API-Reference
API Accounts: https://opencorporates.com/api_accounts/new
About OpenCorporates: https://opencorporates.com/info/about

3.22 CourtListener (Free Law Project) API

API Key Required:

config = {
    'courtlistener_api_key': 'YOUR_COURTLISTENER_API_KEY'
}
tool = APISourceTool(config)

Environment Variable:

export COURTLISTENER_API_KEY="your_courtlistener_api_key"

Configuration (Optional):

config = {
    'courtlistener_api_key': 'YOUR_API_KEY',
    'courtlistener_config': {
        'timeout': 30,
        'rate_limit': 10,
        'max_burst': 20
    }
}
tool = APISourceTool(config)

Rate Limits:

Free tier: 5,000 requests per hour for authenticated users
Higher limits available upon request
Be respectful of the free service

Obtaining an API Key:

Visit https://www.courtlistener.com/sign-in/register/
Register for a free account
Go to your profile settings
Generate an API key
Copy and store the API key securely

Features:

Search legal opinions and case law from federal and state courts
Access court dockets and case filings (RECAP archive)
Search judges and judicial information
Access oral argument audio recordings
Explore legal citations and citation networks
Search court information
Access to millions of legal opinions and PACER data
Full metadata including case names, judges, courts, dates, citations

Supported Operations:

search_opinions - Search for legal opinions and case law with advanced filtering
get_opinion - Get detailed information about a specific legal opinion
search_dockets - Search for court dockets and case filings
get_docket - Get detailed information about a specific docket
search_judges - Search for judges and judicial information
get_judge - Get detailed information about a specific judge
search_oral_arguments - Search for oral argument audio recordings
get_oral_argument - Get detailed information about a specific oral argument
search_citations - Search for legal citations and citation networks
get_citation - Get detailed information about a specific citation
search_courts - Search for court information
get_court - Get detailed information about a specific court

Important API Rules:

API Key Required: Must register for a free API key
Rate Limiting: Default is 5,000 requests per hour for authenticated users
Attribution: Acknowledge Free Law Project when using the data
Data Freshness: Data is updated regularly from court sources and PACER
Fair Use: Follow the API terms of service

API Documentation:

REST API Documentation: https://www.courtlistener.com/help/api/rest/
Interactive API Docs: https://www.courtlistener.com/api/rest-info/
About Free Law Project: https://free.law/
Coverage Information: https://www.courtlistener.com/coverage/

Example Usage:

# Search for opinions about constitutional law
result = tool.query(
    provider='courtlistener',
    operation='search_opinions',
    params={'q': 'first amendment', 'court': 'scotus', 'page_size': 10}
)

# Search for dockets in a specific court
result = tool.query(
    provider='courtlistener',
    operation='search_dockets',
    params={'court': 'dcd', 'docket_number': '20-cv', 'page_size': 5}
)

# Search for judges
result = tool.query(
    provider='courtlistener',
    operation='search_judges',
    params={'name': 'Sotomayor', 'court': 'scotus'}
)

# Search for oral arguments
result = tool.query(
    provider='courtlistener',
    operation='search_oral_arguments',
    params={'court': 'scotus', 'case_name': 'Brown', 'page_size': 5}
)

# Get court information
result = tool.query(
    provider='courtlistener',
    operation='get_court',
    params={'court_id': 'scotus'}
)

Popular Court IDs:

Supreme Court: scotus
9th Circuit Court of Appeals: ca9
2nd Circuit Court of Appeals: ca2
D.C. District Court: dcd
Southern District of New York: nysd
Northern District of California: cand

3.23 GBIF (Global Biodiversity Information Facility) API

No API Key Required:

# GBIF API is completely free and open
tool = APISourceTool()  # No key needed for GBIF

Configuration (Optional):

config = {
    'gbif_config': {
        'timeout': 30,
        'rate_limit': 10,  # Requests per second
        'max_burst': 20,   # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)'
    }
}
tool = APISourceTool(config)

Environment Variables:

export GBIF_TIMEOUT=30
export GBIF_RATE_LIMIT=10
export GBIF_MAX_BURST=20
export GBIF_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)"

Rate Limits:

No official rate limit
Recommended: Max 10 requests per second
Be respectful of the free service

Important API Rules:

No API Key Required: Completely free and open access
Rate Limiting: Be respectful - implement reasonable delays between requests
Data Coverage: Access to 2+ billion species occurrence records
Attribution: Acknowledge GBIF when using the data
Fair Use: Do not abuse the free service with excessive requests

API Documentation:

API Reference: https://techdocs.gbif.org/en/openapi/
Species API: https://techdocs.gbif.org/en/openapi/v1/species
Occurrence API: https://techdocs.gbif.org/en/openapi/v1/occurrence
Dataset API: https://techdocs.gbif.org/en/openapi/v1/dataset
About GBIF: https://www.gbif.org/what-is-gbif

Features:

Search for species by name or taxonomic criteria
Match scientific names to GBIF’s taxonomic backbone
Search occurrence records with geographic and temporal filters
Access dataset metadata and publishing information
Get vernacular (common) names in multiple languages
Explore taxonomic hierarchies and relationships
Access to 2+ billion biodiversity occurrence records
Rich metadata including coordinates, dates, basis of record

Supported Operations:

search_species - Search for species by name or other criteria
get_species_by_key - Get detailed species information by GBIF key
match_species_name - Match a scientific name to GBIF taxonomy
search_occurrences - Search for species occurrence records
get_occurrence_by_key - Get detailed occurrence record by key
search_datasets - Search for datasets in GBIF
get_dataset_by_key - Get detailed dataset information by key
get_species_vernacular_names - Get common/vernacular names for a species
get_species_children - Get direct children taxa of a species
get_species_parents - Get parent taxa hierarchy for a species
get_occurrence_count - Get count of occurrence records matching criteria
search_organizations - Search for publishing organizations

Example Usage:

# Search for species
result = tool.query(
    provider='gbif',
    operation='search_species',
    params={'q': 'Panthera leo', 'rank': 'SPECIES', 'limit': 10}
)

# Match a scientific name
result = tool.query(
    provider='gbif',
    operation='match_species_name',
    params={'name': 'Panthera leo', 'kingdom': 'Animalia'}
)

# Search for occurrence records
result = tool.query(
    provider='gbif',
    operation='search_occurrences',
    params={
        'taxonKey': 5219404,  # Panthera leo
        'country': 'KE',      # Kenya
        'year': '2020',
        'limit': 50
    }
)

# Get occurrence count
result = tool.query(
    provider='gbif',
    operation='get_occurrence_count',
    params={'country': 'US', 'year': '2020'}
)

# Get vernacular names
result = tool.query(
    provider='gbif',
    operation='get_species_vernacular_names',
    params={'key': 5219404}  # Panthera leo
)

# Search datasets
result = tool.query(
    provider='gbif',
    operation='search_datasets',
    params={'q': 'birds', 'type': 'OCCURRENCE', 'limit': 10}
)

# Get species details
result = tool.query(
    provider='gbif',
    operation='get_species_by_key',
    params={'key': 5219404}  # Panthera leo
)

# Get taxonomic children
result = tool.query(
    provider='gbif',
    operation='get_species_children',
    params={'key': 5219404, 'limit': 20}
)

# Search organizations
result = tool.query(
    provider='gbif',
    operation='search_organizations',
    params={'country': 'US', 'limit': 10}
)

Data Fields Available:

Species metadata: scientific name, rank, kingdom, phylum, class, order, family, genus
Occurrence data: coordinates, date, basis of record, dataset key
Dataset information: title, description, publishing organization, license
Vernacular names: common names in multiple languages
Taxonomic hierarchy: parent and child taxa
Geographic information: country, locality, coordinates
Temporal information: year, month, day, event date
Data quality: coordinate uncertainty, identification confidence

Common Taxonomic Ranks:

KINGDOM
PHYLUM
CLASS
ORDER
FAMILY
GENUS
SPECIES
SUBSPECIES

Common Basis of Record Values:

HUMAN_OBSERVATION
PRESERVED_SPECIMEN
FOSSIL_SPECIMEN
LIVING_SPECIMEN
MACHINE_OBSERVATION
MATERIAL_SAMPLE
OBSERVATION
OCCURRENCE

Use Cases:

Biodiversity research and analysis
Species distribution mapping
Conservation planning
Environmental impact assessments
Citizen science data exploration
Taxonomic research and validation
Dataset discovery and metadata retrieval
Geographic occurrence analysis

4. Performance Settings

4.1 Caching Configuration

config = {
    # Basic caching
    'cache_ttl': 300,  # 5 minutes
    
    # Intelligent caching (intent-aware TTL)
    'enable_intelligent_cache': True,
    
    # Cache backend (optional)
    'cache_backend': 'redis',  # 'memory' or 'redis'
    'redis_url': 'redis://localhost:6379/0'
}

Intelligent Cache TTL Strategies:

Recent data queries: 60 seconds
Historical data: 3600 seconds (1 hour)
Metadata queries: 86400 seconds (24 hours)
Search queries: 300 seconds (5 minutes)

4.2 Timeout Configuration

config = {
    # Global timeout
    'default_timeout': 30,
    
    # Provider-specific timeouts
    'provider_timeouts': {
        'fred': 15,
        'worldbank': 45,
        'newsapi': 20,
        'census': 30
    }
}

4.3 Retry Configuration

config = {
    'max_retries': 3,
    'retry_backoff_factor': 2.0,  # Exponential backoff multiplier
    'retry_jitter': True,          # Add random jitter to prevent thundering herd
    'retry_on_status_codes': [429, 500, 502, 503, 504]
}

Retry Delay Calculation:

delay = base_delay * (backoff_factor ** attempt) + random_jitter

Example:

Attempt 1: 1.0s + jitter
Attempt 2: 2.0s + jitter
Attempt 3: 4.0s + jitter

5. Feature Flags

5.1 Query Enhancement

config = {
    'enable_query_enhancement': True,
    'query_enhancement_config': {
        'confidence_threshold': 0.5,  # Min confidence for auto-enhancement
        'max_enhancements': 5,         # Max parameters to add
        'preserve_explicit_params': True  # Don't override user params
    }
}

5.2 Fallback Strategy

config = {
    'enable_fallback': True,
    'fallback_config': {
        'max_fallback_attempts': 2,
        'fallback_timeout_multiplier': 1.5,  # Increase timeout for fallback
        'preserve_quality_threshold': 0.7     # Min quality for fallback result
    }
}

5.3 Data Fusion

config = {
    'enable_data_fusion': True,
    'data_fusion_config': {
        'default_strategy': 'best_quality',  # 'best_quality', 'merge_all', 'consensus'
        'quality_weight': 0.6,
        'freshness_weight': 0.3,
        'completeness_weight': 0.1
    }
}

5.4 Rate Limiting

config = {
    'enable_rate_limiting': True,
    'rate_limit_config': {
        'fred': {
            'tokens_per_second': 2.0,  # 120 per minute
            'max_tokens': 10
        },
        'newsapi': {
            'tokens_per_second': 0.001,  # ~100 per day
            'max_tokens': 5
        },
        'census': {
            'tokens_per_second': 0.005,  # ~500 per day
            'max_tokens': 10
        }
    }
}

6. Provider-Specific Configuration

6.1 FRED Provider

config = {
    'fred_api_key': 'YOUR_KEY',
    'fred_config': {
        'base_url': 'https://api.stlouisfed.org/fred',
        'timeout': 15,
        'default_file_type': 'json',
        'default_frequency': 'a',  # Annual
        'default_units': 'lin'     # Linear
    }
}

6.2 World Bank Provider

config = {
    'worldbank_config': {
        'base_url': 'https://api.worldbank.org/v2',
        'timeout': 45,
        'default_format': 'json',
        'default_per_page': 50,
        'default_language': 'en'
    }
}

6.3 News API Provider

config = {
    'newsapi_api_key': 'YOUR_KEY',
    'newsapi_config': {
        'base_url': 'https://newsapi.org/v2',
        'timeout': 20,
        'default_language': 'en',
        'default_page_size': 20,
        'default_sort_by': 'publishedAt'
    }
}

6.4 The Guardian Provider

config = {
    'guardian_api_key': 'YOUR_KEY',
    'guardian_config': {
        'base_url': 'https://content.guardianapis.com',
        'timeout': 30,
        'rate_limit': 5,
        'max_burst': 10
    }
}

Features:

Search all Guardian content with advanced filtering
Get specific content items by ID
Browse and search tags (keywords, contributors, series, etc.)
Get all sections
Filter by section, tag, date range
Support for multiple editions (UK, US, AU, International)
Rich metadata including headlines, body text, thumbnails, tags

Supported Operations:

search_content - Search all Guardian content with advanced filtering options
get_item - Get a specific content item by ID
get_tags - Get all tags or filter by type
search_tags - Search for tags by query
get_sections - Get all Guardian sections
get_edition - Get content for a specific edition

Important Configuration Notes:

API Key Required: Must register for a free API key
Rate Limits: Free tier allows 5,000 requests per day
Attribution: Must acknowledge The Guardian when displaying content
Content Fields: Use show_fields parameter to request specific fields (headline, body, thumbnail, etc.)
Tags: Use show_tags parameter to include tag metadata (keyword, contributor, etc.)

Example Usage:

# Search for articles about technology
result = tool.query(
    provider='guardian',
    operation='search_content',
    params={
        'q': 'artificial intelligence',
        'section': 'technology',
        'from_date': '2024-01-01',
        'page_size': 10,
        'show_fields': 'headline,body,thumbnail',
        'show_tags': 'keyword,contributor'
    }
)

# Get all sections
result = tool.query(
    provider='guardian',
    operation='get_sections',
    params={}
)

# Search for tags related to climate
result = tool.query(
    provider='guardian',
    operation='search_tags',
    params={'q': 'climate', 'page_size': 10}
)

# Get US edition content
result = tool.query(
    provider='guardian',
    operation='get_edition',
    params={'edition': 'us', 'page_size': 20}
)

API Documentation:

API Overview: https://open-platform.theguardian.com/documentation/
Content Search: https://open-platform.theguardian.com/documentation/search
Tags API: https://open-platform.theguardian.com/documentation/tag
Sections API: https://open-platform.theguardian.com/documentation/section

6.5 Census Provider

config = {
    'census_api_key': 'YOUR_KEY',
    'census_config': {
        'base_url': 'https://api.census.gov/data',
        'timeout': 30,
        'default_year': 2021,
        'default_dataset': 'acs/acs5'
    }
}

6.5 Congress Provider

config = {
    'congress_api_key': 'YOUR_KEY',
    'congress_config': {
        'base_url': 'https://api.congress.gov/v3',
        'timeout': 30,
        'rate_limit': 10,
        'max_burst': 20
    }
}

Available Operations:

search_bills: Search for bills and resolutions by congress number and type
get_bill: Get detailed information about a specific bill
list_members: List members of Congress by congress number and chamber
get_member: Get detailed information about a specific member
list_committees: List congressional committees
get_committee: Get detailed information about a specific committee
search_amendments: Search for amendments to bills
get_amendment: Get detailed information about a specific amendment

Example Usage:

# Search for bills in the 118th Congress
result = tool.execute('search_bills', {
    'congress': 118,
    'bill_type': 'hr',
    'limit': 10
})

# Get specific bill details
result = tool.execute('get_bill', {
    'congress': 118,
    'bill_type': 'hr',
    'bill_number': 1
})

# List House members in 118th Congress
result = tool.execute('list_members', {
    'congress': 118,
    'chamber': 'house',
    'limit': 20
})

6.6 OpenStates Provider

config = {
    'openstates_api_key': 'YOUR_API_KEY',  # REQUIRED
    'openstates_config': {
        'base_url': 'https://v3.openstates.org',
        'timeout': 30,
        'rate_limit': 10,
        'max_burst': 20
    }
}

Available Operations:

search_bills: Search for state bills and resolutions with advanced filtering
get_bill: Get detailed information about a specific bill by ID
search_people: Search for state legislators with filtering options
get_person: Get detailed information about a specific legislator
list_jurisdictions: List all available state jurisdictions
get_jurisdiction: Get detailed information about a specific jurisdiction

Example Usage:

# Search for bills in California
result = tool.query(
    provider='openstates',
    operation='search_bills',
    params={'jurisdiction': 'CA', 'session': '2023', 'per_page': 10}
)

# Search for bills by subject
result = tool.query(
    provider='openstates',
    operation='search_bills',
    params={
        'jurisdiction': 'NY',
        'subject': 'Education',
        'per_page': 5
    }
)

# Get current legislators from Texas
result = tool.query(
    provider='openstates',
    operation='search_people',
    params={'jurisdiction': 'TX', 'current': True, 'per_page': 10}
)

# List all available jurisdictions
result = tool.query(
    provider='openstates',
    operation='list_jurisdictions',
    params={'per_page': 52}
)

# Get specific bill details
result = tool.query(
    provider='openstates',
    operation='get_bill',
    params={'bill_id': 'ocd-bill/...'}
)

Important Configuration Notes:

API Key Required: Must register for a free API key at https://openstates.org/accounts/profile/
Rate Limit: Free tier with reasonable usage limits (default: 10 req/s)
Attribution: Acknowledge OpenStates.org when using the data
Data Freshness: Data is updated regularly from official state sources
Coverage: All 50 U.S. states plus DC and Puerto Rico

API Documentation:

API v3 Documentation: https://docs.openstates.org/api-v3/
Interactive API Docs: https://v3.openstates.org/docs/
About OpenStates: https://openstates.org/about/

6.7 Alpha Vantage Provider

config = {
    'alphavantage_api_key': 'YOUR_KEY',
    'alphavantage_config': {
        'base_url': 'https://www.alphavantage.co/query',
        'timeout': 30,
        'default_datatype': 'json'
    }
}

6.6 REST Countries Provider

config = {
    'restcountries_config': {
        'base_url': 'https://restcountries.com/v3.1',
        'timeout': 30
    }
}

6.7 ExchangeRate Provider

config = {
    'exchangerate_api_key': 'YOUR_KEY',  # Optional
    'exchangerate_config': {
        'base_url': 'https://api.exchangerate-api.com/v4',
        'timeout': 30
    }
}

6.8 Open Library Provider

config = {
    'openlibrary_config': {
        'base_url': 'https://openlibrary.org',
        'timeout': 30,
        'rate_limit': 10,  # Requests per second
        'max_burst': 20    # Maximum burst size
    }
}

6.9 Metropolitan Museum of Art (The Met) Provider

config = {
    'metmuseum_config': {
        'base_url': 'https://collectionapi.metmuseum.org/public/collection/v1',
        'timeout': 30,
        'rate_limit': 10,  # Requests per second
        'max_burst': 20    # Maximum burst size
    }
}

Features:

Search art objects with comprehensive filtering
Get detailed object information including high-resolution images
Browse by department, artist, medium, culture
Access to 470,000+ artworks from The Met collection
Rich metadata including provenance, exhibition history
Public domain images available for many works

Supported Operations:

search_objects - Search for art objects with advanced filtering
get_object - Get detailed information about a specific art object
get_departments - Get list of all departments
get_objects_by_department - Get objects in a specific department
search_by_artist - Search for artworks by artist name
search_by_medium - Search for artworks by medium
search_by_culture - Search for artworks by culture
search_highlight_objects - Search for highlighted/featured objects
download_image - Download high-resolution images from The Met collection

Important Configuration Notes:

No API Key Required: Completely free and open access
Rate Limit: No official limit, recommended 10 req/s
Attribution: Acknowledge The Metropolitan Museum of Art when using the data
Images: Many objects include high-resolution images (check isPublicDomain flag)
Data Quality: Comprehensive metadata for frontend analysis needs

Example Usage:

# Search for impressionist paintings
result = tool.query(
    provider='metmuseum',
    operation='search_objects',
    params={
        'q': 'impressionism',
        'has_images': True,
        'date_begin': 1860,
        'date_end': 1900,
        'limit': 20
    }
)

# Get specific artwork details
result = tool.query(
    provider='metmuseum',
    operation='get_object',
    params={'object_id': 436535}
)

# Search by artist
result = tool.query(
    provider='metmuseum',
    operation='search_by_artist',
    params={'artist_name': 'Vincent van Gogh', 'has_images': True, 'limit': 10}
)

# Get all departments
result = tool.query(
    provider='metmuseum',
    operation='get_departments',
    params={}
)

# Download artwork images
result = tool.query(
    provider='metmuseum',
    operation='download_image',
    params={'object_id': 436535, 'output_path': './vangogh.jpg'}
)

Image Download Feature: The Met Museum provider includes a powerful download_image operation that allows you to download high-resolution images:

# Download by object ID (automatically fetches primary image)
result = tool.query(
    provider='metmuseum',
    operation='download_image',
    params={'object_id': 436535}
)
# Returns: {'success': True, 'output_path': '/tmp/...jpg', 'file_size': 1234567}

# Download by direct URL with custom path
result = tool.query(
    provider='metmuseum',
    operation='download_image',
    params={
        'image_url': 'https://images.metmuseum.org/CRDImages/ep/original/DP-42549-001.jpg',
        'output_path': './my_artwork.jpg'
    }
)

# Batch download from search results
search_result = tool.query(
    provider='metmuseum',
    operation='search_objects',
    params={'q': 'van gogh', 'has_images': True, 'limit': 5}
)

for obj in search_result['data']['objects']:
    if obj.get('primaryImage'):
        download_result = tool.query(
            provider='metmuseum',
            operation='download_image',
            params={
                'image_url': obj['primaryImage'],
                'output_path': f"./images/{obj['objectID']}.jpg"
            }
        )

API Documentation:

API Documentation: https://metmuseum.github.io/
GitHub Repository: https://github.com/metmuseum/openaccess
Open Access Initiative: https://www.metmuseum.org/about-the-met/policies-and-documents/open-access

6.10 CoinGecko Provider

config = {
    'coingecko_config': {
        'base_url': 'https://api.coingecko.com/api/v3',
        'timeout': 30,
        'rate_limit': 10,  # Requests per second (free tier)
        'max_burst': 20    # Maximum burst size
    }
}

Note: CoinGecko free tier does not require an API key. For higher rate limits and additional features, consider the Pro API.

6.10 OpenWeatherMap Provider

config = {
    'openweathermap_api_key': 'YOUR_KEY',
    'openweathermap_config': {
        'base_url': 'https://api.openweathermap.org/data/2.5',
        'geo_url': 'https://api.openweathermap.org/geo/1.0',
        'timeout': 30,
        'rate_limit': 10,  # Requests per second
        'max_burst': 20    # Maximum burst size
    }
}

Obtaining the Key:

Visit https://openweathermap.org/api
Sign up for a free account
Generate an API key from your account dashboard

6.11 Wikipedia Provider

config = {
    'wikipedia_config': {
        'base_url': 'https://en.wikipedia.org/w/api.php',
        'rest_base_url': 'https://en.wikipedia.org/api/rest_v1',
        'timeout': 30,
        'rate_limit': 10,  # Requests per second (max 200 allowed)
        'max_burst': 20,   # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'  # REQUIRED
    }
}

Features:

Article search by title or content
Page summaries and extracts
Full page content retrieval
Random article discovery
Page metadata and information

Important Configuration Notes:

No API Key Required: Wikipedia API is completely free and open
User-Agent REQUIRED: Must set a unique User-Agent with contact information
Rate Limit: Maximum 200 req/s allowed, default config uses 10 req/s
API Etiquette: Follow https://www.mediawiki.org/wiki/API:Etiquette

6.12 GitHub Provider

config = {
    'github_api_key': 'YOUR_GITHUB_TOKEN',  # Recommended for higher rate limits
    'github_config': {
        'base_url': 'https://api.github.com',
        'timeout': 30,
        'rate_limit': 10,  # Requests per second
        'max_burst': 20,   # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs)'
    }
}

Features:

Repository information and statistics
Search repositories, users, and code
User profiles and activity
Repository issues and pull requests
Organization data

Supported Operations:

get_repository - Get detailed repository information
search_repositories - Search for repositories
get_user - Get user profile information
search_users - Search for users
get_repository_issues - Get repository issues
get_repository_pulls - Get repository pull requests
search_code - Search for code across repositories

Important Configuration Notes:

API Key Recommended: Use a Personal Access Token for 5,000 req/hour (vs 60 unauthenticated)
Rate Limits: Authenticated: 5,000/hour, Unauthenticated: 60/hour
Token Scopes: Use minimal scopes needed (e.g., public_repo for public data)
API Version: Uses GitHub REST API v3 with application/vnd.github+json accept header

Obtaining the Key:

Visit https://github.com/settings/tokens
Generate new token (classic)
Select appropriate scopes
Copy and store the token securely

6.13 arXiv Provider

config = {
    'arxiv_config': {
        'base_url': 'http://export.arxiv.org/api/query',
        'timeout': 30,
        'rate_limit': 0.33,  # Requests per second (~3 second delays between requests)
        'max_burst': 2,      # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
    }
}

Features:

Search papers by query (all fields)
Get paper by arXiv ID
Search by author name
Search by category (e.g., cs.AI, math.CO)
Pagination support
Full metadata including authors, abstract, categories, PDF links

Important Configuration Notes:

No API Key Required: arXiv API is completely free and open
Rate Limit: Be respectful - implement 3 second delays between requests
Max Results: Limited to 30,000 results in slices of at most 2,000 at a time
Caching: Strongly recommended to cache responses to reduce server load
API Etiquette: Follow https://info.arxiv.org/help/api/user-manual.html

Obtaining the Key:

No API key required - completely free and open access

API Documentation:

API User Manual: https://info.arxiv.org/help/api/user-manual.html
API Basics: https://info.arxiv.org/help/api/basics.html
Category Taxonomy: https://arxiv.org/category_taxonomy

6.14 PubMed Provider

config = {
    'pubmed_api_key': 'YOUR_NCBI_API_KEY',  # Optional but recommended
    'pubmed_config': {
        'base_url': 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils',
        'timeout': 30,
        'rate_limit': 3,     # Requests per second (3 without key, 10 with key)
        'max_burst': 5,      # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
    }
}

Features:

Search biomedical and life sciences literature
Get paper metadata by PubMed ID (PMID)
Search by author name
Get detailed paper information including abstracts
Access to 35+ million citations from MEDLINE, PubMed, and other databases
Full metadata including authors, journal, DOI, publication date

Supported Operations:

search_papers - Search for papers by query string
get_paper_by_id - Get paper metadata by PMID
search_by_author - Search for papers by author name
get_paper_details - Get detailed paper information including abstract

Important Configuration Notes:

API Key Optional but Recommended: Increases rate limit from 3 to 10 requests/second
Rate Limits: 3 req/s without API key, 10 req/s with API key
User-Agent: Should include contact email for NCBI to reach you if needed
Caching: Strongly recommended to cache responses to reduce server load
API Etiquette: Follow NCBI E-utilities guidelines

Obtaining the Key:

Visit https://www.ncbi.nlm.nih.gov/account/
Register for a free NCBI account
Go to Settings → API Key Management
Generate a new API key

API Documentation:

E-utilities Quick Start: https://www.ncbi.nlm.nih.gov/books/NBK25500/
E-utilities API Guide: https://www.ncbi.nlm.nih.gov/books/NBK25501/
PubMed Help: https://pubmed.ncbi.nlm.nih.gov/help/

6.15 CrossRef Provider

config = {
    'crossref_config': {
        'base_url': 'https://api.crossref.org',
        'mailto': 'your-email@example.com',  # For polite pool access
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
    }
}

Features:

Get work metadata by DOI
Search for scholarly works
Get works from specific journals by ISSN
Search for funders in Open Funder Registry
Get works funded by specific funders
Access to extensive scholarly metadata including citations, references, authors, affiliations

Supported Operations:

get_work_by_doi - Get metadata for a work by its DOI
search_works - Search for works by query string with pagination and sorting
get_journal_works - Get works published in a specific journal by ISSN
search_funders - Search for funders in the Open Funder Registry
get_funder_works - Get works associated with a specific funder

Important Configuration Notes:

No API Key Required: CrossRef API is completely free and open
Polite Pool: Provide an email address (mailto parameter) for better rate limits
User-Agent: Set a descriptive User-Agent header with contact information
Caching: Strongly recommended to cache responses to reduce server load
Attribution: Acknowledge CrossRef when using the data in publications

Obtaining Access:

No API key required - completely free and open access
Optional: Register email for polite pool access (better rate limits)

API Documentation:

REST API Documentation: https://www.crossref.org/documentation/retrieve-metadata/rest-api/
API Etiquette: https://github.com/CrossRef/rest-api-doc#etiquette
Metadata Plus: https://www.crossref.org/services/metadata-delivery/

6.16 Semantic Scholar Provider

config = {
    'semanticscholar_config': {
        'base_url': 'https://api.semanticscholar.org/graph/v1',
        'timeout': 30,
        'rate_limit': 1,     # Requests per second (recommended for sustained use)
        'max_burst': 5,      # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
    }
}

Features:

Search for academic papers by query
Get paper metadata by ID (S2 ID, DOI, arXiv ID, etc.)
Get paper authors, citations, and references
Get author information and publications
Access to extensive academic paper database with citation data
Support for multiple paper ID formats (S2 ID, DOI, arXiv ID, PubMed ID, etc.)

Supported Operations:

search_papers - Search for papers by query string
get_paper - Get paper details by ID (S2 ID, DOI, arXiv ID, etc.)
get_paper_authors - Get authors of a specific paper
get_paper_citations - Get papers that cite this paper
get_paper_references - Get papers referenced by this paper
get_author - Get author details by ID
get_author_papers - Get papers by a specific author

Important Configuration Notes:

No API Key Required: Semantic Scholar API is completely free and open
Rate Limit: Recommended 1 request per second for sustained use (100 requests per 5 minutes)
Max Results: Limited to 100 results per request for search, use pagination for more
User-Agent: Set a descriptive User-Agent header with contact information
Caching: Strongly recommended to cache responses to reduce server load
Paper IDs: Supports multiple ID formats (S2 ID, DOI, arXiv ID, PubMed ID, etc.)

Obtaining Access:

No API key required - completely free and open access
Optional: Contact Semantic Scholar for higher rate limits if needed

API Documentation:

API Documentation: https://api.semanticscholar.org/api-docs/
Academic Graph API: https://www.semanticscholar.org/product/api
API Tutorial: https://www.semanticscholar.org/product/api/tutorial

6.17 CORE Provider

config = {
    'core_api_key': 'YOUR_CORE_API_KEY',  # Required
    'core_config': {
        'base_url': 'https://api.core.ac.uk/v3',
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
    }
}

Features:

Search for open access research papers
Get work metadata by CORE ID
Search by DOI
Search by title
Access to millions of open access research papers
Full metadata including authors, abstract, publication date, citations

Supported Operations:

search_works - Search for works by query string
get_work - Get work details by CORE ID
search_by_doi - Search for works by DOI
search_by_title - Search for works by title

Important Configuration Notes:

API Key Required: CORE API requires an API key for access
Rate Limit: Free tier allows reasonable usage with rate limiting
Max Results: Limited to 100 results per request for search, use pagination for more
Caching: Strongly recommended to cache responses to reduce server load
Attribution: Acknowledge CORE when using the data in publications

Obtaining the Key:

Visit https://core.ac.uk/services/api
Register for a free account
Request an API key from your account dashboard

API Documentation:

API Documentation: https://core.ac.uk/documentation/api
API Services: https://core.ac.uk/services/api
About CORE: https://core.ac.uk/about

6.18 USPTO Provider

config = {
    'uspto_api_key': 'YOUR_USPTO_API_KEY',  # Required
    'uspto_config': {
        'base_url': 'https://developer.uspto.gov/ibd-api/v1',
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
    }
}

Features:

Search for US patents by query
Get patent details by patent number
Search patents by inventor name
Search patents by assignee (company/organization)
Access to comprehensive US patent database
Full metadata including title, abstract, inventors, assignees, classifications, citations

Supported Operations:

search_patents - Search for patents by query string
get_patent - Get patent details by patent number/ID
search_by_inventor - Search for patents by inventor name
search_by_assignee - Search for patents by assignee name

Important Configuration Notes:

API Key Required: USPTO API requires an API key for access
Rate Limit: Free tier allows reasonable usage with rate limiting
Max Results: Pagination supported for large result sets
Caching: Strongly recommended to cache responses to reduce server load
Attribution: Acknowledge USPTO when using patent data in publications

Obtaining the Key:

Visit https://developer.uspto.gov/
Register for a free developer account
Request an API key from your account dashboard

API Documentation:

API Catalog: https://developer.uspto.gov/api-catalog
Patent Search API: https://developer.uspto.gov/api-catalog/patent-search-api
Developer Portal: https://developer.uspto.gov/

6.19 SEC EDGAR Provider

config = {
    'secedgar_config': {
        'base_url': 'https://data.sec.gov',
        'user_agent': 'YourCompanyName contact@example.com',  # REQUIRED
        'timeout': 30,
        'rate_limit': 10,    # Requests per second (max allowed by SEC)
        'max_burst': 20,     # Maximum burst size
    }
}

Features:

Get company submissions and filing history by CIK
Access XBRL financial data and concepts
Retrieve company facts across all filings
Search company filings (10-K, 10-Q, 8-K, etc.)
Download actual filing documents (10-K, 10-Q, 8-K full text)
Calculate financial ratios automatically
Get formatted financial statements
Access insider trading data (Form 4)
Access to comprehensive SEC filing database
Full metadata including company info, filing dates, XBRL tags

Supported Operations:

Basic Data Retrieval:

get_company_submissions - Get company filing history and submission data
get_company_concept - Get XBRL concept data for specific financial metrics
get_company_facts - Get all XBRL facts for a company

Filing Document Access:

search_filings - Search for filings by CIK and form type
get_filings_by_type - Get recent filings of a specific form type
get_filing_documents - Get filing document URLs and metadata
get_filing_text - Download full text of filing documents

Financial Analysis:

calculate_financial_ratios - Calculate common financial ratios (P/E, ROE, ROA, etc.)
get_financial_statement - Get formatted financial statements (balance sheet, income statement, cash flow)

Corporate Governance:

get_insider_transactions - Get insider trading transactions (Form 4 filings)

Important Configuration Notes:

No API Key Required: SEC EDGAR API is completely free and open
User-Agent REQUIRED: Must include company/individual name and contact email
- Format: "CompanyName contact@email.com"
- SEC will block access if User-Agent is missing or generic
Rate Limit: Maximum 10 requests per second (enforced by SEC)
CIK Format: Central Index Key must be 10 digits with leading zeros (e.g., “0000320193”)
Caching: Strongly recommended to cache responses to reduce server load
Fair Access: SEC monitors usage and may block non-compliant access

Example Usage:

# 1. Get Apple Inc. filings (CIK: 0000320193)
result = tool.query(
    provider='secedgar',
    operation='get_company_submissions',
    params={'cik': '0000320193'}
)

# 2. Search for specific form type (10-K annual reports)
result = tool.query(
    provider='secedgar',
    operation='search_filings',
    params={
        'cik': '0000320193',
        'form_type': '10-K',
        'limit': 5
    }
)

# 3. Get Apple's Assets data from XBRL
result = tool.query(
    provider='secedgar',
    operation='get_company_concept',
    params={
        'cik': '0000320193',
        'taxonomy': 'us-gaap',
        'tag': 'Assets'
    }
)

# 4. Calculate financial ratios
result = tool.query(
    provider='secedgar',
    operation='calculate_financial_ratios',
    params={'cik': '0000320193'}
)
# Returns: current_ratio, debt_to_equity, profit_margin, ROA, ROE, etc.

# 5. Get formatted balance sheet
result = tool.query(
    provider='secedgar',
    operation='get_financial_statement',
    params={
        'cik': '0000320193',
        'statement_type': 'balance_sheet',
        'period': 'annual'
    }
)

# 6. Get insider transactions (Form 4)
result = tool.query(
    provider='secedgar',
    operation='get_insider_transactions',
    params={
        'cik': '0000320193',
        'start_date': '2024-01-01'
    }
)

# 7. Download filing document text
result = tool.query(
    provider='secedgar',
    operation='get_filing_text',
    params={
        'cik': '0000320193',
        'accession_number': '0000320193-23-000077'
    }
)

Common CIKs:

Apple Inc.: 0000320193
Tesla Inc.: 0001318605
Microsoft Corp.: 0000789019
Amazon.com Inc.: 0001018724
Alphabet Inc.: 0001652044

Finding CIKs:

Company Search: https://www.sec.gov/edgar/searchedgar/companysearch.html
CIK Lookup Tool: https://www.sec.gov/cgi-bin/browse-edgar

API Documentation:

API Overview: https://www.sec.gov/search-filings/edgar-application-programming-interfaces
Accessing EDGAR Data: https://www.sec.gov/os/accessing-edgar-data
XBRL Data Sets: https://www.sec.gov/dera/data/financial-statement-data-sets.html
Company Submissions: https://data.sec.gov/submissions/
XBRL API: https://data.sec.gov/api/xbrl/

6.20 Stack Exchange Provider

config = {
    'stackexchange_config': {
        'base_url': 'https://api.stackexchange.com/2.3',
        'api_key': 'YOUR_API_KEY',  # Optional but recommended
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
    }
}

Features:

Search questions across Stack Exchange network
Get detailed question and answer information
Search for users and their profiles
Browse tags and their statistics
Access all Stack Exchange sites
Rich metadata including votes, views, acceptance status

Supported Operations:

Question Operations:

search_questions - Search for questions by query and tags
get_question - Get detailed information about a specific question
get_answers - Get answers for a specific question

User and Tag Operations:

search_users - Search for users by name
get_tags - Get tags and their statistics
get_sites - Get all sites in the Stack Exchange network

Important Notes:

API Key Optional: Works without key but has much lower rate limits (300 vs 10,000 requests/day)
Compression: API returns gzip compressed responses by default
Backoff: Respect the backoff field in responses when present
Attribution: Required when displaying Stack Exchange content

Example Usage:

# 1. Search for Python questions on Stack Overflow
result = tool.query(
    provider='stackexchange',
    operation='search_questions',
    params={
        'site': 'stackoverflow',
        'q': 'python async',
        'tagged': 'python',
        'sort': 'votes',
        'pagesize': 10
    }
)

# 2. Get a specific question by ID
result = tool.query(
    provider='stackexchange',
    operation='get_question',
    params={
        'question_id': 11227809,
        'site': 'stackoverflow'
    }
)

# 3. Get answers for a question
result = tool.query(
    provider='stackexchange',
    operation='get_answers',
    params={
        'question_id': 11227809,
        'site': 'stackoverflow',
        'sort': 'votes',
        'pagesize': 5
    }
)

# 4. Search for users
result = tool.query(
    provider='stackexchange',
    operation='search_users',
    params={
        'site': 'stackoverflow',
        'inname': 'Jon Skeet',
        'pagesize': 10
    }
)

# 5. Get popular Python tags
result = tool.query(
    provider='stackexchange',
    operation='get_tags',
    params={
        'site': 'stackoverflow',
        'inname': 'python',
        'sort': 'popular',
        'pagesize': 20
    }
)

# 6. Get all Stack Exchange sites
result = tool.query(
    provider='stackexchange',
    operation='get_sites',
    params={'pagesize': 50}
)

Popular Sites:

Stack Overflow: stackoverflow
Server Fault: serverfault
Super User: superuser
Ask Ubuntu: askubuntu
Mathematics: math
Unix & Linux: unix

API Documentation:

API Documentation: https://api.stackexchange.com/docs
Authentication: https://api.stackexchange.com/docs/authentication
Throttling: https://api.stackexchange.com/docs/throttle
Register App: https://stackapps.com/apps/oauth/register

6.21 Hacker News Provider

config = {
    'hackernews_config': {
        'base_url': 'http://hn.algolia.com/api/v1',
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
    }
}

Features:

Search Hacker News stories by keywords
Search comments by keywords
Search items sorted by date (most recent first)
Get item details by ID (story, comment, poll, etc.)
Get user information by username
Full metadata including title, author, points, comments, URL
Pagination support for large result sets

Supported Operations:

search_stories - Search for stories by keywords (sorted by relevance)
search_comments - Search for comments by keywords
search_by_date - Search for items sorted by date (most recent first)
get_item - Get item details by ID (story, comment, poll, etc.)
get_user - Get user information by username

Important Configuration Notes:

No API Key Required: Hacker News Algolia API is completely free and open
Rate Limiting: Be respectful - implement reasonable delays between requests
Max Results: Limited to 1000 results per query (pagination available)
User-Agent: Set a descriptive User-Agent header for API etiquette
Caching: Strongly recommended to cache responses to reduce server load

Example Usage:

# 1. Search for Python-related stories
result = tool.query(
    provider='hackernews',
    operation='search_stories',
    params={
        'query': 'python',
        'hits_per_page': 20
    }
)

# 2. Search for stories with minimum comments
result = tool.query(
    provider='hackernews',
    operation='search_stories',
    params={
        'query': 'AI',
        'num_comments': 50,  # Minimum 50 comments
        'hits_per_page': 10
    }
)

# 3. Search comments about machine learning
result = tool.query(
    provider='hackernews',
    operation='search_comments',
    params={
        'query': 'machine learning',
        'hits_per_page': 20
    }
)

# 4. Get recent AI stories sorted by date
result = tool.query(
    provider='hackernews',
    operation='search_by_date',
    params={
        'query': 'AI',
        'tags': 'story',
        'hits_per_page': 20
    }
)

# 5. Get specific item details
result = tool.query(
    provider='hackernews',
    operation='get_item',
    params={'item_id': 1}  # The first HN story ever posted
)

# 6. Get user information
result = tool.query(
    provider='hackernews',
    operation='get_user',
    params={'username': 'pg'}  # Paul Graham
)

Common Tags:

story - Filter for stories only
comment - Filter for comments only
poll - Filter for polls only
author_pg - Filter by author (e.g., Paul Graham)
Combine tags: story,author_pg - Stories by Paul Graham

Obtaining Access:

No API key required - completely free and open access

API Documentation:

API Documentation: https://hn.algolia.com/api
Hacker News Official: https://news.ycombinator.com/
Search Interface: https://hn.algolia.com/

6.22 OpenCorporates Provider

config = {
    'opencorporates_api_key': 'YOUR_OPENCORPORATES_API_KEY',  # Required
    'opencorporates_config': {
        'base_url': 'https://api.opencorporates.com/v0.4',
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
    }
}

Features:

Search for companies by name across 140+ jurisdictions worldwide
Get detailed company information by jurisdiction code and company number
Search for company officers (directors, agents, secretaries)
Get officer details and their company affiliations
Access company filings and statutory documents
Get jurisdiction information and codes
Access to 200+ million companies from official registers
Full metadata including company status, address, incorporation date, officers

Supported Operations:

Company Operations:

search_companies - Search for companies by name or other criteria
get_company - Get detailed information about a specific company by jurisdiction and company number
get_company_filings - Get statutory filings for a specific company

Officer Operations:

search_officers - Search for company officers (directors, agents) by name
get_officer - Get detailed information about a specific officer by ID

Jurisdiction Operations:

list_jurisdictions - Get list of all available jurisdictions

Important Configuration Notes:

API Key Required: OpenCorporates API requires an API key for all requests
Rate Limits: Free tier allows 200 requests/month, 50 requests/day
Open Data: Free for open data projects with share-alike attribution
Paid Plans: Available for commercial use without share-alike restrictions
Jurisdiction Codes: Use standard codes like ‘us_ca’ (California), ‘gb’ (UK), ‘de’ (Germany)
Caching: Strongly recommended to cache responses to reduce API usage

Example Usage:

# 1. Search for companies by name
result = tool.query(
    provider='opencorporates',
    operation='search_companies',
    params={
        'q': 'Apple Inc',
        'jurisdiction_code': 'us_ca',  # Optional: filter by jurisdiction
        'per_page': 10
    }
)

# 2. Get specific company details
result = tool.query(
    provider='opencorporates',
    operation='get_company',
    params={
        'jurisdiction_code': 'us_ca',
        'company_number': 'C0806592'  # Apple Inc.
    }
)

# 3. Search for officers
result = tool.query(
    provider='opencorporates',
    operation='search_officers',
    params={
        'q': 'John Smith',
        'jurisdiction_code': 'gb',  # Optional: filter by jurisdiction
        'per_page': 10
    }
)

# 4. Get company filings
result = tool.query(
    provider='opencorporates',
    operation='get_company_filings',
    params={
        'jurisdiction_code': 'us_ca',
        'company_number': 'C0806592',
        'per_page': 20
    }
)

# 5. List all jurisdictions
result = tool.query(
    provider='opencorporates',
    operation='list_jurisdictions',
    params={}
)

Common Jurisdiction Codes:

United States (California): us_ca
United States (Delaware): us_de
United Kingdom: gb
Germany: de
France: fr
Canada (Ontario): ca_on
Australia: au

Example Companies:

Apple Inc. (US-CA): jurisdiction_code=’us_ca’, company_number=’C0806592’
Google LLC (US-DE): jurisdiction_code=’us_de’, company_number=’5908224’
Microsoft Corporation (US-WA): jurisdiction_code=’us_wa’, company_number=’600413485’

Obtaining the Key:

Visit https://opencorporates.com/api_accounts/new
Register for a free account
Choose your plan (free for open data projects)
Get your API key from the dashboard

API Documentation:

API Reference: https://api.opencorporates.com/documentation/API-Reference
API Accounts: https://opencorporates.com/api_accounts/new
About OpenCorporates: https://opencorporates.com/info/about
Jurisdiction Codes: https://api.opencorporates.com/documentation/Open-Data-Licence

6.23 GDELT Project Provider

config = {
    'gdelt_config': {
        'doc_base_url': 'https://api.gdeltproject.org/api/v2/doc/doc',
        'geo_base_url': 'https://api.gdeltproject.org/api/v2/geo/geo',
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
    }
}

Features:

Search global news articles across 100+ languages
Timeline analysis of news coverage volume and tone
Geographic mapping of news coverage
Image search with visual recognition
Theme-based search using Global Knowledge Graph
Emotional tone analysis of news coverage
Source country analysis
Real-time updates every 15 minutes

Supported Operations:

Article Search Operations:

search_articles - Search global news articles with advanced filtering
get_article_list - Get detailed list of articles with full metadata
search_by_theme - Search using GDELT’s Global Knowledge Graph themes

Timeline Operations:

get_timeline - Get timeline of news coverage volume
get_timeline_volume - Get volume timeline with raw counts or percentages
get_timeline_tone - Get timeline showing average emotional tone over time
get_timeline_lang - Get timeline broken down by language
get_timeline_source_country - Get timeline broken down by source country

Analysis Operations:

get_tone_chart - Analyze emotional tone distribution of coverage
get_top_themes - Get top themes and topics from matching articles

Geographic Operations:

get_geo_map - Get geographic map of locations mentioned in news
get_source_country_map - Map which countries are reporting on a topic

Image Operations:

search_images - Search news images using visual recognition

Important Configuration Notes:

No API Key Required: GDELT Project API is completely free and open
Rate Limiting: Be respectful - implement reasonable delays between requests
Data Coverage: Monitors news in 100+ languages from around the world
Real-time Updates: Data updated every 15 minutes
Attribution: Acknowledge GDELT Project when using the data
Fair Use: Do not abuse the free service with excessive requests
Caching: Strongly recommended to cache responses to reduce server load

Example Usage:

# 1. Search for climate change articles
result = tool.query(
    provider='gdelt',
    operation='search_articles',
    params={
        'query': 'climate change',
        'timespan': '7d',
        'max_records': 50,
        'source_lang': 'english'
    }
)

# 2. Get timeline of AI coverage
result = tool.query(
    provider='gdelt',
    operation='get_timeline',
    params={
        'query': 'artificial intelligence',
        'timespan': '30d',
        'mode': 'timelinevol'
    }
)

# 3. Analyze tone of election coverage
result = tool.query(
    provider='gdelt',
    operation='get_tone_chart',
    params={
        'query': 'election',
        'timespan': '7d'
    }
)

# 4. Search for protest images
result = tool.query(
    provider='gdelt',
    operation='search_images',
    params={
        'query': 'protest',
        'timespan': '7d',
        'image_tag': 'protest',
        'max_records': 20
    }
)

# 5. Get geographic map of earthquake coverage
result = tool.query(
    provider='gdelt',
    operation='get_geo_map',
    params={
        'query': 'earthquake',
        'mode': 'country',
        'timespan': '24h'
    }
)

# 6. Search by theme (Global Knowledge Graph)
result = tool.query(
    provider='gdelt',
    operation='search_by_theme',
    params={
        'theme': 'ENV_CLIMATECHANGE',
        'timespan': '7d',
        'max_records': 50
    }
)

# 7. Get source country map
result = tool.query(
    provider='gdelt',
    operation='get_source_country_map',
    params={
        'query': 'technology',
        'timespan': '24h'
    }
)

# 8. Get timeline with tone analysis
result = tool.query(
    provider='gdelt',
    operation='get_timeline_tone',
    params={
        'query': 'economy',
        'timespan': '30d',
        'smoothing': 5
    }
)

Common GKG Themes:

ENV_CLIMATECHANGE - Climate change and global warming
TERROR - Terrorism and extremism
HEALTH - Health and medical topics
ECON_INFLATION - Economic inflation
ECON_STOCKMARKET - Stock market and finance
TAX_FNCACT_STUDENT - Student finance and education
WB_* - World Bank indicators (e.g., WB_1987_POVERTY_HEADCOUNT)

Timespan Formats:

Hours: 1h, 6h, 12h, 24h
Days: 1d, 3d, 7d
Weeks: 1week, 2weeks
Months: 1month, 3months, 6months

Query Operators:

Phrase search: "exact phrase"
Boolean AND: term1 term2 or term1 AND term2
Boolean OR: term1 OR term2
Boolean NOT: -term or NOT term
Grouping: (term1 OR term2) AND term3
Theme search: theme:TERROR
Domain filter: domain:nytimes.com
Source language: sourcelang:english
Source country: sourcecountry:us

API Documentation:

DOC API 2.0: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
GEO API 2.0: https://blog.gdeltproject.org/gdelt-geo-2-0-api-debuts/
Global Knowledge Graph: https://blog.gdeltproject.org/announcing-the-global-knowledge-graph/
GDELT Project: https://www.gdeltproject.org/
Query Guide: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/

6.24 DuckDuckGo Zero-Click Info Provider

config = {
    'duckduckgo_config': {
        'base_url': 'https://api.duckduckgo.com/',
        'timeout': 30,
        'rate_limit': 10,    # Requests per second
        'max_burst': 20,     # Maximum burst size
        'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)'
    }
}

tool = APISourceTool(config)

Environment Variable:

export DUCKDUCKGO_TIMEOUT=30
export DUCKDUCKGO_RATE_LIMIT=10
export DUCKDUCKGO_MAX_BURST=20
export DUCKDUCKGO_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)"

Supported Operations:

Instant Answer Operations:

get_instant_answer - Get instant answer for a query with all available data
get_abstract - Get article abstract/summary from Wikipedia and other sources
get_definition - Get definition for a term
get_related_topics - Get related topics and disambiguation
get_infobox - Get structured infobox data for an entity

Important Configuration Notes:

No API Key Required: DuckDuckGo Instant Answer API is completely free and open
Rate Limiting: Be respectful - implement reasonable delays between requests
Caching: Strongly recommended to cache responses to reduce server load
User-Agent: Set a descriptive User-Agent header for API etiquette
No Scraping: This is an Instant Answer API, not a full search results API
Attribution: Consider attributing results to DuckDuckGo when displaying them
Data Sources: Primarily Wikipedia, but also includes other curated sources

Example Usage:

# 1. Get instant answer for a query
result = tool.query(
    provider='duckduckgo',
    operation='get_instant_answer',
    params={
        'query': 'Python programming language',
        'no_html': True
    }
)

# 2. Get abstract for an entity
result = tool.query(
    provider='duckduckgo',
    operation='get_abstract',
    params={'query': 'Albert Einstein'}
)

# 3. Get definition for a term
result = tool.query(
    provider='duckduckgo',
    operation='get_definition',
    params={'query': 'algorithm'}
)

# 4. Get related topics (disambiguation)
result = tool.query(
    provider='duckduckgo',
    operation='get_related_topics',
    params={'query': 'Python'}
)

# 5. Get infobox data for an entity
result = tool.query(
    provider='duckduckgo',
    operation='get_infobox',
    params={'query': 'Steve Jobs'}
)

Response Data Structure:

Instant Answer Response:

{
    'heading': 'Python (programming language)',
    'abstract': 'Python is a high-level, general-purpose programming language...',
    'abstract_source': 'Wikipedia',
    'abstract_url': 'https://en.wikipedia.org/wiki/Python_(programming_language)',
    'answer': '',  # Direct answer if available
    'answer_type': '',
    'definition': '',  # Definition if available
    'image': 'https://duckduckgo.com/i/...',
    'type': 'A',  # Answer type: A=Article, D=Disambiguation, etc.
    'has_infobox': True,
    'has_related_topics': True
}

Related Topics Response:

{
    'heading': 'Python',
    'related_topics': [
        {
            'type': 'topic',
            'text': 'Python (programming language) A high-level...',
            'url': 'https://duckduckgo.com/Python_(programming_language)',
            'icon': '/i/7eec482b.png'
        },
        {
            'type': 'category',
            'name': 'Snakes',
            'topics': [...]
        }
    ],
    'total_topics': 15
}

Use Cases:

Quick facts and information retrieval
Entity disambiguation (e.g., “Python” could be programming language, snake, etc.)
Topic exploration and related content discovery
Knowledge base enrichment
Instant answers for common queries
Structured data extraction from infoboxes

API Documentation:

API Endpoint: https://api.duckduckgo.com/
API Format: https://api.duckduckgo.com/?q=query&format=json
DuckDuckGo: https://duckduckgo.com/

7. Environment Variables

7.1 Variable Reference

All configuration parameters can be set via environment variables with the APISOURCE_ prefix:

# API Keys
export APISOURCE_FRED_API_KEY="your_fred_key"
export APISOURCE_NEWSAPI_API_KEY="your_news_key"
export APISOURCE_CENSUS_API_KEY="your_census_key"
export APISOURCE_CONGRESS_API_KEY="your_congress_key"
export APISOURCE_ALPHAVANTAGE_API_KEY="your_alphavantage_key"
export APISOURCE_EXCHANGERATE_API_KEY="your_exchangerate_key"  # Optional
export APISOURCE_OPENWEATHERMAP_API_KEY="your_openweathermap_key"
export APISOURCE_GITHUB_API_KEY="your_github_token"  # Recommended
export APISOURCE_PUBMED_API_KEY="your_ncbi_api_key"  # Optional but recommended
export CROSSREF_MAILTO="your-email@example.com"  # Optional but recommended for polite pool
export APISOURCE_CORE_API_KEY="your_core_api_key"  # Required
export APISOURCE_USPTO_API_KEY="your_uspto_api_key"  # Required
export SECEDGAR_USER_AGENT="YourCompanyName contact@example.com"  # REQUIRED for SEC EDGAR
export STACKEXCHANGE_API_KEY="your_stackexchange_api_key"  # Optional but recommended
export OPENCORPORATES_API_KEY="your_opencorporates_api_key"  # Required

# Provider-specific Configuration
export SEMANTICSCHOLAR_TIMEOUT=30
export SEMANTICSCHOLAR_RATE_LIMIT=1
export SEMANTICSCHOLAR_MAX_BURST=5
export CORE_TIMEOUT=30
export CORE_RATE_LIMIT=10
export CORE_MAX_BURST=20
export USPTO_TIMEOUT=30
export USPTO_RATE_LIMIT=10
export USPTO_MAX_BURST=20
export SECEDGAR_TIMEOUT=30
export SECEDGAR_RATE_LIMIT=10
export SECEDGAR_MAX_BURST=20
export STACKEXCHANGE_TIMEOUT=30
export STACKEXCHANGE_RATE_LIMIT=10
export STACKEXCHANGE_MAX_BURST=20
export OPENCORPORATES_TIMEOUT=30
export OPENCORPORATES_RATE_LIMIT=10
export OPENCORPORATES_MAX_BURST=20
export HACKERNEWS_TIMEOUT=30
export HACKERNEWS_RATE_LIMIT=10
export HACKERNEWS_MAX_BURST=20
export STACKEXCHANGE_RATE_LIMIT=10
export STACKEXCHANGE_MAX_BURST=20
export GDELT_TIMEOUT=30
export GDELT_RATE_LIMIT=10
export GDELT_MAX_BURST=20
export DUCKDUCKGO_TIMEOUT=30
export DUCKDUCKGO_RATE_LIMIT=10
export DUCKDUCKGO_MAX_BURST=20
export DUCKDUCKGO_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)"

# Performance
export APISOURCE_CACHE_TTL="300"
export APISOURCE_DEFAULT_TIMEOUT="30"
export APISOURCE_MAX_RETRIES="3"

# Feature Flags
export APISOURCE_ENABLE_RATE_LIMITING="true"
export APISOURCE_ENABLE_FALLBACK="true"
export APISOURCE_ENABLE_DATA_FUSION="true"
export APISOURCE_ENABLE_QUERY_ENHANCEMENT="true"

# Logging
export APISOURCE_LOG_LEVEL="INFO"
export APISOURCE_METRICS_ENABLED="true"

7.2 Loading from .env File

# .env file
APISOURCE_FRED_API_KEY=your_fred_key
APISOURCE_NEWSAPI_API_KEY=your_news_key
APISOURCE_CACHE_TTL=300
APISOURCE_ENABLE_FALLBACK=true

# Load with python-dotenv
from dotenv import load_dotenv
load_dotenv()

# Tool automatically picks up environment variables
tool = APISourceTool()

8. Configuration Examples

8.1 Development Configuration

{
    "fred_api_key": "YOUR_FRED_KEY",
    "newsapi_api_key": "YOUR_NEWS_KEY",
    "cache_ttl": 60,
    "default_timeout": 30,
    "max_retries": 1,
    "enable_rate_limiting": false,
    "enable_fallback": true,
    "enable_data_fusion": true,
    "enable_query_enhancement": true,
    "log_level": "DEBUG",
    "metrics_enabled": true
}

8.2 Production Configuration

{
    "fred_api_key": "${FRED_API_KEY}",
    "newsapi_api_key": "${NEWSAPI_API_KEY}",
    "census_api_key": "${CENSUS_API_KEY}",
    "congress_api_key": "${CONGRESS_API_KEY}",
    "cache_ttl": 600,
    "default_timeout": 30,
    "max_retries": 5,
    "enable_rate_limiting": true,
    "enable_fallback": true,
    "enable_data_fusion": true,
    "enable_query_enhancement": true,
    "enable_intelligent_cache": true,
    "log_level": "INFO",
    "metrics_enabled": true,
    "cache_backend": "redis",
    "redis_url": "redis://redis:6379/0"
}

8.3 High-Volume Configuration

{
    "fred_api_key": "${FRED_API_KEY}",
    "cache_ttl": 3600,
    "default_timeout": 15,
    "max_retries": 3,
    "enable_rate_limiting": true,
    "enable_fallback": true,
    "enable_data_fusion": false,
    "enable_query_enhancement": false,
    "enable_intelligent_cache": true,
    "log_level": "WARNING",
    "metrics_enabled": true,
    "rate_limit_config": {
        "fred": {
            "tokens_per_second": 1.5,
            "max_tokens": 5
        }
    }
}

8.4 Minimal Configuration

{
    "fred_api_key": "YOUR_FRED_KEY"
}

All other parameters use defaults.

9. Validation and Testing

9.1 Configuration Validation

from aiecs.tools.apisource.tool import Config

# Validate configuration
try:
    config = Config(
        fred_api_key='YOUR_KEY',
        cache_ttl=300,
        max_retries=3
    )
    print("Configuration valid!")
except ValueError as e:
    print(f"Configuration error: {e}")

9.2 Testing Configuration

from aiecs.tools.apisource import APISourceTool

# Create tool with configuration
tool = APISourceTool(config)

# Test provider connectivity
providers = tool.list_providers()
for provider in providers:
    print(f"Provider: {provider['name']}")
    print(f"Health: {provider['health']['status']}")
    print(f"Score: {provider['health']['score']}\n")

# Test a simple query
try:
    result = tool.query(
        provider='fred',
        operation='get_series_info',
        params={'series_id': 'GDP'}
    )
    print("Configuration working correctly!")
except Exception as e:
    print(f"Configuration issue: {e}")

9.3 Configuration Best Practices

Use Environment Variables for Secrets:

import os
config = {
    'fred_api_key': os.getenv('FRED_API_KEY'),
    'newsapi_api_key': os.getenv('NEWSAPI_KEY')
}

Validate Before Deployment:

def validate_config(config):
    required_keys = ['fred_api_key']
    for key in required_keys:
        if not config.get(key):
            raise ValueError(f"Missing required config: {key}")
    return True

Use Different Configs for Different Environments:

import os

env = os.getenv('ENVIRONMENT', 'development')
config_file = f'config.{env}.json'

with open(config_file) as f:
    config = json.load(f)

Monitor Configuration Impact:

# Check metrics after configuration changes
metrics = tool.get_metrics()
print(f"Success rate: {metrics['overall']['success_rate']}")
print(f"Avg response time: {metrics['overall']['avg_response_time']}")

Document Version: 2.0
Last Updated: 2025-10-18
Maintainer: AIECS Tools Team