> **Deprecated** — removed in AIECS 2.0 # APISource Tool - Configuration Reference ## Table of Contents 1. [Configuration Overview](#configuration-overview) 2. [Configuration Parameters](#configuration-parameters) 3. [API Credentials](#api-credentials) 4. [Performance Settings](#performance-settings) 5. [Feature Flags](#feature-flags) 6. [Provider-Specific Configuration](#provider-specific-configuration) 7. [Environment Variables](#environment-variables) 8. [Configuration Examples](#configuration-examples) 9. [Validation and Testing](#validation-and-testing) --- ## 1. Configuration Overview ### 1.1 Configuration Methods The APISource Tool supports multiple configuration methods: 1. **Dictionary Configuration**: ```python from aiecs.tools.apisource import APISourceTool config = { 'fred_api_key': 'YOUR_KEY', 'cache_ttl': 300, 'enable_fallback': True } tool = APISourceTool(config) ``` 2. **Environment Variables**: ```python import os os.environ['APISOURCE_FRED_API_KEY'] = 'YOUR_KEY' os.environ['APISOURCE_CACHE_TTL'] = '300' tool = APISourceTool() # Auto-loads from environment ``` 3. **Configuration File**: ```python import json with open('apisource_config.json') as f: config = json.load(f) tool = APISourceTool(config) ``` 4. **Pydantic Model**: ```python from aiecs.tools.apisource.tool import Config config = Config( fred_api_key='YOUR_KEY', cache_ttl=300, enable_fallback=True ) tool = APISourceTool(config) ``` ### 1.2 Configuration Priority When multiple configuration sources are present, the priority is: 1. **Explicit parameters** (highest priority) 2. **Configuration dictionary/object** 3. **Environment variables** 4. **Default values** (lowest priority) --- ## 2. Configuration Parameters ### 2.1 Complete Parameter Reference | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `fred_api_key` | str | None | FRED API key | | `newsapi_api_key` | str | None | News API key | | `guardian_api_key` | str | None | The Guardian API key | | `census_api_key` | str | None | Census Bureau API key | | `congress_api_key` | str | None | Congress.gov API key | | `cache_ttl` | int | 300 | Cache TTL in seconds | | `default_timeout` | int | 30 | Request timeout in seconds | | `max_retries` | int | 3 | Maximum retry attempts | | `enable_rate_limiting` | bool | True | Enable rate limiting | | `enable_fallback` | bool | True | Enable provider fallback | | `enable_data_fusion` | bool | True | Enable data fusion | | `enable_query_enhancement` | bool | True | Enable query enhancement | | `enable_intelligent_cache` | bool | True | Enable intelligent caching | | `log_level` | str | 'INFO' | Logging level | | `metrics_enabled` | bool | True | Enable metrics collection | ### 2.2 Parameter Details #### cache_ttl - **Type**: Integer - **Default**: 300 (5 minutes) - **Range**: 0-86400 (0 = no cache, 86400 = 24 hours) - **Description**: Time-to-live for cached results in seconds - **Recommendation**: - Development: 60-300 seconds - Production: 300-3600 seconds - High-frequency data: 60-300 seconds - Static data: 3600-86400 seconds #### default_timeout - **Type**: Integer - **Default**: 30 seconds - **Range**: 5-300 seconds - **Description**: Maximum time to wait for API response - **Recommendation**: - Fast APIs (FRED, News): 10-30 seconds - Slow APIs (World Bank): 30-60 seconds - Batch operations: 60-120 seconds #### max_retries - **Type**: Integer - **Default**: 3 - **Range**: 0-10 - **Description**: Maximum number of retry attempts for failed requests - **Recommendation**: - Production: 3-5 retries - Development: 1-2 retries - Critical operations: 5-10 retries #### enable_rate_limiting - **Type**: Boolean - **Default**: True - **Description**: Enable automatic rate limiting to prevent API quota exhaustion - **Recommendation**: Always True in production #### enable_fallback - **Type**: Boolean - **Default**: True - **Description**: Enable automatic failover to alternative providers - **Recommendation**: True for high-availability applications #### enable_data_fusion - **Type**: Boolean - **Default**: True - **Description**: Enable intelligent merging of multi-provider results - **Recommendation**: True for search operations #### enable_query_enhancement - **Type**: Boolean - **Default**: True - **Description**: Enable automatic parameter completion from query text - **Recommendation**: True for AI agent integration #### enable_intelligent_cache - **Type**: Boolean - **Default**: True - **Description**: Enable intent-aware cache TTL strategies - **Recommendation**: True for optimal performance --- ## 3. API Credentials ### 3.1 FRED API Key **Obtaining the Key**: 1. Visit https://fred.stlouisfed.org/docs/api/api_key.html 2. Register for a free account 3. Request an API key **Configuration**: ```python # Method 1: Direct configuration tool = APISourceTool({'fred_api_key': 'YOUR_FRED_KEY'}) # Method 2: Environment variable export APISOURCE_FRED_API_KEY="YOUR_FRED_KEY" # Method 3: Configuration file { "fred_api_key": "YOUR_FRED_KEY" } ``` **Rate Limits**: - Free tier: 120 requests per minute - No daily limit ### 3.2 News API Key **Obtaining the Key**: 1. Visit https://newsapi.org/register 2. Choose a plan (Free tier available) 3. Get your API key **Configuration**: ```python tool = APISourceTool({'newsapi_api_key': 'YOUR_NEWS_KEY'}) ``` **Rate Limits**: - Free tier: 100 requests per day - Developer tier: 250 requests per day - Business tier: 250,000 requests per day ### 3.3 The Guardian API Key **Obtaining the Key**: 1. Visit https://open-platform.theguardian.com/access/ 2. Register for a free account 3. Request an API key **Configuration**: ```python # Method 1: Direct configuration tool = APISourceTool({'guardian_api_key': 'YOUR_GUARDIAN_KEY'}) # Method 2: Environment variable export GUARDIAN_API_KEY="YOUR_GUARDIAN_KEY" # Method 3: Configuration file { "guardian_api_key": "YOUR_GUARDIAN_KEY" } ``` **Rate Limits**: - Free tier: 5,000 requests per day - Developer tier: 15,000 requests per day - Higher tiers available for commercial use **Important API Rules**: 1. **API Key Required**: All API requests require an API key 2. **Rate Limiting**: Free tier allows 5,000 requests per day 3. **Attribution**: Must acknowledge The Guardian when displaying content 4. **Data Freshness**: Content is updated in real-time 5. **Commercial Use**: Contact The Guardian for commercial licensing **API Documentation**: - API Documentation: https://open-platform.theguardian.com/documentation/ - Content API: https://open-platform.theguardian.com/documentation/search - Tags API: https://open-platform.theguardian.com/documentation/tag - Sections API: https://open-platform.theguardian.com/documentation/section **Available Operations**: - `search_content`: Search all Guardian content with advanced filtering - `get_item`: Get a specific content item by ID - `get_tags`: Get all tags or filter by type - `search_tags`: Search for tags by query - `get_sections`: Get all Guardian sections - `get_edition`: Get content for a specific edition (UK, US, AU, International) **Example Usage**: ```python # Search for articles about climate change result = tool.query( provider='guardian', operation='search_content', params={ 'q': 'climate change', 'section': 'environment', 'page_size': 10, 'show_fields': 'headline,body,thumbnail' } ) # Get all sections result = tool.query( provider='guardian', operation='get_sections', params={} ) # Search for tags result = tool.query( provider='guardian', operation='search_tags', params={'q': 'technology', 'page_size': 10} ) # Get US edition content result = tool.query( provider='guardian', operation='get_edition', params={'edition': 'us', 'page_size': 20} ) ``` ### 3.4 Census Bureau API Key **Obtaining the Key**: 1. Visit https://api.census.gov/data/key_signup.html 2. Fill out the request form 3. Receive key via email **Configuration**: ```python tool = APISourceTool({'census_api_key': 'YOUR_CENSUS_KEY'}) ``` **Rate Limits**: - 500 requests per IP per day (without key) - Higher limits with API key ### 3.4 Congress.gov API **API Key Required**: 1. Visit https://api.congress.gov/sign-up/ 2. Fill out the registration form 3. Receive key via email **Configuration**: ```python tool = APISourceTool({'congress_api_key': 'YOUR_CONGRESS_KEY'}) ``` **Rate Limits**: - Reasonable usage limits with API key - Data updated regularly from official sources **Available Operations**: - `search_bills`: Search for bills and resolutions - `get_bill`: Get detailed bill information - `list_members`: List members of Congress - `get_member`: Get member details - `list_committees`: List congressional committees - `get_committee`: Get committee details - `search_amendments`: Search for amendments - `get_amendment`: Get amendment details ### 3.5 OpenStates API **API Key Required**: ```python config = { 'openstates_api_key': 'YOUR_API_KEY' } tool = APISourceTool(config) ``` **Configuration**: ```python config = { 'openstates_api_key': 'YOUR_API_KEY', 'openstates_config': { 'timeout': 30, 'rate_limit': 10, 'max_burst': 20 } } tool = APISourceTool(config) ``` **Environment Variables**: ```bash export OPENSTATES_API_KEY="your_api_key_here" export OPENSTATES_TIMEOUT=30 export OPENSTATES_RATE_LIMIT=10 export OPENSTATES_MAX_BURST=20 ``` **Obtaining an API Key**: 1. Visit https://openstates.org/accounts/profile/ 2. Register for a free account 3. Generate an API key from your profile 4. Copy the API key to your configuration **Rate Limits**: - Free tier: Reasonable usage limits - Be respectful of the free service - Recommended: Max 10 requests per second **Important API Rules**: 1. **API Key Required**: Must register for a free API key 2. **Rate Limiting**: Be respectful - implement reasonable delays between requests 3. **Attribution**: Acknowledge OpenStates.org when using the data 4. **Data Freshness**: Data is updated regularly from official state sources **API Documentation**: - API v3 Documentation: https://docs.openstates.org/api-v3/ - Interactive API Docs: https://v3.openstates.org/docs/ - About OpenStates: https://openstates.org/about/ **Available Operations**: - `search_bills`: Search for state bills and resolutions with advanced filtering - `get_bill`: Get detailed information about a specific bill by ID - `search_people`: Search for state legislators with filtering options - `get_person`: Get detailed information about a specific legislator - `list_jurisdictions`: List all available state jurisdictions - `get_jurisdiction`: Get detailed information about a specific jurisdiction **Example Usage**: ```python # Search for bills in California result = tool.query( provider='openstates', operation='search_bills', params={'jurisdiction': 'CA', 'session': '2023', 'per_page': 10} ) # Get current legislators from Texas result = tool.query( provider='openstates', operation='search_people', params={'jurisdiction': 'TX', 'current': True, 'per_page': 10} ) # List all available jurisdictions result = tool.query( provider='openstates', operation='list_jurisdictions', params={'per_page': 52} ) ``` ### 3.6 World Bank API **No API Key Required**: ```python # World Bank API is publicly accessible tool = APISourceTool() # No key needed for World Bank ``` **Rate Limits**: - No official rate limit - Recommended: Max 10 requests per second ### 3.7 Alpha Vantage API Key **Obtaining the Key**: 1. Visit https://www.alphavantage.co/support/#api-key 2. Register for a free account 3. Get your API key **Configuration**: ```python tool = APISourceTool({'alphavantage_api_key': 'YOUR_ALPHAVANTAGE_KEY'}) ``` **Rate Limits**: - Free tier: 5 API requests per minute, 500 per day - Premium tiers available with higher limits ### 3.8 REST Countries API **No API Key Required**: ```python # REST Countries API is publicly accessible tool = APISourceTool() # No key needed for REST Countries ``` **Rate Limits**: - No official rate limit - Recommended: Max 10 requests per second ### 3.9 ExchangeRate-API **No API Key Required (Free Tier)**: ```python # ExchangeRate-API free tier works without key tool = APISourceTool() # No key needed for free tier ``` **Optional API Key for Enhanced Features**: ```python tool = APISourceTool({'exchangerate_api_key': 'YOUR_EXCHANGERATE_KEY'}) ``` **Rate Limits**: - Free tier: 1,500 requests per month - Standard tier: Higher limits with API key ### 3.10 Open Library API **No API Key Required**: ```python # Open Library API is completely free and open tool = APISourceTool() # No key needed for Open Library ``` **Rate Limits**: - No official rate limit - Recommended: Max 10 requests per second - Be respectful of the free service ### 3.11 Metropolitan Museum of Art (The Met) API **No API Key Required**: ```python # The Met Museum API is completely free and open tool = APISourceTool() # No key needed for Met Museum ``` **Configuration (Optional)**: ```python config = { 'metmuseum_config': { 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20 # Maximum burst size } } tool = APISourceTool(config) ``` **Environment Variables**: ```bash export METMUSEUM_TIMEOUT=30 export METMUSEUM_RATE_LIMIT=10 export METMUSEUM_MAX_BURST=20 ``` **Rate Limits**: - No official rate limit - Recommended: Max 10 requests per second - Be respectful of the free service **Important API Rules**: 1. **No API Key Required**: Completely free and open access 2. **Rate Limiting**: Be respectful - implement reasonable delays between requests 3. **Data Coverage**: Access to 470,000+ artworks from The Met collection 4. **Images**: Many objects include high-resolution images (check isPublicDomain flag) 5. **Attribution**: Acknowledge The Metropolitan Museum of Art when using the data **API Documentation**: - API Documentation: https://metmuseum.github.io/ - GitHub Repository: https://github.com/metmuseum/openaccess - Open Access Initiative: https://www.metmuseum.org/about-the-met/policies-and-documents/open-access **Supported Operations**: - `search_objects` - Search for art objects with advanced filtering (query, department, date range, etc.) - `get_object` - Get detailed information about a specific art object by ID - `get_departments` - Get list of all departments at The Met - `get_objects_by_department` - Get all objects in a specific department - `search_by_artist` - Search for artworks by artist name - `search_by_medium` - Search for artworks by medium (Paintings, Sculpture, etc.) - `search_by_culture` - Search for artworks by culture or civilization - `search_highlight_objects` - Search for highlighted/featured objects - `download_image` - Download high-resolution images from The Met collection **Example Usage**: ```python # Search for artworks by Van Gogh result = tool.query( provider='metmuseum', operation='search_by_artist', params={'artist_name': 'Vincent van Gogh', 'has_images': True, 'limit': 10} ) # Get detailed object information result = tool.query( provider='metmuseum', operation='get_object', params={'object_id': 436535} # Wheat Field with Cypresses ) # Search for Egyptian art result = tool.query( provider='metmuseum', operation='search_by_culture', params={'culture': 'Egyptian', 'has_images': True, 'limit': 20} ) # Get all departments result = tool.query( provider='metmuseum', operation='get_departments', params={} ) # Search with date range filter result = tool.query( provider='metmuseum', operation='search_objects', params={ 'q': 'impressionism', 'has_images': True, 'date_begin': 1860, 'date_end': 1900, 'limit': 15 } ) # Download image by object ID result = tool.query( provider='metmuseum', operation='download_image', params={'object_id': 436535} # Downloads primary image ) print(f"Image saved to: {result['data']['output_path']}") # Download image by direct URL result = tool.query( provider='metmuseum', operation='download_image', params={ 'image_url': 'https://images.metmuseum.org/CRDImages/ep/original/DP-42549-001.jpg', 'output_path': './artwork.jpg' # Optional custom path } ) ``` **Data Fields Available**: - Object metadata: title, artist, date, medium, dimensions - Department and classification information - Geographic and cultural origin - High-resolution images (primaryImage, additionalImages) - Exhibition history and provenance - Related artworks and references - Public domain status (isPublicDomain) - Gallery information (isOnView, GalleryNumber) ### 3.12 CoinGecko API **No API Key Required**: ```python # CoinGecko API is free for basic usage tool = APISourceTool() # No key needed for free tier ``` **Rate Limits**: - Free tier: 10-50 calls/minute (varies by endpoint) - Pro tier available with API key for higher limits ### 3.12 OpenWeatherMap API **Obtaining the Key**: 1. Visit https://openweathermap.org/api 2. Sign up for a free account 3. Generate an API key from your account dashboard **Configuration**: ```python tool = APISourceTool({'openweathermap_api_key': 'YOUR_OPENWEATHERMAP_KEY'}) ``` **Rate Limits**: - Free tier: 60 calls/minute, 1,000,000 calls/month - Various paid tiers available ### 3.13 Wikipedia API **No API Key Required**: ```python # Wikipedia API is completely free and open tool = APISourceTool() # No key needed for Wikipedia ``` **Configuration with User-Agent (REQUIRED)**: ```python config = { 'wikipedia_config': { 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' } } tool = APISourceTool(config) ``` **Rate Limits**: - Maximum: 200 requests per second - Recommended: 10 requests per second (default in configuration) - Be respectful of the free service **API Rules** (https://www.mediawiki.org/wiki/API:Etiquette): 1. **User-Agent Header REQUIRED**: Must include a unique User-Agent header with: - Application name and version - Contact URL or email address - Format: `"AppName/Version (URL; contact@email.com)"` 2. **Rate Limiting**: Limit to 200 requests/second maximum 3. **Caching**: Cache responses when possible to reduce load **API Documentation**: - MediaWiki Action API: https://www.mediawiki.org/wiki/API:Main_page - REST API: https://en.wikipedia.org/api/rest_v1/ - API Etiquette: https://www.mediawiki.org/wiki/API:Etiquette ### 3.14 GitHub API **API Key Recommended**: ```python config = { 'github_api_key': 'YOUR_GITHUB_TOKEN' } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export GITHUB_API_KEY="your_github_personal_access_token" ``` **Rate Limits**: - **Authenticated**: 5,000 requests per hour - **Unauthenticated**: 60 requests per hour - Strongly recommended to use authentication for higher limits **Obtaining an API Key**: 1. Visit https://github.com/settings/tokens 2. Click "Generate new token" → "Generate new token (classic)" 3. Select scopes based on your needs: - `public_repo` - Access public repositories - `repo` - Full control of private repositories (if needed) - `user` - Read user profile data 4. Generate and copy the token **API Documentation**: - REST API: https://docs.github.com/en/rest - Authentication: https://docs.github.com/en/rest/authentication - Rate Limiting: https://docs.github.com/en/rest/rate-limit ### 3.13 arXiv API **No API Key Required**: ```python # arXiv API is completely free and open tool = APISourceTool() # No key needed for arXiv ``` **Configuration (Optional)**: ```python config = { 'arxiv_config': { 'timeout': 30, 'rate_limit': 0.33, # ~3 second delays between requests (1/3 req/s) 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' } } tool = APISourceTool(config) ``` **Important API Rules**: 1. **Rate Limiting**: Be respectful - implement 3 second delays between requests 2. **Max Results**: Limited to 30,000 results in slices of at most 2,000 at a time 3. **Caching**: Cache responses when possible to reduce server load 4. **User-Agent**: Set a descriptive User-Agent header **API Documentation**: - API User Manual: https://info.arxiv.org/help/api/user-manual.html - API Basics: https://info.arxiv.org/help/api/basics.html - arXiv Categories: https://arxiv.org/category_taxonomy ### 3.14 PubMed/NCBI E-utilities API **API Key Optional but Recommended**: ```python # Works without API key (3 requests/second limit) tool = APISourceTool() # With API key (10 requests/second limit) config = { 'pubmed_api_key': 'YOUR_PUBMED_API_KEY' } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export PUBMED_API_KEY="your_ncbi_api_key" ``` **Configuration (Optional)**: ```python config = { 'pubmed_config': { 'api_key': 'YOUR_API_KEY', # Optional but recommended 'timeout': 30, 'rate_limit': 3, # 3 req/s without key, 10 with key 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' } } tool = APISourceTool(config) ``` **Rate Limits**: - **Without API key**: 3 requests per second - **With API key**: 10 requests per second - API key strongly recommended for better service **Obtaining an API Key**: 1. Visit https://www.ncbi.nlm.nih.gov/account/ 2. Register for a free NCBI account 3. Go to Settings → API Key Management 4. Generate a new API key **Important API Rules**: 1. **Rate Limiting**: Max 3 requests/second without API key, 10 with API key 2. **User-Agent**: Set a descriptive User-Agent header with email 3. **Caching**: Cache responses when possible to reduce server load 4. **API Key**: Recommended for higher rate limits and better service **API Documentation**: - E-utilities Quick Start: https://www.ncbi.nlm.nih.gov/books/NBK25500/ - E-utilities API Guide: https://www.ncbi.nlm.nih.gov/books/NBK25501/ - PubMed Help: https://pubmed.ncbi.nlm.nih.gov/help/ **Supported Operations**: - `search_papers`: Search for papers by query string - `get_paper_by_id`: Get paper metadata by PubMed ID (PMID) - `search_by_author`: Search for papers by author name - `get_paper_details`: Get detailed paper information including abstract and citations ### 3.15 CrossRef API **No API Key Required**: ```python # CrossRef API is completely free and open tool = APISourceTool() # No key needed for CrossRef ``` **Configuration (Optional)**: ```python config = { 'crossref_config': { 'mailto': 'your-email@example.com', # For polite pool access (better rate limits) 'timeout': 30, 'rate_limit': 10, 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)' } } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export CROSSREF_MAILTO="your-email@example.com" ``` **Important API Rules**: 1. **Rate Limiting**: Use polite pool (include mailto parameter) for better rate limits 2. **User-Agent**: Set a descriptive User-Agent header 3. **Caching**: Cache responses when possible to reduce server load 4. **Attribution**: Acknowledge CrossRef when using the data **API Documentation**: - REST API Documentation: https://www.crossref.org/documentation/retrieve-metadata/rest-api/ - API Etiquette: https://github.com/CrossRef/rest-api-doc#etiquette - Metadata Plus: https://www.crossref.org/services/metadata-delivery/ **Supported Operations**: - `get_work_by_doi`: Get metadata for a work by its DOI - `search_works`: Search for works by query string - `get_journal_works`: Get works published in a specific journal by ISSN - `search_funders`: Search for funders in the Open Funder Registry - `get_funder_works`: Get works associated with a specific funder ### 3.16 Semantic Scholar API **No API Key Required**: ```python # Semantic Scholar API is completely free and open tool = APISourceTool() # No key needed for Semantic Scholar ``` **Configuration (Optional)**: ```python config = { 'semanticscholar_config': { 'timeout': 30, 'rate_limit': 1, # Requests per second (recommended for sustained use) 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)' } } tool = APISourceTool(config) ``` **Environment Variables**: ```bash export SEMANTICSCHOLAR_TIMEOUT=30 export SEMANTICSCHOLAR_RATE_LIMIT=1 export SEMANTICSCHOLAR_MAX_BURST=5 ``` **Rate Limits**: - Free tier: 1 request per second recommended (100 requests per 5 minutes) - Higher limits available upon request **Important API Rules**: 1. **Rate Limiting**: Recommended 1 request per second for sustained use 2. **Max Results**: Limited to 100 results per request for search, use pagination for more 3. **Caching**: Cache responses when possible to reduce server load 4. **User-Agent**: Set a descriptive User-Agent header **API Documentation**: - API Documentation: https://api.semanticscholar.org/api-docs/ - Academic Graph API: https://www.semanticscholar.org/product/api - API Tutorial: https://www.semanticscholar.org/product/api/tutorial **Supported Operations**: - `search_papers`: Search for papers by query string - `get_paper`: Get paper details by ID (S2 ID, DOI, arXiv ID, etc.) - `get_paper_authors`: Get authors of a specific paper - `get_paper_citations`: Get papers that cite this paper - `get_paper_references`: Get papers referenced by this paper - `get_author`: Get author details by ID - `get_author_papers`: Get papers by a specific author ### 3.17 CORE API Key **Obtaining the Key**: 1. Visit https://core.ac.uk/services/api 2. Register for a free account 3. Request an API key from your account dashboard **Configuration**: ```python # Method 1: Direct configuration tool = APISourceTool({'core_api_key': 'YOUR_CORE_KEY'}) # Method 2: Environment variable export CORE_API_KEY="YOUR_CORE_KEY" # Method 3: Configuration file { "core_api_key": "YOUR_CORE_KEY" } ``` **Rate Limits**: - Free tier: Reasonable usage with rate limiting - Contact CORE for higher limits if needed **Features**: - Access to millions of open access research papers - Search by query, DOI, or title - Full metadata including authors, abstract, citations - Support for pagination ### 3.18 USPTO API Key **Obtaining the Key**: 1. Visit https://developer.uspto.gov/ 2. Register for a free developer account 3. Request an API key from your account dashboard **Configuration**: ```python # Method 1: Direct configuration tool = APISourceTool({'uspto_api_key': 'YOUR_USPTO_KEY'}) # Method 2: Environment variable export USPTO_API_KEY="YOUR_USPTO_KEY" # Method 3: Configuration file { "uspto_api_key": "YOUR_USPTO_KEY" } ``` **Rate Limits**: - Free tier: Reasonable usage with rate limiting - Contact USPTO for higher limits if needed **Features**: - Search US patents by query, inventor, or assignee - Get detailed patent information by patent number - Access to comprehensive US patent database - Full metadata including inventors, assignees, classifications, citations ### 3.19 SEC EDGAR API **No API Key Required**: ```python # SEC EDGAR API is publicly accessible # User-Agent header is REQUIRED config = { 'secedgar_config': { 'user_agent': 'YourCompanyName contact@example.com' } } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export SECEDGAR_USER_AGENT="YourCompanyName contact@example.com" ``` **Configuration with User-Agent (REQUIRED)**: ```python config = { 'secedgar_config': { 'user_agent': 'AIECS-APISource contact@example.com', 'timeout': 30, 'rate_limit': 10, 'max_burst': 20 } } tool = APISourceTool(config) ``` **Rate Limits**: - Maximum: 10 requests per second - SEC may block access if rules are not followed - Be respectful of the free service **API Rules** (https://www.sec.gov/os/accessing-edgar-data): 1. **User-Agent Header REQUIRED**: Must include: - Company or individual name - Contact email address - Format: `"CompanyName contact@email.com"` 2. **Rate Limiting**: Limit to 10 requests per second maximum 3. **Caching**: Cache responses when possible to reduce load 4. **Fair Access**: SEC monitors usage and may block non-compliant access **API Documentation**: - API Overview: https://www.sec.gov/search-filings/edgar-application-programming-interfaces - Accessing EDGAR Data: https://www.sec.gov/os/accessing-edgar-data - Data Sets: https://www.sec.gov/data-research/sec-markets-data **Features**: - Company submissions and filing history - XBRL financial data and concepts - Company facts across all filings - No API key required - completely free **Supported Operations**: - `get_company_submissions` - Get company filing history by CIK - `get_company_concept` - Get XBRL concept data for specific metrics - `get_company_facts` - Get all XBRL facts for a company **Example CIKs**: - Apple Inc.: 0000320193 - Tesla Inc.: 0001318605 - Microsoft Corp.: 0000789019 ### 3.20 Stack Exchange API **API Key Optional (Recommended)**: ```python # Stack Exchange API works without key but has lower rate limits # API key strongly recommended for production use config = { 'stackexchange_config': { 'api_key': 'YOUR_STACKEXCHANGE_API_KEY' } } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export STACKEXCHANGE_API_KEY="your_api_key_here" ``` **Get Your API Key**: 1. Visit https://stackapps.com/apps/oauth/register 2. Register your application 3. Copy your API key **Configuration**: ```python config = { 'stackexchange_config': { 'api_key': 'YOUR_API_KEY', # Optional but recommended 'timeout': 30, 'rate_limit': 10, 'max_burst': 20 } } tool = APISourceTool(config) ``` **Rate Limits**: - Without API key: 300 requests per day - With API key: 10,000 requests per day - Respect the backoff field in API responses **API Rules** (https://api.stackexchange.com/docs/throttle): 1. **API Key Recommended**: Increases daily quota from 300 to 10,000 requests 2. **Backoff**: Respect the backoff field in responses when present 3. **Compression**: API returns gzip compressed responses by default 4. **Attribution**: Required when displaying Stack Exchange content 5. **Fair Use**: Follow the API terms of service **API Documentation**: - API Documentation: https://api.stackexchange.com/docs - Authentication: https://api.stackexchange.com/docs/authentication - Throttling: https://api.stackexchange.com/docs/throttle **Features**: - Search questions across Stack Exchange network - Get detailed question and answer data - Search for users and their profiles - Browse tags and their statistics - Access all Stack Exchange sites (Stack Overflow, Server Fault, Super User, etc.) - Rich metadata including votes, views, acceptance status, and bounties **Supported Operations**: - `search_questions` - Search for questions by query and tags - `get_question` - Get detailed information about a specific question - `get_answers` - Get answers for a specific question - `search_users` - Search for users by name - `get_tags` - Get tags and their statistics - `get_sites` - Get all sites in the Stack Exchange network **Popular Sites**: - Stack Overflow: `stackoverflow` - Server Fault: `serverfault` - Super User: `superuser` - Ask Ubuntu: `askubuntu` - Mathematics: `math` ### 3.21 OpenCorporates API **API Key Required**: ```python config = { 'opencorporates_api_key': 'YOUR_OPENCORPORATES_API_KEY' } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export OPENCORPORATES_API_KEY="your_opencorporates_api_key" ``` **Rate Limits**: - Free tier: 200 requests per month, 50 requests per day - Open data projects: Free with share-alike attribution - Paid plans: Available for commercial use without restrictions **Obtaining an API Key**: 1. Visit https://opencorporates.com/api_accounts/new 2. Register for a free account 3. Choose your plan (free for open data projects) 4. Get your API key from the dashboard **Features**: - Search for companies by name across 140+ jurisdictions - Get detailed company information by jurisdiction and company number - Search for company officers (directors, agents) - Access company filings and statutory documents - Get jurisdiction information - Access to 200+ million companies worldwide **API Documentation**: - API Reference: https://api.opencorporates.com/documentation/API-Reference - API Accounts: https://opencorporates.com/api_accounts/new - About OpenCorporates: https://opencorporates.com/info/about ### 3.22 CourtListener (Free Law Project) API **API Key Required**: ```python config = { 'courtlistener_api_key': 'YOUR_COURTLISTENER_API_KEY' } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export COURTLISTENER_API_KEY="your_courtlistener_api_key" ``` **Configuration (Optional)**: ```python config = { 'courtlistener_api_key': 'YOUR_API_KEY', 'courtlistener_config': { 'timeout': 30, 'rate_limit': 10, 'max_burst': 20 } } tool = APISourceTool(config) ``` **Rate Limits**: - Free tier: 5,000 requests per hour for authenticated users - Higher limits available upon request - Be respectful of the free service **Obtaining an API Key**: 1. Visit https://www.courtlistener.com/sign-in/register/ 2. Register for a free account 3. Go to your profile settings 4. Generate an API key 5. Copy and store the API key securely **Features**: - Search legal opinions and case law from federal and state courts - Access court dockets and case filings (RECAP archive) - Search judges and judicial information - Access oral argument audio recordings - Explore legal citations and citation networks - Search court information - Access to millions of legal opinions and PACER data - Full metadata including case names, judges, courts, dates, citations **Supported Operations**: - `search_opinions` - Search for legal opinions and case law with advanced filtering - `get_opinion` - Get detailed information about a specific legal opinion - `search_dockets` - Search for court dockets and case filings - `get_docket` - Get detailed information about a specific docket - `search_judges` - Search for judges and judicial information - `get_judge` - Get detailed information about a specific judge - `search_oral_arguments` - Search for oral argument audio recordings - `get_oral_argument` - Get detailed information about a specific oral argument - `search_citations` - Search for legal citations and citation networks - `get_citation` - Get detailed information about a specific citation - `search_courts` - Search for court information - `get_court` - Get detailed information about a specific court **Important API Rules**: 1. **API Key Required**: Must register for a free API key 2. **Rate Limiting**: Default is 5,000 requests per hour for authenticated users 3. **Attribution**: Acknowledge Free Law Project when using the data 4. **Data Freshness**: Data is updated regularly from court sources and PACER 5. **Fair Use**: Follow the API terms of service **API Documentation**: - REST API Documentation: https://www.courtlistener.com/help/api/rest/ - Interactive API Docs: https://www.courtlistener.com/api/rest-info/ - About Free Law Project: https://free.law/ - Coverage Information: https://www.courtlistener.com/coverage/ **Example Usage**: ```python # Search for opinions about constitutional law result = tool.query( provider='courtlistener', operation='search_opinions', params={'q': 'first amendment', 'court': 'scotus', 'page_size': 10} ) # Search for dockets in a specific court result = tool.query( provider='courtlistener', operation='search_dockets', params={'court': 'dcd', 'docket_number': '20-cv', 'page_size': 5} ) # Search for judges result = tool.query( provider='courtlistener', operation='search_judges', params={'name': 'Sotomayor', 'court': 'scotus'} ) # Search for oral arguments result = tool.query( provider='courtlistener', operation='search_oral_arguments', params={'court': 'scotus', 'case_name': 'Brown', 'page_size': 5} ) # Get court information result = tool.query( provider='courtlistener', operation='get_court', params={'court_id': 'scotus'} ) ``` **Popular Court IDs**: - Supreme Court: `scotus` - 9th Circuit Court of Appeals: `ca9` - 2nd Circuit Court of Appeals: `ca2` - D.C. District Court: `dcd` - Southern District of New York: `nysd` - Northern District of California: `cand` ### 3.23 GBIF (Global Biodiversity Information Facility) API **No API Key Required**: ```python # GBIF API is completely free and open tool = APISourceTool() # No key needed for GBIF ``` **Configuration (Optional)**: ```python config = { 'gbif_config': { 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)' } } tool = APISourceTool(config) ``` **Environment Variables**: ```bash export GBIF_TIMEOUT=30 export GBIF_RATE_LIMIT=10 export GBIF_MAX_BURST=20 export GBIF_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)" ``` **Rate Limits**: - No official rate limit - Recommended: Max 10 requests per second - Be respectful of the free service **Important API Rules**: 1. **No API Key Required**: Completely free and open access 2. **Rate Limiting**: Be respectful - implement reasonable delays between requests 3. **Data Coverage**: Access to 2+ billion species occurrence records 4. **Attribution**: Acknowledge GBIF when using the data 5. **Fair Use**: Do not abuse the free service with excessive requests **API Documentation**: - API Reference: https://techdocs.gbif.org/en/openapi/ - Species API: https://techdocs.gbif.org/en/openapi/v1/species - Occurrence API: https://techdocs.gbif.org/en/openapi/v1/occurrence - Dataset API: https://techdocs.gbif.org/en/openapi/v1/dataset - About GBIF: https://www.gbif.org/what-is-gbif **Features**: - Search for species by name or taxonomic criteria - Match scientific names to GBIF's taxonomic backbone - Search occurrence records with geographic and temporal filters - Access dataset metadata and publishing information - Get vernacular (common) names in multiple languages - Explore taxonomic hierarchies and relationships - Access to 2+ billion biodiversity occurrence records - Rich metadata including coordinates, dates, basis of record **Supported Operations**: - `search_species` - Search for species by name or other criteria - `get_species_by_key` - Get detailed species information by GBIF key - `match_species_name` - Match a scientific name to GBIF taxonomy - `search_occurrences` - Search for species occurrence records - `get_occurrence_by_key` - Get detailed occurrence record by key - `search_datasets` - Search for datasets in GBIF - `get_dataset_by_key` - Get detailed dataset information by key - `get_species_vernacular_names` - Get common/vernacular names for a species - `get_species_children` - Get direct children taxa of a species - `get_species_parents` - Get parent taxa hierarchy for a species - `get_occurrence_count` - Get count of occurrence records matching criteria - `search_organizations` - Search for publishing organizations **Example Usage**: ```python # Search for species result = tool.query( provider='gbif', operation='search_species', params={'q': 'Panthera leo', 'rank': 'SPECIES', 'limit': 10} ) # Match a scientific name result = tool.query( provider='gbif', operation='match_species_name', params={'name': 'Panthera leo', 'kingdom': 'Animalia'} ) # Search for occurrence records result = tool.query( provider='gbif', operation='search_occurrences', params={ 'taxonKey': 5219404, # Panthera leo 'country': 'KE', # Kenya 'year': '2020', 'limit': 50 } ) # Get occurrence count result = tool.query( provider='gbif', operation='get_occurrence_count', params={'country': 'US', 'year': '2020'} ) # Get vernacular names result = tool.query( provider='gbif', operation='get_species_vernacular_names', params={'key': 5219404} # Panthera leo ) # Search datasets result = tool.query( provider='gbif', operation='search_datasets', params={'q': 'birds', 'type': 'OCCURRENCE', 'limit': 10} ) # Get species details result = tool.query( provider='gbif', operation='get_species_by_key', params={'key': 5219404} # Panthera leo ) # Get taxonomic children result = tool.query( provider='gbif', operation='get_species_children', params={'key': 5219404, 'limit': 20} ) # Search organizations result = tool.query( provider='gbif', operation='search_organizations', params={'country': 'US', 'limit': 10} ) ``` **Data Fields Available**: - Species metadata: scientific name, rank, kingdom, phylum, class, order, family, genus - Occurrence data: coordinates, date, basis of record, dataset key - Dataset information: title, description, publishing organization, license - Vernacular names: common names in multiple languages - Taxonomic hierarchy: parent and child taxa - Geographic information: country, locality, coordinates - Temporal information: year, month, day, event date - Data quality: coordinate uncertainty, identification confidence **Common Taxonomic Ranks**: - KINGDOM - PHYLUM - CLASS - ORDER - FAMILY - GENUS - SPECIES - SUBSPECIES **Common Basis of Record Values**: - HUMAN_OBSERVATION - PRESERVED_SPECIMEN - FOSSIL_SPECIMEN - LIVING_SPECIMEN - MACHINE_OBSERVATION - MATERIAL_SAMPLE - OBSERVATION - OCCURRENCE **Use Cases**: - Biodiversity research and analysis - Species distribution mapping - Conservation planning - Environmental impact assessments - Citizen science data exploration - Taxonomic research and validation - Dataset discovery and metadata retrieval - Geographic occurrence analysis --- ## 4. Performance Settings ### 4.1 Caching Configuration ```python config = { # Basic caching 'cache_ttl': 300, # 5 minutes # Intelligent caching (intent-aware TTL) 'enable_intelligent_cache': True, # Cache backend (optional) 'cache_backend': 'redis', # 'memory' or 'redis' 'redis_url': 'redis://localhost:6379/0' } ``` **Intelligent Cache TTL Strategies**: - **Recent data queries**: 60 seconds - **Historical data**: 3600 seconds (1 hour) - **Metadata queries**: 86400 seconds (24 hours) - **Search queries**: 300 seconds (5 minutes) ### 4.2 Timeout Configuration ```python config = { # Global timeout 'default_timeout': 30, # Provider-specific timeouts 'provider_timeouts': { 'fred': 15, 'worldbank': 45, 'newsapi': 20, 'census': 30 } } ``` ### 4.3 Retry Configuration ```python config = { 'max_retries': 3, 'retry_backoff_factor': 2.0, # Exponential backoff multiplier 'retry_jitter': True, # Add random jitter to prevent thundering herd 'retry_on_status_codes': [429, 500, 502, 503, 504] } ``` **Retry Delay Calculation**: ``` delay = base_delay * (backoff_factor ** attempt) + random_jitter ``` Example: - Attempt 1: 1.0s + jitter - Attempt 2: 2.0s + jitter - Attempt 3: 4.0s + jitter --- ## 5. Feature Flags ### 5.1 Query Enhancement ```python config = { 'enable_query_enhancement': True, 'query_enhancement_config': { 'confidence_threshold': 0.5, # Min confidence for auto-enhancement 'max_enhancements': 5, # Max parameters to add 'preserve_explicit_params': True # Don't override user params } } ``` ### 5.2 Fallback Strategy ```python config = { 'enable_fallback': True, 'fallback_config': { 'max_fallback_attempts': 2, 'fallback_timeout_multiplier': 1.5, # Increase timeout for fallback 'preserve_quality_threshold': 0.7 # Min quality for fallback result } } ``` ### 5.3 Data Fusion ```python config = { 'enable_data_fusion': True, 'data_fusion_config': { 'default_strategy': 'best_quality', # 'best_quality', 'merge_all', 'consensus' 'quality_weight': 0.6, 'freshness_weight': 0.3, 'completeness_weight': 0.1 } } ``` ### 5.4 Rate Limiting ```python config = { 'enable_rate_limiting': True, 'rate_limit_config': { 'fred': { 'tokens_per_second': 2.0, # 120 per minute 'max_tokens': 10 }, 'newsapi': { 'tokens_per_second': 0.001, # ~100 per day 'max_tokens': 5 }, 'census': { 'tokens_per_second': 0.005, # ~500 per day 'max_tokens': 10 } } } ``` --- ## 6. Provider-Specific Configuration ### 6.1 FRED Provider ```python config = { 'fred_api_key': 'YOUR_KEY', 'fred_config': { 'base_url': 'https://api.stlouisfed.org/fred', 'timeout': 15, 'default_file_type': 'json', 'default_frequency': 'a', # Annual 'default_units': 'lin' # Linear } } ``` ### 6.2 World Bank Provider ```python config = { 'worldbank_config': { 'base_url': 'https://api.worldbank.org/v2', 'timeout': 45, 'default_format': 'json', 'default_per_page': 50, 'default_language': 'en' } } ``` ### 6.3 News API Provider ```python config = { 'newsapi_api_key': 'YOUR_KEY', 'newsapi_config': { 'base_url': 'https://newsapi.org/v2', 'timeout': 20, 'default_language': 'en', 'default_page_size': 20, 'default_sort_by': 'publishedAt' } } ``` ### 6.4 The Guardian Provider ```python config = { 'guardian_api_key': 'YOUR_KEY', 'guardian_config': { 'base_url': 'https://content.guardianapis.com', 'timeout': 30, 'rate_limit': 5, 'max_burst': 10 } } ``` **Features**: - Search all Guardian content with advanced filtering - Get specific content items by ID - Browse and search tags (keywords, contributors, series, etc.) - Get all sections - Filter by section, tag, date range - Support for multiple editions (UK, US, AU, International) - Rich metadata including headlines, body text, thumbnails, tags **Supported Operations**: - `search_content` - Search all Guardian content with advanced filtering options - `get_item` - Get a specific content item by ID - `get_tags` - Get all tags or filter by type - `search_tags` - Search for tags by query - `get_sections` - Get all Guardian sections - `get_edition` - Get content for a specific edition **Important Configuration Notes**: - **API Key Required**: Must register for a free API key - **Rate Limits**: Free tier allows 5,000 requests per day - **Attribution**: Must acknowledge The Guardian when displaying content - **Content Fields**: Use `show_fields` parameter to request specific fields (headline, body, thumbnail, etc.) - **Tags**: Use `show_tags` parameter to include tag metadata (keyword, contributor, etc.) **Example Usage**: ```python # Search for articles about technology result = tool.query( provider='guardian', operation='search_content', params={ 'q': 'artificial intelligence', 'section': 'technology', 'from_date': '2024-01-01', 'page_size': 10, 'show_fields': 'headline,body,thumbnail', 'show_tags': 'keyword,contributor' } ) # Get all sections result = tool.query( provider='guardian', operation='get_sections', params={} ) # Search for tags related to climate result = tool.query( provider='guardian', operation='search_tags', params={'q': 'climate', 'page_size': 10} ) # Get US edition content result = tool.query( provider='guardian', operation='get_edition', params={'edition': 'us', 'page_size': 20} ) ``` **API Documentation**: - API Overview: https://open-platform.theguardian.com/documentation/ - Content Search: https://open-platform.theguardian.com/documentation/search - Tags API: https://open-platform.theguardian.com/documentation/tag - Sections API: https://open-platform.theguardian.com/documentation/section ### 6.5 Census Provider ```python config = { 'census_api_key': 'YOUR_KEY', 'census_config': { 'base_url': 'https://api.census.gov/data', 'timeout': 30, 'default_year': 2021, 'default_dataset': 'acs/acs5' } } ``` ### 6.5 Congress Provider ```python config = { 'congress_api_key': 'YOUR_KEY', 'congress_config': { 'base_url': 'https://api.congress.gov/v3', 'timeout': 30, 'rate_limit': 10, 'max_burst': 20 } } ``` **Available Operations**: - `search_bills`: Search for bills and resolutions by congress number and type - `get_bill`: Get detailed information about a specific bill - `list_members`: List members of Congress by congress number and chamber - `get_member`: Get detailed information about a specific member - `list_committees`: List congressional committees - `get_committee`: Get detailed information about a specific committee - `search_amendments`: Search for amendments to bills - `get_amendment`: Get detailed information about a specific amendment **Example Usage**: ```python # Search for bills in the 118th Congress result = tool.execute('search_bills', { 'congress': 118, 'bill_type': 'hr', 'limit': 10 }) # Get specific bill details result = tool.execute('get_bill', { 'congress': 118, 'bill_type': 'hr', 'bill_number': 1 }) # List House members in 118th Congress result = tool.execute('list_members', { 'congress': 118, 'chamber': 'house', 'limit': 20 }) ``` ### 6.6 OpenStates Provider ```python config = { 'openstates_api_key': 'YOUR_API_KEY', # REQUIRED 'openstates_config': { 'base_url': 'https://v3.openstates.org', 'timeout': 30, 'rate_limit': 10, 'max_burst': 20 } } ``` **Available Operations**: - `search_bills`: Search for state bills and resolutions with advanced filtering - `get_bill`: Get detailed information about a specific bill by ID - `search_people`: Search for state legislators with filtering options - `get_person`: Get detailed information about a specific legislator - `list_jurisdictions`: List all available state jurisdictions - `get_jurisdiction`: Get detailed information about a specific jurisdiction **Example Usage**: ```python # Search for bills in California result = tool.query( provider='openstates', operation='search_bills', params={'jurisdiction': 'CA', 'session': '2023', 'per_page': 10} ) # Search for bills by subject result = tool.query( provider='openstates', operation='search_bills', params={ 'jurisdiction': 'NY', 'subject': 'Education', 'per_page': 5 } ) # Get current legislators from Texas result = tool.query( provider='openstates', operation='search_people', params={'jurisdiction': 'TX', 'current': True, 'per_page': 10} ) # List all available jurisdictions result = tool.query( provider='openstates', operation='list_jurisdictions', params={'per_page': 52} ) # Get specific bill details result = tool.query( provider='openstates', operation='get_bill', params={'bill_id': 'ocd-bill/...'} ) ``` **Important Configuration Notes**: - **API Key Required**: Must register for a free API key at https://openstates.org/accounts/profile/ - **Rate Limit**: Free tier with reasonable usage limits (default: 10 req/s) - **Attribution**: Acknowledge OpenStates.org when using the data - **Data Freshness**: Data is updated regularly from official state sources - **Coverage**: All 50 U.S. states plus DC and Puerto Rico **API Documentation**: - API v3 Documentation: https://docs.openstates.org/api-v3/ - Interactive API Docs: https://v3.openstates.org/docs/ - About OpenStates: https://openstates.org/about/ ### 6.7 Alpha Vantage Provider ```python config = { 'alphavantage_api_key': 'YOUR_KEY', 'alphavantage_config': { 'base_url': 'https://www.alphavantage.co/query', 'timeout': 30, 'default_datatype': 'json' } } ``` ### 6.6 REST Countries Provider ```python config = { 'restcountries_config': { 'base_url': 'https://restcountries.com/v3.1', 'timeout': 30 } } ``` ### 6.7 ExchangeRate Provider ```python config = { 'exchangerate_api_key': 'YOUR_KEY', # Optional 'exchangerate_config': { 'base_url': 'https://api.exchangerate-api.com/v4', 'timeout': 30 } } ``` ### 6.8 Open Library Provider ```python config = { 'openlibrary_config': { 'base_url': 'https://openlibrary.org', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20 # Maximum burst size } } ``` ### 6.9 Metropolitan Museum of Art (The Met) Provider ```python config = { 'metmuseum_config': { 'base_url': 'https://collectionapi.metmuseum.org/public/collection/v1', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20 # Maximum burst size } } ``` **Features**: - Search art objects with comprehensive filtering - Get detailed object information including high-resolution images - Browse by department, artist, medium, culture - Access to 470,000+ artworks from The Met collection - Rich metadata including provenance, exhibition history - Public domain images available for many works **Supported Operations**: - `search_objects` - Search for art objects with advanced filtering - `get_object` - Get detailed information about a specific art object - `get_departments` - Get list of all departments - `get_objects_by_department` - Get objects in a specific department - `search_by_artist` - Search for artworks by artist name - `search_by_medium` - Search for artworks by medium - `search_by_culture` - Search for artworks by culture - `search_highlight_objects` - Search for highlighted/featured objects - `download_image` - Download high-resolution images from The Met collection **Important Configuration Notes**: - **No API Key Required**: Completely free and open access - **Rate Limit**: No official limit, recommended 10 req/s - **Attribution**: Acknowledge The Metropolitan Museum of Art when using the data - **Images**: Many objects include high-resolution images (check isPublicDomain flag) - **Data Quality**: Comprehensive metadata for frontend analysis needs **Example Usage**: ```python # Search for impressionist paintings result = tool.query( provider='metmuseum', operation='search_objects', params={ 'q': 'impressionism', 'has_images': True, 'date_begin': 1860, 'date_end': 1900, 'limit': 20 } ) # Get specific artwork details result = tool.query( provider='metmuseum', operation='get_object', params={'object_id': 436535} ) # Search by artist result = tool.query( provider='metmuseum', operation='search_by_artist', params={'artist_name': 'Vincent van Gogh', 'has_images': True, 'limit': 10} ) # Get all departments result = tool.query( provider='metmuseum', operation='get_departments', params={} ) # Download artwork images result = tool.query( provider='metmuseum', operation='download_image', params={'object_id': 436535, 'output_path': './vangogh.jpg'} ) ``` **Image Download Feature**: The Met Museum provider includes a powerful `download_image` operation that allows you to download high-resolution images: ```python # Download by object ID (automatically fetches primary image) result = tool.query( provider='metmuseum', operation='download_image', params={'object_id': 436535} ) # Returns: {'success': True, 'output_path': '/tmp/...jpg', 'file_size': 1234567} # Download by direct URL with custom path result = tool.query( provider='metmuseum', operation='download_image', params={ 'image_url': 'https://images.metmuseum.org/CRDImages/ep/original/DP-42549-001.jpg', 'output_path': './my_artwork.jpg' } ) # Batch download from search results search_result = tool.query( provider='metmuseum', operation='search_objects', params={'q': 'van gogh', 'has_images': True, 'limit': 5} ) for obj in search_result['data']['objects']: if obj.get('primaryImage'): download_result = tool.query( provider='metmuseum', operation='download_image', params={ 'image_url': obj['primaryImage'], 'output_path': f"./images/{obj['objectID']}.jpg" } ) ``` **API Documentation**: - API Documentation: https://metmuseum.github.io/ - GitHub Repository: https://github.com/metmuseum/openaccess - Open Access Initiative: https://www.metmuseum.org/about-the-met/policies-and-documents/open-access ### 6.10 CoinGecko Provider ```python config = { 'coingecko_config': { 'base_url': 'https://api.coingecko.com/api/v3', 'timeout': 30, 'rate_limit': 10, # Requests per second (free tier) 'max_burst': 20 # Maximum burst size } } ``` **Note**: CoinGecko free tier does not require an API key. For higher rate limits and additional features, consider the Pro API. ### 6.10 OpenWeatherMap Provider ```python config = { 'openweathermap_api_key': 'YOUR_KEY', 'openweathermap_config': { 'base_url': 'https://api.openweathermap.org/data/2.5', 'geo_url': 'https://api.openweathermap.org/geo/1.0', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20 # Maximum burst size } } ``` **Obtaining the Key**: 1. Visit https://openweathermap.org/api 2. Sign up for a free account 3. Generate an API key from your account dashboard ### 6.11 Wikipedia Provider ```python config = { 'wikipedia_config': { 'base_url': 'https://en.wikipedia.org/w/api.php', 'rest_base_url': 'https://en.wikipedia.org/api/rest_v1', 'timeout': 30, 'rate_limit': 10, # Requests per second (max 200 allowed) 'max_burst': 20, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' # REQUIRED } } ``` **Features**: - Article search by title or content - Page summaries and extracts - Full page content retrieval - Random article discovery - Page metadata and information **Important Configuration Notes**: - **No API Key Required**: Wikipedia API is completely free and open - **User-Agent REQUIRED**: Must set a unique User-Agent with contact information - **Rate Limit**: Maximum 200 req/s allowed, default config uses 10 req/s - **API Etiquette**: Follow https://www.mediawiki.org/wiki/API:Etiquette ### 6.12 GitHub Provider ```python config = { 'github_api_key': 'YOUR_GITHUB_TOKEN', # Recommended for higher rate limits 'github_config': { 'base_url': 'https://api.github.com', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs)' } } ``` **Features**: - Repository information and statistics - Search repositories, users, and code - User profiles and activity - Repository issues and pull requests - Organization data **Supported Operations**: - `get_repository` - Get detailed repository information - `search_repositories` - Search for repositories - `get_user` - Get user profile information - `search_users` - Search for users - `get_repository_issues` - Get repository issues - `get_repository_pulls` - Get repository pull requests - `search_code` - Search for code across repositories **Important Configuration Notes**: - **API Key Recommended**: Use a Personal Access Token for 5,000 req/hour (vs 60 unauthenticated) - **Rate Limits**: Authenticated: 5,000/hour, Unauthenticated: 60/hour - **Token Scopes**: Use minimal scopes needed (e.g., `public_repo` for public data) - **API Version**: Uses GitHub REST API v3 with `application/vnd.github+json` accept header **Obtaining the Key**: 1. Visit https://github.com/settings/tokens 2. Generate new token (classic) 3. Select appropriate scopes 4. Copy and store the token securely ### 6.13 arXiv Provider ```python config = { 'arxiv_config': { 'base_url': 'http://export.arxiv.org/api/query', 'timeout': 30, 'rate_limit': 0.33, # Requests per second (~3 second delays between requests) 'max_burst': 2, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' } } ``` **Features**: - Search papers by query (all fields) - Get paper by arXiv ID - Search by author name - Search by category (e.g., cs.AI, math.CO) - Pagination support - Full metadata including authors, abstract, categories, PDF links **Important Configuration Notes**: - **No API Key Required**: arXiv API is completely free and open - **Rate Limit**: Be respectful - implement 3 second delays between requests - **Max Results**: Limited to 30,000 results in slices of at most 2,000 at a time - **Caching**: Strongly recommended to cache responses to reduce server load - **API Etiquette**: Follow https://info.arxiv.org/help/api/user-manual.html **Obtaining the Key**: - No API key required - completely free and open access **API Documentation**: - API User Manual: https://info.arxiv.org/help/api/user-manual.html - API Basics: https://info.arxiv.org/help/api/basics.html - Category Taxonomy: https://arxiv.org/category_taxonomy ### 6.14 PubMed Provider ```python config = { 'pubmed_api_key': 'YOUR_NCBI_API_KEY', # Optional but recommended 'pubmed_config': { 'base_url': 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils', 'timeout': 30, 'rate_limit': 3, # Requests per second (3 without key, 10 with key) 'max_burst': 5, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' } } ``` **Features**: - Search biomedical and life sciences literature - Get paper metadata by PubMed ID (PMID) - Search by author name - Get detailed paper information including abstracts - Access to 35+ million citations from MEDLINE, PubMed, and other databases - Full metadata including authors, journal, DOI, publication date **Supported Operations**: - `search_papers` - Search for papers by query string - `get_paper_by_id` - Get paper metadata by PMID - `search_by_author` - Search for papers by author name - `get_paper_details` - Get detailed paper information including abstract **Important Configuration Notes**: - **API Key Optional but Recommended**: Increases rate limit from 3 to 10 requests/second - **Rate Limits**: 3 req/s without API key, 10 req/s with API key - **User-Agent**: Should include contact email for NCBI to reach you if needed - **Caching**: Strongly recommended to cache responses to reduce server load - **API Etiquette**: Follow NCBI E-utilities guidelines **Obtaining the Key**: 1. Visit https://www.ncbi.nlm.nih.gov/account/ 2. Register for a free NCBI account 3. Go to Settings → API Key Management 4. Generate a new API key **API Documentation**: - E-utilities Quick Start: https://www.ncbi.nlm.nih.gov/books/NBK25500/ - E-utilities API Guide: https://www.ncbi.nlm.nih.gov/books/NBK25501/ - PubMed Help: https://pubmed.ncbi.nlm.nih.gov/help/ ### 6.15 CrossRef Provider ```python config = { 'crossref_config': { 'base_url': 'https://api.crossref.org', 'mailto': 'your-email@example.com', # For polite pool access 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)' } } ``` **Features**: - Get work metadata by DOI - Search for scholarly works - Get works from specific journals by ISSN - Search for funders in Open Funder Registry - Get works funded by specific funders - Access to extensive scholarly metadata including citations, references, authors, affiliations **Supported Operations**: - `get_work_by_doi` - Get metadata for a work by its DOI - `search_works` - Search for works by query string with pagination and sorting - `get_journal_works` - Get works published in a specific journal by ISSN - `search_funders` - Search for funders in the Open Funder Registry - `get_funder_works` - Get works associated with a specific funder **Important Configuration Notes**: - **No API Key Required**: CrossRef API is completely free and open - **Polite Pool**: Provide an email address (mailto parameter) for better rate limits - **User-Agent**: Set a descriptive User-Agent header with contact information - **Caching**: Strongly recommended to cache responses to reduce server load - **Attribution**: Acknowledge CrossRef when using the data in publications **Obtaining Access**: - No API key required - completely free and open access - Optional: Register email for polite pool access (better rate limits) **API Documentation**: - REST API Documentation: https://www.crossref.org/documentation/retrieve-metadata/rest-api/ - API Etiquette: https://github.com/CrossRef/rest-api-doc#etiquette - Metadata Plus: https://www.crossref.org/services/metadata-delivery/ ### 6.16 Semantic Scholar Provider ```python config = { 'semanticscholar_config': { 'base_url': 'https://api.semanticscholar.org/graph/v1', 'timeout': 30, 'rate_limit': 1, # Requests per second (recommended for sustained use) 'max_burst': 5, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)' } } ``` **Features**: - Search for academic papers by query - Get paper metadata by ID (S2 ID, DOI, arXiv ID, etc.) - Get paper authors, citations, and references - Get author information and publications - Access to extensive academic paper database with citation data - Support for multiple paper ID formats (S2 ID, DOI, arXiv ID, PubMed ID, etc.) **Supported Operations**: - `search_papers` - Search for papers by query string - `get_paper` - Get paper details by ID (S2 ID, DOI, arXiv ID, etc.) - `get_paper_authors` - Get authors of a specific paper - `get_paper_citations` - Get papers that cite this paper - `get_paper_references` - Get papers referenced by this paper - `get_author` - Get author details by ID - `get_author_papers` - Get papers by a specific author **Important Configuration Notes**: - **No API Key Required**: Semantic Scholar API is completely free and open - **Rate Limit**: Recommended 1 request per second for sustained use (100 requests per 5 minutes) - **Max Results**: Limited to 100 results per request for search, use pagination for more - **User-Agent**: Set a descriptive User-Agent header with contact information - **Caching**: Strongly recommended to cache responses to reduce server load - **Paper IDs**: Supports multiple ID formats (S2 ID, DOI, arXiv ID, PubMed ID, etc.) **Obtaining Access**: - No API key required - completely free and open access - Optional: Contact Semantic Scholar for higher rate limits if needed **API Documentation**: - API Documentation: https://api.semanticscholar.org/api-docs/ - Academic Graph API: https://www.semanticscholar.org/product/api - API Tutorial: https://www.semanticscholar.org/product/api/tutorial ### 6.17 CORE Provider ```python config = { 'core_api_key': 'YOUR_CORE_API_KEY', # Required 'core_config': { 'base_url': 'https://api.core.ac.uk/v3', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size } } ``` **Features**: - Search for open access research papers - Get work metadata by CORE ID - Search by DOI - Search by title - Access to millions of open access research papers - Full metadata including authors, abstract, publication date, citations **Supported Operations**: - `search_works` - Search for works by query string - `get_work` - Get work details by CORE ID - `search_by_doi` - Search for works by DOI - `search_by_title` - Search for works by title **Important Configuration Notes**: - **API Key Required**: CORE API requires an API key for access - **Rate Limit**: Free tier allows reasonable usage with rate limiting - **Max Results**: Limited to 100 results per request for search, use pagination for more - **Caching**: Strongly recommended to cache responses to reduce server load - **Attribution**: Acknowledge CORE when using the data in publications **Obtaining the Key**: 1. Visit https://core.ac.uk/services/api 2. Register for a free account 3. Request an API key from your account dashboard **API Documentation**: - API Documentation: https://core.ac.uk/documentation/api - API Services: https://core.ac.uk/services/api - About CORE: https://core.ac.uk/about ### 6.18 USPTO Provider ```python config = { 'uspto_api_key': 'YOUR_USPTO_API_KEY', # Required 'uspto_config': { 'base_url': 'https://developer.uspto.gov/ibd-api/v1', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size } } ``` **Features**: - Search for US patents by query - Get patent details by patent number - Search patents by inventor name - Search patents by assignee (company/organization) - Access to comprehensive US patent database - Full metadata including title, abstract, inventors, assignees, classifications, citations **Supported Operations**: - `search_patents` - Search for patents by query string - `get_patent` - Get patent details by patent number/ID - `search_by_inventor` - Search for patents by inventor name - `search_by_assignee` - Search for patents by assignee name **Important Configuration Notes**: - **API Key Required**: USPTO API requires an API key for access - **Rate Limit**: Free tier allows reasonable usage with rate limiting - **Max Results**: Pagination supported for large result sets - **Caching**: Strongly recommended to cache responses to reduce server load - **Attribution**: Acknowledge USPTO when using patent data in publications **Obtaining the Key**: 1. Visit https://developer.uspto.gov/ 2. Register for a free developer account 3. Request an API key from your account dashboard **API Documentation**: - API Catalog: https://developer.uspto.gov/api-catalog - Patent Search API: https://developer.uspto.gov/api-catalog/patent-search-api - Developer Portal: https://developer.uspto.gov/ ### 6.19 SEC EDGAR Provider ```python config = { 'secedgar_config': { 'base_url': 'https://data.sec.gov', 'user_agent': 'YourCompanyName contact@example.com', # REQUIRED 'timeout': 30, 'rate_limit': 10, # Requests per second (max allowed by SEC) 'max_burst': 20, # Maximum burst size } } ``` **Features**: - Get company submissions and filing history by CIK - Access XBRL financial data and concepts - Retrieve company facts across all filings - Search company filings (10-K, 10-Q, 8-K, etc.) - Download actual filing documents (10-K, 10-Q, 8-K full text) - Calculate financial ratios automatically - Get formatted financial statements - Access insider trading data (Form 4) - Access to comprehensive SEC filing database - Full metadata including company info, filing dates, XBRL tags **Supported Operations**: *Basic Data Retrieval:* - `get_company_submissions` - Get company filing history and submission data - `get_company_concept` - Get XBRL concept data for specific financial metrics - `get_company_facts` - Get all XBRL facts for a company *Filing Document Access:* - `search_filings` - Search for filings by CIK and form type - `get_filings_by_type` - Get recent filings of a specific form type - `get_filing_documents` - Get filing document URLs and metadata - `get_filing_text` - Download full text of filing documents *Financial Analysis:* - `calculate_financial_ratios` - Calculate common financial ratios (P/E, ROE, ROA, etc.) - `get_financial_statement` - Get formatted financial statements (balance sheet, income statement, cash flow) *Corporate Governance:* - `get_insider_transactions` - Get insider trading transactions (Form 4 filings) **Important Configuration Notes**: - **No API Key Required**: SEC EDGAR API is completely free and open - **User-Agent REQUIRED**: Must include company/individual name and contact email - Format: `"CompanyName contact@email.com"` - SEC will block access if User-Agent is missing or generic - **Rate Limit**: Maximum 10 requests per second (enforced by SEC) - **CIK Format**: Central Index Key must be 10 digits with leading zeros (e.g., "0000320193") - **Caching**: Strongly recommended to cache responses to reduce server load - **Fair Access**: SEC monitors usage and may block non-compliant access **Example Usage**: ```python # 1. Get Apple Inc. filings (CIK: 0000320193) result = tool.query( provider='secedgar', operation='get_company_submissions', params={'cik': '0000320193'} ) # 2. Search for specific form type (10-K annual reports) result = tool.query( provider='secedgar', operation='search_filings', params={ 'cik': '0000320193', 'form_type': '10-K', 'limit': 5 } ) # 3. Get Apple's Assets data from XBRL result = tool.query( provider='secedgar', operation='get_company_concept', params={ 'cik': '0000320193', 'taxonomy': 'us-gaap', 'tag': 'Assets' } ) # 4. Calculate financial ratios result = tool.query( provider='secedgar', operation='calculate_financial_ratios', params={'cik': '0000320193'} ) # Returns: current_ratio, debt_to_equity, profit_margin, ROA, ROE, etc. # 5. Get formatted balance sheet result = tool.query( provider='secedgar', operation='get_financial_statement', params={ 'cik': '0000320193', 'statement_type': 'balance_sheet', 'period': 'annual' } ) # 6. Get insider transactions (Form 4) result = tool.query( provider='secedgar', operation='get_insider_transactions', params={ 'cik': '0000320193', 'start_date': '2024-01-01' } ) # 7. Download filing document text result = tool.query( provider='secedgar', operation='get_filing_text', params={ 'cik': '0000320193', 'accession_number': '0000320193-23-000077' } ) ``` **Common CIKs**: - Apple Inc.: 0000320193 - Tesla Inc.: 0001318605 - Microsoft Corp.: 0000789019 - Amazon.com Inc.: 0001018724 - Alphabet Inc.: 0001652044 **Finding CIKs**: - Company Search: https://www.sec.gov/edgar/searchedgar/companysearch.html - CIK Lookup Tool: https://www.sec.gov/cgi-bin/browse-edgar **API Documentation**: - API Overview: https://www.sec.gov/search-filings/edgar-application-programming-interfaces - Accessing EDGAR Data: https://www.sec.gov/os/accessing-edgar-data - XBRL Data Sets: https://www.sec.gov/dera/data/financial-statement-data-sets.html - Company Submissions: https://data.sec.gov/submissions/ - XBRL API: https://data.sec.gov/api/xbrl/ ### 6.20 Stack Exchange Provider ```python config = { 'stackexchange_config': { 'base_url': 'https://api.stackexchange.com/2.3', 'api_key': 'YOUR_API_KEY', # Optional but recommended 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size } } ``` **Features**: - Search questions across Stack Exchange network - Get detailed question and answer information - Search for users and their profiles - Browse tags and their statistics - Access all Stack Exchange sites - Rich metadata including votes, views, acceptance status **Supported Operations**: *Question Operations:* - `search_questions` - Search for questions by query and tags - `get_question` - Get detailed information about a specific question - `get_answers` - Get answers for a specific question *User and Tag Operations:* - `search_users` - Search for users by name - `get_tags` - Get tags and their statistics - `get_sites` - Get all sites in the Stack Exchange network **Important Notes**: - **API Key Optional**: Works without key but has much lower rate limits (300 vs 10,000 requests/day) - **Compression**: API returns gzip compressed responses by default - **Backoff**: Respect the backoff field in responses when present - **Attribution**: Required when displaying Stack Exchange content **Example Usage**: ```python # 1. Search for Python questions on Stack Overflow result = tool.query( provider='stackexchange', operation='search_questions', params={ 'site': 'stackoverflow', 'q': 'python async', 'tagged': 'python', 'sort': 'votes', 'pagesize': 10 } ) # 2. Get a specific question by ID result = tool.query( provider='stackexchange', operation='get_question', params={ 'question_id': 11227809, 'site': 'stackoverflow' } ) # 3. Get answers for a question result = tool.query( provider='stackexchange', operation='get_answers', params={ 'question_id': 11227809, 'site': 'stackoverflow', 'sort': 'votes', 'pagesize': 5 } ) # 4. Search for users result = tool.query( provider='stackexchange', operation='search_users', params={ 'site': 'stackoverflow', 'inname': 'Jon Skeet', 'pagesize': 10 } ) # 5. Get popular Python tags result = tool.query( provider='stackexchange', operation='get_tags', params={ 'site': 'stackoverflow', 'inname': 'python', 'sort': 'popular', 'pagesize': 20 } ) # 6. Get all Stack Exchange sites result = tool.query( provider='stackexchange', operation='get_sites', params={'pagesize': 50} ) ``` **Popular Sites**: - Stack Overflow: `stackoverflow` - Server Fault: `serverfault` - Super User: `superuser` - Ask Ubuntu: `askubuntu` - Mathematics: `math` - Unix & Linux: `unix` **API Documentation**: - API Documentation: https://api.stackexchange.com/docs - Authentication: https://api.stackexchange.com/docs/authentication - Throttling: https://api.stackexchange.com/docs/throttle - Register App: https://stackapps.com/apps/oauth/register ### 6.21 Hacker News Provider ```python config = { 'hackernews_config': { 'base_url': 'http://hn.algolia.com/api/v1', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)' } } ``` **Features**: - Search Hacker News stories by keywords - Search comments by keywords - Search items sorted by date (most recent first) - Get item details by ID (story, comment, poll, etc.) - Get user information by username - Full metadata including title, author, points, comments, URL - Pagination support for large result sets **Supported Operations**: - `search_stories` - Search for stories by keywords (sorted by relevance) - `search_comments` - Search for comments by keywords - `search_by_date` - Search for items sorted by date (most recent first) - `get_item` - Get item details by ID (story, comment, poll, etc.) - `get_user` - Get user information by username **Important Configuration Notes**: - **No API Key Required**: Hacker News Algolia API is completely free and open - **Rate Limiting**: Be respectful - implement reasonable delays between requests - **Max Results**: Limited to 1000 results per query (pagination available) - **User-Agent**: Set a descriptive User-Agent header for API etiquette - **Caching**: Strongly recommended to cache responses to reduce server load **Example Usage**: ```python # 1. Search for Python-related stories result = tool.query( provider='hackernews', operation='search_stories', params={ 'query': 'python', 'hits_per_page': 20 } ) # 2. Search for stories with minimum comments result = tool.query( provider='hackernews', operation='search_stories', params={ 'query': 'AI', 'num_comments': 50, # Minimum 50 comments 'hits_per_page': 10 } ) # 3. Search comments about machine learning result = tool.query( provider='hackernews', operation='search_comments', params={ 'query': 'machine learning', 'hits_per_page': 20 } ) # 4. Get recent AI stories sorted by date result = tool.query( provider='hackernews', operation='search_by_date', params={ 'query': 'AI', 'tags': 'story', 'hits_per_page': 20 } ) # 5. Get specific item details result = tool.query( provider='hackernews', operation='get_item', params={'item_id': 1} # The first HN story ever posted ) # 6. Get user information result = tool.query( provider='hackernews', operation='get_user', params={'username': 'pg'} # Paul Graham ) ``` **Common Tags**: - `story` - Filter for stories only - `comment` - Filter for comments only - `poll` - Filter for polls only - `author_pg` - Filter by author (e.g., Paul Graham) - Combine tags: `story,author_pg` - Stories by Paul Graham **Obtaining Access**: - No API key required - completely free and open access **API Documentation**: - API Documentation: https://hn.algolia.com/api - Hacker News Official: https://news.ycombinator.com/ - Search Interface: https://hn.algolia.com/ ### 6.22 OpenCorporates Provider ```python config = { 'opencorporates_api_key': 'YOUR_OPENCORPORATES_API_KEY', # Required 'opencorporates_config': { 'base_url': 'https://api.opencorporates.com/v0.4', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size } } ``` **Features**: - Search for companies by name across 140+ jurisdictions worldwide - Get detailed company information by jurisdiction code and company number - Search for company officers (directors, agents, secretaries) - Get officer details and their company affiliations - Access company filings and statutory documents - Get jurisdiction information and codes - Access to 200+ million companies from official registers - Full metadata including company status, address, incorporation date, officers **Supported Operations**: *Company Operations:* - `search_companies` - Search for companies by name or other criteria - `get_company` - Get detailed information about a specific company by jurisdiction and company number - `get_company_filings` - Get statutory filings for a specific company *Officer Operations:* - `search_officers` - Search for company officers (directors, agents) by name - `get_officer` - Get detailed information about a specific officer by ID *Jurisdiction Operations:* - `list_jurisdictions` - Get list of all available jurisdictions **Important Configuration Notes**: - **API Key Required**: OpenCorporates API requires an API key for all requests - **Rate Limits**: Free tier allows 200 requests/month, 50 requests/day - **Open Data**: Free for open data projects with share-alike attribution - **Paid Plans**: Available for commercial use without share-alike restrictions - **Jurisdiction Codes**: Use standard codes like 'us_ca' (California), 'gb' (UK), 'de' (Germany) - **Caching**: Strongly recommended to cache responses to reduce API usage **Example Usage**: ```python # 1. Search for companies by name result = tool.query( provider='opencorporates', operation='search_companies', params={ 'q': 'Apple Inc', 'jurisdiction_code': 'us_ca', # Optional: filter by jurisdiction 'per_page': 10 } ) # 2. Get specific company details result = tool.query( provider='opencorporates', operation='get_company', params={ 'jurisdiction_code': 'us_ca', 'company_number': 'C0806592' # Apple Inc. } ) # 3. Search for officers result = tool.query( provider='opencorporates', operation='search_officers', params={ 'q': 'John Smith', 'jurisdiction_code': 'gb', # Optional: filter by jurisdiction 'per_page': 10 } ) # 4. Get company filings result = tool.query( provider='opencorporates', operation='get_company_filings', params={ 'jurisdiction_code': 'us_ca', 'company_number': 'C0806592', 'per_page': 20 } ) # 5. List all jurisdictions result = tool.query( provider='opencorporates', operation='list_jurisdictions', params={} ) ``` **Common Jurisdiction Codes**: - United States (California): `us_ca` - United States (Delaware): `us_de` - United Kingdom: `gb` - Germany: `de` - France: `fr` - Canada (Ontario): `ca_on` - Australia: `au` **Example Companies**: - Apple Inc. (US-CA): jurisdiction_code='us_ca', company_number='C0806592' - Google LLC (US-DE): jurisdiction_code='us_de', company_number='5908224' - Microsoft Corporation (US-WA): jurisdiction_code='us_wa', company_number='600413485' **Obtaining the Key**: 1. Visit https://opencorporates.com/api_accounts/new 2. Register for a free account 3. Choose your plan (free for open data projects) 4. Get your API key from the dashboard **API Documentation**: - API Reference: https://api.opencorporates.com/documentation/API-Reference - API Accounts: https://opencorporates.com/api_accounts/new - About OpenCorporates: https://opencorporates.com/info/about - Jurisdiction Codes: https://api.opencorporates.com/documentation/Open-Data-Licence ### 6.23 GDELT Project Provider ```python config = { 'gdelt_config': { 'doc_base_url': 'https://api.gdeltproject.org/api/v2/doc/doc', 'geo_base_url': 'https://api.gdeltproject.org/api/v2/geo/geo', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size } } ``` **Features**: - Search global news articles across 100+ languages - Timeline analysis of news coverage volume and tone - Geographic mapping of news coverage - Image search with visual recognition - Theme-based search using Global Knowledge Graph - Emotional tone analysis of news coverage - Source country analysis - Real-time updates every 15 minutes **Supported Operations**: *Article Search Operations:* - `search_articles` - Search global news articles with advanced filtering - `get_article_list` - Get detailed list of articles with full metadata - `search_by_theme` - Search using GDELT's Global Knowledge Graph themes *Timeline Operations:* - `get_timeline` - Get timeline of news coverage volume - `get_timeline_volume` - Get volume timeline with raw counts or percentages - `get_timeline_tone` - Get timeline showing average emotional tone over time - `get_timeline_lang` - Get timeline broken down by language - `get_timeline_source_country` - Get timeline broken down by source country *Analysis Operations:* - `get_tone_chart` - Analyze emotional tone distribution of coverage - `get_top_themes` - Get top themes and topics from matching articles *Geographic Operations:* - `get_geo_map` - Get geographic map of locations mentioned in news - `get_source_country_map` - Map which countries are reporting on a topic *Image Operations:* - `search_images` - Search news images using visual recognition **Important Configuration Notes**: - **No API Key Required**: GDELT Project API is completely free and open - **Rate Limiting**: Be respectful - implement reasonable delays between requests - **Data Coverage**: Monitors news in 100+ languages from around the world - **Real-time Updates**: Data updated every 15 minutes - **Attribution**: Acknowledge GDELT Project when using the data - **Fair Use**: Do not abuse the free service with excessive requests - **Caching**: Strongly recommended to cache responses to reduce server load **Example Usage**: ```python # 1. Search for climate change articles result = tool.query( provider='gdelt', operation='search_articles', params={ 'query': 'climate change', 'timespan': '7d', 'max_records': 50, 'source_lang': 'english' } ) # 2. Get timeline of AI coverage result = tool.query( provider='gdelt', operation='get_timeline', params={ 'query': 'artificial intelligence', 'timespan': '30d', 'mode': 'timelinevol' } ) # 3. Analyze tone of election coverage result = tool.query( provider='gdelt', operation='get_tone_chart', params={ 'query': 'election', 'timespan': '7d' } ) # 4. Search for protest images result = tool.query( provider='gdelt', operation='search_images', params={ 'query': 'protest', 'timespan': '7d', 'image_tag': 'protest', 'max_records': 20 } ) # 5. Get geographic map of earthquake coverage result = tool.query( provider='gdelt', operation='get_geo_map', params={ 'query': 'earthquake', 'mode': 'country', 'timespan': '24h' } ) # 6. Search by theme (Global Knowledge Graph) result = tool.query( provider='gdelt', operation='search_by_theme', params={ 'theme': 'ENV_CLIMATECHANGE', 'timespan': '7d', 'max_records': 50 } ) # 7. Get source country map result = tool.query( provider='gdelt', operation='get_source_country_map', params={ 'query': 'technology', 'timespan': '24h' } ) # 8. Get timeline with tone analysis result = tool.query( provider='gdelt', operation='get_timeline_tone', params={ 'query': 'economy', 'timespan': '30d', 'smoothing': 5 } ) ``` **Common GKG Themes**: - `ENV_CLIMATECHANGE` - Climate change and global warming - `TERROR` - Terrorism and extremism - `HEALTH` - Health and medical topics - `ECON_INFLATION` - Economic inflation - `ECON_STOCKMARKET` - Stock market and finance - `TAX_FNCACT_STUDENT` - Student finance and education - `WB_*` - World Bank indicators (e.g., WB_1987_POVERTY_HEADCOUNT) **Timespan Formats**: - Hours: `1h`, `6h`, `12h`, `24h` - Days: `1d`, `3d`, `7d` - Weeks: `1week`, `2weeks` - Months: `1month`, `3months`, `6months` **Query Operators**: - Phrase search: `"exact phrase"` - Boolean AND: `term1 term2` or `term1 AND term2` - Boolean OR: `term1 OR term2` - Boolean NOT: `-term` or `NOT term` - Grouping: `(term1 OR term2) AND term3` - Theme search: `theme:TERROR` - Domain filter: `domain:nytimes.com` - Source language: `sourcelang:english` - Source country: `sourcecountry:us` **API Documentation**: - DOC API 2.0: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/ - GEO API 2.0: https://blog.gdeltproject.org/gdelt-geo-2-0-api-debuts/ - Global Knowledge Graph: https://blog.gdeltproject.org/announcing-the-global-knowledge-graph/ - GDELT Project: https://www.gdeltproject.org/ - Query Guide: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/ --- ### 6.24 DuckDuckGo Zero-Click Info Provider ```python config = { 'duckduckgo_config': { 'base_url': 'https://api.duckduckgo.com/', 'timeout': 30, 'rate_limit': 10, # Requests per second 'max_burst': 20, # Maximum burst size 'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)' } } tool = APISourceTool(config) ``` **Environment Variable**: ```bash export DUCKDUCKGO_TIMEOUT=30 export DUCKDUCKGO_RATE_LIMIT=10 export DUCKDUCKGO_MAX_BURST=20 export DUCKDUCKGO_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)" ``` **Supported Operations**: *Instant Answer Operations:* - `get_instant_answer` - Get instant answer for a query with all available data - `get_abstract` - Get article abstract/summary from Wikipedia and other sources - `get_definition` - Get definition for a term - `get_related_topics` - Get related topics and disambiguation - `get_infobox` - Get structured infobox data for an entity **Important Configuration Notes**: - **No API Key Required**: DuckDuckGo Instant Answer API is completely free and open - **Rate Limiting**: Be respectful - implement reasonable delays between requests - **Caching**: Strongly recommended to cache responses to reduce server load - **User-Agent**: Set a descriptive User-Agent header for API etiquette - **No Scraping**: This is an Instant Answer API, not a full search results API - **Attribution**: Consider attributing results to DuckDuckGo when displaying them - **Data Sources**: Primarily Wikipedia, but also includes other curated sources **Example Usage**: ```python # 1. Get instant answer for a query result = tool.query( provider='duckduckgo', operation='get_instant_answer', params={ 'query': 'Python programming language', 'no_html': True } ) # 2. Get abstract for an entity result = tool.query( provider='duckduckgo', operation='get_abstract', params={'query': 'Albert Einstein'} ) # 3. Get definition for a term result = tool.query( provider='duckduckgo', operation='get_definition', params={'query': 'algorithm'} ) # 4. Get related topics (disambiguation) result = tool.query( provider='duckduckgo', operation='get_related_topics', params={'query': 'Python'} ) # 5. Get infobox data for an entity result = tool.query( provider='duckduckgo', operation='get_infobox', params={'query': 'Steve Jobs'} ) ``` **Response Data Structure**: *Instant Answer Response:* ```python { 'heading': 'Python (programming language)', 'abstract': 'Python is a high-level, general-purpose programming language...', 'abstract_source': 'Wikipedia', 'abstract_url': 'https://en.wikipedia.org/wiki/Python_(programming_language)', 'answer': '', # Direct answer if available 'answer_type': '', 'definition': '', # Definition if available 'image': 'https://duckduckgo.com/i/...', 'type': 'A', # Answer type: A=Article, D=Disambiguation, etc. 'has_infobox': True, 'has_related_topics': True } ``` *Related Topics Response:* ```python { 'heading': 'Python', 'related_topics': [ { 'type': 'topic', 'text': 'Python (programming language) A high-level...', 'url': 'https://duckduckgo.com/Python_(programming_language)', 'icon': '/i/7eec482b.png' }, { 'type': 'category', 'name': 'Snakes', 'topics': [...] } ], 'total_topics': 15 } ``` **Use Cases**: - Quick facts and information retrieval - Entity disambiguation (e.g., "Python" could be programming language, snake, etc.) - Topic exploration and related content discovery - Knowledge base enrichment - Instant answers for common queries - Structured data extraction from infoboxes **API Documentation**: - API Endpoint: https://api.duckduckgo.com/ - API Format: https://api.duckduckgo.com/?q=query&format=json - DuckDuckGo: https://duckduckgo.com/ --- ## 7. Environment Variables ### 7.1 Variable Reference All configuration parameters can be set via environment variables with the `APISOURCE_` prefix: ```bash # API Keys export APISOURCE_FRED_API_KEY="your_fred_key" export APISOURCE_NEWSAPI_API_KEY="your_news_key" export APISOURCE_CENSUS_API_KEY="your_census_key" export APISOURCE_CONGRESS_API_KEY="your_congress_key" export APISOURCE_ALPHAVANTAGE_API_KEY="your_alphavantage_key" export APISOURCE_EXCHANGERATE_API_KEY="your_exchangerate_key" # Optional export APISOURCE_OPENWEATHERMAP_API_KEY="your_openweathermap_key" export APISOURCE_GITHUB_API_KEY="your_github_token" # Recommended export APISOURCE_PUBMED_API_KEY="your_ncbi_api_key" # Optional but recommended export CROSSREF_MAILTO="your-email@example.com" # Optional but recommended for polite pool export APISOURCE_CORE_API_KEY="your_core_api_key" # Required export APISOURCE_USPTO_API_KEY="your_uspto_api_key" # Required export SECEDGAR_USER_AGENT="YourCompanyName contact@example.com" # REQUIRED for SEC EDGAR export STACKEXCHANGE_API_KEY="your_stackexchange_api_key" # Optional but recommended export OPENCORPORATES_API_KEY="your_opencorporates_api_key" # Required # Provider-specific Configuration export SEMANTICSCHOLAR_TIMEOUT=30 export SEMANTICSCHOLAR_RATE_LIMIT=1 export SEMANTICSCHOLAR_MAX_BURST=5 export CORE_TIMEOUT=30 export CORE_RATE_LIMIT=10 export CORE_MAX_BURST=20 export USPTO_TIMEOUT=30 export USPTO_RATE_LIMIT=10 export USPTO_MAX_BURST=20 export SECEDGAR_TIMEOUT=30 export SECEDGAR_RATE_LIMIT=10 export SECEDGAR_MAX_BURST=20 export STACKEXCHANGE_TIMEOUT=30 export STACKEXCHANGE_RATE_LIMIT=10 export STACKEXCHANGE_MAX_BURST=20 export OPENCORPORATES_TIMEOUT=30 export OPENCORPORATES_RATE_LIMIT=10 export OPENCORPORATES_MAX_BURST=20 export HACKERNEWS_TIMEOUT=30 export HACKERNEWS_RATE_LIMIT=10 export HACKERNEWS_MAX_BURST=20 export STACKEXCHANGE_RATE_LIMIT=10 export STACKEXCHANGE_MAX_BURST=20 export GDELT_TIMEOUT=30 export GDELT_RATE_LIMIT=10 export GDELT_MAX_BURST=20 export DUCKDUCKGO_TIMEOUT=30 export DUCKDUCKGO_RATE_LIMIT=10 export DUCKDUCKGO_MAX_BURST=20 export DUCKDUCKGO_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)" # Performance export APISOURCE_CACHE_TTL="300" export APISOURCE_DEFAULT_TIMEOUT="30" export APISOURCE_MAX_RETRIES="3" # Feature Flags export APISOURCE_ENABLE_RATE_LIMITING="true" export APISOURCE_ENABLE_FALLBACK="true" export APISOURCE_ENABLE_DATA_FUSION="true" export APISOURCE_ENABLE_QUERY_ENHANCEMENT="true" # Logging export APISOURCE_LOG_LEVEL="INFO" export APISOURCE_METRICS_ENABLED="true" ``` ### 7.2 Loading from .env File ```bash # .env file APISOURCE_FRED_API_KEY=your_fred_key APISOURCE_NEWSAPI_API_KEY=your_news_key APISOURCE_CACHE_TTL=300 APISOURCE_ENABLE_FALLBACK=true ``` ```python # Load with python-dotenv from dotenv import load_dotenv load_dotenv() # Tool automatically picks up environment variables tool = APISourceTool() ``` --- ## 8. Configuration Examples ### 8.1 Development Configuration ```json { "fred_api_key": "YOUR_FRED_KEY", "newsapi_api_key": "YOUR_NEWS_KEY", "cache_ttl": 60, "default_timeout": 30, "max_retries": 1, "enable_rate_limiting": false, "enable_fallback": true, "enable_data_fusion": true, "enable_query_enhancement": true, "log_level": "DEBUG", "metrics_enabled": true } ``` ### 8.2 Production Configuration ```json { "fred_api_key": "${FRED_API_KEY}", "newsapi_api_key": "${NEWSAPI_API_KEY}", "census_api_key": "${CENSUS_API_KEY}", "congress_api_key": "${CONGRESS_API_KEY}", "cache_ttl": 600, "default_timeout": 30, "max_retries": 5, "enable_rate_limiting": true, "enable_fallback": true, "enable_data_fusion": true, "enable_query_enhancement": true, "enable_intelligent_cache": true, "log_level": "INFO", "metrics_enabled": true, "cache_backend": "redis", "redis_url": "redis://redis:6379/0" } ``` ### 8.3 High-Volume Configuration ```json { "fred_api_key": "${FRED_API_KEY}", "cache_ttl": 3600, "default_timeout": 15, "max_retries": 3, "enable_rate_limiting": true, "enable_fallback": true, "enable_data_fusion": false, "enable_query_enhancement": false, "enable_intelligent_cache": true, "log_level": "WARNING", "metrics_enabled": true, "rate_limit_config": { "fred": { "tokens_per_second": 1.5, "max_tokens": 5 } } } ``` ### 8.4 Minimal Configuration ```json { "fred_api_key": "YOUR_FRED_KEY" } ``` All other parameters use defaults. --- ## 9. Validation and Testing ### 9.1 Configuration Validation ```python from aiecs.tools.apisource.tool import Config # Validate configuration try: config = Config( fred_api_key='YOUR_KEY', cache_ttl=300, max_retries=3 ) print("Configuration valid!") except ValueError as e: print(f"Configuration error: {e}") ``` ### 9.2 Testing Configuration ```python from aiecs.tools.apisource import APISourceTool # Create tool with configuration tool = APISourceTool(config) # Test provider connectivity providers = tool.list_providers() for provider in providers: print(f"Provider: {provider['name']}") print(f"Health: {provider['health']['status']}") print(f"Score: {provider['health']['score']}\n") # Test a simple query try: result = tool.query( provider='fred', operation='get_series_info', params={'series_id': 'GDP'} ) print("Configuration working correctly!") except Exception as e: print(f"Configuration issue: {e}") ``` ### 9.3 Configuration Best Practices 1. **Use Environment Variables for Secrets**: ```python import os config = { 'fred_api_key': os.getenv('FRED_API_KEY'), 'newsapi_api_key': os.getenv('NEWSAPI_KEY') } ``` 2. **Validate Before Deployment**: ```python def validate_config(config): required_keys = ['fred_api_key'] for key in required_keys: if not config.get(key): raise ValueError(f"Missing required config: {key}") return True ``` 3. **Use Different Configs for Different Environments**: ```python import os env = os.getenv('ENVIRONMENT', 'development') config_file = f'config.{env}.json' with open(config_file) as f: config = json.load(f) ``` 4. **Monitor Configuration Impact**: ```python # Check metrics after configuration changes metrics = tool.get_metrics() print(f"Success rate: {metrics['overall']['success_rate']}") print(f"Avg response time: {metrics['overall']['avg_response_time']}") ``` --- **Document Version**: 2.0 **Last Updated**: 2025-10-18 **Maintainer**: AIECS Tools Team