Deprecated — removed in AIECS 2.0
APISource Tool - Configuration Reference
Table of Contents
1. Configuration Overview
1.1 Configuration Methods
The APISource Tool supports multiple configuration methods:
Dictionary Configuration:
from aiecs.tools.apisource import APISourceTool
config = {
'fred_api_key': 'YOUR_KEY',
'cache_ttl': 300,
'enable_fallback': True
}
tool = APISourceTool(config)
Environment Variables:
import os
os.environ['APISOURCE_FRED_API_KEY'] = 'YOUR_KEY'
os.environ['APISOURCE_CACHE_TTL'] = '300'
tool = APISourceTool() # Auto-loads from environment
Configuration File:
import json
with open('apisource_config.json') as f:
config = json.load(f)
tool = APISourceTool(config)
Pydantic Model:
from aiecs.tools.apisource.tool import Config
config = Config(
fred_api_key='YOUR_KEY',
cache_ttl=300,
enable_fallback=True
)
tool = APISourceTool(config)
1.2 Configuration Priority
When multiple configuration sources are present, the priority is:
Explicit parameters (highest priority)
Configuration dictionary/object
Environment variables
Default values (lowest priority)
2. Configuration Parameters
2.1 Complete Parameter Reference
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str |
None |
FRED API key |
|
str |
None |
News API key |
|
str |
None |
The Guardian API key |
|
str |
None |
Census Bureau API key |
|
str |
None |
Congress.gov API key |
|
int |
300 |
Cache TTL in seconds |
|
int |
30 |
Request timeout in seconds |
|
int |
3 |
Maximum retry attempts |
|
bool |
True |
Enable rate limiting |
|
bool |
True |
Enable provider fallback |
|
bool |
True |
Enable data fusion |
|
bool |
True |
Enable query enhancement |
|
bool |
True |
Enable intelligent caching |
|
str |
‘INFO’ |
Logging level |
|
bool |
True |
Enable metrics collection |
2.2 Parameter Details
cache_ttl
Type: Integer
Default: 300 (5 minutes)
Range: 0-86400 (0 = no cache, 86400 = 24 hours)
Description: Time-to-live for cached results in seconds
Recommendation:
Development: 60-300 seconds
Production: 300-3600 seconds
High-frequency data: 60-300 seconds
Static data: 3600-86400 seconds
default_timeout
Type: Integer
Default: 30 seconds
Range: 5-300 seconds
Description: Maximum time to wait for API response
Recommendation:
Fast APIs (FRED, News): 10-30 seconds
Slow APIs (World Bank): 30-60 seconds
Batch operations: 60-120 seconds
max_retries
Type: Integer
Default: 3
Range: 0-10
Description: Maximum number of retry attempts for failed requests
Recommendation:
Production: 3-5 retries
Development: 1-2 retries
Critical operations: 5-10 retries
enable_rate_limiting
Type: Boolean
Default: True
Description: Enable automatic rate limiting to prevent API quota exhaustion
Recommendation: Always True in production
enable_fallback
Type: Boolean
Default: True
Description: Enable automatic failover to alternative providers
Recommendation: True for high-availability applications
enable_data_fusion
Type: Boolean
Default: True
Description: Enable intelligent merging of multi-provider results
Recommendation: True for search operations
enable_query_enhancement
Type: Boolean
Default: True
Description: Enable automatic parameter completion from query text
Recommendation: True for AI agent integration
enable_intelligent_cache
Type: Boolean
Default: True
Description: Enable intent-aware cache TTL strategies
Recommendation: True for optimal performance
3. API Credentials
3.1 FRED API Key
Obtaining the Key:
Visit https://fred.stlouisfed.org/docs/api/api_key.html
Register for a free account
Request an API key
Configuration:
# Method 1: Direct configuration
tool = APISourceTool({'fred_api_key': 'YOUR_FRED_KEY'})
# Method 2: Environment variable
export APISOURCE_FRED_API_KEY="YOUR_FRED_KEY"
# Method 3: Configuration file
{
"fred_api_key": "YOUR_FRED_KEY"
}
Rate Limits:
Free tier: 120 requests per minute
No daily limit
3.2 News API Key
Obtaining the Key:
Visit https://newsapi.org/register
Choose a plan (Free tier available)
Get your API key
Configuration:
tool = APISourceTool({'newsapi_api_key': 'YOUR_NEWS_KEY'})
Rate Limits:
Free tier: 100 requests per day
Developer tier: 250 requests per day
Business tier: 250,000 requests per day
3.3 The Guardian API Key
Obtaining the Key:
Visit https://open-platform.theguardian.com/access/
Register for a free account
Request an API key
Configuration:
# Method 1: Direct configuration
tool = APISourceTool({'guardian_api_key': 'YOUR_GUARDIAN_KEY'})
# Method 2: Environment variable
export GUARDIAN_API_KEY="YOUR_GUARDIAN_KEY"
# Method 3: Configuration file
{
"guardian_api_key": "YOUR_GUARDIAN_KEY"
}
Rate Limits:
Free tier: 5,000 requests per day
Developer tier: 15,000 requests per day
Higher tiers available for commercial use
Important API Rules:
API Key Required: All API requests require an API key
Rate Limiting: Free tier allows 5,000 requests per day
Attribution: Must acknowledge The Guardian when displaying content
Data Freshness: Content is updated in real-time
Commercial Use: Contact The Guardian for commercial licensing
API Documentation:
API Documentation: https://open-platform.theguardian.com/documentation/
Content API: https://open-platform.theguardian.com/documentation/search
Tags API: https://open-platform.theguardian.com/documentation/tag
Sections API: https://open-platform.theguardian.com/documentation/section
Available Operations:
search_content: Search all Guardian content with advanced filteringget_item: Get a specific content item by IDget_tags: Get all tags or filter by typesearch_tags: Search for tags by queryget_sections: Get all Guardian sectionsget_edition: Get content for a specific edition (UK, US, AU, International)
Example Usage:
# Search for articles about climate change
result = tool.query(
provider='guardian',
operation='search_content',
params={
'q': 'climate change',
'section': 'environment',
'page_size': 10,
'show_fields': 'headline,body,thumbnail'
}
)
# Get all sections
result = tool.query(
provider='guardian',
operation='get_sections',
params={}
)
# Search for tags
result = tool.query(
provider='guardian',
operation='search_tags',
params={'q': 'technology', 'page_size': 10}
)
# Get US edition content
result = tool.query(
provider='guardian',
operation='get_edition',
params={'edition': 'us', 'page_size': 20}
)
3.4 Census Bureau API Key
Obtaining the Key:
Visit https://api.census.gov/data/key_signup.html
Fill out the request form
Receive key via email
Configuration:
tool = APISourceTool({'census_api_key': 'YOUR_CENSUS_KEY'})
Rate Limits:
500 requests per IP per day (without key)
Higher limits with API key
3.4 Congress.gov API
API Key Required:
Visit https://api.congress.gov/sign-up/
Fill out the registration form
Receive key via email
Configuration:
tool = APISourceTool({'congress_api_key': 'YOUR_CONGRESS_KEY'})
Rate Limits:
Reasonable usage limits with API key
Data updated regularly from official sources
Available Operations:
search_bills: Search for bills and resolutionsget_bill: Get detailed bill informationlist_members: List members of Congressget_member: Get member detailslist_committees: List congressional committeesget_committee: Get committee detailssearch_amendments: Search for amendmentsget_amendment: Get amendment details
3.5 OpenStates API
API Key Required:
config = {
'openstates_api_key': 'YOUR_API_KEY'
}
tool = APISourceTool(config)
Configuration:
config = {
'openstates_api_key': 'YOUR_API_KEY',
'openstates_config': {
'timeout': 30,
'rate_limit': 10,
'max_burst': 20
}
}
tool = APISourceTool(config)
Environment Variables:
export OPENSTATES_API_KEY="your_api_key_here"
export OPENSTATES_TIMEOUT=30
export OPENSTATES_RATE_LIMIT=10
export OPENSTATES_MAX_BURST=20
Obtaining an API Key:
Visit https://openstates.org/accounts/profile/
Register for a free account
Generate an API key from your profile
Copy the API key to your configuration
Rate Limits:
Free tier: Reasonable usage limits
Be respectful of the free service
Recommended: Max 10 requests per second
Important API Rules:
API Key Required: Must register for a free API key
Rate Limiting: Be respectful - implement reasonable delays between requests
Attribution: Acknowledge OpenStates.org when using the data
Data Freshness: Data is updated regularly from official state sources
API Documentation:
API v3 Documentation: https://docs.openstates.org/api-v3/
Interactive API Docs: https://v3.openstates.org/docs/
About OpenStates: https://openstates.org/about/
Available Operations:
search_bills: Search for state bills and resolutions with advanced filteringget_bill: Get detailed information about a specific bill by IDsearch_people: Search for state legislators with filtering optionsget_person: Get detailed information about a specific legislatorlist_jurisdictions: List all available state jurisdictionsget_jurisdiction: Get detailed information about a specific jurisdiction
Example Usage:
# Search for bills in California
result = tool.query(
provider='openstates',
operation='search_bills',
params={'jurisdiction': 'CA', 'session': '2023', 'per_page': 10}
)
# Get current legislators from Texas
result = tool.query(
provider='openstates',
operation='search_people',
params={'jurisdiction': 'TX', 'current': True, 'per_page': 10}
)
# List all available jurisdictions
result = tool.query(
provider='openstates',
operation='list_jurisdictions',
params={'per_page': 52}
)
3.6 World Bank API
No API Key Required:
# World Bank API is publicly accessible
tool = APISourceTool() # No key needed for World Bank
Rate Limits:
No official rate limit
Recommended: Max 10 requests per second
3.7 Alpha Vantage API Key
Obtaining the Key:
Visit https://www.alphavantage.co/support/#api-key
Register for a free account
Get your API key
Configuration:
tool = APISourceTool({'alphavantage_api_key': 'YOUR_ALPHAVANTAGE_KEY'})
Rate Limits:
Free tier: 5 API requests per minute, 500 per day
Premium tiers available with higher limits
3.8 REST Countries API
No API Key Required:
# REST Countries API is publicly accessible
tool = APISourceTool() # No key needed for REST Countries
Rate Limits:
No official rate limit
Recommended: Max 10 requests per second
3.9 ExchangeRate-API
No API Key Required (Free Tier):
# ExchangeRate-API free tier works without key
tool = APISourceTool() # No key needed for free tier
Optional API Key for Enhanced Features:
tool = APISourceTool({'exchangerate_api_key': 'YOUR_EXCHANGERATE_KEY'})
Rate Limits:
Free tier: 1,500 requests per month
Standard tier: Higher limits with API key
3.10 Open Library API
No API Key Required:
# Open Library API is completely free and open
tool = APISourceTool() # No key needed for Open Library
Rate Limits:
No official rate limit
Recommended: Max 10 requests per second
Be respectful of the free service
3.11 Metropolitan Museum of Art (The Met) API
No API Key Required:
# The Met Museum API is completely free and open
tool = APISourceTool() # No key needed for Met Museum
Configuration (Optional):
config = {
'metmuseum_config': {
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20 # Maximum burst size
}
}
tool = APISourceTool(config)
Environment Variables:
export METMUSEUM_TIMEOUT=30
export METMUSEUM_RATE_LIMIT=10
export METMUSEUM_MAX_BURST=20
Rate Limits:
No official rate limit
Recommended: Max 10 requests per second
Be respectful of the free service
Important API Rules:
No API Key Required: Completely free and open access
Rate Limiting: Be respectful - implement reasonable delays between requests
Data Coverage: Access to 470,000+ artworks from The Met collection
Images: Many objects include high-resolution images (check isPublicDomain flag)
Attribution: Acknowledge The Metropolitan Museum of Art when using the data
API Documentation:
API Documentation: https://metmuseum.github.io/
GitHub Repository: https://github.com/metmuseum/openaccess
Open Access Initiative: https://www.metmuseum.org/about-the-met/policies-and-documents/open-access
Supported Operations:
search_objects- Search for art objects with advanced filtering (query, department, date range, etc.)get_object- Get detailed information about a specific art object by IDget_departments- Get list of all departments at The Metget_objects_by_department- Get all objects in a specific departmentsearch_by_artist- Search for artworks by artist namesearch_by_medium- Search for artworks by medium (Paintings, Sculpture, etc.)search_by_culture- Search for artworks by culture or civilizationsearch_highlight_objects- Search for highlighted/featured objectsdownload_image- Download high-resolution images from The Met collection
Example Usage:
# Search for artworks by Van Gogh
result = tool.query(
provider='metmuseum',
operation='search_by_artist',
params={'artist_name': 'Vincent van Gogh', 'has_images': True, 'limit': 10}
)
# Get detailed object information
result = tool.query(
provider='metmuseum',
operation='get_object',
params={'object_id': 436535} # Wheat Field with Cypresses
)
# Search for Egyptian art
result = tool.query(
provider='metmuseum',
operation='search_by_culture',
params={'culture': 'Egyptian', 'has_images': True, 'limit': 20}
)
# Get all departments
result = tool.query(
provider='metmuseum',
operation='get_departments',
params={}
)
# Search with date range filter
result = tool.query(
provider='metmuseum',
operation='search_objects',
params={
'q': 'impressionism',
'has_images': True,
'date_begin': 1860,
'date_end': 1900,
'limit': 15
}
)
# Download image by object ID
result = tool.query(
provider='metmuseum',
operation='download_image',
params={'object_id': 436535} # Downloads primary image
)
print(f"Image saved to: {result['data']['output_path']}")
# Download image by direct URL
result = tool.query(
provider='metmuseum',
operation='download_image',
params={
'image_url': 'https://images.metmuseum.org/CRDImages/ep/original/DP-42549-001.jpg',
'output_path': './artwork.jpg' # Optional custom path
}
)
Data Fields Available:
Object metadata: title, artist, date, medium, dimensions
Department and classification information
Geographic and cultural origin
High-resolution images (primaryImage, additionalImages)
Exhibition history and provenance
Related artworks and references
Public domain status (isPublicDomain)
Gallery information (isOnView, GalleryNumber)
3.12 CoinGecko API
No API Key Required:
# CoinGecko API is free for basic usage
tool = APISourceTool() # No key needed for free tier
Rate Limits:
Free tier: 10-50 calls/minute (varies by endpoint)
Pro tier available with API key for higher limits
3.12 OpenWeatherMap API
Obtaining the Key:
Visit https://openweathermap.org/api
Sign up for a free account
Generate an API key from your account dashboard
Configuration:
tool = APISourceTool({'openweathermap_api_key': 'YOUR_OPENWEATHERMAP_KEY'})
Rate Limits:
Free tier: 60 calls/minute, 1,000,000 calls/month
Various paid tiers available
3.13 Wikipedia API
No API Key Required:
# Wikipedia API is completely free and open
tool = APISourceTool() # No key needed for Wikipedia
Configuration with User-Agent (REQUIRED):
config = {
'wikipedia_config': {
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
}
}
tool = APISourceTool(config)
Rate Limits:
Maximum: 200 requests per second
Recommended: 10 requests per second (default in configuration)
Be respectful of the free service
API Rules (https://www.mediawiki.org/wiki/API:Etiquette):
User-Agent Header REQUIRED: Must include a unique User-Agent header with:
Application name and version
Contact URL or email address
Format:
"AppName/Version (URL; contact@email.com)"
Rate Limiting: Limit to 200 requests/second maximum
Caching: Cache responses when possible to reduce load
API Documentation:
MediaWiki Action API: https://www.mediawiki.org/wiki/API:Main_page
REST API: https://en.wikipedia.org/api/rest_v1/
API Etiquette: https://www.mediawiki.org/wiki/API:Etiquette
3.14 GitHub API
API Key Recommended:
config = {
'github_api_key': 'YOUR_GITHUB_TOKEN'
}
tool = APISourceTool(config)
Environment Variable:
export GITHUB_API_KEY="your_github_personal_access_token"
Rate Limits:
Authenticated: 5,000 requests per hour
Unauthenticated: 60 requests per hour
Strongly recommended to use authentication for higher limits
Obtaining an API Key:
Visit https://github.com/settings/tokens
Click “Generate new token” → “Generate new token (classic)”
Select scopes based on your needs:
public_repo- Access public repositoriesrepo- Full control of private repositories (if needed)user- Read user profile data
Generate and copy the token
API Documentation:
REST API: https://docs.github.com/en/rest
Authentication: https://docs.github.com/en/rest/authentication
Rate Limiting: https://docs.github.com/en/rest/rate-limit
3.13 arXiv API
No API Key Required:
# arXiv API is completely free and open
tool = APISourceTool() # No key needed for arXiv
Configuration (Optional):
config = {
'arxiv_config': {
'timeout': 30,
'rate_limit': 0.33, # ~3 second delays between requests (1/3 req/s)
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
}
}
tool = APISourceTool(config)
Important API Rules:
Rate Limiting: Be respectful - implement 3 second delays between requests
Max Results: Limited to 30,000 results in slices of at most 2,000 at a time
Caching: Cache responses when possible to reduce server load
User-Agent: Set a descriptive User-Agent header
API Documentation:
API User Manual: https://info.arxiv.org/help/api/user-manual.html
API Basics: https://info.arxiv.org/help/api/basics.html
arXiv Categories: https://arxiv.org/category_taxonomy
3.14 PubMed/NCBI E-utilities API
API Key Optional but Recommended:
# Works without API key (3 requests/second limit)
tool = APISourceTool()
# With API key (10 requests/second limit)
config = {
'pubmed_api_key': 'YOUR_PUBMED_API_KEY'
}
tool = APISourceTool(config)
Environment Variable:
export PUBMED_API_KEY="your_ncbi_api_key"
Configuration (Optional):
config = {
'pubmed_config': {
'api_key': 'YOUR_API_KEY', # Optional but recommended
'timeout': 30,
'rate_limit': 3, # 3 req/s without key, 10 with key
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
}
}
tool = APISourceTool(config)
Rate Limits:
Without API key: 3 requests per second
With API key: 10 requests per second
API key strongly recommended for better service
Obtaining an API Key:
Visit https://www.ncbi.nlm.nih.gov/account/
Register for a free NCBI account
Go to Settings → API Key Management
Generate a new API key
Important API Rules:
Rate Limiting: Max 3 requests/second without API key, 10 with API key
User-Agent: Set a descriptive User-Agent header with email
Caching: Cache responses when possible to reduce server load
API Key: Recommended for higher rate limits and better service
API Documentation:
E-utilities Quick Start: https://www.ncbi.nlm.nih.gov/books/NBK25500/
E-utilities API Guide: https://www.ncbi.nlm.nih.gov/books/NBK25501/
PubMed Help: https://pubmed.ncbi.nlm.nih.gov/help/
Supported Operations:
search_papers: Search for papers by query stringget_paper_by_id: Get paper metadata by PubMed ID (PMID)search_by_author: Search for papers by author nameget_paper_details: Get detailed paper information including abstract and citations
3.15 CrossRef API
No API Key Required:
# CrossRef API is completely free and open
tool = APISourceTool() # No key needed for CrossRef
Configuration (Optional):
config = {
'crossref_config': {
'mailto': 'your-email@example.com', # For polite pool access (better rate limits)
'timeout': 30,
'rate_limit': 10,
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
}
}
tool = APISourceTool(config)
Environment Variable:
export CROSSREF_MAILTO="your-email@example.com"
Important API Rules:
Rate Limiting: Use polite pool (include mailto parameter) for better rate limits
User-Agent: Set a descriptive User-Agent header
Caching: Cache responses when possible to reduce server load
Attribution: Acknowledge CrossRef when using the data
API Documentation:
REST API Documentation: https://www.crossref.org/documentation/retrieve-metadata/rest-api/
API Etiquette: https://github.com/CrossRef/rest-api-doc#etiquette
Metadata Plus: https://www.crossref.org/services/metadata-delivery/
Supported Operations:
get_work_by_doi: Get metadata for a work by its DOIsearch_works: Search for works by query stringget_journal_works: Get works published in a specific journal by ISSNsearch_funders: Search for funders in the Open Funder Registryget_funder_works: Get works associated with a specific funder
3.16 Semantic Scholar API
No API Key Required:
# Semantic Scholar API is completely free and open
tool = APISourceTool() # No key needed for Semantic Scholar
Configuration (Optional):
config = {
'semanticscholar_config': {
'timeout': 30,
'rate_limit': 1, # Requests per second (recommended for sustained use)
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
}
}
tool = APISourceTool(config)
Environment Variables:
export SEMANTICSCHOLAR_TIMEOUT=30
export SEMANTICSCHOLAR_RATE_LIMIT=1
export SEMANTICSCHOLAR_MAX_BURST=5
Rate Limits:
Free tier: 1 request per second recommended (100 requests per 5 minutes)
Higher limits available upon request
Important API Rules:
Rate Limiting: Recommended 1 request per second for sustained use
Max Results: Limited to 100 results per request for search, use pagination for more
Caching: Cache responses when possible to reduce server load
User-Agent: Set a descriptive User-Agent header
API Documentation:
API Documentation: https://api.semanticscholar.org/api-docs/
Academic Graph API: https://www.semanticscholar.org/product/api
API Tutorial: https://www.semanticscholar.org/product/api/tutorial
Supported Operations:
search_papers: Search for papers by query stringget_paper: Get paper details by ID (S2 ID, DOI, arXiv ID, etc.)get_paper_authors: Get authors of a specific paperget_paper_citations: Get papers that cite this paperget_paper_references: Get papers referenced by this paperget_author: Get author details by IDget_author_papers: Get papers by a specific author
3.17 CORE API Key
Obtaining the Key:
Visit https://core.ac.uk/services/api
Register for a free account
Request an API key from your account dashboard
Configuration:
# Method 1: Direct configuration
tool = APISourceTool({'core_api_key': 'YOUR_CORE_KEY'})
# Method 2: Environment variable
export CORE_API_KEY="YOUR_CORE_KEY"
# Method 3: Configuration file
{
"core_api_key": "YOUR_CORE_KEY"
}
Rate Limits:
Free tier: Reasonable usage with rate limiting
Contact CORE for higher limits if needed
Features:
Access to millions of open access research papers
Search by query, DOI, or title
Full metadata including authors, abstract, citations
Support for pagination
3.18 USPTO API Key
Obtaining the Key:
Visit https://developer.uspto.gov/
Register for a free developer account
Request an API key from your account dashboard
Configuration:
# Method 1: Direct configuration
tool = APISourceTool({'uspto_api_key': 'YOUR_USPTO_KEY'})
# Method 2: Environment variable
export USPTO_API_KEY="YOUR_USPTO_KEY"
# Method 3: Configuration file
{
"uspto_api_key": "YOUR_USPTO_KEY"
}
Rate Limits:
Free tier: Reasonable usage with rate limiting
Contact USPTO for higher limits if needed
Features:
Search US patents by query, inventor, or assignee
Get detailed patent information by patent number
Access to comprehensive US patent database
Full metadata including inventors, assignees, classifications, citations
3.19 SEC EDGAR API
No API Key Required:
# SEC EDGAR API is publicly accessible
# User-Agent header is REQUIRED
config = {
'secedgar_config': {
'user_agent': 'YourCompanyName contact@example.com'
}
}
tool = APISourceTool(config)
Environment Variable:
export SECEDGAR_USER_AGENT="YourCompanyName contact@example.com"
Configuration with User-Agent (REQUIRED):
config = {
'secedgar_config': {
'user_agent': 'AIECS-APISource contact@example.com',
'timeout': 30,
'rate_limit': 10,
'max_burst': 20
}
}
tool = APISourceTool(config)
Rate Limits:
Maximum: 10 requests per second
SEC may block access if rules are not followed
Be respectful of the free service
API Rules (https://www.sec.gov/os/accessing-edgar-data):
User-Agent Header REQUIRED: Must include:
Company or individual name
Contact email address
Format:
"CompanyName contact@email.com"
Rate Limiting: Limit to 10 requests per second maximum
Caching: Cache responses when possible to reduce load
Fair Access: SEC monitors usage and may block non-compliant access
API Documentation:
API Overview: https://www.sec.gov/search-filings/edgar-application-programming-interfaces
Accessing EDGAR Data: https://www.sec.gov/os/accessing-edgar-data
Data Sets: https://www.sec.gov/data-research/sec-markets-data
Features:
Company submissions and filing history
XBRL financial data and concepts
Company facts across all filings
No API key required - completely free
Supported Operations:
get_company_submissions- Get company filing history by CIKget_company_concept- Get XBRL concept data for specific metricsget_company_facts- Get all XBRL facts for a company
Example CIKs:
Apple Inc.: 0000320193
Tesla Inc.: 0001318605
Microsoft Corp.: 0000789019
3.20 Stack Exchange API
API Key Optional (Recommended):
# Stack Exchange API works without key but has lower rate limits
# API key strongly recommended for production use
config = {
'stackexchange_config': {
'api_key': 'YOUR_STACKEXCHANGE_API_KEY'
}
}
tool = APISourceTool(config)
Environment Variable:
export STACKEXCHANGE_API_KEY="your_api_key_here"
Get Your API Key:
Visit https://stackapps.com/apps/oauth/register
Register your application
Copy your API key
Configuration:
config = {
'stackexchange_config': {
'api_key': 'YOUR_API_KEY', # Optional but recommended
'timeout': 30,
'rate_limit': 10,
'max_burst': 20
}
}
tool = APISourceTool(config)
Rate Limits:
Without API key: 300 requests per day
With API key: 10,000 requests per day
Respect the backoff field in API responses
API Rules (https://api.stackexchange.com/docs/throttle):
API Key Recommended: Increases daily quota from 300 to 10,000 requests
Backoff: Respect the backoff field in responses when present
Compression: API returns gzip compressed responses by default
Attribution: Required when displaying Stack Exchange content
Fair Use: Follow the API terms of service
API Documentation:
API Documentation: https://api.stackexchange.com/docs
Authentication: https://api.stackexchange.com/docs/authentication
Throttling: https://api.stackexchange.com/docs/throttle
Features:
Search questions across Stack Exchange network
Get detailed question and answer data
Search for users and their profiles
Browse tags and their statistics
Access all Stack Exchange sites (Stack Overflow, Server Fault, Super User, etc.)
Rich metadata including votes, views, acceptance status, and bounties
Supported Operations:
search_questions- Search for questions by query and tagsget_question- Get detailed information about a specific questionget_answers- Get answers for a specific questionsearch_users- Search for users by nameget_tags- Get tags and their statisticsget_sites- Get all sites in the Stack Exchange network
Popular Sites:
Stack Overflow:
stackoverflowServer Fault:
serverfaultSuper User:
superuserAsk Ubuntu:
askubuntuMathematics:
math
3.21 OpenCorporates API
API Key Required:
config = {
'opencorporates_api_key': 'YOUR_OPENCORPORATES_API_KEY'
}
tool = APISourceTool(config)
Environment Variable:
export OPENCORPORATES_API_KEY="your_opencorporates_api_key"
Rate Limits:
Free tier: 200 requests per month, 50 requests per day
Open data projects: Free with share-alike attribution
Paid plans: Available for commercial use without restrictions
Obtaining an API Key:
Visit https://opencorporates.com/api_accounts/new
Register for a free account
Choose your plan (free for open data projects)
Get your API key from the dashboard
Features:
Search for companies by name across 140+ jurisdictions
Get detailed company information by jurisdiction and company number
Search for company officers (directors, agents)
Access company filings and statutory documents
Get jurisdiction information
Access to 200+ million companies worldwide
API Documentation:
API Reference: https://api.opencorporates.com/documentation/API-Reference
API Accounts: https://opencorporates.com/api_accounts/new
About OpenCorporates: https://opencorporates.com/info/about
3.22 CourtListener (Free Law Project) API
API Key Required:
config = {
'courtlistener_api_key': 'YOUR_COURTLISTENER_API_KEY'
}
tool = APISourceTool(config)
Environment Variable:
export COURTLISTENER_API_KEY="your_courtlistener_api_key"
Configuration (Optional):
config = {
'courtlistener_api_key': 'YOUR_API_KEY',
'courtlistener_config': {
'timeout': 30,
'rate_limit': 10,
'max_burst': 20
}
}
tool = APISourceTool(config)
Rate Limits:
Free tier: 5,000 requests per hour for authenticated users
Higher limits available upon request
Be respectful of the free service
Obtaining an API Key:
Visit https://www.courtlistener.com/sign-in/register/
Register for a free account
Go to your profile settings
Generate an API key
Copy and store the API key securely
Features:
Search legal opinions and case law from federal and state courts
Access court dockets and case filings (RECAP archive)
Search judges and judicial information
Access oral argument audio recordings
Explore legal citations and citation networks
Search court information
Access to millions of legal opinions and PACER data
Full metadata including case names, judges, courts, dates, citations
Supported Operations:
search_opinions- Search for legal opinions and case law with advanced filteringget_opinion- Get detailed information about a specific legal opinionsearch_dockets- Search for court dockets and case filingsget_docket- Get detailed information about a specific docketsearch_judges- Search for judges and judicial informationget_judge- Get detailed information about a specific judgesearch_oral_arguments- Search for oral argument audio recordingsget_oral_argument- Get detailed information about a specific oral argumentsearch_citations- Search for legal citations and citation networksget_citation- Get detailed information about a specific citationsearch_courts- Search for court informationget_court- Get detailed information about a specific court
Important API Rules:
API Key Required: Must register for a free API key
Rate Limiting: Default is 5,000 requests per hour for authenticated users
Attribution: Acknowledge Free Law Project when using the data
Data Freshness: Data is updated regularly from court sources and PACER
Fair Use: Follow the API terms of service
API Documentation:
REST API Documentation: https://www.courtlistener.com/help/api/rest/
Interactive API Docs: https://www.courtlistener.com/api/rest-info/
About Free Law Project: https://free.law/
Coverage Information: https://www.courtlistener.com/coverage/
Example Usage:
# Search for opinions about constitutional law
result = tool.query(
provider='courtlistener',
operation='search_opinions',
params={'q': 'first amendment', 'court': 'scotus', 'page_size': 10}
)
# Search for dockets in a specific court
result = tool.query(
provider='courtlistener',
operation='search_dockets',
params={'court': 'dcd', 'docket_number': '20-cv', 'page_size': 5}
)
# Search for judges
result = tool.query(
provider='courtlistener',
operation='search_judges',
params={'name': 'Sotomayor', 'court': 'scotus'}
)
# Search for oral arguments
result = tool.query(
provider='courtlistener',
operation='search_oral_arguments',
params={'court': 'scotus', 'case_name': 'Brown', 'page_size': 5}
)
# Get court information
result = tool.query(
provider='courtlistener',
operation='get_court',
params={'court_id': 'scotus'}
)
Popular Court IDs:
Supreme Court:
scotus9th Circuit Court of Appeals:
ca92nd Circuit Court of Appeals:
ca2D.C. District Court:
dcdSouthern District of New York:
nysdNorthern District of California:
cand
3.23 GBIF (Global Biodiversity Information Facility) API
No API Key Required:
# GBIF API is completely free and open
tool = APISourceTool() # No key needed for GBIF
Configuration (Optional):
config = {
'gbif_config': {
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)'
}
}
tool = APISourceTool(config)
Environment Variables:
export GBIF_TIMEOUT=30
export GBIF_RATE_LIMIT=10
export GBIF_MAX_BURST=20
export GBIF_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)"
Rate Limits:
No official rate limit
Recommended: Max 10 requests per second
Be respectful of the free service
Important API Rules:
No API Key Required: Completely free and open access
Rate Limiting: Be respectful - implement reasonable delays between requests
Data Coverage: Access to 2+ billion species occurrence records
Attribution: Acknowledge GBIF when using the data
Fair Use: Do not abuse the free service with excessive requests
API Documentation:
API Reference: https://techdocs.gbif.org/en/openapi/
Species API: https://techdocs.gbif.org/en/openapi/v1/species
Occurrence API: https://techdocs.gbif.org/en/openapi/v1/occurrence
Dataset API: https://techdocs.gbif.org/en/openapi/v1/dataset
About GBIF: https://www.gbif.org/what-is-gbif
Features:
Search for species by name or taxonomic criteria
Match scientific names to GBIF’s taxonomic backbone
Search occurrence records with geographic and temporal filters
Access dataset metadata and publishing information
Get vernacular (common) names in multiple languages
Explore taxonomic hierarchies and relationships
Access to 2+ billion biodiversity occurrence records
Rich metadata including coordinates, dates, basis of record
Supported Operations:
search_species- Search for species by name or other criteriaget_species_by_key- Get detailed species information by GBIF keymatch_species_name- Match a scientific name to GBIF taxonomysearch_occurrences- Search for species occurrence recordsget_occurrence_by_key- Get detailed occurrence record by keysearch_datasets- Search for datasets in GBIFget_dataset_by_key- Get detailed dataset information by keyget_species_vernacular_names- Get common/vernacular names for a speciesget_species_children- Get direct children taxa of a speciesget_species_parents- Get parent taxa hierarchy for a speciesget_occurrence_count- Get count of occurrence records matching criteriasearch_organizations- Search for publishing organizations
Example Usage:
# Search for species
result = tool.query(
provider='gbif',
operation='search_species',
params={'q': 'Panthera leo', 'rank': 'SPECIES', 'limit': 10}
)
# Match a scientific name
result = tool.query(
provider='gbif',
operation='match_species_name',
params={'name': 'Panthera leo', 'kingdom': 'Animalia'}
)
# Search for occurrence records
result = tool.query(
provider='gbif',
operation='search_occurrences',
params={
'taxonKey': 5219404, # Panthera leo
'country': 'KE', # Kenya
'year': '2020',
'limit': 50
}
)
# Get occurrence count
result = tool.query(
provider='gbif',
operation='get_occurrence_count',
params={'country': 'US', 'year': '2020'}
)
# Get vernacular names
result = tool.query(
provider='gbif',
operation='get_species_vernacular_names',
params={'key': 5219404} # Panthera leo
)
# Search datasets
result = tool.query(
provider='gbif',
operation='search_datasets',
params={'q': 'birds', 'type': 'OCCURRENCE', 'limit': 10}
)
# Get species details
result = tool.query(
provider='gbif',
operation='get_species_by_key',
params={'key': 5219404} # Panthera leo
)
# Get taxonomic children
result = tool.query(
provider='gbif',
operation='get_species_children',
params={'key': 5219404, 'limit': 20}
)
# Search organizations
result = tool.query(
provider='gbif',
operation='search_organizations',
params={'country': 'US', 'limit': 10}
)
Data Fields Available:
Species metadata: scientific name, rank, kingdom, phylum, class, order, family, genus
Occurrence data: coordinates, date, basis of record, dataset key
Dataset information: title, description, publishing organization, license
Vernacular names: common names in multiple languages
Taxonomic hierarchy: parent and child taxa
Geographic information: country, locality, coordinates
Temporal information: year, month, day, event date
Data quality: coordinate uncertainty, identification confidence
Common Taxonomic Ranks:
KINGDOM
PHYLUM
CLASS
ORDER
FAMILY
GENUS
SPECIES
SUBSPECIES
Common Basis of Record Values:
HUMAN_OBSERVATION
PRESERVED_SPECIMEN
FOSSIL_SPECIMEN
LIVING_SPECIMEN
MACHINE_OBSERVATION
MATERIAL_SAMPLE
OBSERVATION
OCCURRENCE
Use Cases:
Biodiversity research and analysis
Species distribution mapping
Conservation planning
Environmental impact assessments
Citizen science data exploration
Taxonomic research and validation
Dataset discovery and metadata retrieval
Geographic occurrence analysis
4. Performance Settings
4.1 Caching Configuration
config = {
# Basic caching
'cache_ttl': 300, # 5 minutes
# Intelligent caching (intent-aware TTL)
'enable_intelligent_cache': True,
# Cache backend (optional)
'cache_backend': 'redis', # 'memory' or 'redis'
'redis_url': 'redis://localhost:6379/0'
}
Intelligent Cache TTL Strategies:
Recent data queries: 60 seconds
Historical data: 3600 seconds (1 hour)
Metadata queries: 86400 seconds (24 hours)
Search queries: 300 seconds (5 minutes)
4.2 Timeout Configuration
config = {
# Global timeout
'default_timeout': 30,
# Provider-specific timeouts
'provider_timeouts': {
'fred': 15,
'worldbank': 45,
'newsapi': 20,
'census': 30
}
}
4.3 Retry Configuration
config = {
'max_retries': 3,
'retry_backoff_factor': 2.0, # Exponential backoff multiplier
'retry_jitter': True, # Add random jitter to prevent thundering herd
'retry_on_status_codes': [429, 500, 502, 503, 504]
}
Retry Delay Calculation:
delay = base_delay * (backoff_factor ** attempt) + random_jitter
Example:
Attempt 1: 1.0s + jitter
Attempt 2: 2.0s + jitter
Attempt 3: 4.0s + jitter
5. Feature Flags
5.1 Query Enhancement
config = {
'enable_query_enhancement': True,
'query_enhancement_config': {
'confidence_threshold': 0.5, # Min confidence for auto-enhancement
'max_enhancements': 5, # Max parameters to add
'preserve_explicit_params': True # Don't override user params
}
}
5.2 Fallback Strategy
config = {
'enable_fallback': True,
'fallback_config': {
'max_fallback_attempts': 2,
'fallback_timeout_multiplier': 1.5, # Increase timeout for fallback
'preserve_quality_threshold': 0.7 # Min quality for fallback result
}
}
5.3 Data Fusion
config = {
'enable_data_fusion': True,
'data_fusion_config': {
'default_strategy': 'best_quality', # 'best_quality', 'merge_all', 'consensus'
'quality_weight': 0.6,
'freshness_weight': 0.3,
'completeness_weight': 0.1
}
}
5.4 Rate Limiting
config = {
'enable_rate_limiting': True,
'rate_limit_config': {
'fred': {
'tokens_per_second': 2.0, # 120 per minute
'max_tokens': 10
},
'newsapi': {
'tokens_per_second': 0.001, # ~100 per day
'max_tokens': 5
},
'census': {
'tokens_per_second': 0.005, # ~500 per day
'max_tokens': 10
}
}
}
6. Provider-Specific Configuration
6.1 FRED Provider
config = {
'fred_api_key': 'YOUR_KEY',
'fred_config': {
'base_url': 'https://api.stlouisfed.org/fred',
'timeout': 15,
'default_file_type': 'json',
'default_frequency': 'a', # Annual
'default_units': 'lin' # Linear
}
}
6.2 World Bank Provider
config = {
'worldbank_config': {
'base_url': 'https://api.worldbank.org/v2',
'timeout': 45,
'default_format': 'json',
'default_per_page': 50,
'default_language': 'en'
}
}
6.3 News API Provider
config = {
'newsapi_api_key': 'YOUR_KEY',
'newsapi_config': {
'base_url': 'https://newsapi.org/v2',
'timeout': 20,
'default_language': 'en',
'default_page_size': 20,
'default_sort_by': 'publishedAt'
}
}
6.4 The Guardian Provider
config = {
'guardian_api_key': 'YOUR_KEY',
'guardian_config': {
'base_url': 'https://content.guardianapis.com',
'timeout': 30,
'rate_limit': 5,
'max_burst': 10
}
}
Features:
Search all Guardian content with advanced filtering
Get specific content items by ID
Browse and search tags (keywords, contributors, series, etc.)
Get all sections
Filter by section, tag, date range
Support for multiple editions (UK, US, AU, International)
Rich metadata including headlines, body text, thumbnails, tags
Supported Operations:
search_content- Search all Guardian content with advanced filtering optionsget_item- Get a specific content item by IDget_tags- Get all tags or filter by typesearch_tags- Search for tags by queryget_sections- Get all Guardian sectionsget_edition- Get content for a specific edition
Important Configuration Notes:
API Key Required: Must register for a free API key
Rate Limits: Free tier allows 5,000 requests per day
Attribution: Must acknowledge The Guardian when displaying content
Content Fields: Use
show_fieldsparameter to request specific fields (headline, body, thumbnail, etc.)Tags: Use
show_tagsparameter to include tag metadata (keyword, contributor, etc.)
Example Usage:
# Search for articles about technology
result = tool.query(
provider='guardian',
operation='search_content',
params={
'q': 'artificial intelligence',
'section': 'technology',
'from_date': '2024-01-01',
'page_size': 10,
'show_fields': 'headline,body,thumbnail',
'show_tags': 'keyword,contributor'
}
)
# Get all sections
result = tool.query(
provider='guardian',
operation='get_sections',
params={}
)
# Search for tags related to climate
result = tool.query(
provider='guardian',
operation='search_tags',
params={'q': 'climate', 'page_size': 10}
)
# Get US edition content
result = tool.query(
provider='guardian',
operation='get_edition',
params={'edition': 'us', 'page_size': 20}
)
API Documentation:
API Overview: https://open-platform.theguardian.com/documentation/
Content Search: https://open-platform.theguardian.com/documentation/search
Tags API: https://open-platform.theguardian.com/documentation/tag
Sections API: https://open-platform.theguardian.com/documentation/section
6.5 Census Provider
config = {
'census_api_key': 'YOUR_KEY',
'census_config': {
'base_url': 'https://api.census.gov/data',
'timeout': 30,
'default_year': 2021,
'default_dataset': 'acs/acs5'
}
}
6.5 Congress Provider
config = {
'congress_api_key': 'YOUR_KEY',
'congress_config': {
'base_url': 'https://api.congress.gov/v3',
'timeout': 30,
'rate_limit': 10,
'max_burst': 20
}
}
Available Operations:
search_bills: Search for bills and resolutions by congress number and typeget_bill: Get detailed information about a specific billlist_members: List members of Congress by congress number and chamberget_member: Get detailed information about a specific memberlist_committees: List congressional committeesget_committee: Get detailed information about a specific committeesearch_amendments: Search for amendments to billsget_amendment: Get detailed information about a specific amendment
Example Usage:
# Search for bills in the 118th Congress
result = tool.execute('search_bills', {
'congress': 118,
'bill_type': 'hr',
'limit': 10
})
# Get specific bill details
result = tool.execute('get_bill', {
'congress': 118,
'bill_type': 'hr',
'bill_number': 1
})
# List House members in 118th Congress
result = tool.execute('list_members', {
'congress': 118,
'chamber': 'house',
'limit': 20
})
6.6 OpenStates Provider
config = {
'openstates_api_key': 'YOUR_API_KEY', # REQUIRED
'openstates_config': {
'base_url': 'https://v3.openstates.org',
'timeout': 30,
'rate_limit': 10,
'max_burst': 20
}
}
Available Operations:
search_bills: Search for state bills and resolutions with advanced filteringget_bill: Get detailed information about a specific bill by IDsearch_people: Search for state legislators with filtering optionsget_person: Get detailed information about a specific legislatorlist_jurisdictions: List all available state jurisdictionsget_jurisdiction: Get detailed information about a specific jurisdiction
Example Usage:
# Search for bills in California
result = tool.query(
provider='openstates',
operation='search_bills',
params={'jurisdiction': 'CA', 'session': '2023', 'per_page': 10}
)
# Search for bills by subject
result = tool.query(
provider='openstates',
operation='search_bills',
params={
'jurisdiction': 'NY',
'subject': 'Education',
'per_page': 5
}
)
# Get current legislators from Texas
result = tool.query(
provider='openstates',
operation='search_people',
params={'jurisdiction': 'TX', 'current': True, 'per_page': 10}
)
# List all available jurisdictions
result = tool.query(
provider='openstates',
operation='list_jurisdictions',
params={'per_page': 52}
)
# Get specific bill details
result = tool.query(
provider='openstates',
operation='get_bill',
params={'bill_id': 'ocd-bill/...'}
)
Important Configuration Notes:
API Key Required: Must register for a free API key at https://openstates.org/accounts/profile/
Rate Limit: Free tier with reasonable usage limits (default: 10 req/s)
Attribution: Acknowledge OpenStates.org when using the data
Data Freshness: Data is updated regularly from official state sources
Coverage: All 50 U.S. states plus DC and Puerto Rico
API Documentation:
API v3 Documentation: https://docs.openstates.org/api-v3/
Interactive API Docs: https://v3.openstates.org/docs/
About OpenStates: https://openstates.org/about/
6.7 Alpha Vantage Provider
config = {
'alphavantage_api_key': 'YOUR_KEY',
'alphavantage_config': {
'base_url': 'https://www.alphavantage.co/query',
'timeout': 30,
'default_datatype': 'json'
}
}
6.6 REST Countries Provider
config = {
'restcountries_config': {
'base_url': 'https://restcountries.com/v3.1',
'timeout': 30
}
}
6.7 ExchangeRate Provider
config = {
'exchangerate_api_key': 'YOUR_KEY', # Optional
'exchangerate_config': {
'base_url': 'https://api.exchangerate-api.com/v4',
'timeout': 30
}
}
6.8 Open Library Provider
config = {
'openlibrary_config': {
'base_url': 'https://openlibrary.org',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20 # Maximum burst size
}
}
6.9 Metropolitan Museum of Art (The Met) Provider
config = {
'metmuseum_config': {
'base_url': 'https://collectionapi.metmuseum.org/public/collection/v1',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20 # Maximum burst size
}
}
Features:
Search art objects with comprehensive filtering
Get detailed object information including high-resolution images
Browse by department, artist, medium, culture
Access to 470,000+ artworks from The Met collection
Rich metadata including provenance, exhibition history
Public domain images available for many works
Supported Operations:
search_objects- Search for art objects with advanced filteringget_object- Get detailed information about a specific art objectget_departments- Get list of all departmentsget_objects_by_department- Get objects in a specific departmentsearch_by_artist- Search for artworks by artist namesearch_by_medium- Search for artworks by mediumsearch_by_culture- Search for artworks by culturesearch_highlight_objects- Search for highlighted/featured objectsdownload_image- Download high-resolution images from The Met collection
Important Configuration Notes:
No API Key Required: Completely free and open access
Rate Limit: No official limit, recommended 10 req/s
Attribution: Acknowledge The Metropolitan Museum of Art when using the data
Images: Many objects include high-resolution images (check isPublicDomain flag)
Data Quality: Comprehensive metadata for frontend analysis needs
Example Usage:
# Search for impressionist paintings
result = tool.query(
provider='metmuseum',
operation='search_objects',
params={
'q': 'impressionism',
'has_images': True,
'date_begin': 1860,
'date_end': 1900,
'limit': 20
}
)
# Get specific artwork details
result = tool.query(
provider='metmuseum',
operation='get_object',
params={'object_id': 436535}
)
# Search by artist
result = tool.query(
provider='metmuseum',
operation='search_by_artist',
params={'artist_name': 'Vincent van Gogh', 'has_images': True, 'limit': 10}
)
# Get all departments
result = tool.query(
provider='metmuseum',
operation='get_departments',
params={}
)
# Download artwork images
result = tool.query(
provider='metmuseum',
operation='download_image',
params={'object_id': 436535, 'output_path': './vangogh.jpg'}
)
Image Download Feature:
The Met Museum provider includes a powerful download_image operation that allows you to download high-resolution images:
# Download by object ID (automatically fetches primary image)
result = tool.query(
provider='metmuseum',
operation='download_image',
params={'object_id': 436535}
)
# Returns: {'success': True, 'output_path': '/tmp/...jpg', 'file_size': 1234567}
# Download by direct URL with custom path
result = tool.query(
provider='metmuseum',
operation='download_image',
params={
'image_url': 'https://images.metmuseum.org/CRDImages/ep/original/DP-42549-001.jpg',
'output_path': './my_artwork.jpg'
}
)
# Batch download from search results
search_result = tool.query(
provider='metmuseum',
operation='search_objects',
params={'q': 'van gogh', 'has_images': True, 'limit': 5}
)
for obj in search_result['data']['objects']:
if obj.get('primaryImage'):
download_result = tool.query(
provider='metmuseum',
operation='download_image',
params={
'image_url': obj['primaryImage'],
'output_path': f"./images/{obj['objectID']}.jpg"
}
)
API Documentation:
API Documentation: https://metmuseum.github.io/
GitHub Repository: https://github.com/metmuseum/openaccess
Open Access Initiative: https://www.metmuseum.org/about-the-met/policies-and-documents/open-access
6.10 CoinGecko Provider
config = {
'coingecko_config': {
'base_url': 'https://api.coingecko.com/api/v3',
'timeout': 30,
'rate_limit': 10, # Requests per second (free tier)
'max_burst': 20 # Maximum burst size
}
}
Note: CoinGecko free tier does not require an API key. For higher rate limits and additional features, consider the Pro API.
6.10 OpenWeatherMap Provider
config = {
'openweathermap_api_key': 'YOUR_KEY',
'openweathermap_config': {
'base_url': 'https://api.openweathermap.org/data/2.5',
'geo_url': 'https://api.openweathermap.org/geo/1.0',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20 # Maximum burst size
}
}
Obtaining the Key:
Visit https://openweathermap.org/api
Sign up for a free account
Generate an API key from your account dashboard
6.11 Wikipedia Provider
config = {
'wikipedia_config': {
'base_url': 'https://en.wikipedia.org/w/api.php',
'rest_base_url': 'https://en.wikipedia.org/api/rest_v1',
'timeout': 30,
'rate_limit': 10, # Requests per second (max 200 allowed)
'max_burst': 20, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)' # REQUIRED
}
}
Features:
Article search by title or content
Page summaries and extracts
Full page content retrieval
Random article discovery
Page metadata and information
Important Configuration Notes:
No API Key Required: Wikipedia API is completely free and open
User-Agent REQUIRED: Must set a unique User-Agent with contact information
Rate Limit: Maximum 200 req/s allowed, default config uses 10 req/s
API Etiquette: Follow https://www.mediawiki.org/wiki/API:Etiquette
6.12 GitHub Provider
config = {
'github_api_key': 'YOUR_GITHUB_TOKEN', # Recommended for higher rate limits
'github_config': {
'base_url': 'https://api.github.com',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs)'
}
}
Features:
Repository information and statistics
Search repositories, users, and code
User profiles and activity
Repository issues and pull requests
Organization data
Supported Operations:
get_repository- Get detailed repository informationsearch_repositories- Search for repositoriesget_user- Get user profile informationsearch_users- Search for usersget_repository_issues- Get repository issuesget_repository_pulls- Get repository pull requestssearch_code- Search for code across repositories
Important Configuration Notes:
API Key Recommended: Use a Personal Access Token for 5,000 req/hour (vs 60 unauthenticated)
Rate Limits: Authenticated: 5,000/hour, Unauthenticated: 60/hour
Token Scopes: Use minimal scopes needed (e.g.,
public_repofor public data)API Version: Uses GitHub REST API v3 with
application/vnd.github+jsonaccept header
Obtaining the Key:
Visit https://github.com/settings/tokens
Generate new token (classic)
Select appropriate scopes
Copy and store the token securely
6.13 arXiv Provider
config = {
'arxiv_config': {
'base_url': 'http://export.arxiv.org/api/query',
'timeout': 30,
'rate_limit': 0.33, # Requests per second (~3 second delays between requests)
'max_burst': 2, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
}
}
Features:
Search papers by query (all fields)
Get paper by arXiv ID
Search by author name
Search by category (e.g., cs.AI, math.CO)
Pagination support
Full metadata including authors, abstract, categories, PDF links
Important Configuration Notes:
No API Key Required: arXiv API is completely free and open
Rate Limit: Be respectful - implement 3 second delays between requests
Max Results: Limited to 30,000 results in slices of at most 2,000 at a time
Caching: Strongly recommended to cache responses to reduce server load
API Etiquette: Follow https://info.arxiv.org/help/api/user-manual.html
Obtaining the Key:
No API key required - completely free and open access
API Documentation:
API User Manual: https://info.arxiv.org/help/api/user-manual.html
API Basics: https://info.arxiv.org/help/api/basics.html
Category Taxonomy: https://arxiv.org/category_taxonomy
6.14 PubMed Provider
config = {
'pubmed_api_key': 'YOUR_NCBI_API_KEY', # Optional but recommended
'pubmed_config': {
'base_url': 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils',
'timeout': 30,
'rate_limit': 3, # Requests per second (3 without key, 10 with key)
'max_burst': 5, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; iretbl@gmail.com)'
}
}
Features:
Search biomedical and life sciences literature
Get paper metadata by PubMed ID (PMID)
Search by author name
Get detailed paper information including abstracts
Access to 35+ million citations from MEDLINE, PubMed, and other databases
Full metadata including authors, journal, DOI, publication date
Supported Operations:
search_papers- Search for papers by query stringget_paper_by_id- Get paper metadata by PMIDsearch_by_author- Search for papers by author nameget_paper_details- Get detailed paper information including abstract
Important Configuration Notes:
API Key Optional but Recommended: Increases rate limit from 3 to 10 requests/second
Rate Limits: 3 req/s without API key, 10 req/s with API key
User-Agent: Should include contact email for NCBI to reach you if needed
Caching: Strongly recommended to cache responses to reduce server load
API Etiquette: Follow NCBI E-utilities guidelines
Obtaining the Key:
Visit https://www.ncbi.nlm.nih.gov/account/
Register for a free NCBI account
Go to Settings → API Key Management
Generate a new API key
API Documentation:
E-utilities Quick Start: https://www.ncbi.nlm.nih.gov/books/NBK25500/
E-utilities API Guide: https://www.ncbi.nlm.nih.gov/books/NBK25501/
PubMed Help: https://pubmed.ncbi.nlm.nih.gov/help/
6.15 CrossRef Provider
config = {
'crossref_config': {
'base_url': 'https://api.crossref.org',
'mailto': 'your-email@example.com', # For polite pool access
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
}
}
Features:
Get work metadata by DOI
Search for scholarly works
Get works from specific journals by ISSN
Search for funders in Open Funder Registry
Get works funded by specific funders
Access to extensive scholarly metadata including citations, references, authors, affiliations
Supported Operations:
get_work_by_doi- Get metadata for a work by its DOIsearch_works- Search for works by query string with pagination and sortingget_journal_works- Get works published in a specific journal by ISSNsearch_funders- Search for funders in the Open Funder Registryget_funder_works- Get works associated with a specific funder
Important Configuration Notes:
No API Key Required: CrossRef API is completely free and open
Polite Pool: Provide an email address (mailto parameter) for better rate limits
User-Agent: Set a descriptive User-Agent header with contact information
Caching: Strongly recommended to cache responses to reduce server load
Attribution: Acknowledge CrossRef when using the data in publications
Obtaining Access:
No API key required - completely free and open access
Optional: Register email for polite pool access (better rate limits)
API Documentation:
REST API Documentation: https://www.crossref.org/documentation/retrieve-metadata/rest-api/
API Etiquette: https://github.com/CrossRef/rest-api-doc#etiquette
Metadata Plus: https://www.crossref.org/services/metadata-delivery/
6.16 Semantic Scholar Provider
config = {
'semanticscholar_config': {
'base_url': 'https://api.semanticscholar.org/graph/v1',
'timeout': 30,
'rate_limit': 1, # Requests per second (recommended for sustained use)
'max_burst': 5, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
}
}
Features:
Search for academic papers by query
Get paper metadata by ID (S2 ID, DOI, arXiv ID, etc.)
Get paper authors, citations, and references
Get author information and publications
Access to extensive academic paper database with citation data
Support for multiple paper ID formats (S2 ID, DOI, arXiv ID, PubMed ID, etc.)
Supported Operations:
search_papers- Search for papers by query stringget_paper- Get paper details by ID (S2 ID, DOI, arXiv ID, etc.)get_paper_authors- Get authors of a specific paperget_paper_citations- Get papers that cite this paperget_paper_references- Get papers referenced by this paperget_author- Get author details by IDget_author_papers- Get papers by a specific author
Important Configuration Notes:
No API Key Required: Semantic Scholar API is completely free and open
Rate Limit: Recommended 1 request per second for sustained use (100 requests per 5 minutes)
Max Results: Limited to 100 results per request for search, use pagination for more
User-Agent: Set a descriptive User-Agent header with contact information
Caching: Strongly recommended to cache responses to reduce server load
Paper IDs: Supports multiple ID formats (S2 ID, DOI, arXiv ID, PubMed ID, etc.)
Obtaining Access:
No API key required - completely free and open access
Optional: Contact Semantic Scholar for higher rate limits if needed
API Documentation:
API Documentation: https://api.semanticscholar.org/api-docs/
Academic Graph API: https://www.semanticscholar.org/product/api
API Tutorial: https://www.semanticscholar.org/product/api/tutorial
6.17 CORE Provider
config = {
'core_api_key': 'YOUR_CORE_API_KEY', # Required
'core_config': {
'base_url': 'https://api.core.ac.uk/v3',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
}
}
Features:
Search for open access research papers
Get work metadata by CORE ID
Search by DOI
Search by title
Access to millions of open access research papers
Full metadata including authors, abstract, publication date, citations
Supported Operations:
search_works- Search for works by query stringget_work- Get work details by CORE IDsearch_by_doi- Search for works by DOIsearch_by_title- Search for works by title
Important Configuration Notes:
API Key Required: CORE API requires an API key for access
Rate Limit: Free tier allows reasonable usage with rate limiting
Max Results: Limited to 100 results per request for search, use pagination for more
Caching: Strongly recommended to cache responses to reduce server load
Attribution: Acknowledge CORE when using the data in publications
Obtaining the Key:
Visit https://core.ac.uk/services/api
Register for a free account
Request an API key from your account dashboard
API Documentation:
API Documentation: https://core.ac.uk/documentation/api
API Services: https://core.ac.uk/services/api
About CORE: https://core.ac.uk/about
6.18 USPTO Provider
config = {
'uspto_api_key': 'YOUR_USPTO_API_KEY', # Required
'uspto_config': {
'base_url': 'https://developer.uspto.gov/ibd-api/v1',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
}
}
Features:
Search for US patents by query
Get patent details by patent number
Search patents by inventor name
Search patents by assignee (company/organization)
Access to comprehensive US patent database
Full metadata including title, abstract, inventors, assignees, classifications, citations
Supported Operations:
search_patents- Search for patents by query stringget_patent- Get patent details by patent number/IDsearch_by_inventor- Search for patents by inventor namesearch_by_assignee- Search for patents by assignee name
Important Configuration Notes:
API Key Required: USPTO API requires an API key for access
Rate Limit: Free tier allows reasonable usage with rate limiting
Max Results: Pagination supported for large result sets
Caching: Strongly recommended to cache responses to reduce server load
Attribution: Acknowledge USPTO when using patent data in publications
Obtaining the Key:
Visit https://developer.uspto.gov/
Register for a free developer account
Request an API key from your account dashboard
API Documentation:
API Catalog: https://developer.uspto.gov/api-catalog
Patent Search API: https://developer.uspto.gov/api-catalog/patent-search-api
Developer Portal: https://developer.uspto.gov/
6.19 SEC EDGAR Provider
config = {
'secedgar_config': {
'base_url': 'https://data.sec.gov',
'user_agent': 'YourCompanyName contact@example.com', # REQUIRED
'timeout': 30,
'rate_limit': 10, # Requests per second (max allowed by SEC)
'max_burst': 20, # Maximum burst size
}
}
Features:
Get company submissions and filing history by CIK
Access XBRL financial data and concepts
Retrieve company facts across all filings
Search company filings (10-K, 10-Q, 8-K, etc.)
Download actual filing documents (10-K, 10-Q, 8-K full text)
Calculate financial ratios automatically
Get formatted financial statements
Access insider trading data (Form 4)
Access to comprehensive SEC filing database
Full metadata including company info, filing dates, XBRL tags
Supported Operations:
Basic Data Retrieval:
get_company_submissions- Get company filing history and submission dataget_company_concept- Get XBRL concept data for specific financial metricsget_company_facts- Get all XBRL facts for a company
Filing Document Access:
search_filings- Search for filings by CIK and form typeget_filings_by_type- Get recent filings of a specific form typeget_filing_documents- Get filing document URLs and metadataget_filing_text- Download full text of filing documents
Financial Analysis:
calculate_financial_ratios- Calculate common financial ratios (P/E, ROE, ROA, etc.)get_financial_statement- Get formatted financial statements (balance sheet, income statement, cash flow)
Corporate Governance:
get_insider_transactions- Get insider trading transactions (Form 4 filings)
Important Configuration Notes:
No API Key Required: SEC EDGAR API is completely free and open
User-Agent REQUIRED: Must include company/individual name and contact email
Format:
"CompanyName contact@email.com"SEC will block access if User-Agent is missing or generic
Rate Limit: Maximum 10 requests per second (enforced by SEC)
CIK Format: Central Index Key must be 10 digits with leading zeros (e.g., “0000320193”)
Caching: Strongly recommended to cache responses to reduce server load
Fair Access: SEC monitors usage and may block non-compliant access
Example Usage:
# 1. Get Apple Inc. filings (CIK: 0000320193)
result = tool.query(
provider='secedgar',
operation='get_company_submissions',
params={'cik': '0000320193'}
)
# 2. Search for specific form type (10-K annual reports)
result = tool.query(
provider='secedgar',
operation='search_filings',
params={
'cik': '0000320193',
'form_type': '10-K',
'limit': 5
}
)
# 3. Get Apple's Assets data from XBRL
result = tool.query(
provider='secedgar',
operation='get_company_concept',
params={
'cik': '0000320193',
'taxonomy': 'us-gaap',
'tag': 'Assets'
}
)
# 4. Calculate financial ratios
result = tool.query(
provider='secedgar',
operation='calculate_financial_ratios',
params={'cik': '0000320193'}
)
# Returns: current_ratio, debt_to_equity, profit_margin, ROA, ROE, etc.
# 5. Get formatted balance sheet
result = tool.query(
provider='secedgar',
operation='get_financial_statement',
params={
'cik': '0000320193',
'statement_type': 'balance_sheet',
'period': 'annual'
}
)
# 6. Get insider transactions (Form 4)
result = tool.query(
provider='secedgar',
operation='get_insider_transactions',
params={
'cik': '0000320193',
'start_date': '2024-01-01'
}
)
# 7. Download filing document text
result = tool.query(
provider='secedgar',
operation='get_filing_text',
params={
'cik': '0000320193',
'accession_number': '0000320193-23-000077'
}
)
Common CIKs:
Apple Inc.: 0000320193
Tesla Inc.: 0001318605
Microsoft Corp.: 0000789019
Amazon.com Inc.: 0001018724
Alphabet Inc.: 0001652044
Finding CIKs:
Company Search: https://www.sec.gov/edgar/searchedgar/companysearch.html
CIK Lookup Tool: https://www.sec.gov/cgi-bin/browse-edgar
API Documentation:
API Overview: https://www.sec.gov/search-filings/edgar-application-programming-interfaces
Accessing EDGAR Data: https://www.sec.gov/os/accessing-edgar-data
XBRL Data Sets: https://www.sec.gov/dera/data/financial-statement-data-sets.html
Company Submissions: https://data.sec.gov/submissions/
XBRL API: https://data.sec.gov/api/xbrl/
6.20 Stack Exchange Provider
config = {
'stackexchange_config': {
'base_url': 'https://api.stackexchange.com/2.3',
'api_key': 'YOUR_API_KEY', # Optional but recommended
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
}
}
Features:
Search questions across Stack Exchange network
Get detailed question and answer information
Search for users and their profiles
Browse tags and their statistics
Access all Stack Exchange sites
Rich metadata including votes, views, acceptance status
Supported Operations:
Question Operations:
search_questions- Search for questions by query and tagsget_question- Get detailed information about a specific questionget_answers- Get answers for a specific question
User and Tag Operations:
search_users- Search for users by nameget_tags- Get tags and their statisticsget_sites- Get all sites in the Stack Exchange network
Important Notes:
API Key Optional: Works without key but has much lower rate limits (300 vs 10,000 requests/day)
Compression: API returns gzip compressed responses by default
Backoff: Respect the backoff field in responses when present
Attribution: Required when displaying Stack Exchange content
Example Usage:
# 1. Search for Python questions on Stack Overflow
result = tool.query(
provider='stackexchange',
operation='search_questions',
params={
'site': 'stackoverflow',
'q': 'python async',
'tagged': 'python',
'sort': 'votes',
'pagesize': 10
}
)
# 2. Get a specific question by ID
result = tool.query(
provider='stackexchange',
operation='get_question',
params={
'question_id': 11227809,
'site': 'stackoverflow'
}
)
# 3. Get answers for a question
result = tool.query(
provider='stackexchange',
operation='get_answers',
params={
'question_id': 11227809,
'site': 'stackoverflow',
'sort': 'votes',
'pagesize': 5
}
)
# 4. Search for users
result = tool.query(
provider='stackexchange',
operation='search_users',
params={
'site': 'stackoverflow',
'inname': 'Jon Skeet',
'pagesize': 10
}
)
# 5. Get popular Python tags
result = tool.query(
provider='stackexchange',
operation='get_tags',
params={
'site': 'stackoverflow',
'inname': 'python',
'sort': 'popular',
'pagesize': 20
}
)
# 6. Get all Stack Exchange sites
result = tool.query(
provider='stackexchange',
operation='get_sites',
params={'pagesize': 50}
)
Popular Sites:
Stack Overflow:
stackoverflowServer Fault:
serverfaultSuper User:
superuserAsk Ubuntu:
askubuntuMathematics:
mathUnix & Linux:
unix
API Documentation:
API Documentation: https://api.stackexchange.com/docs
Authentication: https://api.stackexchange.com/docs/authentication
Throttling: https://api.stackexchange.com/docs/throttle
Register App: https://stackapps.com/apps/oauth/register
6.21 Hacker News Provider
config = {
'hackernews_config': {
'base_url': 'http://hn.algolia.com/api/v1',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; your-email@example.com)'
}
}
Features:
Search Hacker News stories by keywords
Search comments by keywords
Search items sorted by date (most recent first)
Get item details by ID (story, comment, poll, etc.)
Get user information by username
Full metadata including title, author, points, comments, URL
Pagination support for large result sets
Supported Operations:
search_stories- Search for stories by keywords (sorted by relevance)search_comments- Search for comments by keywordssearch_by_date- Search for items sorted by date (most recent first)get_item- Get item details by ID (story, comment, poll, etc.)get_user- Get user information by username
Important Configuration Notes:
No API Key Required: Hacker News Algolia API is completely free and open
Rate Limiting: Be respectful - implement reasonable delays between requests
Max Results: Limited to 1000 results per query (pagination available)
User-Agent: Set a descriptive User-Agent header for API etiquette
Caching: Strongly recommended to cache responses to reduce server load
Example Usage:
# 1. Search for Python-related stories
result = tool.query(
provider='hackernews',
operation='search_stories',
params={
'query': 'python',
'hits_per_page': 20
}
)
# 2. Search for stories with minimum comments
result = tool.query(
provider='hackernews',
operation='search_stories',
params={
'query': 'AI',
'num_comments': 50, # Minimum 50 comments
'hits_per_page': 10
}
)
# 3. Search comments about machine learning
result = tool.query(
provider='hackernews',
operation='search_comments',
params={
'query': 'machine learning',
'hits_per_page': 20
}
)
# 4. Get recent AI stories sorted by date
result = tool.query(
provider='hackernews',
operation='search_by_date',
params={
'query': 'AI',
'tags': 'story',
'hits_per_page': 20
}
)
# 5. Get specific item details
result = tool.query(
provider='hackernews',
operation='get_item',
params={'item_id': 1} # The first HN story ever posted
)
# 6. Get user information
result = tool.query(
provider='hackernews',
operation='get_user',
params={'username': 'pg'} # Paul Graham
)
Common Tags:
story- Filter for stories onlycomment- Filter for comments onlypoll- Filter for polls onlyauthor_pg- Filter by author (e.g., Paul Graham)Combine tags:
story,author_pg- Stories by Paul Graham
Obtaining Access:
No API key required - completely free and open access
API Documentation:
API Documentation: https://hn.algolia.com/api
Hacker News Official: https://news.ycombinator.com/
Search Interface: https://hn.algolia.com/
6.22 OpenCorporates Provider
config = {
'opencorporates_api_key': 'YOUR_OPENCORPORATES_API_KEY', # Required
'opencorporates_config': {
'base_url': 'https://api.opencorporates.com/v0.4',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
}
}
Features:
Search for companies by name across 140+ jurisdictions worldwide
Get detailed company information by jurisdiction code and company number
Search for company officers (directors, agents, secretaries)
Get officer details and their company affiliations
Access company filings and statutory documents
Get jurisdiction information and codes
Access to 200+ million companies from official registers
Full metadata including company status, address, incorporation date, officers
Supported Operations:
Company Operations:
search_companies- Search for companies by name or other criteriaget_company- Get detailed information about a specific company by jurisdiction and company numberget_company_filings- Get statutory filings for a specific company
Officer Operations:
search_officers- Search for company officers (directors, agents) by nameget_officer- Get detailed information about a specific officer by ID
Jurisdiction Operations:
list_jurisdictions- Get list of all available jurisdictions
Important Configuration Notes:
API Key Required: OpenCorporates API requires an API key for all requests
Rate Limits: Free tier allows 200 requests/month, 50 requests/day
Open Data: Free for open data projects with share-alike attribution
Paid Plans: Available for commercial use without share-alike restrictions
Jurisdiction Codes: Use standard codes like ‘us_ca’ (California), ‘gb’ (UK), ‘de’ (Germany)
Caching: Strongly recommended to cache responses to reduce API usage
Example Usage:
# 1. Search for companies by name
result = tool.query(
provider='opencorporates',
operation='search_companies',
params={
'q': 'Apple Inc',
'jurisdiction_code': 'us_ca', # Optional: filter by jurisdiction
'per_page': 10
}
)
# 2. Get specific company details
result = tool.query(
provider='opencorporates',
operation='get_company',
params={
'jurisdiction_code': 'us_ca',
'company_number': 'C0806592' # Apple Inc.
}
)
# 3. Search for officers
result = tool.query(
provider='opencorporates',
operation='search_officers',
params={
'q': 'John Smith',
'jurisdiction_code': 'gb', # Optional: filter by jurisdiction
'per_page': 10
}
)
# 4. Get company filings
result = tool.query(
provider='opencorporates',
operation='get_company_filings',
params={
'jurisdiction_code': 'us_ca',
'company_number': 'C0806592',
'per_page': 20
}
)
# 5. List all jurisdictions
result = tool.query(
provider='opencorporates',
operation='list_jurisdictions',
params={}
)
Common Jurisdiction Codes:
United States (California):
us_caUnited States (Delaware):
us_deUnited Kingdom:
gbGermany:
deFrance:
frCanada (Ontario):
ca_onAustralia:
au
Example Companies:
Apple Inc. (US-CA): jurisdiction_code=’us_ca’, company_number=’C0806592’
Google LLC (US-DE): jurisdiction_code=’us_de’, company_number=’5908224’
Microsoft Corporation (US-WA): jurisdiction_code=’us_wa’, company_number=’600413485’
Obtaining the Key:
Visit https://opencorporates.com/api_accounts/new
Register for a free account
Choose your plan (free for open data projects)
Get your API key from the dashboard
API Documentation:
API Reference: https://api.opencorporates.com/documentation/API-Reference
API Accounts: https://opencorporates.com/api_accounts/new
About OpenCorporates: https://opencorporates.com/info/about
Jurisdiction Codes: https://api.opencorporates.com/documentation/Open-Data-Licence
6.23 GDELT Project Provider
config = {
'gdelt_config': {
'doc_base_url': 'https://api.gdeltproject.org/api/v2/doc/doc',
'geo_base_url': 'https://api.gdeltproject.org/api/v2/geo/geo',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
}
}
Features:
Search global news articles across 100+ languages
Timeline analysis of news coverage volume and tone
Geographic mapping of news coverage
Image search with visual recognition
Theme-based search using Global Knowledge Graph
Emotional tone analysis of news coverage
Source country analysis
Real-time updates every 15 minutes
Supported Operations:
Article Search Operations:
search_articles- Search global news articles with advanced filteringget_article_list- Get detailed list of articles with full metadatasearch_by_theme- Search using GDELT’s Global Knowledge Graph themes
Timeline Operations:
get_timeline- Get timeline of news coverage volumeget_timeline_volume- Get volume timeline with raw counts or percentagesget_timeline_tone- Get timeline showing average emotional tone over timeget_timeline_lang- Get timeline broken down by languageget_timeline_source_country- Get timeline broken down by source country
Analysis Operations:
get_tone_chart- Analyze emotional tone distribution of coverageget_top_themes- Get top themes and topics from matching articles
Geographic Operations:
get_geo_map- Get geographic map of locations mentioned in newsget_source_country_map- Map which countries are reporting on a topic
Image Operations:
search_images- Search news images using visual recognition
Important Configuration Notes:
No API Key Required: GDELT Project API is completely free and open
Rate Limiting: Be respectful - implement reasonable delays between requests
Data Coverage: Monitors news in 100+ languages from around the world
Real-time Updates: Data updated every 15 minutes
Attribution: Acknowledge GDELT Project when using the data
Fair Use: Do not abuse the free service with excessive requests
Caching: Strongly recommended to cache responses to reduce server load
Example Usage:
# 1. Search for climate change articles
result = tool.query(
provider='gdelt',
operation='search_articles',
params={
'query': 'climate change',
'timespan': '7d',
'max_records': 50,
'source_lang': 'english'
}
)
# 2. Get timeline of AI coverage
result = tool.query(
provider='gdelt',
operation='get_timeline',
params={
'query': 'artificial intelligence',
'timespan': '30d',
'mode': 'timelinevol'
}
)
# 3. Analyze tone of election coverage
result = tool.query(
provider='gdelt',
operation='get_tone_chart',
params={
'query': 'election',
'timespan': '7d'
}
)
# 4. Search for protest images
result = tool.query(
provider='gdelt',
operation='search_images',
params={
'query': 'protest',
'timespan': '7d',
'image_tag': 'protest',
'max_records': 20
}
)
# 5. Get geographic map of earthquake coverage
result = tool.query(
provider='gdelt',
operation='get_geo_map',
params={
'query': 'earthquake',
'mode': 'country',
'timespan': '24h'
}
)
# 6. Search by theme (Global Knowledge Graph)
result = tool.query(
provider='gdelt',
operation='search_by_theme',
params={
'theme': 'ENV_CLIMATECHANGE',
'timespan': '7d',
'max_records': 50
}
)
# 7. Get source country map
result = tool.query(
provider='gdelt',
operation='get_source_country_map',
params={
'query': 'technology',
'timespan': '24h'
}
)
# 8. Get timeline with tone analysis
result = tool.query(
provider='gdelt',
operation='get_timeline_tone',
params={
'query': 'economy',
'timespan': '30d',
'smoothing': 5
}
)
Common GKG Themes:
ENV_CLIMATECHANGE- Climate change and global warmingTERROR- Terrorism and extremismHEALTH- Health and medical topicsECON_INFLATION- Economic inflationECON_STOCKMARKET- Stock market and financeTAX_FNCACT_STUDENT- Student finance and educationWB_*- World Bank indicators (e.g., WB_1987_POVERTY_HEADCOUNT)
Timespan Formats:
Hours:
1h,6h,12h,24hDays:
1d,3d,7dWeeks:
1week,2weeksMonths:
1month,3months,6months
Query Operators:
Phrase search:
"exact phrase"Boolean AND:
term1 term2orterm1 AND term2Boolean OR:
term1 OR term2Boolean NOT:
-termorNOT termGrouping:
(term1 OR term2) AND term3Theme search:
theme:TERRORDomain filter:
domain:nytimes.comSource language:
sourcelang:englishSource country:
sourcecountry:us
API Documentation:
DOC API 2.0: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
GEO API 2.0: https://blog.gdeltproject.org/gdelt-geo-2-0-api-debuts/
Global Knowledge Graph: https://blog.gdeltproject.org/announcing-the-global-knowledge-graph/
GDELT Project: https://www.gdeltproject.org/
Query Guide: https://blog.gdeltproject.org/gdelt-doc-2-0-api-debuts/
6.24 DuckDuckGo Zero-Click Info Provider
config = {
'duckduckgo_config': {
'base_url': 'https://api.duckduckgo.com/',
'timeout': 30,
'rate_limit': 10, # Requests per second
'max_burst': 20, # Maximum burst size
'user_agent': 'AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)'
}
}
tool = APISourceTool(config)
Environment Variable:
export DUCKDUCKGO_TIMEOUT=30
export DUCKDUCKGO_RATE_LIMIT=10
export DUCKDUCKGO_MAX_BURST=20
export DUCKDUCKGO_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)"
Supported Operations:
Instant Answer Operations:
get_instant_answer- Get instant answer for a query with all available dataget_abstract- Get article abstract/summary from Wikipedia and other sourcesget_definition- Get definition for a termget_related_topics- Get related topics and disambiguationget_infobox- Get structured infobox data for an entity
Important Configuration Notes:
No API Key Required: DuckDuckGo Instant Answer API is completely free and open
Rate Limiting: Be respectful - implement reasonable delays between requests
Caching: Strongly recommended to cache responses to reduce server load
User-Agent: Set a descriptive User-Agent header for API etiquette
No Scraping: This is an Instant Answer API, not a full search results API
Attribution: Consider attributing results to DuckDuckGo when displaying them
Data Sources: Primarily Wikipedia, but also includes other curated sources
Example Usage:
# 1. Get instant answer for a query
result = tool.query(
provider='duckduckgo',
operation='get_instant_answer',
params={
'query': 'Python programming language',
'no_html': True
}
)
# 2. Get abstract for an entity
result = tool.query(
provider='duckduckgo',
operation='get_abstract',
params={'query': 'Albert Einstein'}
)
# 3. Get definition for a term
result = tool.query(
provider='duckduckgo',
operation='get_definition',
params={'query': 'algorithm'}
)
# 4. Get related topics (disambiguation)
result = tool.query(
provider='duckduckgo',
operation='get_related_topics',
params={'query': 'Python'}
)
# 5. Get infobox data for an entity
result = tool.query(
provider='duckduckgo',
operation='get_infobox',
params={'query': 'Steve Jobs'}
)
Response Data Structure:
Instant Answer Response:
{
'heading': 'Python (programming language)',
'abstract': 'Python is a high-level, general-purpose programming language...',
'abstract_source': 'Wikipedia',
'abstract_url': 'https://en.wikipedia.org/wiki/Python_(programming_language)',
'answer': '', # Direct answer if available
'answer_type': '',
'definition': '', # Definition if available
'image': 'https://duckduckgo.com/i/...',
'type': 'A', # Answer type: A=Article, D=Disambiguation, etc.
'has_infobox': True,
'has_related_topics': True
}
Related Topics Response:
{
'heading': 'Python',
'related_topics': [
{
'type': 'topic',
'text': 'Python (programming language) A high-level...',
'url': 'https://duckduckgo.com/Python_(programming_language)',
'icon': '/i/7eec482b.png'
},
{
'type': 'category',
'name': 'Snakes',
'topics': [...]
}
],
'total_topics': 15
}
Use Cases:
Quick facts and information retrieval
Entity disambiguation (e.g., “Python” could be programming language, snake, etc.)
Topic exploration and related content discovery
Knowledge base enrichment
Instant answers for common queries
Structured data extraction from infoboxes
API Documentation:
API Endpoint: https://api.duckduckgo.com/
API Format: https://api.duckduckgo.com/?q=query&format=json
DuckDuckGo: https://duckduckgo.com/
7. Environment Variables
7.1 Variable Reference
All configuration parameters can be set via environment variables with the APISOURCE_ prefix:
# API Keys
export APISOURCE_FRED_API_KEY="your_fred_key"
export APISOURCE_NEWSAPI_API_KEY="your_news_key"
export APISOURCE_CENSUS_API_KEY="your_census_key"
export APISOURCE_CONGRESS_API_KEY="your_congress_key"
export APISOURCE_ALPHAVANTAGE_API_KEY="your_alphavantage_key"
export APISOURCE_EXCHANGERATE_API_KEY="your_exchangerate_key" # Optional
export APISOURCE_OPENWEATHERMAP_API_KEY="your_openweathermap_key"
export APISOURCE_GITHUB_API_KEY="your_github_token" # Recommended
export APISOURCE_PUBMED_API_KEY="your_ncbi_api_key" # Optional but recommended
export CROSSREF_MAILTO="your-email@example.com" # Optional but recommended for polite pool
export APISOURCE_CORE_API_KEY="your_core_api_key" # Required
export APISOURCE_USPTO_API_KEY="your_uspto_api_key" # Required
export SECEDGAR_USER_AGENT="YourCompanyName contact@example.com" # REQUIRED for SEC EDGAR
export STACKEXCHANGE_API_KEY="your_stackexchange_api_key" # Optional but recommended
export OPENCORPORATES_API_KEY="your_opencorporates_api_key" # Required
# Provider-specific Configuration
export SEMANTICSCHOLAR_TIMEOUT=30
export SEMANTICSCHOLAR_RATE_LIMIT=1
export SEMANTICSCHOLAR_MAX_BURST=5
export CORE_TIMEOUT=30
export CORE_RATE_LIMIT=10
export CORE_MAX_BURST=20
export USPTO_TIMEOUT=30
export USPTO_RATE_LIMIT=10
export USPTO_MAX_BURST=20
export SECEDGAR_TIMEOUT=30
export SECEDGAR_RATE_LIMIT=10
export SECEDGAR_MAX_BURST=20
export STACKEXCHANGE_TIMEOUT=30
export STACKEXCHANGE_RATE_LIMIT=10
export STACKEXCHANGE_MAX_BURST=20
export OPENCORPORATES_TIMEOUT=30
export OPENCORPORATES_RATE_LIMIT=10
export OPENCORPORATES_MAX_BURST=20
export HACKERNEWS_TIMEOUT=30
export HACKERNEWS_RATE_LIMIT=10
export HACKERNEWS_MAX_BURST=20
export STACKEXCHANGE_RATE_LIMIT=10
export STACKEXCHANGE_MAX_BURST=20
export GDELT_TIMEOUT=30
export GDELT_RATE_LIMIT=10
export GDELT_MAX_BURST=20
export DUCKDUCKGO_TIMEOUT=30
export DUCKDUCKGO_RATE_LIMIT=10
export DUCKDUCKGO_MAX_BURST=20
export DUCKDUCKGO_USER_AGENT="AIECS-APISource/2.0 (https://github.com/your-org/aiecs; contact@example.com)"
# Performance
export APISOURCE_CACHE_TTL="300"
export APISOURCE_DEFAULT_TIMEOUT="30"
export APISOURCE_MAX_RETRIES="3"
# Feature Flags
export APISOURCE_ENABLE_RATE_LIMITING="true"
export APISOURCE_ENABLE_FALLBACK="true"
export APISOURCE_ENABLE_DATA_FUSION="true"
export APISOURCE_ENABLE_QUERY_ENHANCEMENT="true"
# Logging
export APISOURCE_LOG_LEVEL="INFO"
export APISOURCE_METRICS_ENABLED="true"
7.2 Loading from .env File
# .env file
APISOURCE_FRED_API_KEY=your_fred_key
APISOURCE_NEWSAPI_API_KEY=your_news_key
APISOURCE_CACHE_TTL=300
APISOURCE_ENABLE_FALLBACK=true
# Load with python-dotenv
from dotenv import load_dotenv
load_dotenv()
# Tool automatically picks up environment variables
tool = APISourceTool()
8. Configuration Examples
8.1 Development Configuration
{
"fred_api_key": "YOUR_FRED_KEY",
"newsapi_api_key": "YOUR_NEWS_KEY",
"cache_ttl": 60,
"default_timeout": 30,
"max_retries": 1,
"enable_rate_limiting": false,
"enable_fallback": true,
"enable_data_fusion": true,
"enable_query_enhancement": true,
"log_level": "DEBUG",
"metrics_enabled": true
}
8.2 Production Configuration
{
"fred_api_key": "${FRED_API_KEY}",
"newsapi_api_key": "${NEWSAPI_API_KEY}",
"census_api_key": "${CENSUS_API_KEY}",
"congress_api_key": "${CONGRESS_API_KEY}",
"cache_ttl": 600,
"default_timeout": 30,
"max_retries": 5,
"enable_rate_limiting": true,
"enable_fallback": true,
"enable_data_fusion": true,
"enable_query_enhancement": true,
"enable_intelligent_cache": true,
"log_level": "INFO",
"metrics_enabled": true,
"cache_backend": "redis",
"redis_url": "redis://redis:6379/0"
}
8.3 High-Volume Configuration
{
"fred_api_key": "${FRED_API_KEY}",
"cache_ttl": 3600,
"default_timeout": 15,
"max_retries": 3,
"enable_rate_limiting": true,
"enable_fallback": true,
"enable_data_fusion": false,
"enable_query_enhancement": false,
"enable_intelligent_cache": true,
"log_level": "WARNING",
"metrics_enabled": true,
"rate_limit_config": {
"fred": {
"tokens_per_second": 1.5,
"max_tokens": 5
}
}
}
8.4 Minimal Configuration
{
"fred_api_key": "YOUR_FRED_KEY"
}
All other parameters use defaults.
9. Validation and Testing
9.1 Configuration Validation
from aiecs.tools.apisource.tool import Config
# Validate configuration
try:
config = Config(
fred_api_key='YOUR_KEY',
cache_ttl=300,
max_retries=3
)
print("Configuration valid!")
except ValueError as e:
print(f"Configuration error: {e}")
9.2 Testing Configuration
from aiecs.tools.apisource import APISourceTool
# Create tool with configuration
tool = APISourceTool(config)
# Test provider connectivity
providers = tool.list_providers()
for provider in providers:
print(f"Provider: {provider['name']}")
print(f"Health: {provider['health']['status']}")
print(f"Score: {provider['health']['score']}\n")
# Test a simple query
try:
result = tool.query(
provider='fred',
operation='get_series_info',
params={'series_id': 'GDP'}
)
print("Configuration working correctly!")
except Exception as e:
print(f"Configuration issue: {e}")
9.3 Configuration Best Practices
Use Environment Variables for Secrets:
import os
config = {
'fred_api_key': os.getenv('FRED_API_KEY'),
'newsapi_api_key': os.getenv('NEWSAPI_KEY')
}
Validate Before Deployment:
def validate_config(config):
required_keys = ['fred_api_key']
for key in required_keys:
if not config.get(key):
raise ValueError(f"Missing required config: {key}")
return True
Use Different Configs for Different Environments:
import os
env = os.getenv('ENVIRONMENT', 'development')
config_file = f'config.{env}.json'
with open(config_file) as f:
config = json.load(f)
Monitor Configuration Impact:
# Check metrics after configuration changes
metrics = tool.get_metrics()
print(f"Success rate: {metrics['overall']['success_rate']}")
print(f"Avg response time: {metrics['overall']['avg_response_time']}")
Document Version: 2.0
Last Updated: 2025-10-18
Maintainer: AIECS Tools Team