> "Sitting on your own data while the world's data streams past you is like running a business with blinders on."
In This Chapter
- Opening Scenario: The Monday Morning Gap
- 21.1 What APIs Are and Why They Matter
- 21.2 REST API Concepts
- 21.3 The requests Library
- 21.4 Query Parameters and Headers
- 21.5 Authentication
- 21.6 Storing API Credentials Safely
- 21.7 Working with JSON Responses
- 21.8 Pagination
- 21.9 Rate Limiting and Retry Logic
- 21.10 Error Handling for API Calls
- 21.11 Real Business API Examples
- 21.12 Building a Reusable API Client
- 21.13 Practical Summary: A Complete API Data Pull
- Chapter Summary
Chapter 21: Working with APIs and External Data Services
"Sitting on your own data while the world's data streams past you is like running a business with blinders on."
Opening Scenario: The Monday Morning Gap
Priya Okonkwo has built something real. Over the past several months she's automated Acme Corp's weekly sales reports, cleaned their messy data, and produced visualizations that Sandra Chen actually uses in her executive presentations. The tools live on her laptop. The data lives in CSV files.
Then Sandra asks a question Priya can't answer: "Our biggest competitor just hired a VP of eCommerce. What does that mean for our pricing?"
Priya's internal data says nothing about competitors. It says nothing about exchange rates for Acme's growing international accounts. It says nothing about economic signals, industry news, or the world outside Acme's own spreadsheets.
This is the gap that APIs fill.
An API (Application Programming Interface) is a structured way to request data and services from an external system over the internet. Thousands of organizations — governments, financial data providers, news outlets, weather services, logistics companies — publish their data through APIs. Your Python code can call those APIs the same way it calls a function in your own program.
After this chapter, Priya's toolkit will reach beyond Acme's internal data. Yours will too.
21.1 What APIs Are and Why They Matter
The word "API" is used loosely in software to mean many things. In this chapter, we mean specifically web APIs: services you call over the internet using HTTP requests.
The conceptual model is simple:
- Your Python program sends an HTTP request to a URL (the endpoint)
- The server processes your request and sends back an HTTP response
- The response contains data — almost always formatted as JSON
- Your program parses that JSON and uses the data
This is exactly how a web browser works, except instead of a human clicking links, your code is making the requests programmatically and processing the results automatically.
Why APIs Matter for Business Work
Access to real-time data. Financial markets, exchange rates, weather, shipping rates — this data changes minute to minute. No CSV file stays current. An API call always returns the latest.
Access to data you couldn't collect yourself. Company profiles, news archives, demographic data, commodity prices — organizations that specialize in collecting this data sell or share it through APIs. Buying API access is far cheaper than collecting the data yourself.
Automation. Pulling data manually from a website is a job. Pulling it via API is one line of Python. Once you have the code, every future data pull is free and instant.
Integration. Modern business software — CRMs, accounting systems, project tools — exposes APIs. Your Python code can read from your CRM, push data to your accounting system, and trigger actions in your project management tool.
The Business API Ecosystem
A few categories worth knowing:
- Financial data: Alpha Vantage, Yahoo Finance, Quandl, FRED (Federal Reserve)
- Currency exchange: Open Exchange Rates, ExchangeRate-API, Fixer.io
- Weather and geography: OpenWeatherMap, Open-Meteo, OpenStreetMap
- Company and business data: Clearbit, OpenCorporates, Companies House (UK)
- News: NewsAPI, The Guardian API, New York Times API
- Government and public data: US Census Bureau, Bureau of Labor Statistics, World Bank
- E-commerce and shipping: Shopify, FedEx, UPS, USPS APIs
Many of these have free tiers adequate for business analysis. We'll use several throughout this chapter.
21.2 REST API Concepts
Most web APIs you'll encounter are REST APIs (Representational State Transfer). REST isn't a protocol or a standard — it's an architectural style with a few defining characteristics:
Stateless: Each request contains all the information the server needs. The server doesn't remember previous requests from your program.
Resource-based: Data is organized into "resources" (customers, orders, products) and you interact with them through URLs.
Standard HTTP methods: The same HTTP verbs used by browsers are used for API operations.
HTTP Methods
| Method | What It Does | Business Example |
|---|---|---|
GET |
Retrieve data | Fetch a list of orders |
POST |
Create new data | Submit a new customer record |
PUT |
Replace existing data | Update an entire order record |
PATCH |
Partial update | Change only the status field of an order |
DELETE |
Remove data | Delete a draft record |
For business data analysis, you'll use GET 90% of the time. POST is useful when submitting data to services (like sending emails or triggering reports). PUT, PATCH, and DELETE become relevant when you're building integrations that write back to systems.
Endpoints
An endpoint is a specific URL that represents a resource or operation. Examples:
GET https://api.example.com/v1/products
GET https://api.example.com/v1/products/42
GET https://api.example.com/v1/orders?status=pending®ion=west
POST https://api.example.com/v1/orders
Notice the structure: base URL, version (v1), resource name, optional resource ID, optional query parameters.
HTTP Status Codes
The server's response always includes a status code — a three-digit number that tells you whether the request succeeded or failed, and why.
| Code | Meaning | Your Action |
|---|---|---|
200 |
OK — request succeeded | Use the data |
201 |
Created — resource was created | Success for POST requests |
400 |
Bad Request — your request was malformed | Check your parameters |
401 |
Unauthorized — authentication failed | Check your API key |
403 |
Forbidden — authenticated but no permission | Check your account tier |
404 |
Not Found — resource doesn't exist | Check the URL and resource ID |
429 |
Too Many Requests — rate limited | Wait and retry |
500 |
Internal Server Error — server problem | Retry later |
503 |
Service Unavailable — server down | Retry later |
Your code must handle these status codes. Assuming every response is a 200 is how you end up processing error messages as if they were data.
21.3 The requests Library
Python's built-in urllib can make HTTP requests, but it's verbose and awkward. The requests library is the standard tool for any Python developer working with APIs.
Install it:
pip install requests
Your First API Call
Let's start with the simplest possible example — calling a public API that requires no authentication:
import requests
# Call the Open-Meteo weather API (completely free, no API key required)
# Fetching weather for Chicago (Acme Corp's headquarters city)
response = requests.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": 41.8781,
"longitude": -87.6298,
"current_weather": True,
}
)
print(f"Status code: {response.status_code}")
print(f"Response data: {response.json()}")
Run that and you'll see a 200 status code and a JSON dictionary with Chicago's current weather.
Let's unpack what happened:
requests.get(url, params=...)sent an HTTP GET request to that URL with query parameters appended:?latitude=41.8781&longitude=-87.6298¤t_weather=true- The server returned an HTTP response
response.status_codegives us the numeric status coderesponse.json()parses the response body as JSON and returns a Python dictionary
The Response Object
The response object from any requests call contains everything the server sent back:
response = requests.get("https://api.open-meteo.com/v1/forecast", params={...})
# Status code
print(response.status_code) # 200
# Response body as text
print(response.text) # Raw JSON string
# Response body parsed as Python dictionary/list
data = response.json() # Python dict — use this for data work
# Response headers (metadata about the response)
print(response.headers) # dict-like object
# Whether the request succeeded (True for 2xx status codes)
print(response.ok) # True or False
# Raise an exception if the status code indicates an error
response.raise_for_status() # Raises HTTPError for 4xx, 5xx
requests.get() — Detailed Reference
import requests
response = requests.get(
url="https://api.example.com/data",
# Query parameters — appended to URL as ?key=value&key2=value2
params={
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"format": "json",
},
# HTTP headers — metadata sent with the request
headers={
"Authorization": "Bearer your_token_here",
"Accept": "application/json",
"User-Agent": "AcmeCorp-Analytics/1.0",
},
# Timeout in seconds — ALWAYS set this in production code
timeout=30,
)
The params dictionary is the cleanest way to handle query parameters. requests handles URL encoding automatically — you don't need to worry about escaping special characters or formatting the URL string.
requests.post() — Sending Data
When you need to create a resource or send data to an API:
import requests
import json
new_record = {
"customer_id": "CUST-4821",
"contact_email": "buyer@client-corp.com",
"deal_value": 45000,
"stage": "Proposal",
}
response = requests.post(
url="https://api.crm-example.com/v1/deals",
headers={
"Authorization": "Bearer your_token",
"Content-Type": "application/json",
},
# Send Python dict as JSON body
json=new_record, # requests serializes this automatically
timeout=30,
)
print(response.status_code) # 201 if created successfully
created_deal = response.json()
print(f"Created deal ID: {created_deal['id']}")
21.4 Query Parameters and Headers
Query Parameters in Depth
Query parameters filter, sort, and configure what an API returns. Every API documents its own parameter names — read the documentation for each API you use.
Common patterns:
# Date range filtering
params = {
"from": "2024-01-01",
"to": "2024-12-31",
}
# Pagination
params = {
"page": 1,
"per_page": 100,
"offset": 0,
"limit": 50,
}
# Field selection — only return the fields you need
params = {
"fields": "id,name,price,category",
}
# Sorting
params = {
"sort_by": "date",
"sort_order": "desc",
}
# Search
params = {
"q": "office supplies",
"category": "furniture",
}
Headers in Depth
HTTP headers carry metadata that's separate from the URL and the response body. You'll use headers for three main purposes:
Authentication (covered in section 21.5):
headers = {
"Authorization": "Bearer eyJhbGciOiJ...",
"X-API-Key": "your-api-key",
}
Content negotiation — telling the server what format you want back:
headers = {
"Accept": "application/json",
"Accept-Language": "en-US",
}
Identifying your application — good practice and sometimes required:
headers = {
"User-Agent": "AcmeCorp-Analytics/1.0 (priya@acme.com)",
}
21.5 Authentication
Most useful APIs require authentication. The API needs to know who you are to enforce rate limits, track usage, and control access. There are several authentication patterns you'll encounter.
API Keys in Headers
The most common pattern for data APIs:
import requests
api_key = "your_api_key_here"
response = requests.get(
"https://newsapi.org/v2/everything",
headers={
"X-Api-Key": api_key, # Header name varies by API — check docs
},
params={
"q": "office supplies industry",
"from": "2024-01-01",
"language": "en",
},
timeout=30,
)
API Keys in Query Parameters
Some APIs accept the key as a URL parameter:
response = requests.get(
"https://www.alphavantage.co/query",
params={
"function": "OVERVIEW",
"symbol": "SPLS",
"apikey": "your_api_key", # Key in params
},
timeout=30,
)
Bearer Token Authentication
Bearer tokens are commonly used with OAuth 2.0 systems (like connecting to Google, Salesforce, or Microsoft APIs):
access_token = "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."
response = requests.get(
"https://api.service.com/v1/data",
headers={
"Authorization": f"Bearer {access_token}",
},
timeout=30,
)
HTTP Basic Authentication
Older APIs sometimes use username/password:
response = requests.get(
"https://api.legacy-service.com/data",
auth=("your_username", "your_password"),
timeout=30,
)
requests handles the base64 encoding required for basic auth automatically.
21.6 Storing API Credentials Safely
Never hardcode API keys in your scripts. This is the single most important security rule in this chapter. API keys embedded in code get committed to version control, shared in emails, and posted to forums. Even if you only share the script internally, rotating a compromised key means finding every copy.
The standard solution is environment variables: key-value pairs that live in the operating system's environment, not in your code.
Setting Environment Variables
Windows (Command Prompt, permanent):
setx ALPHA_VANTAGE_API_KEY "your_key_here"
setx NEWS_API_KEY "your_key_here"
setx EXCHANGE_RATE_API_KEY "your_key_here"
Windows (PowerShell, permanent):
[Environment]::SetEnvironmentVariable("ALPHA_VANTAGE_API_KEY", "your_key_here", "User")
macOS/Linux (add to ~/.bashrc or ~/.zshrc):
export ALPHA_VANTAGE_API_KEY="your_key_here"
export NEWS_API_KEY="your_key_here"
Using a .env File with python-dotenv
For development, a .env file is more convenient than system environment variables:
pip install python-dotenv
Create a .env file in your project directory:
# .env — NEVER commit this file to version control
ALPHA_VANTAGE_API_KEY=your_key_here
NEWS_API_KEY=your_key_here
EXCHANGE_RATE_API_KEY=your_key_here
Add .env to your .gitignore:
.env
*.env
Load it in your Python code:
import os
from dotenv import load_dotenv
# Load .env file into os.environ
load_dotenv()
# Now access credentials from environment
alpha_vantage_key = os.environ.get("ALPHA_VANTAGE_API_KEY")
news_api_key = os.environ.get("NEWS_API_KEY")
if not alpha_vantage_key:
raise ValueError(
"ALPHA_VANTAGE_API_KEY not found in environment. "
"Set it in your .env file or system environment."
)
The os.environ.get("KEY_NAME") pattern returns None if the variable doesn't exist, rather than raising an exception. Checking explicitly and raising a descriptive error is better than letting the program fail mysteriously later.
21.7 Working with JSON Responses
JSON (JavaScript Object Notation) is the lingua franca of web APIs. It maps directly to Python data types:
| JSON | Python |
|---|---|
object { } |
dict |
array [ ] |
list |
string "..." |
str |
number |
int or float |
true / false |
True / False |
null |
None |
Navigating Nested Structures
Real API responses are nested. Here's a realistic example from a financial data API:
response = requests.get(
"https://www.alphavantage.co/query",
params={
"function": "OVERVIEW",
"symbol": "AAPL",
"apikey": os.environ.get("ALPHA_VANTAGE_API_KEY"),
},
timeout=30,
)
data = response.json()
# data is a large nested dictionary
# Access top-level fields directly
company_name = data["Name"] # "Apple Inc"
industry = data["Industry"] # "Electronic Computers"
market_cap = data["MarketCapitalization"] # "2800000000000"
pe_ratio = data["PERatio"] # "28.5"
print(f"{company_name}: {industry}, P/E: {pe_ratio}")
A more complex nested example — currency exchange rates:
# ExchangeRate-API response structure:
# {
# "result": "success",
# "base_code": "USD",
# "rates": {
# "EUR": 0.9234,
# "GBP": 0.7891,
# "JPY": 149.82,
# "CAD": 1.3621,
# ...
# }
# }
response = requests.get(
f"https://v6.exchangerate-api.com/v6/{api_key}/latest/USD",
timeout=30,
)
data = response.json()
# Check if the API returned success
if data["result"] != "success":
raise RuntimeError(f"API returned error: {data.get('error-type', 'unknown')}")
exchange_rates = data["rates"] # This is a dict: {"EUR": 0.9234, ...}
# Get specific rates
eur_rate = exchange_rates["EUR"]
gbp_rate = exchange_rates["GBP"]
cad_rate = exchange_rates.get("CAD", None) # .get() with default is safer
print(f"1 USD = {eur_rate:.4f} EUR")
print(f"1 USD = {gbp_rate:.4f} GBP")
Handling Missing Fields Defensively
Not every response will contain every field you expect. Use .get() with defaults:
# Risky — raises KeyError if "revenue" is missing
revenue = data["financials"]["annual"]["revenue"]
# Safe — returns None if any level is missing
financials = data.get("financials", {})
annual = financials.get("annual", {})
revenue = annual.get("revenue")
# Or use a helper function for deeply nested access
def safe_get(d, *keys, default=None):
"""Safely navigate nested dictionaries."""
for key in keys:
if isinstance(d, dict):
d = d.get(key, default)
else:
return default
return d
revenue = safe_get(data, "financials", "annual", "revenue")
Processing Lists of Records
Many API responses return lists of objects — one item per record:
# A typical paginated list response:
# {
# "status": "ok",
# "totalResults": 847,
# "articles": [
# {"title": "...", "publishedAt": "...", "source": {...}},
# {"title": "...", "publishedAt": "...", "source": {...}},
# ...
# ]
# }
response = requests.get(
"https://newsapi.org/v2/everything",
headers={"X-Api-Key": news_api_key},
params={
"q": "office supplies wholesale distribution",
"language": "en",
"sortBy": "publishedAt",
"pageSize": 20,
},
timeout=30,
)
data = response.json()
articles = data.get("articles", [])
for article in articles:
title = article.get("title", "No title")
source_name = article.get("source", {}).get("name", "Unknown source")
published = article.get("publishedAt", "")[:10] # Just the date part
url = article.get("url", "")
print(f"[{published}] {source_name}: {title}")
21.8 Pagination
APIs don't return all results at once. If a database has 50,000 orders and you request them all, the response would be enormous, slow, and potentially crash the server. Instead, APIs use pagination: they return results in pages, and you make multiple requests to get all the data.
Page-Based Pagination
The most common pattern — request page 1, page 2, and so on:
import requests
import time
def fetch_all_news_articles(api_key, query, max_pages=10):
"""
Fetch all matching news articles, handling pagination automatically.
Returns a list of all article dictionaries.
"""
all_articles = []
page_number = 1
page_size = 100 # Max per page for this API
while page_number <= max_pages:
response = requests.get(
"https://newsapi.org/v2/everything",
headers={"X-Api-Key": api_key},
params={
"q": query,
"language": "en",
"sortBy": "publishedAt",
"page": page_number,
"pageSize": page_size,
},
timeout=30,
)
response.raise_for_status()
data = response.json()
articles_this_page = data.get("articles", [])
# If this page is empty, we've retrieved everything
if not articles_this_page:
break
all_articles.extend(articles_this_page)
# Check if we've retrieved all available results
total_results = data.get("totalResults", 0)
if len(all_articles) >= total_results:
break
page_number += 1
# Polite delay between requests — more on this in 21.9
time.sleep(0.5)
return all_articles
# Usage
industry_articles = fetch_all_news_articles(
api_key=news_api_key,
query="office supplies distribution",
max_pages=5,
)
print(f"Retrieved {len(industry_articles)} articles")
Cursor-Based Pagination
Some APIs use cursors instead of page numbers — each response includes a "cursor" token pointing to the next page:
def fetch_with_cursor_pagination(base_url, headers, params, cursor_field="next_cursor"):
"""
Fetch all results using cursor-based pagination.
"""
all_results = []
cursor = None
while True:
if cursor:
params["cursor"] = cursor
response = requests.get(base_url, headers=headers, params=params, timeout=30)
response.raise_for_status()
data = response.json()
results = data.get("results", data.get("data", []))
all_results.extend(results)
# Get next cursor — if None, we're done
cursor = data.get(cursor_field)
if not cursor:
break
time.sleep(0.25)
return all_results
Offset-Based Pagination
Some APIs use offset and limit parameters:
def fetch_with_offset_pagination(base_url, headers, params, batch_size=100):
"""
Fetch all results using offset/limit pagination.
"""
all_results = []
offset = 0
while True:
params.update({"limit": batch_size, "offset": offset})
response = requests.get(base_url, headers=headers, params=params, timeout=30)
response.raise_for_status()
data = response.json()
batch = data.get("results", [])
all_results.extend(batch)
# If we got fewer results than requested, we've reached the end
if len(batch) < batch_size:
break
offset += batch_size
time.sleep(0.25)
return all_results
21.9 Rate Limiting and Retry Logic
APIs enforce rate limits: maximum number of requests per second, minute, or day. Exceed them and you'll get 429 responses. Good production code handles rate limits gracefully rather than crashing.
Understanding Rate Limit Headers
Many APIs tell you your current usage in response headers:
response = requests.get(url, headers=headers, params=params, timeout=30)
# Common rate limit header patterns (vary by API):
requests_remaining = response.headers.get("X-RateLimit-Remaining")
rate_limit_reset = response.headers.get("X-RateLimit-Reset")
retry_after = response.headers.get("Retry-After")
if requests_remaining:
print(f"Requests remaining this window: {requests_remaining}")
Exponential Backoff
When you hit a rate limit or a transient server error, don't retry immediately — wait, then retry. Exponential backoff increases the wait time with each retry:
import requests
import time
import logging
logger = logging.getLogger(__name__)
def api_request_with_retry(
url,
method="GET",
headers=None,
params=None,
json_body=None,
max_retries=3,
initial_backoff_seconds=1.0,
timeout=30,
):
"""
Make an API request with automatic retry and exponential backoff.
Retries on:
- 429 (rate limited)
- 500, 502, 503, 504 (server errors)
- Connection errors and timeouts
"""
backoff_seconds = initial_backoff_seconds
for attempt_number in range(max_retries + 1):
try:
if method.upper() == "GET":
response = requests.get(
url,
headers=headers,
params=params,
timeout=timeout,
)
elif method.upper() == "POST":
response = requests.post(
url,
headers=headers,
params=params,
json=json_body,
timeout=timeout,
)
else:
raise ValueError(f"Unsupported HTTP method: {method}")
# Success
if response.status_code == 200 or response.status_code == 201:
return response
# Rate limited — respect Retry-After header if present
if response.status_code == 429:
retry_after = response.headers.get("Retry-After")
wait_time = float(retry_after) if retry_after else backoff_seconds * 2
logger.warning(
f"Rate limited on attempt {attempt_number + 1}. "
f"Waiting {wait_time:.1f}s before retry."
)
time.sleep(wait_time)
backoff_seconds *= 2
continue
# Server errors — retry with backoff
if response.status_code in (500, 502, 503, 504):
if attempt_number < max_retries:
logger.warning(
f"Server error {response.status_code} on attempt "
f"{attempt_number + 1}. Retrying in {backoff_seconds:.1f}s."
)
time.sleep(backoff_seconds)
backoff_seconds *= 2
continue
else:
response.raise_for_status()
# Client errors (4xx except 429) — don't retry, they won't fix themselves
response.raise_for_status()
except requests.exceptions.Timeout:
if attempt_number < max_retries:
logger.warning(
f"Request timed out on attempt {attempt_number + 1}. "
f"Retrying in {backoff_seconds:.1f}s."
)
time.sleep(backoff_seconds)
backoff_seconds *= 2
else:
raise
except requests.exceptions.ConnectionError:
if attempt_number < max_retries:
logger.warning(
f"Connection error on attempt {attempt_number + 1}. "
f"Retrying in {backoff_seconds:.1f}s."
)
time.sleep(backoff_seconds)
backoff_seconds *= 2
else:
raise
raise RuntimeError(f"All {max_retries + 1} attempts failed for URL: {url}")
21.10 Error Handling for API Calls
Production API code needs to handle three categories of failure:
- Network failures: No internet connection, DNS failures, timeouts
- HTTP errors: 4xx (your fault) and 5xx (their fault)
- Data errors: Unexpected response format, missing fields, type mismatches
import requests
import json
import logging
logger = logging.getLogger(__name__)
def fetch_exchange_rates(api_key, base_currency="USD"):
"""
Fetch current exchange rates from ExchangeRate-API.
Returns dict of {currency_code: rate} or raises an exception.
"""
url = f"https://v6.exchangerate-api.com/v6/{api_key}/latest/{base_currency}"
try:
response = requests.get(url, timeout=15)
except requests.exceptions.ConnectionError as e:
logger.error(f"Cannot connect to exchange rate API: {e}")
raise ConnectionError(
"Exchange rate API is unreachable. Check internet connection."
) from e
except requests.exceptions.Timeout:
logger.error("Exchange rate API request timed out after 15 seconds")
raise TimeoutError(
"Exchange rate API did not respond within 15 seconds. Try again."
)
# Check HTTP status
if response.status_code == 401:
raise PermissionError(
"Invalid API key. Check EXCHANGE_RATE_API_KEY environment variable."
)
if response.status_code == 429:
raise RuntimeError(
"Exchange rate API rate limit exceeded. Upgrade plan or wait."
)
if not response.ok:
raise RuntimeError(
f"Exchange rate API returned unexpected status: {response.status_code}"
)
# Parse JSON
try:
data = response.json()
except json.JSONDecodeError as e:
logger.error(f"Exchange rate API returned non-JSON response: {response.text[:200]}")
raise ValueError("Exchange rate API returned malformed response") from e
# Validate expected structure
if data.get("result") != "success":
error_type = data.get("error-type", "unknown")
raise ValueError(f"Exchange rate API returned error: {error_type}")
rates = data.get("rates")
if not rates or not isinstance(rates, dict):
raise ValueError("Exchange rate API response missing 'rates' dictionary")
return rates
21.11 Real Business API Examples
Currency Exchange Rates for International Sales
Acme Corp has started selling to Canadian and European customers. Priya needs to convert their foreign revenue to USD for the quarterly roll-up.
import os
import requests
from datetime import datetime
def get_exchange_rates(base_currency="USD"):
"""
Fetch current exchange rates from ExchangeRate-API free tier.
Free tier: https://www.exchangerate-api.com (1,500 requests/month free)
"""
api_key = os.environ.get("EXCHANGE_RATE_API_KEY")
# Free tier endpoint — no key required for some base currencies
# Using open.er-api.com which is free without registration for basic use
url = f"https://open.er-api.com/v6/latest/{base_currency}"
response = requests.get(url, timeout=15)
response.raise_for_status()
data = response.json()
if data.get("result") != "success":
raise ValueError(f"API error: {data}")
return data["rates"]
def convert_international_sales(sales_records, exchange_rates):
"""
Convert foreign currency sales amounts to USD.
Args:
sales_records: list of dicts with 'amount', 'currency', 'customer'
exchange_rates: dict from get_exchange_rates()
Returns:
list of records with added 'amount_usd' field
"""
converted_records = []
for record in sales_records:
currency = record["currency"].upper()
amount = record["amount"]
if currency == "USD":
amount_usd = amount
elif currency in exchange_rates:
# exchange_rates["EUR"] = 0.9234 means 1 USD = 0.9234 EUR
# So to get USD: divide the foreign amount by the rate
amount_usd = amount / exchange_rates[currency]
else:
print(f"Warning: Unknown currency '{currency}' for {record['customer']}")
amount_usd = None
converted_records.append({
**record,
"amount_usd": round(amount_usd, 2) if amount_usd else None,
"conversion_date": datetime.now().strftime("%Y-%m-%d"),
})
return converted_records
# Example usage
acme_international_sales = [
{"customer": "Maple Office Supplies Ltd", "amount": 12500.00, "currency": "CAD"},
{"customer": "Euro Bürobedarf GmbH", "amount": 8400.00, "currency": "EUR"},
{"customer": "Pacific Office Co", "amount": 3200.00, "currency": "USD"},
{"customer": "Manchester Business Supplies", "amount": 5600.00, "currency": "GBP"},
]
rates = get_exchange_rates("USD")
converted = convert_international_sales(acme_international_sales, rates)
total_usd = sum(r["amount_usd"] for r in converted if r["amount_usd"])
print(f"\nInternational Sales Conversion Summary")
print(f"{'Customer':<35} {'Orig Amount':>12} {'Currency':>8} {'USD Amount':>12}")
print("-" * 72)
for record in converted:
usd = f"${record['amount_usd']:,.2f}" if record["amount_usd"] else "N/A"
print(f"{record['customer']:<35} {record['amount']:>12,.2f} {record['currency']:>8} {usd:>12}")
print("-" * 72)
print(f"{'TOTAL USD':>58} ${total_usd:>10,.2f}")
Weather Data for Logistics Planning
Acme's warehouse operations team needs weather alerts for their regional distribution centers:
import requests
def get_weather_for_warehouse(city_name, latitude, longitude):
"""
Fetch current weather using Open-Meteo (free, no API key required).
https://open-meteo.com/
"""
response = requests.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": latitude,
"longitude": longitude,
"current_weather": True,
"hourly": "precipitation_probability,snow_depth",
"forecast_days": 2,
"timezone": "America/Chicago",
},
timeout=15,
)
response.raise_for_status()
data = response.json()
current = data["current_weather"]
weather_code = current["weathercode"]
# WMO weather code interpretation (simplified)
severe_weather_codes = {
55: "Heavy drizzle",
65: "Heavy rain",
75: "Heavy snowfall",
82: "Violent rain showers",
95: "Thunderstorm",
99: "Thunderstorm with heavy hail",
}
is_severe = weather_code in severe_weather_codes
return {
"city": city_name,
"temperature_c": current["temperature"],
"wind_speed_kmh": current["windspeed"],
"weather_code": weather_code,
"is_severe": is_severe,
"condition": severe_weather_codes.get(weather_code, "Normal conditions"),
}
# Acme's regional distribution centers
acme_warehouses = [
("Chicago HQ", 41.8781, -87.6298),
("Cleveland East", 41.4993, -81.6944),
("Memphis South", 35.1495, -90.0490),
("Denver West", 39.7392, -104.9903),
]
print("Acme Corp Warehouse Weather Status")
print("=" * 55)
for city, lat, lon in acme_warehouses:
weather = get_weather_for_warehouse(city, lat, lon)
alert = " *** WEATHER ALERT ***" if weather["is_severe"] else ""
print(
f"{weather['city']:<20} {weather['temperature_c']:>5.1f}°C "
f"Wind: {weather['wind_speed_kmh']:>5.1f} km/h "
f"{weather['condition']}{alert}"
)
Public Financial Data with Alpha Vantage
Alpha Vantage's free tier provides company financial data:
import os
import requests
def get_company_overview(ticker_symbol):
"""
Fetch company overview from Alpha Vantage free API.
Free tier: 25 requests/day, 5 requests/minute.
Sign up at https://www.alphavantage.co/support/#api-key
"""
api_key = os.environ.get("ALPHA_VANTAGE_API_KEY")
if not api_key:
raise ValueError("Set ALPHA_VANTAGE_API_KEY environment variable")
response = requests.get(
"https://www.alphavantage.co/query",
params={
"function": "OVERVIEW",
"symbol": ticker_symbol,
"apikey": api_key,
},
timeout=20,
)
response.raise_for_status()
data = response.json()
# Alpha Vantage returns empty dict for invalid symbols
if not data or "Name" not in data:
raise ValueError(f"No data found for ticker: {ticker_symbol}")
return {
"ticker": ticker_symbol,
"name": data.get("Name"),
"industry": data.get("Industry"),
"sector": data.get("Sector"),
"market_cap": data.get("MarketCapitalization"),
"pe_ratio": data.get("PERatio"),
"revenue_ttm": data.get("RevenueTTM"),
"profit_margin": data.get("ProfitMargin"),
"description": data.get("Description", "")[:200] + "...",
}
def get_stock_quote(ticker_symbol):
"""
Get current stock quote from Alpha Vantage.
"""
api_key = os.environ.get("ALPHA_VANTAGE_API_KEY")
response = requests.get(
"https://www.alphavantage.co/query",
params={
"function": "GLOBAL_QUOTE",
"symbol": ticker_symbol,
"apikey": api_key,
},
timeout=20,
)
response.raise_for_status()
data = response.json()
quote_data = data.get("Global Quote", {})
if not quote_data:
raise ValueError(f"No quote data for {ticker_symbol}")
return {
"ticker": ticker_symbol,
"price": float(quote_data.get("05. price", 0)),
"change": float(quote_data.get("09. change", 0)),
"change_pct": quote_data.get("10. change percent", "0%"),
"volume": int(quote_data.get("06. volume", 0)),
"latest_trading_day": quote_data.get("07. latest trading day"),
}
News API for Business Monitoring
Tracking industry news for competitive intelligence:
import os
import requests
from datetime import datetime, timedelta
def fetch_industry_news(search_query, days_back=7, max_articles=20):
"""
Fetch recent news articles using NewsAPI.
Free tier: 100 requests/day, developer use only.
Sign up at https://newsapi.org/register
"""
api_key = os.environ.get("NEWS_API_KEY")
if not api_key:
raise ValueError("Set NEWS_API_KEY environment variable")
from_date = (datetime.now() - timedelta(days=days_back)).strftime("%Y-%m-%d")
response = requests.get(
"https://newsapi.org/v2/everything",
headers={"X-Api-Key": api_key},
params={
"q": search_query,
"from": from_date,
"language": "en",
"sortBy": "relevancy",
"pageSize": max_articles,
},
timeout=20,
)
response.raise_for_status()
data = response.json()
if data.get("status") != "ok":
raise ValueError(f"NewsAPI error: {data.get('message', 'Unknown error')}")
articles = []
for article in data.get("articles", []):
articles.append({
"title": article.get("title"),
"source": article.get("source", {}).get("name"),
"published_at": article.get("publishedAt", "")[:10],
"description": article.get("description", "")[:200],
"url": article.get("url"),
})
return articles
# Usage for Priya's competitive monitoring
competitor_news = fetch_industry_news(
"office supplies distribution wholesale",
days_back=7,
max_articles=10,
)
print(f"Industry News — Last 7 Days ({len(competitor_news)} articles)")
print("=" * 65)
for article in competitor_news:
print(f"\n[{article['published_at']}] {article['source']}")
print(f" {article['title']}")
if article['description']:
print(f" {article['description'][:100]}...")
21.12 Building a Reusable API Client
Once you're making calls to multiple APIs, repeating authentication, retry logic, and error handling in every script is wasteful. A reusable API client class encapsulates these concerns once:
import os
import time
import logging
import requests
from typing import Any
logger = logging.getLogger(__name__)
class APIClient:
"""
Reusable HTTP API client with authentication, retry logic, and error handling.
Designed for business data retrieval where reliability matters more than speed.
"""
def __init__(
self,
base_url,
api_key=None,
api_key_header="X-Api-Key",
bearer_token=None,
default_timeout=30,
max_retries=3,
rate_limit_pause_seconds=1.0,
):
self.base_url = base_url.rstrip("/")
self.default_timeout = default_timeout
self.max_retries = max_retries
self.rate_limit_pause_seconds = rate_limit_pause_seconds
# Build default headers
self.default_headers = {"Accept": "application/json"}
if api_key:
self.default_headers[api_key_header] = api_key
if bearer_token:
self.default_headers["Authorization"] = f"Bearer {bearer_token}"
# Use a requests.Session for connection pooling (faster for multiple calls)
self.session = requests.Session()
self.session.headers.update(self.default_headers)
def get(self, endpoint, params=None, extra_headers=None):
"""
Make a GET request. Returns parsed JSON response.
Args:
endpoint: URL path relative to base_url (e.g., "/v1/rates/USD")
params: dict of query parameters
extra_headers: additional headers for this specific request
Returns:
Parsed JSON as dict or list
Raises:
requests.HTTPError for non-retryable HTTP errors
ConnectionError for network failures after all retries
"""
url = f"{self.base_url}{endpoint}"
headers = extra_headers or {}
return self._request_with_retry("GET", url, params=params, headers=headers)
def post(self, endpoint, json_body=None, params=None, extra_headers=None):
"""Make a POST request. Returns parsed JSON response."""
url = f"{self.base_url}{endpoint}"
headers = extra_headers or {}
return self._request_with_retry(
"POST", url, params=params, headers=headers, json_body=json_body
)
def _request_with_retry(self, method, url, params=None, headers=None, json_body=None):
"""Internal method: execute request with retry/backoff logic."""
backoff = self.rate_limit_pause_seconds
for attempt in range(self.max_retries + 1):
try:
response = self.session.request(
method=method,
url=url,
params=params,
headers=headers,
json=json_body,
timeout=self.default_timeout,
)
# Success
if response.status_code in (200, 201, 204):
if response.status_code == 204 or not response.text:
return {}
return response.json()
# Rate limited
if response.status_code == 429:
retry_after = response.headers.get("Retry-After")
wait = float(retry_after) if retry_after else backoff * 2
logger.warning(f"Rate limited. Waiting {wait:.1f}s (attempt {attempt + 1})")
time.sleep(wait)
backoff = min(backoff * 2, 60)
continue
# Retryable server errors
if response.status_code in (500, 502, 503, 504) and attempt < self.max_retries:
logger.warning(
f"Server error {response.status_code}. "
f"Retrying in {backoff:.1f}s (attempt {attempt + 1})"
)
time.sleep(backoff)
backoff = min(backoff * 2, 60)
continue
# All other errors — raise immediately
response.raise_for_status()
except requests.exceptions.Timeout:
if attempt < self.max_retries:
logger.warning(f"Timeout. Retrying in {backoff:.1f}s (attempt {attempt + 1})")
time.sleep(backoff)
backoff = min(backoff * 2, 60)
else:
raise
except requests.exceptions.ConnectionError:
if attempt < self.max_retries:
logger.warning(
f"Connection error. Retrying in {backoff:.1f}s (attempt {attempt + 1})"
)
time.sleep(backoff)
backoff = min(backoff * 2, 60)
else:
raise
raise RuntimeError(f"All {self.max_retries + 1} attempts exhausted for: {url}")
With this client, calling any API becomes clean and consistent:
# Exchange rate client
exchange_client = APIClient(
base_url="https://open.er-api.com",
default_timeout=15,
)
rates_data = exchange_client.get("/v6/latest/USD")
exchange_rates = rates_data["rates"]
# News client
news_client = APIClient(
base_url="https://newsapi.org",
api_key=os.environ.get("NEWS_API_KEY"),
api_key_header="X-Api-Key",
)
news_data = news_client.get("/v2/everything", params={"q": "office supplies", "pageSize": 10})
articles = news_data.get("articles", [])
21.13 Practical Summary: A Complete API Data Pull
Here's how a real business data pull comes together, using everything from this chapter:
"""
acme_market_intelligence.py
Pulls industry news, exchange rates, and weather data from public APIs.
Combines them into a market intelligence summary for Priya's Monday report.
Requirements:
pip install requests python-dotenv
Environment variables required:
NEWS_API_KEY — from newsapi.org (free tier)
(Exchange rates and weather are free/no key required)
"""
import os
import csv
import logging
from datetime import datetime
from dotenv import load_dotenv
import requests
# Load environment variables from .env file
load_dotenv()
# Configure logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(levelname)s %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger(__name__)
def fetch_exchange_rates():
"""Fetch USD exchange rates — free, no API key required."""
logger.info("Fetching exchange rates...")
response = requests.get("https://open.er-api.com/v6/latest/USD", timeout=15)
response.raise_for_status()
data = response.json()
return data["rates"]
def fetch_industry_news(query, max_articles=10):
"""Fetch industry news via NewsAPI."""
api_key = os.environ.get("NEWS_API_KEY")
if not api_key:
logger.warning("NEWS_API_KEY not set — skipping news fetch")
return []
logger.info(f"Fetching news for: '{query}'")
response = requests.get(
"https://newsapi.org/v2/everything",
headers={"X-Api-Key": api_key},
params={
"q": query,
"language": "en",
"sortBy": "publishedAt",
"pageSize": max_articles,
},
timeout=20,
)
response.raise_for_status()
data = response.json()
return [
{
"title": a.get("title", ""),
"source": a.get("source", {}).get("name", ""),
"published": a.get("publishedAt", "")[:10],
"url": a.get("url", ""),
}
for a in data.get("articles", [])
]
def save_exchange_rates_to_csv(rates, output_path):
"""Save exchange rates snapshot to CSV for historical tracking."""
today = datetime.now().strftime("%Y-%m-%d")
currencies_of_interest = ["EUR", "GBP", "CAD", "JPY", "AUD", "MXN", "CHF"]
rows = []
for currency in currencies_of_interest:
if currency in rates:
rows.append({
"date": today,
"base_currency": "USD",
"target_currency": currency,
"rate": rates[currency],
})
with open(output_path, "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["date", "base_currency", "target_currency", "rate"])
writer.writeheader()
writer.writerows(rows)
logger.info(f"Exchange rates saved to {output_path}")
def generate_market_intelligence_report(exchange_rates, news_articles, output_path):
"""Write a plain-text market intelligence summary."""
today = datetime.now().strftime("%Y-%m-%d")
with open(output_path, "w", encoding="utf-8") as f:
f.write(f"ACME CORP MARKET INTELLIGENCE BRIEF\n")
f.write(f"Generated: {today}\n")
f.write("=" * 60 + "\n\n")
f.write("EXCHANGE RATES (USD Base)\n")
f.write("-" * 40 + "\n")
for currency in ["EUR", "GBP", "CAD", "JPY", "AUD"]:
if currency in exchange_rates:
f.write(f" 1 USD = {exchange_rates[currency]:.4f} {currency}\n")
f.write("\n")
f.write(f"INDUSTRY NEWS ({len(news_articles)} articles)\n")
f.write("-" * 40 + "\n")
for article in news_articles[:5]:
f.write(f"\n[{article['published']}] {article['source']}\n")
f.write(f" {article['title']}\n")
logger.info(f"Market intelligence report saved to {output_path}")
if __name__ == "__main__":
exchange_rates = fetch_exchange_rates()
news_articles = fetch_industry_news(
"office supplies wholesale distribution",
max_articles=10,
)
save_exchange_rates_to_csv(exchange_rates, "acme_exchange_rates_today.csv")
generate_market_intelligence_report(
exchange_rates,
news_articles,
"acme_market_intelligence.txt",
)
print("\nMarket intelligence pull complete.")
print(f" Exchange rates: {len(exchange_rates)} currencies")
print(f" News articles: {len(news_articles)} articles")
Chapter Summary
APIs transform your Python code from a tool that processes internal data into a system that connects to the world. The key ideas from this chapter:
- APIs are structured data services accessed over HTTP, returning JSON you can work with directly in Python
- The
requestslibrary makes HTTP calls straightforward:requests.get()for fetching data,response.json()for parsing it - Authentication usually means API keys in headers or query params — always stored in environment variables, never hardcoded
- Error handling is non-negotiable in production code: check status codes, handle network failures, validate response structure
- Pagination is the norm for large datasets — design your fetch functions to handle multi-page responses from the start
- Rate limiting is a constraint, not an obstacle — exponential backoff handles it cleanly
- A reusable
APIClientclass eliminates repetition across your API-consuming scripts
The next chapter introduces scheduling: once you've built scripts like these, you'll want them to run automatically on a schedule. Chapter 22 shows you how.
Next: Chapter 22 — Scheduling and Task Automation