Class 22

Application Programming Interfaces

Objectives for today

  • Collect data from the web using an API.
  • Write data to a file.

An important component of many scientific applications is data collection and data analysis. Today we’ll look at some ways to get data from the internet. Download this starter file.

The Internet

API stands for “Application Program Interface”, but before introducing APIs that we communicate through using the internet, let’s talk about the internet itself. The internet refers to any network of two or more machines connected by wires (WiFi is like invisible wires). To communicate with one another, these machines need a few things:

  1. Identification: this is done using an internet protocol (IP) address which are of the form x.x.x.x with each x being a number between 0 and 255. Visit the website https://whatismyipaddress.com/ to determine your computer’s IP address.
  2. Protocol: defines how computers send and respond to requests from each other. The requests and responses are in the form of a packet.

Open up a browser and, in the URL bar where you would usually type the name of a website, enter http://140.233.20.9.

The Computer Science department website!

The Middlebury Computer Science website! In fact, all websites you visit are hosted on some machine with an IP address like this. But it would be annoying to have to type in a bunch of numbers whenever we want to go to a website. So your internet service provider (ISP) provides a way to look up the IP address from the nice text form of a URL you enter. This is called the Domain Name System (DNS) which is pretty much a look-up table from URLs to IP addresses. For example, http://www.middlebury.edu/academics/cs maps to 140.233.20.9.

When you enter a URL in your browser, you are actually making a request to some computer that hosts the website. Your computer is the client and the host computer (where the website lives) is called the server. When you visit websites, your browser makes a HyperText Transfer Protocol (HTTP) request to some computer. The computer will then return a response packet, which will be a HTML file that then gets rendered in your browser.

Instead of requesting the home page for a website, we can make more interesting requests to computers. This is like calling a function on another computer, so we need to specify the values for parameters we want to pass in. Furthermore, we may need to authenticate our access to the API. We will use the Python requests package to make requests to an API. The general form of making a request is

import requests

# base_url is the base URL such as http://some_website.com
# path_to_api specifies the path/endpoint to the specific API we want to use (more details soon)
response = requests.get('base_url/path_to_api/input1&input2&input3&key=YOUR_API_KEY')

The command above will get a response from the API at http://some_website.com/path_to_api with inputs input1, input2 and input3 where the API key is YOUR_API_KEY. This key will be a mix of numbers and letters. It is a way for us to (1) authenticate our request and (2) keep track of how many calls we make to the API. Note that the ampersand & is used to delimit inputs to the API call, similar to how we use commas (,) to delimit arguments passed to functions.

As a quick example which doesn’t require an API key, let’s translate a message using the FunTranslations API (here using the Yoda translator). Note that for inputs that have spaces, we need to replace them with %20 which encodes a space.

import requests

endpoint = 'https://api.funtranslations.com/translate/yoda.json'
message = "computer science is my favorite subject".replace(" ", "%20") # replace spaces
url = endpoint + "?text=" + message

response = requests.get(url)

print(response.content)

The the translated message is somewhere in there and, in fact, we’re seeing a JSON describing the response from the server. JSON stands for JavaScript Object Notation which is a common format for exchanging data. We can access information in a JSON just like we access key-value pairs in a dictionary. There is a Python module for parsing this representation (specifically check out the json.loads function).

import json

content_json = json.loads(response.content)
print(content_json)
translated_message = content_json["contents"]["translated"]
print(translated_message)
My favorite subject,  computer science is

Although the FunTranslations API does not require a key, the number of calls that can be made from a specific IP address per day is limited to 60, and per hour is 5, so be mindful of how many times you make a request to this API.

Retrieving weather data using OpenWeatherMap

The next API we will use is set up by OpenWeatherMap, which will allow us to collect weather data. In this case, however, we need to provide a key.

To get started, click Create an Account here. Once you have an account (and your email is verified) click the dropdown at the top-right where your username is displayed and select My API keys (or click here).

Copy the long combination of characters under the Key field to the API_KEY constant in get_weather.py. We are using a free version of OpenWeatherMap’s API, and note that it puts a limit of 60 calls to the API per minute. This should be enough for our exercises today, but please just be mindful of how many times you are running your program.

We communicate with the OpenWeatherMap the same way we communicated with FunTranslations: we just define the URL with the appropriate inputs and, this time, with our API key.

Let’s write a get_current_temperature function in which we will get the current temperature for some input ZIP code (which will be a command-line argument to our program).

Every API is different in terms of how to call it and how the response is formatted. We should look at some documentation! Investigate the documentation here:

https://openweathermap.org/current#zip

It looks like we need to specify the ZIP code (via zip) and then a country code, and then our API key (via appid). For example:

https://api.openweathermap.org/data/2.5/weather?zip=05753,us&appid=abc123456789xyz

Of course, abc123456789xyz is not really our API key, so you would replace it with the one associated with your OpenWeatherMap account.

Now, suppose we did:

import requests

API_KEY = "abc123456789xyz" # replace with your key
BASE_URL = "https://api.openweathermap.org/data/2.5/weather"
response = requests.get(f"{BASE_URL}/weather?zip={zip_code},us&appid={API_KEY}")

print(response.content)
b'{"coord":{"lon":-73.1716,"lat":43.9919},"weather":[{"id":800,"main":"Clear","description":"clear sky","icon":"01d"}],"base":"stations","main":{"temp":280.87,"feels_like":277.44,"temp_min":277.66,"temp_max":281.41,"pressure":1022,"humidity":53,"sea_level":1022,"grnd_level":997},"visibility":10000,"wind":{"speed":6.17,"deg":190},"clouds":{"all":0},"dt":1764084432,"sys":{"type":1,"id":3062,"country":"US","sunrise":1764072049,"sunset":1764105546},"timezone":-18000,"id":0,"name":"Middlebury","cod":200}'

Note the use of f-strings to format the URL in a convenient way (as was done in Programming Assignment 8).

We could investigate the format of the response but, again, some documentation might be helpful. We can also add the units=imperial to get the results in Fahrenheit.

Complete the get_current_temperature function which takes in a zip_code (string) and prints out the current temperature in Fahrenheit.

def get_current_temperature(zip_code):
    """
    Get current temperature from online API

    Args:
        zip_code: Zip code as string

    Returns:
        Current temperature as float
    """
    url = f"{BASE_URL}/weather?zip={zip_code},us&APPID={API_KEY}&units=imperial"
    response = requests.get(url)
    # Use an "assert" to make sure we received a valid response
    # If not, response.content will be printed
    assert response.ok, response.content
    data = json.loads(response.content)
    return data["main"]["temp"]

Getting the complete 5 day / 3 hour forecast.

As a final exercise, extend your get_weather program to get the forecasted weather over 5 days (with data points every 3 hours) and then plot it. Since we may still want our get_weather program to retrieve the current weather, we’re going to add a command-line argument that will allow us to switch between current and forecast. For example

python get_weather.py 05753 current

should simply print out the current temperature in Middlebury. However, running

python get_weather.py 05753 forecast

should plot the forecasted temperature over five days (using matplotlib). I’m withholding some details about what to do so that you can practice researching, designing and implementing some things on your own (though I would suggest working in pairs for this exercise).

You should start by looking up how the API can be used here. Other functions you may find useful:

  1. plt.gca().set_xticks to set the values used for x-axis ticks.
  2. plt.gca().set_xticklabels to set the labels associated with each x-tick value.
  3. datetime.datetime.fromtimestamp to convert a UTC timestamp (what you get in the "dt" field in the response) to get a more readable date and time.

A possible implementation will be posted here after class.

Writing the data to a file

What if we wanted to write the data to a file instead of plotting it? We can open a file just like we opened one for reading, but this time we will pass in 'w' to write the data. Let’s write our own CSV (comma-separated values) to a file:

with open("forecast.csv", "w") as f:
   f.write("UTC,Time,Temperature\n")
   for i in range(len(temps)):
      f.write(f"{times[i]},{labels[i]},{temps[i]}\n")

Now, our data is saved so we can plot it without having to call the API all the time (and potentially go over the rate limit):

import datascience as ds
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

table = ds.Table.read_table("forecast.csv")
plt.plot(table["UTC"], table["Temperature"])
plt.xlabel('Time')
plt.ylabel('Temperature (F)')
plt.xticks(rotation=30)
plt.gca().set_xticks(table["UTC"][::4])
plt.gca().set_xticklabels(table["Time"][::4])
plt.savefig("forecast.png", bbox_inches='tight')