Map of active volcanoes#

Here is an example of plotting a map of active volcanoes using data online at Oregon State University. This example was written by Meghan Miller, of ANU.

../../_images/volcano_map.png

At the end of this script you will produce a map just like the one shown here.

Resources you will use.#

This notebook makes use of a couple of packages that might come in handy another time. The maps are made by cartopy which is a mapping tool written by the Meteorological Office in the UK (and which happens to be really good a plotting satellite data). The pandas package is a database tool that is really very good at manipulating tables of different types of data, selecting, sorting, refining and so on.

Notes on the data source#

The data that we are going to use come from this web page and you can see that this is a human-readable table, and the first of several pages.

We will show you how to read the first page, but you can also try this:

  • Can you figure out how to read the next one or two pages / all the pages ? (hint: click on the link and look at the url)

  • Can you see how to merge all the tables into one ? (hint: pandas has a concat function to combine a list of dataframes)

%matplotlib inline

import json

import cartopy.crs as ccrs

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

chartinfo = 'Author: Meghan Miller | Data: Volcano World - volcano.oregonstate.edu'
import cartopy
cartopy.__version__

This next section reads the data from the Oregon State University database. This URL is actually a script to return the table of volcanoes in various forms. This is not a big issue as it returns a valid web page, but not every library that reads html is configured to work with these general URLs.

page_source = "https://volcano.oregonstate.edu/volcano_table?sort_by=title&sort_order=ASC"

This function from the pandas package will read all the tables in a web page and turn them into dataframes.

tables = pd.read_html(page_source)
print("There is/are {} table/s on this web page".format(len(tables)))

In this case, it is not necessary to parse the various tables to find the one we want, but you would need to check (for example, the page header or footer might be in the form of a table to lay out the information but we don’t want to use that for our map !)

df_volc = tables[0]
print(type(df_volc))
# pdurl = 'https://volcano.oregonstate.edu/volcano_table?sort_by=title&sort_order=ASC'
# xpath = '//table'
# tree = html.parse(pdurl)
# tables = tree.xpath(xpath)

# table_dfs = []
# for idx in range(4, len(tables)):
#     df = pd.read_html(html.tostring(tables[idx]), header=0)[0]
#     table_dfs.append(df)
df_volc['Type'].value_counts()

Clean up the data to make sure the typos and missing information are not propogated into your database. This doesn’t seem to be needed in this particular case, but, in other instances, you could use this technique to replace definitions / map to a new terminology etc.

def cleanup_type(s):
    if not isinstance(s, str):
        return s
    s = s.replace('?', '').replace('  ', ' ')
    s = s.replace('volcanoes', 'volcano')
    s = s.replace('volcanoe', 'Volcano')
    s = s.replace('cones', 'cone')
    s = s.replace('Calderas', 'Caldera')
    return s.strip().title()

df_volc['Type'] = df_volc['Type'].map(cleanup_type)
df_volc['Type'].value_counts()

Now determine the number of volcanoes in the database.

df_volc.dropna(inplace=True)
len(df_volc)

Now select the volcanoes that are above sealevel

df_volc = df_volc[df_volc['Elevation (m)'] >= 0]
len(df_volc)

Make a nice table of the first 10 volcanoes from the information that you grabbed out of the Oregon State University website on volcanoes

print(len(df_volc))
df_volc.head(10)

Determine the number of each type of volcanoes from this list and output this information to the screen.

df_volc['Type'].value_counts()
df_volc.dropna(inplace=True)
len(df_volc)
df = df_volc[df_volc['Type'] == 'Stratovolcano']

Create a simple scatter plot map of the stratavolcanoes

fig=plt.figure(figsize=(12,8))
ax = fig.add_subplot(1,1,1, projection=ccrs.Mollweide())
ax.stock_img()
ax.annotate('Stratovolcanoes of the world | ' + chartinfo, xy=(0, -1.04), xycoords='axes fraction')
ax.scatter(df['Longitude (dd)'].array,df['Latitude (dd)'].array, color='red', linewidth=1, marker='^', transform=ccrs.PlateCarree())

plt.show()

More volcanos#

Can you complete the following to get a full map of ALL the volcanos ?

# Fix this !

page_source = "???"

tables = pd.read_html(page_source)
print("There is/are {} table/s on this web page".format(len(tables)))
# Can you add this to the previous data frame with pd.concat ?

df_volc_1 = tables[0]
df_volc_1