Lab 8: Introduction to scatter plots and the Gutenberg-Richter Relationship#

This lesson will introduce scatter plots and how to make them in several different ways. We will start by importing the ANSS earthquake catalog which you can optionally download here. OR, for your convenience this file is also provided in the “data” folder within the LAB-week8 folder.

The catalog provided is for magnitude >= 5 events occurring between 1997 and 2009 worldwide. For easy import, it is best to download a CSV version.

For additional details see the publication: John M. Aiken, Chastity Aiken, Fabrice Cotton; A Python Library for Teaching Computation to Seismology Students. Seismological Research Letters ; 89 (3): 1165–1171. doi: https://doi.org/10.1785/0220170246

Here you need to import the familiar toolkit pandas that we used in Lab 6 to import data from Oregon State University volcano database. As well as the plotting library we have used before matplotlib.

%matplotlib inline 

The next cell will read the csv data from the data folder previously downloaded for you. Then the following cell will display the first 5 lines (head) of the table from the .csv file.

# read the data, set space delimiter, and get rid of the dash line in 1st line after header
df = pd.read_csv('data/anss.csv', delim_whitespace=True, skiprows=[1])
df.head()

What are these columns?

Some are self explanatory but some are not:

  • MagT - magnitude type, the way the magnitude was calculated

  • Nst - the number of stations used to locate the earthquake

  • Gap - azimuthal gap of stations used to locate the earthquake

  • RMS - root mean square error for the location

  • SRC - network organization that reported the event

  • EventID - ID number assigned to the event

df.describe()

Scatter plots are used to look at how two variables compare. For example, does the magnitude of the earthquake correlate in some way to the number of stations used to locate it?

Exercise 1: Make a plot of Magnitude vs. Number of stations#

We got it started for you here, but you need to fill in the brackets to actually make a scatter plot for the magnitude (x-axis) and number of stations (y-axis)

df.plot()

What is this showing??? We can clean this plot up a lot though. Lets start by looking at the machinery that creates it.

fig, ax = plt.subplots(1, figsize=(8,5))

df.plot(ax=ax, kind='scatter', x='Mag', y='Nst', alpha=0.15, edgecolor='None'
       ,color='violet')

ax.set_ylim(0, 1000)
ax.set_xlim(4.5, 10)
ax.grid(True)
ax.set_title('Does the number of stations change for the magnitude?')

ax.set_ylabel('Number of Stations')
ax.set_xlabel('Magnitude')

Exercise 2: Make Gutenberg-Richter plots#

In seismology, the Gutenberg–Richter law expresses the relationship between the magnitude and total number of earthquakes (that is the distribution) in any given region and time period of at least that magnitude.

\(log_{10}N = a -bM\)

or

\(N = 10^{a-bM}\)

Where:

  • \(N\) is the number of events having a magnitude, \(\geq M\)

  • \(a\) and \(b\) are constants.

To create the Gutenberg-Richter distribution of the earthquake magnitudes from the ANSS catalog (aka the .cvs file!) we imported above, we are going to use the familar toolkit numpy.

# Try importing here
import numpy as np
df.Mag.values.round(1)
import numpy as np

hist, edges = np.histogram(a=df.Mag.values.round(1), bins=101, range=(0,10))
#cummulative histogram
chist = np.cumsum(hist[::-1])[::-1]

Look at what the variable “edges” is

edges

Look at what ‘hist’ is

hist

We got you started but you need to plot the data …#

fig, ax = plt.subplots()

#try plotting the data ('edges' variable here)
ax.plot(edges[:-1], chist, marker='^', color='blue')

ax.set_ylabel('N')
ax.set_xlabel('Magnitude')
ax.set_title('Gutenberg-Richter Distribution')

We quite frequently want to plot the fit of the a and b values of the Gutenberg-Richter distribution. To do this, we are going to use the equation below which calculates the maximum likelihood of the distribution and returns the parameters:

  • a - the “productivity” of the distribution, but really just the y-intercept

  • b - the “mean magnitude” but really it’s just the slope

  • bstdev - the standard deviation of b

  • length - the number of earthquakes used to calculate the values

def fmd_values(magnitudes, bin_width=0.1):
    """
    params magnitudes : numpy.array
    params bin_width : float
    
    returns a,b,bstd, n-values if above the earthquake count threshold
    else returns np.nans
    """
    length = magnitudes.shape[0]
    minimum = magnitudes.min()
    average = magnitudes.mean()
    b_value = (1 / (average - (minimum - (bin_width/2)))) * np.log10(np.exp(1))
    square_every_value = np.vectorize(lambda x: x**2)
    b_stddev = square_every_value((magnitudes - average).sum()) / (length * (length - 1))
    b_stddev = 2.3 * np.sqrt(b_stddev) * b_value**2
    a_value = np.log10(length) + b_value * minimum
    
    return a_value, b_value, b_stddev, length
fmd_values(df.Mag.values)
a, b, bstd, n = fmd_values(df.Mag.values)

Now that we have our values that represent the y-intercept (a value) and the slope (b value).

Exercise 3: You need to plot a fitted line to the data. Remember the equation for Gutenberg-Richter …#

We’ve started it for you, but you need to work through the calcuations … in particular defining ‘y’ and actually making the plot.

x = np.linspace(0,10, 1000)
y = 
fig, ax = plt.subplots()

ax.plot(x,y)

ax.set_yscale('log')
ax.set_ylabel('N')
ax.set_xlabel('Magnitude')
ax.set_xlim(4.5, 10)
ax.set_ylim(1e0, 1e5)
ax.set_title('Gutenberg-Richter Distribution')

This fit looks pretty good! I hope!