Lists, tuples and dictionaries
So far, all our variables have been single numbers. However, many types of data may consist of more than one piece of information. For example, an experiment might generate measurements of pressure and temperature, each recorded at a number of different times. While it would be possible to define new variables for each number separately, it quickly becomes difficult to keep track of them all. Ideally, we would like to have a structured way of storing related information. Lists, tuples and dictionaries are some of the tools that Python provides for doing this.
Lists
A list is defined by using square brackets, [
and ]
, and consists of some set of entries separated by commas, e.g.,
= [3, 4, 5] a
The individual entries do not need to have the same type. Also, we can have lists inside lists, e.g.,
= [3.2, 'hello', [4, 8, 17.1]] a
although in practice, list elements tend to be more homogenous than the example above.
We can access individual entries in the list by adding an index to the variable name. In Python, we have to remember a quirk: when counting objects, we start from 0 rather than from 1; this is because Python is a 0-based programming language. Thus, to access the first entry in the list a
, we use a[0]
; the second entry is a[1]
. We can use these entries in calculations.
Thus:
= [3.2, 'hello', [4, 8, 17.1]]
a print(a[0]) # Prints '3.2'
print(a[1]) # Prints 'hello'
print(a[2]) # Prints '[4, 8, 17.1]'
print(2+a[0]) # Prints '5.2'
Where an entry in a list is itself a list, we can use a second index to access the elements of this:
= [3.2, 'hello', [4, 8, 17.1]]
a print(a[2]) # Prints '[4, 8, 17.1]'
print(a[2][0]) # Prints '4'
print(a[2][1]**2 - a[2][2]) # Prints '46.9'
Negative indices count from the last element in the array, which is accessible as a[-1]
.
We can also ask for a range of entries. To do this, we specify the index of the first entry we want, and an index one larger than that of the last entry we want, separated by a colon:
= [5, 6, 7, 8, 9]
a print(a[1:3]) # Prints [6, 7]
Again, this is somewhat quirky. However, it does mean that if we want to extract n
entries starting at index i0
, we simply need to ask for a[i0:i0+n]
. By default, we get every element in the requested range. However, if we only want every \(k\)-th element, we can specify k
after a second colon, .e.g,
= [5, 6, 7, 8, 9, 10, 11, 12]
a print(a[1:6:2]) # Prints '[6, 8, 10]'
Finally, we can use shorthands such as a[:4]
(everything from the start up to element a[3]
, inclusive), a[3:]
(everything from a[3]
to the end) and a[::2]
(every second element, starting with a[0]
).
We can change the values stored within an element of a list:
= [1, 2, 3]
a print(a[1]) # Prints '2'
1] = 17
a[print(a) # Prints '[1, 17, 3]'
Lists are therefore said to be ‘mutable’. However, note that we need to explicitly change the list entry:
= 3
x = [1, 2, x]
a print(a) # Prints '[1, 2, 3]'
= 5
x print(a) # Still prints '[1, 2, 3]'
Unlike most other variables in Python, a list variable stores a reference to the list, rather than the list itself. This means that two different variable names can refer to the same memory address, and is a common (and counter-intuitive) source of errors. For example
= [1, 2, 3]
a = a # 'a' and 'b' both refer to the same object
b print(a) # Prints '[1, 2, 3]'
1] = 17
b[print(a) # Prints '[1, 17, 3]'
Notice that the list referred to by a
has changed, even though we have not explicitly altered it! This behaviour is different from ‘normal’ variables.
If you wish to create a copy of a list, you will need to use the ‘copy’ function:
= [1, 2, 3]
a = a.copy()
b 1] = 17
b[print(a) # Prints '[1, 2, 3]'
If we ‘add’ two lists together, we create a new list containing the elements from each:
= [1, 2, 3]
a = [4, 5, 6]
b print(a + b) # Prints '[1, 2, 3, 4, 5, 6]'
One can ‘grow’ a list:
= [1, 2, 3]
a += [4]
a print(a) # Prints '[1, 2, 3, 4]'
However, if you need to build a large list, this is a very inefficient way of doing it, and your program will probably end up being very slow.
Python provides a number of standard functions for working with lists. A few of the more important ones are detailed below, where we suppose a
is a list:
len(a)
returns the length ofa
, i.e. the number of elements it contains,max(a)
returns the largest element ofa
,min(a)
returns the smallest element ofa
,sum(a)
returns the sum of the elements ofa
,a.sort()
sorts the elements ofa
,a.append(x)
adds a new element at the end of the listw, containing the contents ofx
,a.insert(i, x)
adds a new element containingx
at location `a[i]``, shifting later elements ‘backwards’ by one,a.remove(x)
finds the first occurrence ofx
in the list and removes it,a.count(x)
counts the number of occurrences ofx
in the list,a.index(x)
returns the index of the first occurrence ofx
in the list,a.reverse()
flips the order of elements in a list, anda.pop(i)
removes and return the element at indexi
in the list (or if no argument is provided, the last element in the list).
Tuples
A tuple is rather similar to a list, but is defined using round brackets, (
and )
, e.g.,
= (3, 'x', (1, 2, 3)) a
Again, tuples can contain multiple elements, and the elements may not all share the same type. The key difference is that tuples are immutable, so that once a tuple is created it cannot subsequently be changed. (It can be extended – but this is effectively the creation of a new, larger tuple!) As with lists, the individual elements of a tuple can be accessed using [index]
after the tuple name, e.g.
= (3, 4, 5)
a print(a[1]) # Prints `4`
Again, Python provides a number of functions for working with a tuple (‘a
’), including:
len(a)
min(a)
max(a)
a.index(x)
a.count(x)
all of which are similar to their counterparts for lists.
A tuple can be converted into a list, and vice versa:
= [3, 4, 5]
a = (6, 7, 8)
b = tuple(a)
ta = list(b) lb
When should you use a tuple, and when should you use a list? To some extent, this is a matter of style and preference. In general, the advice is that all the entries in a list should be ‘the same kind of data’, whereas the entries in a tuple are likely to represent different entitites. For example, if we have an experimental dataset that consists of observations of pressure and temperature at a sequence of time points, it might be appropriate to represent each observation as a tuple of (pressure, temperature)
, and then the sequence of observations as a list of these tuples [(p1, T1), (p2, T2), (p3, T3),...]
.
There is a key difference between tuples and lists, though. An immutable data type can be used as a key in a dictionary. So we can use a tuple as a tag for other kinds of data. We haven’t explored what a key for a dictionary means, so let’s do that now.
Dictionaries
Dictionaries provide a mechanism for storing data labelled with a keyword or other information. To create a dictionary, we use curly brackets, and associate each piece of data with a label:
= {<label1>: <data1>, <label2>: <data2>} a
Another syntax to create dictionaries is:
= dict(<label1>=<data1>, <label2>=<data2>) a
For example, if we wished to store an object’s mass, location and length, we might create a dictionary as follows:
= {'mass': 17.3, 'length': 0.7, 'location': (3, 1, 8)} a
or
= dict(mass=17.3, length=0.7, location=(3, 1, 8)) a
Note that in this example, the location
key is storing a tuple. This illustrates that you can store many things in dictionaries, including arrays, list, tuples or objects for instance.
Now, we can use the keywords as an index:
print(a['mass'])
print(a['length'])
= 0.5 * a['mass'] * a['length']**2
inertia print(inertia)
Often keywords will be text strings, but they do not need to be:
= (3, 4, 5)
y = {2: '13.5', y: [6, 7, 8]}
a print(a[2])
print(a[y])
You can add new entries to a dictionary by simply assigning to the indexed entry you wish to create:
= {'name': 'Bob'}
a print(a['name']) # Prints 'Bob'
# print(a['age']) would not work at this point
'age'] = 28
a[print(a) # Now contains both 'name' and 'age'.
Again, Python provides a number of functions that work with dictionaries, many of which we have already encountered, including: - len(a)
- max(a)
- min(a)
- a.copy()
- a.pop()
Some methods particular to dictionaries are: - a.keys()
- returns a list-like object containing all the ‘labels’ within the dictionary a
, - a.values()
- returns a list-like object containing all the ‘values’ within the dictionary a
, and - a.items()
- returns a list-like object containing (label, data)
tuples for each of the entries in the dictionary.
Dictionaries provide a useful mechanism for storing miscellaneous, unstructured information, such as metadata. The use of human-readable labels makes it easy to remember what each entry represents.
Some properties of the first few elements in the periodic table are given below. Melting and boiling points are determined at atmospheric pressure.
Element | Symbol | Atomic Number | Melting point (K) | Boiling point (K) |
---|---|---|---|---|
Hydrogen | H | 1 | 14 | 20 |
Helium | He | 2 | 1 | 4 |
Lithium | Li | 3 | 453 | 1603 |
Beryllium | Be | 4 | 1560 | 2742 |
Boron | B | 5 | 2349 | 4200 |
Carbon | C | 6 | 3915 | 3915 |
Nitrogen | N | 7 | 63 | 77 |
Oxygen | O | 8 | 54 | 90 |
Fluorine | F | 9 | 53 | 85 |
Neon | Ne | 10 | 25 | 27 |
In a previous exercise, you wrote code to calculate mortgage repayments. Different banks are offering different interest rates, with special introductory rates for the first two years of the mortgage:
Bank name | Years 1 & 2 | Year 3 onwards |
---|---|---|
ANZ | 2.3% | 4.1% |
Bank of Australia | 0.1% | 5% |
Commonwealth Bank | 3.5% | 3.8% |
Westpac | 3.7% | 3.7% |
Sets
Finally, we briefly mention the set data type. This is not commonly encountered, but can be very useful. It implements the mathematical concept of a set, an unordered collection of unique objects. A set can be created from a tuple or list:
= [3, 4, 5]
a = set(a)
s print(s) # prints {3, 4, 5}
Duplicates are ignored:
= [3, 3, 5]
a = set(a)
s print(s) # prints {3, 5}
We can test whether one set is a subset of another set (that is, whether every element of the first set is also in the second) using the <
,>
, <=
and >=
operators:
set([3, 4]) < set([3, 4, 5]) # True
set([3, 4, 5]) < set([3, 4, 5]) # False
set([3, 4, 5]) <= set([3, 4, 5]) # True
We can find the union of two sets - the set of all elements in either set - using the |
operator, and the intersection - all elements shared between the sets - using the &
operator:
= set([3, 4])
a = set([4, 5, 6])
b
print(a | b) # prints {3, 4, 5, 6}
print(a & b) # prints {4}
We can also use the operator -
to remove the elements of one set from another set.
= set([3, 4])
a = set([4, 5, 6])
b
print(b - a) # prints {5, 6}
print(b^a) # prints {3, 5, 6}
There is also a ^
operator, which is defined such that a^b
is equivalent to (a|b) - (a&b)
.
::: {.callout-tip collapse=“true” icon=“false”} ## Exercise 8 - Do you understand sets ?
Check the examples given above to make sure you have some experience of what the set
construct is within python.