Dictionaries
Contents
Dictionaries#
Programming for Geoscientists Data Science and Machine Learning for Geoscientists
A dictionary is a Python data structure that can store data as key-value pairs. The syntax is:
dict1 = {key1: value1, key2: value2, key3: value3, ...}
Keys can be strings or numbers and values can be anything: strings, numbers, lists, arrays, etc. Keys must be unique - if you set it twice, the second value replaces the first.
rocks_dict = {"basalt": 1, "granite": 2,
"marl": 3, "gneiss": 4,
"shale": 5}
print(rocks_dict)
{'basalt': 1, 'granite': 2, 'marl': 3, 'gneiss': 4, 'shale': 5}
We can access and modify values based on their key:
# Access value with key 'basalt'
print(rocks_dict["basalt"])
# Create a new key 'sandstone' with value 6
rocks_dict["sandstone"] = 6
print(rocks_dict)
# Add another key/valye pair to the dictionary
rocks_dict.update({"schist": 7})
print(rocks_dict)
# Remove new entry
del rocks_dict["sandstone"]
print(rocks_dict)
# Remove entry
rocks_dict.pop("schist")
print(rocks_dict)
1
{'basalt': 1, 'granite': 2, 'marl': 3, 'gneiss': 4, 'shale': 5, 'sandstone': 6}
{'basalt': 1, 'granite': 2, 'marl': 3, 'gneiss': 4, 'shale': 5, 'sandstone': 6, 'schist': 7}
{'basalt': 1, 'granite': 2, 'marl': 3, 'gneiss': 4, 'shale': 5, 'schist': 7}
{'basalt': 1, 'granite': 2, 'marl': 3, 'gneiss': 4, 'shale': 5}
We can also search and iterate over keys:
# Search if key 'granite' exists
if "granite" in rocks_dict:
print(rocks_dict["granite"])
# Iterate over keys in rocks_dict
for key in rocks_dict:
print(key, rocks_dict[key])
2
basalt 1
granite 2
marl 3
gneiss 4
shale 5
Exercises#
Countries per continent Question very similair to the one in File Handling exercises. Change the following code, so that the result is a dictionary:
from pandas import read_csv
df = read_csv('Data\\CountryContinent.csv')
continents = df['Continent_Name'].unique() # list of continent names from the file
res = [[continent, 0] for continent in continents] # initial list, not counted yet
for index, row in df.iterrows():
if row["Three_Letter_Country_Code"] != "nan":
for j in range(len(res)):
if row["Continent_Name"] == res[j][0]: # find correct continent
res[j][1] += 1 # increase country count by 1
print(res)
[['Asia', 58], ['Europe', 57], ['Antarctica', 5], ['Africa', 58], ['Oceania', 27], ['North America', 43], ['South America', 14]]
Answer
from pandas import read_csv
df = read_csv('Data\\CountryContinent.csv')
continents = df['Continent_Name'].unique()
res = {}
for i in continents:
res[i] = 0
for index, row in df.iterrows():
if row["Three_Letter_Country_Code"] != "nan":
res[row["Continent_Name"]] += 1
print(res)