Python: Tuples, Dictionaries and Sets¶

Tuple¶

A tuple is defined with (). It can store multiple values like a list, but the items are unchangeable (immutable). This means that the order and the values of the items are fixed.

rgb = (93, 217, 117)

type(rgb)

tuple

# Loop over a tuple:
for item in rgb:
    print(item)

93
217
117

# We can access items through their index:
rgb[1]

# It's not possible to change the values.
rgb[1] = 216

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_8042/3999292401.py in <module>
      1 # It's not possible to change the values.
----> 2 rgb[1] = 216

TypeError: 'tuple' object does not support item assignment

Change tuple values through list conversion¶

It’s not possible to change the tuple directly, instead it can be converted to a list, then modified, then converted into a tuple:

print(rgb)
# Convert tuple to list.
rgb = list(rgb)
# Change value.
rgb[1] = 216
# Convert list to tuple.
rgb = tuple(rgb)
print(rgb)

(93, 217, 117)
(93, 216, 117)

Task: Create a new tuple with values of your choice.
Convert it to a list and modify the list (append, remove).
Convert it back to a tuple and iterate over it.

rose = ('a', 'rose')
rose = list(rose)
rose += ['is a rose']*2
rose = tuple(rose)
for item in rose:
    print(item)

a
rose
is a rose
is a rose

Data types inside a tuple¶

# A tuple can contain any data type and also mixed data (like a list).
mixed_tuple = (93, '217', 117.0, ['a', 'list', 'inside', 'a', 'tuple'], ('a', 'tuple', 'inside', 'a', 'tuple'))
print(mixed_tuple)

(93, '217', 117.0, ['a', 'list', 'inside', 'a', 'tuple'], ('a', 'tuple', 'inside', 'a', 'tuple'))

for item in mixed_tuple:
    print(item)

93
217
117.0
['a', 'list', 'inside', 'a', 'tuple']
('a', 'tuple', 'inside', 'a', 'tuple')

'217' in mixed_tuple

True

Unpacking a tuple¶

# Unpacking a tuple into single variables:
r, g, b = rgb
print(r)
print(g)
print(b)

93
217
117

This is for example useful, if a function should return multiple values into several variables:

def return_multiple_values():
    a = 10
    b = a*2
    
    # Return values as a tuple:
    return (a, b) # It's possible to leave the (): return a, b

# Unpack return values in two separate variables:
outer, inner = return_multiple_values()
print(outer)
print(inner)

10
20

Write a function that generates 3 random color values between 0 and 255 and store them in three independent variables.

def generate_rgb():
    import random
    
    # Generate three values in a list.
    rgb_list = [random.randrange(0, 256) for i in range(3)]
    # Convert to tuple
    rgb_tuple = tuple(rgb_list)
    
    return rgb_tuple

r, g, b = generate_rgb()

print(r)
print(g)
print(b)

86
123
77

Dictionaries¶

Dictionaries consists of key:value pairs. A key is unique. The syntax is

{'key':'value', 'key':'value', 'key':'value'}

d = {'x':'value1', 'y':'value2'}

type(d)

dict

# get keys
d.keys()

dict_keys(['x', 'y'])

# get values
d.values()

dict_values(['value1', 'value2'])

# get value from key
d.get('x')
# or
d['x']

'value1'

for key in d.keys():
    print(d.get(key))

value1
value2

Add and remove items¶

d['z'] = 'zebra'
print(d.keys())
print(d.get('z'))

dict_keys(['x', 'y', 'z'])
zebra

d.pop('z')
print(d.keys())

dict_keys(['x', 'y'])

Loop dictionaries¶

for key in d.keys():
    print(key)

x
y

for value in d.values():
    print(value)

value1
value2

for key, value in d.items():
    print(key, '-', value)

x - value1
y - value2

More about dictionaries¶

# Copying a dictionary is done in one of the two following ways:

d_copy = d.copy()
# or
d_copy = dict(d)

Task: Create a copy of d
Change a value.
Remove a key
Add a new key:value pair
Change the key of that pair.
Loop over the dictionary and print the pairs.

# Create a copy of d.
d_new = dict(d)
# Change a value.
d_new['x'] = 'updated value'
# Remove a key.
d_new.pop('y')
# Add a new key:value pair.
d_new['z'] = 'a new key:value pair'
# Change its key.
d_new['z2'] = d_new.pop('z')
# Print it.
for key, value in d_new.items():
    print(key, ':', value)

x : updated value
z2 : a new key:value pair

Markov text generator with dictionary¶

To illustrate the usage of a dictionary we’ll create a very simple Markov Chain.
Markov chains consist of probability distributions for predicting next values based on an existing value. This could be used for text generation, for example as a next word suggestion tool in a mobile phone.

The probability is drawn from a data set, for example a text.

# Text corpora for the probabilities:
txt = '''The quick brown fox jumps over the lazy dog. The lazy programmer jumps over the fire fox.'''
txt = txt.lower().split()

Loop over txt and create key:value pairs for each word.
Each unique word is stored as a key and all the words next to that key are stored as values.

dictionary = {}
debug = True

for i in range(len(txt)-1):
    key = txt[i]
    value = txt[i+1]
    
    # Check if key exists:
    if key in dictionary.keys():
        # Then append the value to it's list of values.
        dictionary[key].append(value)
        if debug:
            print(key, '\tin dictionary,', value, 'added as value')
    
    # Else create the key and a list which holds the value.
    else:
        dictionary[key] = [value]
        if debug:
            print(key, '\tadded as key to dictionary,', value, 'added as value')

the 	added as key to dictionary, quick added as value
quick 	added as key to dictionary, brown added as value
brown 	added as key to dictionary, fox added as value
fox 	added as key to dictionary, jumps added as value
jumps 	added as key to dictionary, over added as value
over 	added as key to dictionary, the added as value
the 	in dictionary, lazy added as value
lazy 	added as key to dictionary, dog. added as value
dog. 	added as key to dictionary, the added as value
the 	in dictionary, lazy added as value
lazy 	in dictionary, programmer added as value
programmer 	added as key to dictionary, jumps added as value
jumps 	in dictionary, over added as value
over 	in dictionary, the added as value
the 	in dictionary, fire added as value
fire 	added as key to dictionary, fox. added as value

dictionary

{'the': ['quick', 'lazy', 'lazy', 'fire'],
 'quick': ['brown'],
 'brown': ['fox'],
 'fox': ['jumps'],
 'jumps': ['over', 'over'],
 'over': ['the', 'the'],
 'lazy': ['dog.', 'programmer'],
 'dog.': ['the'],
 'programmer': ['jumps'],
 'fire': ['fox.']}

Based on this very small dictionary we can start the text generation. It will start with a given input. This input is used as a key to get all possible next words from the corpora:

inp_ = 'the'

# Get all values for the key inp_:
possibilities = dictionary[inp_]
possibilities

['quick', 'lazy', 'lazy', 'fire']

In the text above the word the is followed by quick, lazy, lazy, fire, which are the possible next words for the input the.
Then we can pick one of them:

random.choice(possibilities)

'lazy'

Then this word is the next key and a value is looked up.

gen_txt = ['the'] # Input / generated text as a list.

# Loop until the last word ends with a dot.
while not gen_txt[-1].endswith('.'):
    
    # Pick a value. Key is the last item from the list.
    new_token = random.choice(dictionary[gen_txt[-1]])
    
    # Append the picked value to the list.
    # This value is the key in the next iteration of the loop.
    gen_txt.append(new_token)
    
# Join list to string and print it.
gen_txt = ' '.join(gen_txt)
print(gen_txt)

the lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.

For more about generating text with a Markov chain see hands-on text generators.

Set¶

Set is the 4th data type for storing collections of data next to list, tuple and dictionary. A set is like a list except it can contain only one occurence of each item. A set is created with {} and comma-separated values.

data = {4, 29, 'two words', 4}
print(data)
print(type(data))

{'two words', 4, 29}
<class 'set'>

The 4 appears two times in the creation but only once in the object. Furthermore the items in a set are unordered.

set()¶

A set can be created by transforming a sequence of data into a set with the set() method.

txt = '''The quick brown fox jumps over the lazy dog. The lazy programmer jumps over the fire fox.'''

txt_list = txt.split()
print('list:', txt_list)
print(len(txt_list), 'items\n')

# Transform the list into a set:
txt_set = set(txt.split())
print('set:', txt_set)
print(len(txt_set), 'items')

list: ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog.', 'The', 'lazy', 'programmer', 'jumps', 'over', 'the', 'fire', 'fox.']
17 items

set: {'The', 'programmer', 'jumps', 'over', 'lazy', 'fox.', 'brown', 'the', 'dog.', 'quick', 'fox', 'fire'}
12 items

Access items¶

Items in a set don’t have an index. Instead we have to iterate over them.

for item in txt_set:
    print(item)

fox.
The
fox
over
lazy
dog.
programmer
the
quick
fire
jumps
brown

Add items¶

txt_set.add('squirrel')

Remove items¶

txt_set.remove('fox.')

Join multiple sets¶

Sets can be joined either with update() or with union(). Update will merge the sets into one, union will create a new combined set.

# Create two sets with overlapping content:
set_a = set([x for x in range(5, 10)])
set_b = set([x for x in range(7, 13)])

# Combine them to a new set.
set_c = set_a.union(set_b)
print(set_c)

# Write set_b into set_a.
set_a.update(set_b) # This happens in place.
print(set_a)

{5, 6, 7, 8, 9, 10, 11, 12}
{5, 6, 7, 8, 9, 10, 11, 12}

More methods for set¶

set_a = set([x for x in range(5, 10)])
set_b = set([x for x in range(7, 13)])

# Calculate the difference

diff_a = set_a.difference(set_b)
print(diff_a)

diff_b = set_b.difference(set_a)
print(diff_b)

# Calculate the intersection

inter = set_a.intersection(set_b)
print(inter)

{5, 6}
{10, 11, 12}
{8, 9, 7}

You can inspect more methods with help(set).

Summary¶

It’s not necessary to know all the methods of all different data types for collections (list, tuple, dictionary, set). It’s more important to know of their existence and to know their characteristics (differences). Then you can choose the best type for your task. Most of the time this may be a list. But for example if you need to remove all multiple occurences of items from a list, you can easily convert it into a set and convert it back again into a list. A tuple may be useful to return multiple values from a function, a dictionary is obviously useful when dealing with pairs of data.

Programming books with Python