OOP (object oriented programming) / class

Object oriented programming (OOP) is a design principle. It’s base is a class, a blueprint for a digital object:

  • we can store data (properties) in an object

  • we can perform actions on this data through built-in functions

  • we can create numerous instances of the same class, but all are independent

For example all strings belong to the class str. Each string is one independent instance of that class. It stores data (the text) and methods to process (and return) this data.

a = 'the quick brown fox'
b = 'jumps over the lazy dog'

print(type(a))
print(a)
<class 'str'>
the quick brown fox
print(a.title())
print(b.swapcase())
The Quick Brown Fox
JUMPS OVER THE LAZY DOG

For some objects/ classes the internal methods modify the data itself:
l = a.split()
print(l)
l.sort()
print(l)
print(l.pop())
print(l)
['the', 'quick', 'brown', 'fox']
['brown', 'fox', 'quick', 'the']
the
['brown', 'fox', 'quick']

Creating a custom class

Programming languages based on the principle of OOP have some classes built-in already. Furthermore you can define your own custom blueprint and create objects based on that.
A class is defined with the keyword class followed by a name of your choice (like names for variables and methods).
A class consists of at least one method called __init__. It’s executed when a new object of that class is created.
Typically a class holds data, which can be inserted when an instance is created.

# Define a custom class
# Typically it's name is capitalized
class CustomClass:
    # A function to initialize the object is mandatory
    def __init__(self, x, y):
        # self refers to the object itself, 
        # the following variables to additional data
        
        # Assign the input data (x and y) to the properties
        # of the object
        self.x = x
        self.y = y

Create instances of that class

A new object is created through calling the class like a function and assign this to a variable.

a = CustomClass('some data', 'another data point')
print(type(a))
<class '__main__.CustomClass'>
# Access the data with the dot notation
a.x
'some data'
a.y
'another data point'
# A second instance of the same class. This does not affect the object a.
b = custom_class(1, 2)
b.y
2

It’s possible to change the values with the same notation:

b.y = 3
b.y
3

Additional internal functions

Objects become powerful through internal functions:

class CustomClass:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def sum_(self):
        '''Return the sum of the parameters.'''
        return self.x + self.y
# Create an instance:
a = CustomClass(1, 4)

# Call the method sum_
a.sum_()
5
b = CustomClass('hello', ' world!')
b.sum_()
'hello world!'

__str__(self)

A well defined class should return meaningful output when we print an instance of it.

print(b)
<__main__.CustomClass object at 0x7f56b382bca0>

This is not yet meaningful. We have to define the output through a internal method called __str__, which is called when the object is inserted into the print() function.

class CustomClass:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __str__(self):
        ''' __str__ is called when you convert an object into
        a string (through print(), str()).'''
        return f'{self.x}, {self.y}'
        
    def sum_(self):
        '''Return the sum of the parameters.'''
        return self.x + self.y
b = CustomClass('hello', ' world!')
print(b)
hello,  world!
str(b)
'hello,  world!'

__repr__(self)

Optional we could define a method called __repr__:

class CustomClass:
    '''A custom class for ...'''
    
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __str__(self):
        ''' __str__ is called when you convert an object into
        a string (through print(), str()).'''
        return f'{self.x}, {self.y}'
    
    def __repr__(self):
        '''Used for a more elaborate output (for debugging).'''
        return (f'{self.__class__.__name__}('
                f'{self.x!r}, {self.y!r})')
        
    def sum_(self):
        '''Return the sum of the parameters.'''
        return self.x + self.y
b = CustomClass('hello', ' world!')
print(b)
print(repr(b))
hello,  world!
CustomClass('hello', ' world!')

__doc__

The docstring of functions is accessible through the method __doc__:

CustomClass.__doc__
'A custom class for ...'
CustomClass.sum_.__doc__
'Return the sum of the parameters.'
# It's possible to call the methods through instances of the class as well:
b.sum_.__doc__
'Return the sum of the parameters.'

You can inspect more methods through dir():

dir(CustomClass)
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'sum_']

Class: ShuffleNouns

As a more useful example we’ll wrap the code form the shuffle nouns function into a class.

# Shuffle nouns written as a function
def shuffle_nouns_function(txt):
    from nltk import pos_tag, word_tokenize
    import random
    
    # Store words + tags in a list
    # like [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN')]
    tags = pos_tag(word_tokenize(txt))
    
    # Store all nouns in a new list
    nouns = [word[0] for word in tags if word[1] == 'NN']
    
    # Shuffle nouns
    random.shuffle(nouns)

    # Empty variable for the output
    out = ''

    for word, tag in tags:
        # Add a space if the word is alnum
        if word.isalnum():
            out += ' '

        # Replace noun     
        if tag == 'NN':
            # pick the last noun from the shuffled list
            # and remove it from the list
            word = nouns.pop()

        # Add the word (or punct) to the output
        out += word

    out = out.strip()
    
    return out
s = 'A lazy programmer jumps over a quick brown fox jumps over a lazy programmer jumps over a fire fox.'

shuffle_nouns_function(s)
'A lazy fox jumps over a quick programmer fox jumps over a lazy fire jumps over a programmer brown.'

To profit from the principle of OOP we'll divide the code from the function into 2 sections:
  • the data (the input string)

  • the method(s) (shuffle and return the input string)

This separation makes it possible

  • to initialize the object with the input string, which will be analyzed and stored as usable data inside the object itself

  • to call the shuffle method multiple times without the need to feed the data into it/ analyze it every time

# Shuffle nouns written as a class
class ShuffleNouns:
    # Initialize object, analyze txt
    def __init__(self, txt):
        
        from nltk import pos_tag, word_tokenize
        
        # Store words + tags in a list
        # like [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN')]
        self.tags = pos_tag(word_tokenize(txt))

        # Store all nouns in a new list
        self.nouns = [word[0] for word in self.tags if word[1] == 'NN']

        
    def shuffle(self):
        
        import random
        
        # Create a local copy of the nouns
        nouns = self.nouns.copy()
        
        # Shuffle nouns
        random.shuffle(nouns)

        # Empty variable for the output
        out = ''

        for word, tag in self.tags:
            # Add a space if the word is alnum
            if word.isalnum():
                out += ' '

            # Replace noun     
            if tag == 'NN':
                # pick the last noun from the shuffled list
                # and remove it from the list
                word = nouns.pop()

            # Add the word (or punct) to the output
            out += word

        out = out.strip()

        return out
s = 'A lazy programmer jumps over a quick brown fox jumps over a lazy programmer jumps over a fire fox.'
# Initialize the object
text = ShuffleNouns(s)
# Call the shuffle method
text.shuffle()
'A lazy fox jumps over a quick programmer brown jumps over a lazy programmer jumps over a fox fire.'
# Call shuffle again. It's not necessary to insert/ analyze the string again.
text.shuffle()
'A lazy fox jumps over a quick programmer brown jumps over a lazy programmer jumps over a fox fire.'
for i in range(2):
    print(text.shuffle())
A lazy fire jumps over a quick brown fox jumps over a lazy fox jumps over a programmer programmer.
A lazy fox jumps over a quick programmer brown jumps over a lazy programmer jumps over a fox fire.

Class: Markov

In the following use case we’ll define a very simple (and not fully elaborated) class for generating text with a markov chain. First we’ll insert a parameter to insert a text corpora into the class:

class Markov():
    '''Generate a text with a simple one-word based markov chain.'''
    def __init__(self, txt):
        self.txt = txt # Holds the text corpora
txt = '''The quick brown fox jumps over the lazy dog. The lazy programmer jumps over the fire fox.'''

m = Markov(txt)
# Check that the text is available:
print(m.txt)
The quick brown fox jumps over the lazy dog. The lazy programmer jumps over the fire fox.

Next the class needs a dictionary to hold the probabilities. This results in a clean and organized program, because we don’t have to deal with the dictionary outside of the functionality of the Markov chain.
The dictionary will be created internally, so we don’t need to insert a parameter for that into __init__:

class Markov():
    '''Generate a text with a simple one-word based markov chain.'''
    def __init__(self, txt):
        self.txt = txt # Holds the text corpora.
        self.dictionary = {} # Holds the dictionary for probabilities.

Then we need a method to create that dictionary based on the text corpora:

class Markov():
    '''Generate a text with a simple one-token word markov chain.'''
    def __init__(self, txt, txt_lower=False):
        self.txt = txt.lower() if txt_lower else txt # Holds the text corpora.
        self.dictionary = {} # Holds the dictionary for probabilities.
        
    def create_dictionary(self):
        # Split txt into a list:
        txt = self.txt.lower().split()
        
        self.dictionary = {}
        
        for i in range(len(txt)-1):
            
            # The current token (i) and the next tokens (i+n) are key.
            key = txt[i]

            # The next token after the last token of key is the corresponding value.
            value = txt[i+1]
            
            # First check if the key exists in the dictionary already.
            if key in self.dictionary.keys():
                # If yes, append the value to the list.
                self.dictionary[key].append(value)

            # Else insert the new key + the value in form of a [list].
            else:
                self.dictionary[key] = [value]
m = Markov(txt)
m.create_dictionary()
m.dictionary
{'the': ['quick', 'lazy', 'lazy', 'fire'],
 'quick': ['brown'],
 'brown': ['fox'],
 'fox': ['jumps'],
 'jumps': ['over', 'over'],
 'over': ['the', 'the'],
 'lazy': ['dog.', 'programmer'],
 'dog.': ['the'],
 'programmer': ['jumps'],
 'fire': ['fox.']}

The last part is a method to generate a sentence:

class Markov():
    '''Generate a text with a simple one-word based markov chain.'''
        
    def __init__(self, txt, txt_lower=False):    
        self.txt = txt.lower() if txt_lower else txt # Holds the text corpora.
        self.dictionary = {} # Holds the dictionary for probabilities.
        
    def create_dictionary(self):
        # Split txt into a list:
        txt = self.txt.lower().split()
        
        self.dictionary = {}
        
        for i in range(len(txt)-1):
            
            # The current token (i) and the next tokens (i+n) are key.
            key = txt[i]

            # The next token after the last token of key is the corresponding value.
            value = txt[i+1]
            
            # First check if the key exists in the dictionary already.
            if key in self.dictionary.keys():
                # If yes, append the value to the list.
                self.dictionary[key].append(value)

            # Else insert the new key + the value in form of a [list].
            else:
                self.dictionary[key] = [value]
                
    def generate_sentence(self, inp_):
        import random
        # Transform input into a list
        gen_txt = inp_.split()
        
        while not gen_txt[-1].endswith('.'):
            new_token = random.choice(self.dictionary[gen_txt[-1].lower()])
            gen_txt.append(new_token)
            
        # Return generated text as string:
        return ' '.join(gen_txt)
m = Markov(txt)
m.create_dictionary()
new_text = m.generate_sentence('The')
print(new_text)
The quick brown fox jumps over the quick brown fox jumps over the lazy dog.
for i in range(3):
    print(m.generate_sentence('The'))
The fire fox.
The quick brown fox jumps over the fire fox.
The quick brown fox jumps over the quick brown fox jumps over the lazy dog.

When are classes useful?

  • If you need/ want to store data and methods in the same object

  • If you need multiple instances of the same object

  • If you want to want to write more sophisticated modular/ reusable code (for example to share it as a library)