Functions

A function is a block of code that is executed together.

The image below is a diagram of an algorithm as the in-between of an input and an output. Commonly functions have an input area, where data is being feed into, then the processing of data happens inside the function and the result will be output to the outside of the function.

algorithm.jpg

A typical use-case could be the Stein’sche Grammar that we’ve implemented in the previous notebook:


# Code outside of a function:

import spacy

nlp = spacy.load('en_core_web_sm')

txt = 'Quotation marks, commas, nouns and adjectives are not necessary nor interesting, so they should be avoided. Exclamation marks (!) are ugly?'

# Analyze text with the spacy object
doc = nlp(txt)
    
stein = '' # Empty string for the new text.

# Remove all unwanted tokens
for token in doc:
    if token.pos_ not in ['NOUN', 'ADJ'] and str(token) not in [',', '!', '?']:
        # Add the token, 
        # with a leading space if the token is not a PUNCT (or stein is empty)
        if token.pos_ != 'PUNCT' and stein != '':
            stein += ' '
        stein += str(token)
        
print(stein)

We can put all the code inside a function. Then we can insert some text as a string into the function and will receive the processed text in return.

Example: Stein’sche Grammar

# Code inside a function:

def stein_grammar(data):
    '''Process a text according to the grammar of Gertrude Stein.'''
    import spacy
    
    nlp = spacy.load('en_core_web_sm')

    # Analyze text with the spacy object
    doc = nlp(data)

    stein = '' # Empty string for the new text.

    # Remove all unwanted tokens
    for token in doc:
        if token.pos_ not in ['NOUN', 'ADJ'] and str(token) not in [',', '!', '?']:
            # Add the token, 
            # with a leading space if the token is not a PUNCT (or stein is empty)
            if token.pos_ != 'PUNCT' and stein != '':
                stein += ' '
            stein += str(token)

    # Return processed input
    return stein
s = 'Quotation marks, commas, nouns and adjectives are not necessary nor interesting, so they should be avoided. Exclamation marks (!) are ugly?'

# Call the function, insert the data and store the returned data in a new variable
res = stein_grammar(data=s)

print(res)
and are not nor so they should be avoided.() are

Syntax of a function:

Like a variable, a function has a name and is thus callable. When we initiate a variable, we just write the name and assign data to it, like

txt = 'The quick brown fox jumps over the lazy dog'
len_txt = len(txt)

When we define a function, we start with the keyword def (for definition), followed by the name of the function (which is of our choice), followed by () and a :. In the next line(s) we insert the code of the function. If the result(s) of the function should be transported to the outside, this is done with the keyword return followed by the data.

def my_function():
    # code inside
    result = '...' # local variable result
    return result

Everything inside the function is indent by one tab. As soon as a line of code is not indent anymore, it’s outside of the function. The transportation of data into the function happens through the (). We can specify variables of input data and then assign inputs to that variables:

def my_function(data):
    # data is available here
    # output = processed data
    return output

def stein_grammar(s):
    # process input (s)
    # return processed data
    return s

Feed data into a function

Multiple input variables are separated by commas:

def my_function(val1, val2):
    # process data, for e.g.
    data = val1 + val2
    return data

res = my_function(val1=5, val2=2)
print(res)
7

It’s not necessary to assign the data with the variable names, instead it’s enough to insert the data itself and it will be assigned automatically to the variables of the function in the order we put it into it:

res = my_function('a rose', ' is a rose')
print(res)
res = my_function('is a rose', ' a rose')
print(res)
a rose is a rose
is a rose a rose

It’s possible to define default values for input variables of a function. If the value is not specified when calling the function, the default is used:

def my_function(val1 = 'a rose', val2 = ' is a rose'):
    data = val1 + val2
    return data

print(my_function())

print(my_function('x', 'y'))
a rose is a rose
xy

Example: Burroughs Replacements

Move the lines of code from the previous chapter

txt = txt.replace(' is ', ' ').replace(' to be ', ' ')
txt = txt.replace(' the ', ' a ').replace('The ', 'A ')
txt = txt.replace(' or ', ' and ').replace('Or ', 'And ')

into a function called burroughs_replacements.

def burroughs_replacements(txt):
    # txt is a local variable that does not
    # conflict with the global 'txt' variable
    txt = txt.replace(' is ', ' ').replace(' to be ', ' ')
    txt = txt.replace(' the ', ' a ').replace('The ', 'A ')
    txt = txt.replace(' or ', ' and ').replace('Or ', 'And ')
    
    return txt

txt = '''The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.

We will generate this and more exciting texts in the seminar »Ghostwriter« by means of code.  Sample texts: Screenplay, Concept for a work of art, Digital poetry, Invented words, Advertising slogans, Shopping list, Pop song, Theory, Code.

Most text generation processes use existing text as material for new text. In the course of the seminar, everyone will create/download their own body of text to be used as the basis for new text. The goal is to write (program) a machine author and use it to generate texts. In addition to our own production, we will look at works from the field of digital/electronic literature and, in accordance with the title, also discuss authorship.
'''.replace('\n', ' ').replace('  ', ' ')

res = burroughs_replacements(txt)
print(res)
A lazy programmer jumps over a quick brown fox jumps over a lazy programmer jumps over a fire fox. We will generate this and more exciting texts in a seminar »Ghostwriter« by means of code. Sample texts: Screenplay, Concept for a work of art, Digital poetry, Invented words, Advertising slogans, Shopping list, Pop song, Theory, Code. Most text generation processes use existing text as material for new text. In a course of a seminar, everyone will create/download their own body of text used as a basis for new text. A goal to write (program) a machine author and use it to generate texts. In addition to our own production, we will look at works from a field of digital/electronic literature and, in accordance with a title, also discuss authorship. 

Example: translate poem (#file.read())

def translate_poem(text):
    # Original author: Julia Nakotte: #file.read() (2021)
    # Adapted to a working translate library
    # Added return functionality

    from textblob import TextBlob
    from textblob import Word
    import translators as ts
    import random

    output = '' # Store result instead of printing it
    
    for a in range(3):
        title = ts.google(random.choice(text.split()), to_language =  "en")
        titel = ts.google((f"{title}"), to_language = "de")
        output += "\033[1m" + f"{titel}" + "\033[0m" + "\n"

        for t in range(5):
            RandomN = random.choice([w for (w, pos) in TextBlob(text).tags if pos[1] == "N"])
            RandomV = random.choice([w for (w, pos) in TextBlob(text).tags if pos[0] == "V"])
            RandomA = random.choice([w for (w, pos) in TextBlob(text).tags if pos[0] == "J"])
            RandomAv = random.choice([w for (w, pos) in TextBlob(text).tags if pos[0] == "R"])
            RandomP = random.choice([w for (w, pos) in TextBlob(text).tags if pos[0] == "P"])

            wörter = RandomN, RandomV, RandomA, RandomAv, RandomP # , "\n" moved to the end
            poem = ts.google(" ".join(random.sample(wörter, k = len(wörter))), to_language =  "en")
            uta = ts.google((f"{poem}"), to_language = "ja")
            gedicht = ts.google((f"{uta}"), to_language = "de")
            output += f"{gedicht}\n"

        output += 6*"\n"
        
    return output

Example: shuffle nouns

The following function takes an (english) text as input and returns the same text, except the nouns are shuffled around. Procedure:

  • create a list of word - pos tag pairs

  • create a second list with all the nouns of the input text

  • shuffle this list

  • iterate over the list of all words

    • if a word is a noun: pick a new noun from the shuffled list and append it to the output

    • if it is not a noun: append it directly to the output

def shuffle_nouns(txt):
    from nltk import pos_tag, word_tokenize
    import random
    
    # Store words + tags in a list
    # like [('The', 'DT'), ('quick', 'JJ'), ('brown', 'NN')]
    tags = pos_tag(word_tokenize(txt))
    
    # Store all nouns in a new list
    nouns = [word[0] for word in tags if word[1] == 'NN']
    
    # Shuffle nouns
    random.shuffle(nouns)

    # Empty variable for the output
    out = ''

    for word, tag in tags:
        # Add a space if the word is alnum
        if word.isalnum():
            out += ' '

        # Replace noun     
        if tag == 'NN':
            # pick the last noun from the shuffled list
            # and remove it from the list
            word = nouns.pop()

        # Add the word (or punct) to the output
        out += word

    out = out.strip()
    
    return out

Scope of functions/ variables

When a function is defined in the global space of the program, it’s available everywhere in the program and we can access it multiple times of course. (We could also override it but that’s most of the times not useful.

# Process the data returned from burroughs_replacements with stein_grammar.
res = stein_grammar(res) 
print(res)
A jumps over a jumps over a jumps over a. We will generate this and more in a » Ghostwriter« by of.: Screenplay for a of Digital Invented Theory Code. use existing as for. In a of a everyone will create / download their of used as a for. A to write() a and use it to generate. In to our we will look at from a of / and in with a also discuss.

Functions inside functions

# Multiple text processing functions inside one main function:

def text_processing(data):
    
    res = burroughs_replacements(data)
    
    res = translate_poem(res)
    
    res = stein_grammar(res)
        
    return res
txt = '''The lazy programmer jumps over the quick brown fox jumps over the lazy programmer jumps over the fire fox.

We will generate this and more exciting texts in the seminar »Ghostwriter« by means of code.  Sample texts: Screenplay, Concept for a work of art, Digital poetry, Invented words, Advertising slogans, Shopping list, Pop song, Theory, Code.

Most text generation processes use existing text as material for new text. In the course of the seminar, everyone will create/download their own body of text to be used as the basis for new text. The goal is to write (program) a machine author and use it to generate texts. In addition to our own production, we will look at works from the field of digital/electronic literature and, in accordance with the title, also discuss authorship.
'''.replace('\n', ' ').replace('  ', ' ')

txt = text_processing(txt)

print(txt)
Pop0 m 
 Digitale Verwendung die wir besitzen 
 Wir auch Programmartik. 
 Außerdem -Nachricht 
 Außerdem es so schnell wie wir 
 Darüber Verwendung Text 






 Feld0 m 
 Mehr Slogans springen 
 Der Autor 
 Auch bestehende 
 Bitte schreiben Sie mehr 
 Neue Wörter die wir auch beobachten 






 m 
 Digital / Look 
 Theorie springt 
 Mein Sprung Textnachrichten 
 Sie werden Materialien diskutieren 
 Außerdem werden wir das erzeugen was wir besitzen 

(The function stein_grammar does not work properly anymore as the text to process is now in German and the language model for spacy is made for English. But if the stein_grammar would come first, the function translate_poem would have to work with reduced material. (But that may be result in interesting text as well.)

Variables inside/ outside functions

x = 'global variable' # Global variable

def print_x():
    print(f'this x is a {x}')
    
print_x()
this x is a global variable
x = 'global variable' # Global variable

def print_x():
    x = 'local variable'
    print(f'this x is a {x}')
    
print_x()
print(f'value of the global variable: {x}')
this x is a local variable
value of the global variable: global variable
x = 'global variable' # Global variable

def print_x():
    global x # The global keyword references the global variable instead of the local one
    x = 'global variable accessed from within a local scope'
    print(f'this x is a {x}')
    
print_x()
print(f'value of the global variable: {x}')
this x is a global variable accessed from within a local scope
value of the global variable: global variable accessed from within a local scope

The same applies for functions, meaning it is possible to write local functions inside function but also access global functions from within functions.

Return

The function custom_print() demonstrates that it’s not required to return values from a function. On the other hand it’s possible to define multiple return statements inside a function. If one of these is executed, the rest of the code inside the function will not be executed.

def multiple_returns(data):
    if type(data) == str:
        return data.swapcase()
    elif type(data) == int:
        return 'A' + ' rose is a' * abs(data)
    else:
        return str(data) + ' is less than ' + int(data+1)*'🍍'
    
data = [45.626725, -19, 'Ghostwriter']

for d in data:
    print(multiple_returns(d))
    print()
45.626725 is less than 🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍🍍

A rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a rose is a

gHOSTWRITER

When are functions useful?

When code becomes more complicated and bigger, it’s good practice to separate it into individual blocks of code, which work independently.
Next we’ll use the text from example.md as an example. We made several steps of processing the text until it was ready to be used for the final text generation. It’s a good idea to put all these steps into one function:

  • it’s cleaner and better structured

  • it’s easy to use the steps multiple times in the same program

with open('example.md', 'r') as f:
    txt = f.read()
    
txt = txt.splitlines()

txt = [line for line in txt if not line.startswith('#')]

excerpt = txt[3]
print(excerpt)
"our brains [...] contain tiny events (neuron firings) and larger events (patterns of neuron frings), and the latter presumably somehow have <i>representational</i> qualities, allowing us to register and also to remember things that happen outside of our crania. Such internalization of the outer world in symbolic patterns in a brain is a pretty far-fetched idea, when you think about it, and yet we know it somehow came to exist, thanks to the pressures of evolution." (46)
def clean_paragraph(paragraph):
    '''This function returns a cleaned paragraph.'''
    txt = paragraph

    # Remove [...]
    txt = txt.replace('[...]', '')
    
    # Remove <i>, </i>, <b>, </b>
    # This could be done better with regex (later)
    txt = txt.replace('<i>','').replace('</i>','').replace('<b>','').replace('</b>','')
    
    # Remove space before a dot
    txt = txt.replace(' .', '.')
    
    # Remove page number in parenthesis at the end
    index = txt.rfind('(')
    txt = txt[:index]
    
    # Remove apostrophes
    txt = txt.replace('"', '')
    
    # Remove multiple spaces
    txt = txt.split()
    txt = ' '.join(txt)
    
    # Capitalize the sentences
    # Split string into list.
    txt = txt.split('.')
    # Remove leading or trailing spaces
    txt = [s.strip().capitalize() for s in txt]
    txt = '. '.join(txt)
    
    # Return the cleaned text
    return txt
cleaned_txt = clean_paragraph(excerpt)
print(cleaned_txt)
Our brains contain tiny events (neuron firings) and larger events (patterns of neuron frings), and the latter presumably somehow have representational qualities, allowing us to register and also to remember things that happen outside of our crania. Such internalization of the outer world in symbolic patterns in a brain is a pretty far-fetched idea, when you think about it, and yet we know it somehow came to exist, thanks to the pressures of evolution. 

Call a function multiple times

cleaned_txt = '' # empty string 

# Iterate through all paragraphs
for p in txt:
    # Clean each paragraph and add it to cleaned_txt
    cleaned_txt += clean_paragraph(p)
    
print(cleaned_txt)
Dealing with brains as multi-level systems is essential if we are to make even the slightest progress in analyzing elusive mental phenomena such as perception, concepts, thinking, consciousness, »i«, free will, and so forth. Our brains contain tiny events (neuron firings) and larger events (patterns of neuron frings), and the latter presumably somehow have representational qualities, allowing us to register and also to remember things that happen outside of our crania. Such internalization of the outer world in symbolic patterns in a brain is a pretty far-fetched idea, when you think about it, and yet we know it somehow came to exist, thanks to the pressures of evolution. I begin with the simple fact that living beings, having been shaped by evolution, have survival as their most fundamental, automatic, and built-in goal. To enhance the chances of its survival, any living being must be able to react flexibly to events that take place in its environment. This means it must develop the ability to sense and to categorize, however rudimentarily, the goings-on in its immediate environment (most earthbound beings can pretty safely ignore comets crashing on jupiter). Once the ability to sense external goings-on has developed, however, there ensues a curious side effect that will have vital and radical consequences. This is the fact that the living being's ability to sense certain aspects of its environment flips around and endows the being with the ability to sense certain aspects of itself. Indeed, thinking about how one might tackle such an engineering challenge is a helpful way of simultaneously envisioning the process of perception in the brain of a living creature and its counterpart in the cognitive system of an artificial mind (or an alien creature, for that matter). A creature that thinks knows next to nothing of the subtrate allowing its thinking to happen, but nonetheless it knows all about its symbolic interpretation of the world, and knows very intimately something it calls »i«. A human brain is a representational system that knows no bounds in terms of extensibility or flexibility of its categories. The closing of the strange loop of human selfhood is deeply dependent upon the level-changing leap that is perception, which means categorization, and therefore, the richer and more powerful an organism's categorization equipment is, the more realized and rich will be its self. Through language, other people's bodies can become flexible extensions of our own bodies. A novel is not a specific sequence of words, because if it were, it could only be written in one language, in one culture. No, a novel is a pattern -- a particular collection of characters, events, moods, tones, jokes, allusions, and much more. And so a novel is an abstraction. The cells inside a brain are not the bearers of its consciousness; the bearers of consciousness are patterns. The pattern of organization is what matters, not the substance. . . . Symbol-level brain activity that mirrors external events is consciousnessMy brain is constantly seeking to label, to categorize, to find precedents and analogues – in other words, to simplify while not letting essence slip away. Category assignments go right to the core of thinking, they are determinant of our attitude toward each thing in the world. We human beings are unpredictable self-writing poems -- vague, metaphorical, ambiguous, and sometimes exceedingly beautiful. 
with open('example_cleaned.txt', 'w') as f:
    f.write(cleaned_txt)