Introduction VI - Introduction to Python II

Before we get started…

We’re going to be continuing with the Python lesson, this time dealing with some more complex concepts such as operators, comparisons and data structures!

Make sure to have everything installed to follow along or use the interactive mode.

Note on interactive Mode

As this website is build on Jupyter Notebooks you can click also on the small rocket at the top of this website, select Live code (and wait a bit) and this site will become interactive.

Launch MyBinder

Following you can try to run the code cells below, by clicking on the “run” button, that appears beneath them.

Some functionality of this notebooks can’t be demonstrated using the live code implementation you can therefore either download the course content or open this notebook via Binder, i.e. by clicking on the rocket and select Binder. This will open an online Jupyter-lab session where you can find this notebook by follow the folder strcuture that will be opened on the right hand side via lecture -> content and click on intro_jupyter.ipynb.

If you’ve followed the Setup and installed conda and jupyter, you can simply open a notebook yourself by either:

A. Opening the Anaconda application and selecting the Jupyter Notebooks tile

B. Or opening a terminal/shell type jupyter notebook and hit enter. If you’re not automatically directed to a webpage copy the URL (https://....) printed in the terminal and paste it in your browser

Goals 📍

  • learn basic and efficient usage of the python programming language

    • building blocks of & operations in python

    • operators & comparisons

    • strings, lists, tuples & dictionaries

Recap of the last session

Before we dive into new endeavors, let’s briefly recap the things we’ve talked about so far. If some of the aspects within the recap are either new or fuzzy to you, please have a quick look at the respective part of the last session again.

What is Python?

  • Python is a free, open source programming language

  • Specifically, it’s a widely used/very flexible, high-level, general-purpose, dynamic programming language

Why use Python?

  • Easy to learn

  • offers a comprehensive standard library

  • exceptional external libraries/packages

  • (Relatively) good performance

  • Many languages—particularly R—now interface seamlessly with Python

  • amazing community

  • looks great on every CV

Modules

Most of the functionality in Python is provided by modules. To use a module in a Python program it first has to be imported. A module can be imported using the import statement.

Assuming you want to import the entire pandas module to do some data exploration, wrangling and statistics, how would you do that?

Note: You can run the following lines of code via the live code implementation by clicking on the rocket at the top of the page and selecting Live code. If you’ve opened this page via Binder as described above or downloaded the course materials follow along with the next couple of excerices. You can also open a new notebook as described above and copy the following code cells into your notebook.

# Pleae write your solution in this cell

As this might be a bit hard to navigate, specifically for finding/referencing functions. Thus, it might be a good idea to provide a respective access name. For example, could you show how you would provide the pandas module the access name pd?

# Please write your solution in this cell

During your analyzes you recognize that some of the analyses you want to run require functions from the statistics module pingouin. Is there a way to only import the functions you want from this module, e.g. the wilcoxon test from within pingouin.nonparametric?

# Please write your solution in this cell

Variables and data types

  • in programming variables are things that store values

  • in Python, we declare a variable by assigning it a value with the = sign

    • name = value

    • code variables are not like math variables

      • in mathematics = refers to equality (statement of truth), e.g. y = 10x + 2

      • in coding = refers to assignments, e.g. x = x + 1

    • Variables are pointers, not data stores!

  • Python supports a variety of data types and structures:

    • booleans

    • numbers (ints, floats, etc.)

    • strings

    • lists

    • dictionaries

    • many others!

  • We don’t specify a variable’s type at assignment

Assignment

(Not your homework assignment but the operator in python.)

The assignment operator in Python is =. Python is a dynamically typed language, so we do not need to specify the type of a variable when we create one.

Assigning a value to a new variable creates the variable:

Within your analyzes you need to create a variable called n_students and assign it the value 21, how would that work?

# Please write your solution in this cell

Quickly after you realize that the value should actually be 20. What options do you have to change the value of this variable?

# Please write your solution in this cell

During the analyzes you noticed that the data type of n_students changed. How can you find out the data type?

# Please write your solution in this cell

Is there a way to change the data type of n_students to something else, e.g. float?

# Please write your solution in this cell

Along the way you want to create another variable, this time called acquisition_time and the value December. How would we do that and what data type would that be?

# Please write your solution here

As a final step you want to create two variables that indicate that the outcome of a statistical test is either significant or not. How would you do that for the following example: for outcome_anova it’s true that the result was significant and for outcome_ancova it’s false that the result was significant?

# Please write your solution in this cell 

Alright, thanks for taking the time to go through this recap. Again: if you could solve/answer all questions, you should have the information/knowledge needed for this session.

Here’s agagin what we’ll focus on in this block:

Operators and comparisons

One of the most basic utilizations of python might be simple arithmetic operations and comparisons. operators and comparisons in python work as one would expect:

  • Arithmetic operators available in python: +, -, *, /, ** power, % modulo

  • comparisons available in python: <, >, >= (greater or equal), <= (less or equal), == (equal), != (not equal) and is (identical)

Obviously, these operators and comparisons can be used within tremendously complex analyzes and actually build their basis.

Lets check them out further via a few quick examples, starting with operators:

[1 + 2, 
 1 - 2,
 1 * 2,
 1 / 2,
 1 ** 2,
 1 % 2]
[3, -1, 2, 0.5, 1, 1]

In Python 2.7 (aka the legacy version), what kind of division (/) is executed, depends on the type of the numbers involved. If all numbers are integers, the division will be an integer division, otherwise, it will be a float division. In Python 3 this has been changed and fractions aren’t lost when dividing integers (for integer division you can use another operator, //). In Python 3 the following two operations will give the same result (in Python 2 the first one will be treated as an integer division). It’s thus important to remember that the data type of division outcomes will always be float.

print(1 / 2)
print(1 / 2.0)
print(1//2)
0.5
0.5
0

Python also respects arithemic rules, like the sequence of +/- and *//.

1 + 2/4
1.5
1 + 2 + 3/4
3.75

The same holds true for () and operators:

(1 + 2)/4
0.75
(1 + 2 + 3)/4
1.5

Thus, always watch out for how you define arithmetic operations!

Just as a reminder: the power operator in python is ** and not ^:

2 ** 2
4

This arithmetic operations also show some “handy” properties in combination with assignments, specifically you can apply these operations and modify the value of a given variable “in-place”. This means that you don’t have to assign a given variable a new value via an additional line like so:

a = 2
a = a * 2
print(a)
4

but you can shortcut the command a = a * 2 to a *= 2. This also works with other operators: +=, -= and /=.

b = 3
b *= 3
print(b)
9

Interestingly, we meet booleans again. This time in the form of operators. So booleans can not only be referred to as a data type but also operators. Whereas the data type entails the values True and False, the operators are spelled out as the words and, not, or. They therefore allow us to evaluate if

  • something and something else is the case

  • something is not the case

  • something or something else is the case

How about we check this on an example, i.e. the significance of our test results from the recap:

outcome_anova = True
outcome_ancova = False
outcome_anova and outcome_ancova
False
not outcome_ancova
True
outcome_anova or outcome_ancova
True
outcome_anova and not outcome_ancova
True
outcome_anova and or outcome_ancova
  File "<ipython-input-25-14a735ac7bd0>", line 1
    outcome_anova and or outcome_ancova
                       ^
SyntaxError: invalid syntax

While the “classic” operators appear to be rather simple and the “boolean” operators rather abstract, a sufficient understanding of both is very important to efficiently utilize the python programming language. However, don’t worry: going forward we’ll have plenty of opportunities to get familiarized with them.

After spending a look at operators, it’s time to check out comparisons in more detail. Again, most of them might seem familiar and work as you would expect. Here’s the list again:

  • Comparisons in python:

      `>`, `<`, `>=` (greater or equal), `<=` (less or equal), `==` (equal),
      `!=` (not equal) and `is` (identical)
    

The first four are the “classics” and something you might remember from your math classes in high school. Nevertheless, it’s worth to check how they exactly work in python.

If we compare numerical values, we obtain booleans that indicate if the comparisons is True or False. Lets start with the “classics”.

2 > 1, 2 < 1
2 > 2, 2 < 2
2 >= 2, 2 <= 2

So far so good and no major surprises. Now lets have a look at those comparisons that might be less familiar. At first, ==. You might think: “What, a double assignment?” but actually == is the equal comparison and thus compares two variables, numbers, etc., evaluating if they are equal to each other.

1 == 1
outcome_anova == outcome_ancova
'This course' == "cool"
1 == 1 == 2

One interesting thing to mention here is that equal values of different data types, i.e. integers and floats, are still evaluated as equal by ==:

1 == 1.0

Contrarily to evaluating if two or more things are equal via ==, we can utilize != to evaluate if two are more things are not equal. The behavior of these comparison concerning the outcome is however identical: we get booleans.

2 != 3
outcome_anova = True 
outcome_ancova = False
outcome_anova != outcome_ancova 
1 != 1 != 2

There’s actually one very specific comparison that only works for one data type: string comparison. The string comparison is reflected by the word in and evaluates if a string is part of another string. For example, you can evaluate if a word or certain string pattern is part of another string. Two fantastic beings are going to help showcasing this!

Please welcome, the Wombat & the Capybara.

"cool" in "Wombats are cool"
"ras ar" in "Wombats and capybaras are cool"

The string comparison can also be combined with the boolean operator to evaluate if a string or string pattern is not part of another string.

"stupid" not in "Wombats and capybaras"
"ugly" not in "Wombats and capybaras"

Before we finish the operators & comparison part, it’s important to outline one important aspects that you’ve actually already seen here and there but was never mentioned/explained in detail: operators & comparisons work directly on variables, that is their values. For example, if we want to change the number of a variable called n_lectures from 5 to 6, we can simply run:

n_lectures = 5
n_lectures = n_lectures + 1 
n_lectures

or use the shortcut as seen before

n_lectures = 5
n_lectures += 1
n_lectures

Be careful though, as += means toadd to the value stored in the variable, while =+ is understood as assigning a positive value to a variable

n_lectures =+ 1
n_lectures

This works with other types and operators/comparisons too, for example strings and ==:

'Wombats' == 'Capybaras'

Exercise 4.1

You want to compute the mean of the following reaction times: 1.2, 1.0, 1.5, 1.9, 1.3, 1.2, 1.7. Is there a way to achieve that using operators?

# Please write your solution here

Spoiler: there are of course many existing functions for all sorts of equations and statistics so you don’t have to write it yourself every time. For example, we could also compute the mean using numpy’s mean function:

# Please write your solution here

Exercise 4.2

Having computed the mean, you need to compare it to a reference value. The latter can be possibly found in a string that entails data from a previous analyses. If that’s the case, the string should contain the words "mean reaction time". Is there a way to evaluate this?

The string would be: “In contrast to the majority of the prior studies the mean reaction time of the here described analyses was 1.2.”

# Please write your solution here

Exercise 4.3

Having found the reference value, that is 1.2 we can compare it to our mean. Specifically, you want to know if the mean is less or equal than 1.2. The outcome of the comparison should then be the value of a new variable called mean_reference_comp.

# Please write your solution here

Fantastic work folks, really really great! Time for a quick party!


via GIPHY

Having already checked out modules, help & descriptions, variables and data types, operators and comparisons, we will continue with the final section of the first block of our python introduction. More precisely, we will advance to new, more complex data types and structures: strings, lists, tuples and dictionaries.

Strings, List and dictionaries

So far, we’ve explored integers, float, strings and boolean as fundamental types. However, there are a few more that are equally important within the python programing language and allow you to easily achieve complex behavoir and ease up your everyday programming life: strings, lists, tuples and dictionaries.


Strings

Wait, what? Why are we talking about strings again? Well, actually, strings are more than a “just” fundamental type. There are quite a few things you can do with strings that we haven’t talked about yet. However, first things first: strings contain text:

statement = "The wombat and the capybara are equally cute. However, while the wombat lives in Australia, the capybara can be found in South America."
statement
'The wombat and the capybara are equally cute. However, while the wombat lives in Australia, the capybara can be found in South America.'
type(statement)
str

So, what else can we do? For example, we can get the length of the string which reflects the number of characters in the string. The respective len function is one of the python functions that’s always available to you, even without importing it. Notably, len can operate on various data types which we will explore later.

len(statement)
135

The string data types also allows us to replace parts of it, i.e. substrings, with a different string. The respective syntax is string.replace("substring_to_replace", "replacement_string"), that is, .replace searches for "substring_to_replace" and replaces it with "replacement_string". If we for example want to state that wombats and capybaras are awesome instead of cute, we could do the following:

statement.replace("cute", "awesome")
'The wombat and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'

Importantly, strings are not replaced in-place but require a new variable assignment.

statement
'The wombat and the capybara are equally cute. However, while the wombat lives in Australia, the capybara can be found in South America.'
statement_2 = statement.replace("cute", "awesome")
statement_2
'The wombat and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'

We can also index a string using string[x] to get the character at the specified index:

statement_2
'The wombat and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'
statement_2[1]
'h'

Pump the breaks right there: why do we get h when we specify 1 as index? Shouldn’t this get us the first index and thus T?

via GIPHY

HEADS UP EVERYONE: INDEXING IN PYTHON STARTS AT 0


This means that the first index is 0, the second index 1, the third index 2, etc. . This holds true independent of the data type and is one of the major confusions when folks start programming in python, so always watch out!

statement_2
'The wombat and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'
statement_2[0]
'T'
statement_2[1]
'h'
statement_2[2]
'e'

If we want to get more than one character of a string we can use the following syntax string[start:stop] which extracts characters between index start and stop. This technique is called slicing.

statement_2[4:10]
'wombat'

If we omit either (or both) of start or stop from [start:stop], the default is the beginning and the end of the string, respectively:

statement_2[:10]
'The wombat'
statement_2[10:]
' and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'
statement_2[:]
'The wombat and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'

We can also define the step size using the syntax [start:end:step] (the default value for step is 1, as we saw above):

statement_2[::1]
'The wombat and the capybara are equally awesome. However, while the wombat lives in Australia, the capybara can be found in South America.'
statement_2[::2]
'Tewma n h ayaaaeeulyaeoe oee,wietewma ie nAsrla h ayaacnb on nSuhAeia'

String formatting

Besides operating on strings we can also apply different formatting styles. More precisely, this refers to different ways of displaying strings. The main function we’ll explore regarding this will be the print function. Comparable to len, it’s one of the python functions that’s always available to you, even without import.

For example, if we print strings added with +, they are concatenated without space:

print("The" + "results" + "were" + "significant")
Theresultsweresignificant

The print function concatenates strings differently, depending how the inputs are specified. If we just provide all strings without anything else, they will be concatenated without spaces:

print("The" "results" "were" "significant")
Theresultsweresignificant

If we provide strings separated by ,, they will be concatenated with spaces:

print("The", "results", "were", "significant")
The results were significant

Interestingly, the print function converts all inputs to strings, no matter their actual type:

print("The", "results", "were", "significant", 0.049, False)
The results were significant 0.049 False

Another very cool and handy option that we can specify placeholders which will be filled with an input according to a given formatting style. Python has two string formatting styles. An example of the old style is below, the placeholder or specifier %.3f transforms the input number into a string, that corresponds to a floating point number with 3 decimal places and the specifier %d transforms the input number into a string, corresponding to a decimal number.

print("The results were significant at %.3f" %(0.049))
The results were significant at 0.049
print("The results were significant at %d" %(0.049))
The results were significant at 0

As you can see, you have to be very careful with string formatting as important information might otherwise get lost!

We can achieve the same outcome using the new style string formatting which uses {} followed by .format().

print("The results were significant at {:.3f}" .format(0.049))
The results were significant at 0.049
print("The results were significant at {}" .format(0.049))
The results were significant at 0.049

There are other ways to be more explicit when formatting strings and we cann even use multiple inputs for one string

print("We are hobbits of the Shire. {} is my name, and this is {}.".format("Frodo Baggins", "Samwise Gamgee"))
We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.
print("We are hobbits of the Shire. {0} is my name, and this is {1}.".format("Frodo Baggins", "Samwise Gamgee"))
We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.
print("We are hobbits of the Shire. {fname} is my name, and this is {sname}.".format(fname = "Frodo Baggins", sname = "Samwise Gamgee"))
We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.

If you would like to include line-breaks and/or tabs in your strings, you can use \n and \t respectively:

print("Geez, there are some many things \nPython can do with \t strings.")
Geez, there are some many things 
Python can do with 	 strings.

We can of course also combine the different string formatting options:

print("Animal: {}\nHabitat: {}\nRating: {}".format("Wombat", "Australia", 5))
Animal: Wombat
Habitat: Australia
Rating: 5

Single Quote

You can specify strings using single quotes such as 'Quote me on this'. All white space i.e. spaces and tabs, within the quotes, are preserved as-is.

Double Quotes

Strings in double quotes work exactly the same way as strings in single quotes. An example is "What's your name?".

Triple Quotes

You can specify multi-line strings using triple quotes - (""" or '''). You can use single quotes and double quotes freely within the triple quotes. An example is:

'''I'm the first line. Check how line-breaks are shown in the second line.
Do you see the line-break?
"What's going on here?," you might ask.
Well, "that's just how triple quotes work."
'''
'I\'m the first line. Check how line-breaks are shown in the second line.\nDo you see the line-break?\n"What\'s going on here?," you might ask.\nWell, "that\'s just how triple quotes work."\n'
print('''I'm the first line. Check how line-breaks are shown in the second line.
Do you see the line-break?
"What's going on here?," you might ask.
Well, "that's just how triple quotes work."
''')
I'm the first line. Check how line-breaks are shown in the second line.
Do you see the line-break?
"What's going on here?," you might ask.
Well, "that's just how triple quotes work."
'''
Frodo: We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.
Faramir: Your bodyguard?
Sam: His gardener. 
'''
'\nFrodo: We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.\nFaramir: Your bodyguard?\nSam: His gardener. \n'
print('''
Frodo: We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.
Faramir: Your bodyguard?
Sam: His gardener. 
''')
Frodo: We are hobbits of the Shire. Frodo Baggins is my name, and this is Samwise Gamgee.
Faramir: Your bodyguard?
Sam: His gardener. 
Exercise 5.1

Create two variables called info_wombat and info_capybara and provide them the following values respectively:

“The wombat is quadrupedal marsupial and can weigh up to 35 kg.”

“The capybara is the largest rodent on earth. Its relatives include the guinea pig and the chinchilla.”

Once created, please verify that the type is string.

# Please write your solution here
Exercise 5.2

Compute the length and print within the strings “The wombat information has [insert length here] characters.” and “The capybara information has [insert length here] characters.” After that, please compare the length of the strings and print if they are equal.

# Please write your solution here
Exercise 5.3

Get the following indices from the info_wombat and info_capybara respectively: 4-10 and 4-12. Replace the resulting word in info_wombat with capybara and the resulting word in info_capybara with wombat.

List

Next up: lists. In general, lists are very similar to strings. One crucial difference is that list elements (things in the list) can be of any type: integers, floats, strings, etc..

Additionally, types can be freely mixed within a list, that is, each element of a list can be of a different type. Lists are among the data types and structures you’ll work with almost every time you do something in python. They are super handy and comparably to strings, have a lot of “in-built” functionality.


The basic syntax for creating lists in python is [...]:

[1,2,3,4]
[1, 2, 3, 4]
type([1,2,3,4])
list

You can of course also set lists as the value of a variable. For example, we can create a list with our reaction times from before:

reaction_times = [1.2, 1.0, 1.5, 1.9, 1.3, 1.2, 1.7]
print(type(reaction_times))
print(reaction_times)
<class 'list'>
[1.2, 1.0, 1.5, 1.9, 1.3, 1.2, 1.7]

Going back to the comparison with strings, we can use the same index and slicing techniques to manipulate lists as we could use on strings: list[index], list[start:stop].

print(reaction_times)
print(reaction_times[1:3])
print(reaction_times[::2])
[1.2, 1.0, 1.5, 1.9, 1.3, 1.2, 1.7]
[1.0, 1.5]
[1.2, 1.5, 1.3, 1.7]
via GIPHY

HEADS UP EVERYONE: INDEXING IN PYTHON STILL STARTS AT 0


This means that the first index is 0, the second index 1, the third index 2, etc. . This holds true independent of the data type.

Thus, to get the first index of our reaction_times, we have to do the following:

reaction_times[0]
1.2

There’s another important aspect related to index and slicing. Have a look at the following example that should get us the reaction times from index 1 to 4:

print(reaction_times)
print(reaction_times[1:4])
[1.2, 1.0, 1.5, 1.9, 1.3, 1.2, 1.7]
[1.0, 1.5, 1.9]

Isn’t there something missing, specifically the last index we wanted to grasp, i.e. 4?

via GIPHY

HEADS UP EVERYONE: SLICING IN PYTHON EXCLUDES THE “STOP” INDEX


This means that the slicing technique gives you everything up to the stop index but does not include the stop index itself. For example, reaction_times[1:4] will return the list elements from index 1 up to 4 but not the fourth index. This holds true independent of the data type and is one of the major confusions when folks start programming in python, so always watch out!

So, to get to list elements from index 0 - 4, including 4, we have to do the following:

print(reaction_times)
print(reaction_times[0:5])
[1.2, 1.0, 1.5, 1.9, 1.3, 1.2, 1.7]
[1.2, 1.0, 1.5, 1.9, 1.3]

As mentioned before, elements in a list do not all have to be of the same type:

mixed_list = [1, 'a', 4.0, 'What is happening?']

print(mixed_list)
[1, 'a', 4.0, 'What is happening?']

Another nice thing to know: python lists can be inhomogeneous and arbitrarily nested, meaning that we can define and access lists within lists. Is this list-ception??

nested_list = [1, [2, [3, [4, [5]]]]]

print("our nest list looks like this: %s" %nested_list)
print("the length of our nested list is: %s" %len(nested_list))
print("the first index of our nested_list is: %s" %nested_list[0])
print("the second index of our nested_list is: %s" %nested_list[1])
print("the second index of our nested_list is a list we can index again via nested_list[1][1]: %s" %nested_list[1][1])
our nest list looks like this: [1, [2, [3, [4, [5]]]]]
the length of our nested list is: 2
the first index of our nested_list is: 1
the second index of our nested_list is: [2, [3, [4, [5]]]]
the second index of our nested_list is a list we can index again via nested_list[1][1]: [3, [4, [5]]]

Lets have a look at another list. Assuming you obtain data describing the favorite artists and snacks of a sample population, we can put the respective responses in lists for easy handling:

# everyone name a band/artist
artists = ["Totorro", "American Football", "TTNG", "Delta Sleep", "Clever Girl", "Toe", "Covet", "Chon"]
# everone 2nd name your favourite snack
snacks = ["Dark chocolate", "Oreos", "Trail Mix", "Coconut milk ice cream",
          "Vegan cupcakes", "Seaweed Snacks", "Pocky"]
# everyone 3rd name your favourite animals
animals = ["Red panda", "Sloth", "Quokka", "Fennec fox", "Axolotl", "Slow loris", "Kangaroo"]

So lets check what we can do with these lists. At first, here they are again:

print('The favorite artists were: %s' %artists)
print('\n')
print('The favorite snacks were: %s'%snacks)
print('\n')
print('The favorite animals were: %s' %animals)
The favorite artists were: ['Totorro', 'American Football', 'TTNG', 'Delta Sleep', 'Clever Girl', 'Toe', 'Covet', 'Chon']


The favorite snacks were: ['Dark chocolate', 'Oreos', 'Trail Mix', 'Coconut milk ice cream', 'Vegan cupcakes', 'Seaweed Snacks', 'Pocky']


The favorite animals were: ['Red panda', 'Sloth', 'Quokka', 'Fennec fox', 'Axolotl', 'Slow loris', 'Kangaroo']

Initially we might want to count how many responses there were. We can achieve this via our old friend the len function. If we want to also check if we got responses from all 17 participants of our sample population, we can directly use comparisons.

print('Regarding artists there were %s responses' %len(artists))
print('We got responses from all participants: %s' %str(len(artists)==17))
Regarding artists there were 8 responses
We got responses from all participants: False

We can do the same for snacks and animals:

print('Regarding snacks there were %s responses' %len(snacks))
print('We got responses from all participants: %s' %str(len(snacks)==17))
Regarding snacks there were 7 responses
We got responses from all participants: False
print('Regarding animals there were %s responses' %len(animals))
print('We got responses from all participants: %s' %str(len(animals)==17))
Regarding animals there were 7 responses
We got responses from all participants: False

Another thing we might want to check is the number of unique responses, that is if some values in our list appear multiple times and we might also want to get these values.

In python we have various ways to do this, most often you’ll see (and most likely use) set and numpy’s unique functions. While the first is another example of in-built python functions that don’t need to be imported, the second is a function of the numpy module. They however can achieve the same goal, that is getting us the number of unique values.

print('There are %s unique responses regarding artists' %len(set(artists)))
There are 8 unique responses regarding artists
import numpy as np
print('There are %s unique responses regarding artists' %len(np.unique(artists)))
There are 8 unique responses regarding artists

The functions themselves will also give us the list of unique values:

import numpy as np
print('The unique responses regarding artists were:  %s' %np.unique(artists))
The unique responses regarding artists were:  ['American Football' 'Chon' 'Clever Girl' 'Covet' 'Delta Sleep' 'TTNG'
 'Toe' 'Totorro']

Doing the same for snacks and animals again is straightforward:

print('There are %s unique responses regarding snacks' %len(np.unique(snacks)))
print('The unique responses regarding snacks were:  %s' %np.unique(snacks))
There are 7 unique responses regarding snacks
The unique responses regarding snacks were:  ['Coconut milk ice cream' 'Dark chocolate' 'Oreos' 'Pocky'
 'Seaweed Snacks' 'Trail Mix' 'Vegan cupcakes']
print('There are %s unique responses regarding animals' %len(np.unique(animals)))
print('The unique responses regarding animals were:  %s' %np.unique(animals))
There are 7 unique responses regarding animals
The unique responses regarding animals were:  ['Axolotl' 'Fennec fox' 'Kangaroo' 'Quokka' 'Red panda' 'Sloth'
 'Slow loris']

In-built functions

As indicated before, lists have a great set of in-built functions that allow to perform various operations/transformations on them: sort, append, insert and remove.

Please note, as these functions are part of the data typelist”, you don’t prepend (sort(list)) but append them: list.sort(), list.append(), list.insert() and list.remove().

Lets start with sort which, as you might have expected, will sort our list.

artists.sort()
artists
['American Football',
 'Chon',
 'Clever Girl',
 'Covet',
 'Delta Sleep',
 'TTNG',
 'Toe',
 'Totorro']

Please note that our list is modified/changed in-place. While it’s nice to not have to do a new assignment, this can become problematic if the index is of relevance!

.sort() also allows you to specify how the list should be sorted: ascending or descending. This is controlled via the reverse argument of the .sort() function. By default, lists are sorted in an descending order. If you want to sort your list in an ascending order you have to set the reverse argument to True: list.sort(reverse=True).

artists.sort(reverse=True)
artists
['Totorro',
 'Toe',
 'TTNG',
 'Delta Sleep',
 'Covet',
 'Clever Girl',
 'Chon',
 'American Football']

We of course also want to sort our lists of snacks and animals:

snacks.sort()
snacks
['Coconut milk ice cream',
 'Dark chocolate',
 'Oreos',
 'Pocky',
 'Seaweed Snacks',
 'Trail Mix',
 'Vegan cupcakes']
animals.sort()
animals
['Axolotl',
 'Fennec fox',
 'Kangaroo',
 'Quokka',
 'Red panda',
 'Sloth',
 'Slow loris']

Lets assume we got new data from two participants and thus need to update our list, we can simply use .append() to, well, append or add these new entries:

artists.append('Taylor Swift')
artists
['Totorro',
 'Toe',
 'TTNG',
 'Delta Sleep',
 'Covet',
 'Clever Girl',
 'Chon',
 'American Football',
 'Taylor Swift']

We obviously do the same for snacks and animals again:

snacks.append('Wasabi Peanuts')
artists
['Totorro',
 'Toe',
 'TTNG',
 'Delta Sleep',
 'Covet',
 'Clever Girl',
 'Chon',
 'American Football',
 'Taylor Swift']
animals.append('bear')
animals
['Axolotl',
 'Fennec fox',
 'Kangaroo',
 'Quokka',
 'Red panda',
 'Sloth',
 'Slow loris',
 'bear']

Should the index of the new value be important, you have to use .insert as .append will only ever, you guessed it: append.

The .insert functions takes two arguments, the index where a new value should be inserted and the value that should be inserted: list.insert(index, value). The index of the subsequent values will shift +1 accordingly. Assuming, we want to add a new value at the third index of each of our lists, this how we would do that:

artists.insert(2, 'The sound of animals fighting')
print(artists)
['Totorro', 'Toe', 'The sound of animals fighting', 'TTNG', 'Delta Sleep', 'Covet', 'Clever Girl', 'Chon', 'American Football', 'Taylor Swift']
snacks.insert(2, 'PB&J')
print(snacks)
['Coconut milk ice cream', 'Dark chocolate', 'PB&J', 'Oreos', 'Pocky', 'Seaweed Snacks', 'Trail Mix', 'Vegan cupcakes', 'Wasabi Peanuts']
animals.insert(2, 'Ocelot')
print(animals)
['Axolotl', 'Fennec fox', 'Ocelot', 'Kangaroo', 'Quokka', 'Red panda', 'Sloth', 'Slow loris', 'bear']

If you want to change the value of a list element (e.g. you noticed an error and need to change the value), you can do that directly by assigning a new value to the element at the given index:

print('The element at index 5 of the list of snacks is: %s\n' %snacks[5])
snacks[5] = 'Apples'
print('It is now %s\n' %snacks[5])
print(snacks)
The element at index 5 of the list of snacks is: Seaweed Snacks

It is now Apples

['Coconut milk ice cream', 'Dark chocolate', 'PB&J', 'Oreos', 'Pocky', 'Apples', 'Trail Mix', 'Vegan cupcakes', 'Wasabi Peanuts']

Please note that this is final and the original value overwritten. The characteristic of modifying lists by assigning new values to elements in the list is called mutable in technical jargon.

If you want to remove an element of a given list (e.g. you noticed there are unwanted duplicates, etc.), you basically have two options list.remove(element) and del list[index] and which one you have to use depends on the goal.

As you can see .remove(element) expects the element that should be removed from the list, that is the element with the specified value. For example, if we want to remove duplicates, we can do the following:

artists_2 = artists * 2
artists_2.remove("Taylor Swift")
print(artists_2)
['Totorro', 'Toe', 'The sound of animals fighting', 'TTNG', 'Delta Sleep', 'Covet', 'Clever Girl', 'Chon', 'American Football', 'Totorro', 'Toe', 'The sound of animals fighting', 'TTNG', 'Delta Sleep', 'Covet', 'Clever Girl', 'Chon', 'American Football', 'Taylor Swift']

As you can see, only the first duplicate is removed and not both, this is because .remove(element) only removes the first element of the specified value. Thus, if there are more elements you want to remove, the del function could be more handy.

You may have noticed that the del function expects the index of the element that should be removed from the list. Thus, if we, for example, want to remove the duplicate , we can also achieve that via the following:

del artists[6]
print(artists)
['Totorro', 'Toe', 'The sound of animals fighting', 'TTNG', 'Delta Sleep', 'Covet', 'Chon', 'American Football', 'Taylor Swift']

If there are multiple elements you want to remove, you can make use of slicing again.

For example, if our snack list has multiple elements with the value chocolate. Nothing against chocolate, but you might want to have this value only once in our list. To achieve that, we basically combine slicing and del via indicating what indices should be removed:

del snacks[3:5]
print(snacks)
['Coconut milk ice cream', 'Dark chocolate', 'PB&J', 'Apples', 'Trail Mix', 'Vegan cupcakes', 'Wasabi Peanuts']

Lists play a very important role in python and are for example used in loops and other flow control structures (discussed in the next session). There are a number of convenient functions for generating lists of various types, for example, the range function (note that in Python 3 range creates a generator, so you have to use the list function to convert the output to a list). Creating a list that ranges from 10 to 50 advancing in steps of 2 is as easy as:

start = 10
stop = 50
step = 2

list(range(start, stop, step))
[10,
 12,
 14,
 16,
 18,
 20,
 22,
 24,
 26,
 28,
 30,
 32,
 34,
 36,
 38,
 40,
 42,
 44,
 46,
 48]

This can be very handy if you want to create participant lists and/or condition lists for your experiment and/or analyzes.

Exercise 6.1

Create a variable called rare_animals that stores the following values: ‘Addax’, ‘Black-footed Ferret’, ‘Northern Bald Ibis’, ‘Cross River Gorilla’, ‘Saola’, ‘Amur Leopard’, ‘Philippine Crocodile’, ‘Sumatran Rhino’, ‘South China Tiger’, ‘Vaquita’. After that, please count how many elements the list has and provide the info within the following statement: “There are [insert number of elements here] animals in the list.”

# Please write your solution here

Exercise 6.2

Please add ‘Manatee’ to the list and subsequently evaluate how many unique elements the list has.

# Please write your solution here

Exercise 6.3

Learning that the manatee is not endangered anymore, please remove it from the list. Unfortunately, we have to add “Giant Panda”. Could you please do that at index 3.

# Please write your solution here

Tuples

Tuples (from doubles, triples … n‑tuples) are like lists, except that they cannot be modified once created, that is they are immutable.


In python, tuples are created using the syntax (..., ..., ...), or even ..., ...:

point = (10, 20, 'Whoa another thing')
print(type(point))
print(point)
<class 'tuple'>
(10, 20, 'Whoa another thing')

Elements of tuples can also be referenced via their respective index:

print(point[0])
print(point[1])
print(point[2])
10
20
Whoa another thing

However, as mentioned above, if we try to assign a new value to an element in a tuple we get an error:

try:
    point[0] = 20
except(TypeError) as er:
    print("TypeError:", er)
else:
    raise
TypeError: 'tuple' object does not support item assignment

Thus, tuples also don’t have the set of functions to modify elements lists do.

Exercise 7.1

Please create a tuple called deep_thought with the following values: ‘answer’, 42.

# Please write your solution here

Dictionaries

Dictionaries are also like lists, except that each element is a key-value pair. That is, elements or entries of the dictionary, the values, can be accessed via their respective key and not via indexing, slicing, etc. .


The syntax for dictionaries is {key1 : value1, ...}.

Dictionaries are fantastic if you need to organize your data in a highly structured way where a precise mapping of a multiple of types is crucial and lists might be insufficient. For example, we want to have the information we accessed above for our lists in a detailed and holistic manner. Instead of specifying multiple lists and variables, we could also create a dictionary that comprises all of that information.

artists_info = {"n_responses" :        len(artists),
              "n_responses_unique" : len(set(artists)),
              "responses" :          artists}

print(type(artists_info))
print(artists_info)
<class 'dict'>
{'n_responses': 9, 'n_responses_unique': 9, 'responses': ['Totorro', 'Toe', 'The sound of animals fighting', 'TTNG', 'Delta Sleep', 'Covet', 'Chon', 'American Football', 'Taylor Swift']}

See how great this is? We have everything in one place and can access the information we need/want via the respective key.

artists_info['n_responses']
9
artists_info['n_responses_unique']
9
artists_info['responses']
['Totorro',
 'Toe',
 'The sound of animals fighting',
 'TTNG',
 'Delta Sleep',
 'Covet',
 'Chon',
 'American Football',
 'Taylor Swift']

As you can see and mentioned before: like lists, each value of a dictionary can entail various types: integer, float, string and even data types: lists, tuples, etc. . However, in contrast to lists, we can access entries only via their key name and not indices:

artists_info[1]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-112-104df88282a5> in <module>
----> 1 artists_info[1]

KeyError: 1

If you try that, you’ll get a KeyError indicating that the key whose value you want to access doesn’t exist. If you’re uncertain about the keys in your dictionary, you can get a list of all of them via dictionary.keys():

artists_info.keys()
dict_keys(['n_responses', 'n_responses_unique', 'responses'])

Comparably, if you want to get all the values, you can use dictionary.values() to obtain a respective list:

artists_info.values()
dict_values([9, 9, ['Totorro', 'Toe', 'The sound of animals fighting', 'TTNG', 'Delta Sleep', 'Covet', 'Chon', 'American Football', 'Taylor Swift']])

Assuming you want to add new information to your dictionary, i.e. a new key, this can directly be done via dictionary[new_key] = value. For example, you run some stats on our list of artists and determined that this selection is significantly awesome with a p value of 0.000001, we can easily add this to our dictionary:

artists_info['artist_selection_awesome'] = True
artists_info['artist_selection_awesome_p_value'] = 0.000001
artists_info
{'n_responses': 9,
 'n_responses_unique': 9,
 'responses': ['Totorro',
  'Toe',
  'The sound of animals fighting',
  'TTNG',
  'Delta Sleep',
  'Covet',
  'Chon',
  'American Football',
  'Taylor Swift'],
 'artist_selection_awesome': True,
 'artist_selection_awesome_p_value': 1e-06}

As with lists, the value of a given element, here entry can be modified and deleted. This means they are mutable. If we for example forgot to correct our p value for multiple comparisons, we can simply overwrite the original with corrected one:

artists_info['artist_selection_awesome_p_value'] = 0.001
artists_info
{'n_responses': 9,
 'n_responses_unique': 9,
 'responses': ['Totorro',
  'Toe',
  'The sound of animals fighting',
  'TTNG',
  'Delta Sleep',
  'Covet',
  'Chon',
  'American Football',
  'Taylor Swift'],
 'artist_selection_awesome': True,
 'artist_selection_awesome_p_value': 0.001}

Assuming we then want to delete the p value from our dictionary because we remembered that the sole focus on significance and p values brought the entirety of science down to it’s knees, we can achieve this via:

del artists_info['artist_selection_awesome_p_value']
artists_info
{'n_responses': 9,
 'n_responses_unique': 9,
 'responses': ['Totorro',
  'Toe',
  'The sound of animals fighting',
  'TTNG',
  'Delta Sleep',
  'Covet',
  'Chon',
  'American Football',
  'Taylor Swift'],
 'artist_selection_awesome': True}

Exercise 8.1

Oh damn, we completely forgot to create a comparable dictionary for our snacks list. How would create one that follows the example from the movie list? NB: you can skip the p value right away:

# Please write your solution here

Exercise 8.2

Obviously, we would love to do the same thing for our list of animals again!

# Please write your solution here

Everything is an object in Python

  • All of these data types are actually just objects in python

  • Everything is an object in python!

  • The operations you can perform with a variable depend on the object’s definition

  • E.g., the multiplication operator * is defined for some objects but not others


Achknowledgments


Michael Ernst
Phd student - Fiebach Lab, Neurocognitive Psychology at Goethe-University Frankfurt

Peer Herholz (he/him)
Research affiliate - NeuroDataScience lab at MNI/MIT
Member - BIDS, ReproNim, Brainhack, Neuromod, OHBM SEA-SIG, UNIQUE

logo logo   @peerherholz

TLDR