Python: the basics¶

Using Jupyter notebooks: a quick tour¶

Insert -> Insert Cell Below

Type Python code in the cell, eg:

print("Hello Jupyter !")

Shift-Enter to run the contents of the cell

When the text on the left hand of the cell is: In [*] (with an asterisk rather than a number), the cell is still running. It's usually best to wait until one cell has finished running before running the next.

print("Hello Jupyter !")

Hello Jupyter !

In Jupyter, just typing the name of a variable in the cell prints its representation:

message = "Hello again !"
message

'Hello again !'

# A 'hash' symbol denotes a comment
# This is a comment. Anything after the 'hash' symbol on the line is ignored by the Python interpreter

print("No comment")  # comment

No comment

Variables and data types¶

Integers, floats, strings¶

a = 5

a

5

type(a)

int

Adding a decimal point creates a float

b = 5.0

b

5.0

type(b)

float

int and float are collectively called 'numeric' types

(There are also other numeric types like hex for hexidemical and complex for complex numbers)

Challenge¶

What is the type of the variable letters defined below ?

letters = "ABACBS"

Strings¶

some_words = "Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇"

some_words

'Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇'

type(some_words)

str

more_words = 'You can use "single" quotes'
more_words

'You can use "single" quotes'

triple_quoted_multiline = """In the last years of the nineteenth centuary,
human affairs were being watched from the timeless worlds of space.
Nobody would have believed that we were being scrutinized as a ....

.. etc ..
"""

print(triple_quoted_multiline)

In the last years of the nineteenth centuary,
human affairs were being watched from the timeless worlds of space.
Nobody would have believed that we were being scrutinized as a ....

.. etc ..

# You can substitute variables into a string like this.
# The variables listed after the string replace each `{0}`, `{1}` etc, in order

formatted = "{0} and BTW, did I mention that {1}".format(more_words, some_words)
print(formatted)

# The example above is 'new-style' string formatting. 
# You may also see 'old-style' (C-style) string formatting in examples, which looks like: 

oldskool = "%s and BTW, did I mention that %s" % (more_words, some_words)

# There's lots of fancy ways to format numbers in strings (eg number of decimal places, scientific notation)
# we won't go into today. See: https://pyformat.info/

You can use "single" quotes and BTW, did I mention that Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇

Operators¶

+ - * / % ** //

+= *= -= /=

# int + int = int
a = 5
a + 1

6

# float + int = float
b = 5.0
b + 1

6.0

a + b

10.0

some_words = "Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇"
a = 6
a + some_words

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-d1919f666d68> in <module>()
      1 some_words = "Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇"
      2 a = 6
----> 3 a + some_words

TypeError: unsupported operand type(s) for +: 'int' and 'str'

str(a) + " " + some_words

'6 Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇'

# Multiplication
a * 10

60

# Division
a / 2

3.0

# Power
a**2

36

# Modulus - divide as whole numbers and return the remainder
a % 2

0

# Shorthand: operators with assignment
a += 1
a

7

Lists and sequence types¶

Lists¶

numbers = [2, 4, 6, 8, 10]
numbers

[2, 4, 6, 8, 10]

len(numbers)

5

# Lists can contain multiple data types
mixed_list = ["asdf", 2, 3.142]
mixed_list

['asdf', 2, 3.142]

list_of_lists = [mixed_list, numbers, ['a','b''c']]
list_of_lists

[['asdf', 2, 3.142], [2, 4, 6, 8, 10], ['a', 'bc']]

numbers[0]

2

numbers[3]

8

numbers[3] = numbers[3] * 100
numbers

[2, 4, 6, 800, 10]

numbers.append(12)
numbers

[2, 4, 6, 800, 10, 12]

numbers.extend([14, 16, 18])
numbers

[2, 4, 6, 800, 10, 12, 14, 16, 18]

# The '+' operator for lists is equivalent to list.extend()
numbers + [100, 200, 300, 400]

[2, 4, 6, 800, 10, 12, 14, 16, 18, 100, 200, 300, 400]

Tuples¶

tuples_are_immutable = ("bar", 100, 200, "foo")
tuples_are_immutable

('bar', 100, 200, 'foo')

tuples_are_immutable[1]

100

tuples_are_immutable[1] = 666

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-c91965b0815a> in <module>()
----> 1 tuples_are_immutable[1] = 666

TypeError: 'tuple' object does not support item assignment

Sets¶

unique_items = set([1, 1, 2, 2, 3, 4, 1, 2, 3, 4])
# or curly brackets
# unique_items = {1, 1, 2, 2, 3, 4, 1, 2, 3, 4}
unique_items

{1, 2, 3, 4}

Slicing¶

numbers = [2, 4, 6, 8, 10, 12]

# list[start:end]
# start is inclusive, end isn't

numbers[0:3]

[2, 4, 6]

numbers[4:7]

[10, 12]

numbers[:3] # omitting start implies 0 (the very start)

[2, 4, 6]

numbers[3:] # omitting end means to the very end eg len(numbers)

[8, 10, 12]

numbers[-1:] # negative values reverse direction

[12]

numbers[:-1]

[2, 4, 6, 8, 10]

# you can also specify a step size
# list[start:end:step]

numbers[0:6:2]

[2, 6, 10]

# [:] is a shorthand for copying a list.
# Equivalent to:
# n_copy = list(numbers)

n_copy = numbers[:]
n_copy

[2, 4, 6, 8, 10, 12]

n_copy[3] = 8
n_copy

[2, 4, 6, 8, 10, 12]

numbers

[2, 4, 6, 8, 10, 12]

Challenge¶

Given the list: ['banana', 'cherry', 'strawberry', 'orange']

Return a list of just the red fruits.

Dictionaries¶

Dictionaries store a mapping of key-value pairs. They are unordered.

Other programming languages might call this a 'hash', 'hashtable' or 'hashmap'.

pairs = {'Apple': 1, 'Orange': 2, 'Pear': 4}
pairs

{'Apple': 1, 'Orange': 2, 'Pear': 4}

pairs['Orange']

2

pairs['Orange'] = 16
pairs

{'Apple': 1, 'Orange': 16, 'Pear': 4}

pairs.items()
# list(pairs.items())

dict_items([('Apple', 1), ('Orange', 16), ('Pear', 4)])

pairs.values()
# list(pairs.values())

dict_values([1, 16, 4])

pairs.keys()
# list(pairs.keys())

dict_keys(['Apple', 'Orange', 'Pear'])

len(pairs)

3

dict_of_dicts = {'first': {1:2, 2: 4, 4: 8, 8: 16}, 'second': {'a': 2.2, 'b': 4.4}}
dict_of_dicts

{'first': {1: 2, 2: 4, 4: 8, 8: 16}, 'second': {'a': 2.2, 'b': 4.4}}

Functions¶

Functions wrap up reusable pieces of code - the DRY principle

Significant whitespace: the body of the function is indicated by indenting by 4 spaces

(We also use these indented blocks for if/else, for and while statements .. later !)

return statements immediately return a value (or None if no value is given)

Any code in the function after the return statement does not get executed.

def square(x):
    return x**2

def hyphenate(a, b):
    return a + '-' + b
    print("We will never get here")

print(square(16), hyphenate('python', 'esque'))

256 python-esque

Indentation and whitespace¶

Python uses spaces at the start of a line to indicate a 'block' of code.
A new block of code should be indented by four spaces.
For a function, all the indented code is part of the the function.
This also applies to loops like for and while and conditionals like if

(Indenting/dedenting by four spaces in Python is the equivalent to opening { and closing } curly brackets in languages like Java, Javascript, C, C++, C# etc)

(You can technically use tab characters, but please don't. The official Python style guide prefers spaces https://www.python.org/dev/peps/pep-0008/).

# Functions can return multiple values (just return a tuple and unpack it)
def lengths(a, b, c):
    return len(a), len(b), len(c)

x, y, z = lengths("long", "longer", "LONGEREST")
print(x, y, z)

4 6 9

def split_at(seq, residue='K'):
    """
    Takes a protein sequence (as a string) and splits it at each K residue,
    or the residue specified in the `residue` keyword argument. Split point
    residue is discarded.
    
    Returns a list of strings.
    """
    return seq.split(residue)

split_at('MILKGROGDRINKPINEAPPLE')

['MIL', 'GROGDRIN', 'PINEAPPLE']

# Functions can have an indeterminate number of arguments and keyword arguments using * and **
import math

def vector_magnitude(x, y, *args, **kwargs):
    
    # print(args)    # args is a tuple
    # print(kwargs)  # kwargs is a dictionary
    
    scale = kwargs.get('scale', 1)
    
    vector = [x,y] + list(args)
    return math.sqrt(sum(v**2 for v in vector)) * scale

print(vector_magnitude(1, 2, 4, 8, m=2))

9.219544457292887

Conditionals¶

a = 10
b = 0
a > 1

True

if a > 1:
    print("a is greater than one")

a is greater than one

word = 'Bird'

# Note: Double equals for a conditional vs single equals for assignment !
if word == 'Bird':
    print('Bird is the word.')
    
if word != 'Girt':
    print('The word is not girt.')

Bird is the word.
The word is not girt.

if 'ird' in word:
    print("'ird' is in Bird.")
    
letters = ['B', 'i', 'r', 'd']
if 'i' in letters:
    print("'i' is in letters.")

'ird' is in Bird.
'i' is in letters.

Protip: Long lines can be split across two or more using a backslash ('\')

This can make your code more readable.

There should be nothing after the backslash, including whitespace.

Try to keep lines shorter than 78 characters for a PEP-8 style bonus.

if 'I' not in 'team' or \
   'I' not in 'TEAM':
    print("There is no 'I' in team (or TEAM).")

There is no 'I' in team (or TEAM).

# Boolean logic
# True and True => True
a > 1 and b <= 0

True

# True or False => True
a > 1 or b > 1

True

if a > 100:
    print("a is greater than one hundred")
elif a > 50:
    print("a is greater than fifty but less than one hundred")
else:
    print("a is less than fifty")
    
# For better or worse, there is no case/switch statement in Python - you just use if/elif/elif/else

a is less than fifty

# Truthyness
if a:
    print("A non-zero int is truthy")

if not (a - 10):
    print("The int 0 is 'falsey' ... not False => True !")

if '' or [] or () or dict():
    print("We will never see this since an empty string, list, tuple and dict are all 'falsey'")
    
if "    ":
    print("A non-empty string, even whitespace, is 'truthy")

A non-zero int is truthy
The int 0 is 'falsey' ... not False => True !
A non-empty string, even whitespace, is 'truthy

Loops¶

A for loop works on a sequence types, generators and iterators

(this includes lists, tuples, strings and dictionaries)

for letter in "ABCD..meh":
    print(letter)

A
B
C
D
.
.
m
e
h

ts = [('Z', 99), ('Y', 98), ('X', 97)]

for t in ts:
    print(t)
    
# using tuple unpacking
for m, n in ts:
    print(m, n)

('Z', 99)
('Y', 98)
('X', 97)
Z 99
Y 98
X 97

# for on dictionary.items()
d = {'A': 1, 'B': 2, 'C': 3}

for item in d.items():
    # print(type(item))
    print(item)

('A', 1)
('B', 2)
('C', 3)

for k, v in d.items():
    print(k, v)

A 1
B 2
C 3

while loops keep looping while their condition is true:

while some_condition:
    do_stuff()

Note: If the condition for your while loops never becomes False, the loop will run forever (in Jupyter you can do Kernel -> Interrupt to break out of the infinite loop).

a = 0
while a < 16:
    print(a, end=' ')
    a += 1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

break immediately exits a loop

continue immediately starts the next iteration of the loop

Any code inside the loop after a break or continue is skipped.

a = 0
while True:
    a += 1
    
    if a > 16:
        break
        print('We will never see this.')
    
    if a % 2:
        continue
        print('We will also never see this.')
        
    print(a, end=' ')

2 4 6 8 10 12 14 16

List comprehensions¶

List comprehensions are a shorthand way to loop over a list, modify the items and create a new list.

# Instead of doing
new_list = []
for i in range(0,11):
    new_list.append(i**2)

new_list

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

# Use a list comprehension instead
new_list = [i**2 for i in range(0,11)]
new_list

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

# You can also `filter` values using an if statement inside the list comprehension
new_list = [i**2 for i in range(0,11) if i < 4]
new_list

[0, 1, 4, 9]

End part 1. Stand up and strech for a moment.