Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Table of Contents

Introduction

In this exercise, you'll learn how to use the python interpreter to write and test functions iteratively, using the import and reload commands to import and then refresh functions definitions stored in a file.

What's a function?

In python, as with all computer languages, you can define functions, new commands that work a lot like UNIX commands in that you can type them into the interpreter with different arguments and observe the results.

Functions can also return values, meaning: you can use a function to create a variable or change the value of a variable that already exists.

To define a function in python, you use the keyword def (for define), followed by the name of the function, followed by an argument list, which contains zero or more arguments that will be passed in by the code that calls the function. (Calling a function is also called invoking the function.)

The function then contains a block of code, which is indented relative to the def keyword; the block of code typically performs manipulations on the variables that were passed in.

If the function returns a value, then the last executed statement of the function code block contains a return statement that returns a value.

For example, here is a function called foo that accepts one argument, adds a number to it, and returns the result:

Code Block
>>> def foo(a):
...     return a + 5
...
>>> foo(2)
7
>>> b = foo(2)
>>> b
7

Note that when we invoke the function in the interpreter without assigning the result to a variable, the interpreter prints the return value. When we assign the result to a variable, the interpreter doesn't print anything. This is because an assignment doesn't return a value. All it does is change the python environment by creating a new variable.

Use the dir command to view names of variables that have been defined, like so.

Code Block
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'b', 'foo']

Note that we have defined two variables, including b and foo, which is the function we created.

Functions can return values

Functions are useful mainly because they can accept data in the form of arguments, perform calculations on those arguments, and then return values.

The way this works is that when you define a function, you include an argument list that can include names of variables that are the function arguments. The argument list can also be empty.

For example, here is a function that returns a string:

Code Block
>>> def kindofuseless():
...     return "hello"
...
>>> kindofuseless()
'hello'
>>> result = kindofuseless()
>>> result
'hello'
>>>

Here is a function that accepts two arguments:

Code Block
>>> def addsThem(a,b):
...     result = a+b
...     return result
...
>>> addsThem(1,2)
3
>>> result = addsThem(5,6)
>>> result
11

Functions can have sides effects (good ones and bad)

Both functions return a value using the return keyword. However, you can write functions that don't return values. They might still do something, such as add an element to list or print something to stdout, but unless they explicitly return a value using the return keyword, their return value is None.

Code Block
>>> def nothing():
...     return None
...
>>> nothing()
>>>
>>> def nothingAgain():
...     print "Hello"
...
>>> nothingAgain()
Hello
>>> result = nothingAgain()
Hello
>>> result
>>>

Note how the function nothingAgain contains one statement: a print statement. The print statement does nothing except print its argument to the stdout stream. It doesn't return a value, and if you try to assign a value to the output of a print statement, python reports an error. The print statement is an example of a function that does something useful, but only via a side effect, meaning: it makes something happen, but doesn't return a value.

Here is another example of a function that does something via a side effect.

Code Block
>>> def appendAnItem(lst,item):
...     lst.append(item)
...
>>> lst = [1,2,3]
>>> lst
[1, 2, 3]
>>> appendAnItem(lst,4)
>>> lst
[1, 2, 3, 4]

Observe that the function, when passed a list and an item, adds the item to the list. Nothing gets returned, but the list changes.

Note

When passed a data structure like a list or dictionary, a function can make a change toe the data structure and the change stays in effect even after the function ends.

Exercises

The following exercises will give you experience using major constructs important in writing python programs.

Setting up

  • Open and editor and create a file called functions.py.
  • Save it in a location where python can find it; it should either reside in the same directory where you invoke python or in a directory listed in your PYTHONPATH environment variable.
  • Open two terminals. One terminal will be your interpreter window running python, and the other window will be your UNIX shell.
    • You can use your VM, which should have python already installed. Note the version may be quite old (e.g., 2.4.3) but installing a newer version of python is not necessary for this assignment.

Note that in this class, we're using python 2, not python 3.

Write, test, and run a simple function

Write a simple function countLetters that returns the number of letters in a string.

Add the following code to your functions.py file and replace result = 0 with a statement that calculates the length of the string.

Code Block
def countLetters(s):
    "Count the number of letters in a string"
    result = 0
    return result

Save the file. Next, launch python in one of your terminals. Import your functions into the python environment by typing after the python prompt:

Code Block
>>> import functions

If no error is printed, then the functions module was imported into the environment. However, if functions.py is not in your PYTHONPATH, you will see an error like this:

Code Block
>>> import functions
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named functions

To fix this, create and export an environment variable called PYTHONPATH. PYTHONPATH is an environment variable consisting of a colon-separated list of directories where python searches for module files such as functions.py.

If your file contains a syntax error, you will see a different message that depends on the error. This is called a compile time error because it takes place when the python code is "compiled" to run in the interpreter.

After importing the functions module, test your function by typing

Code Block
>>> astring = "ATGC"
>>> functions.countLetters(astring)
4
Note

To invoke a function belonging to a module, type the name of the module, a period, and then the name of the function with parentheses surrounding its argument list.

If you get an error, edit your code, save the file, and reload the changed code into the interpreter by typing

Code Block
>>> reload functions

Try to run your new function as before and repeat until you get the correct answer and no errors.

Program more functions

sameString

This function should accepted two arguments (strings) and return a boolean value True or False.

If the strings are the same, regardless of the case, it should return True. If they are different, it should return False.

Write the function, import it into your python interpreter, and then test it as in the following example:

Code Block
>>> s1 = 'Hello how are you'
>>> s2 = 'Hello HOW are YOU'
>>> s3 = 'Fine thank you'
>>> functions.sameString(s1,s2)
True
>>> result = functions.sameString(s1,s2)
>>> result
True
>>> result = functions.sameString(s2,s3)
>>> result
False
>>>
Tip

Think of other ways to test your code.

makeComp

A dictionary is a data structure that uses keys to store values.

For example, here's how you would create a dictionary from scratch in the interpreter and add values to it.

Code Block
>>> d = {}
>>> d['FirstName']='Mary'
>>> d['LastName']='Stevens'
>>> d
{'LastName': 'Stevens', 'FirstName': 'Mary'}

Write a function called makeComp that creates and returns a dictionary where keys are letters representing the four nucleotide bases of DNA and the values are their complementary bases. Allow both upper and lower case keys.

Code Block
>>> d = functions.makeComp()
>>> d['a']
't'
>>> d['A']
'T'
>>> d['c']
'g'
>>> d['g']
'c'
>>> d['a']
't'

reverseString

Write a function stringReverse that accepts a string and then returns a new string that is the reverse of the first string.

For example, if given 'atcg', stringReverse will return 'gcta'

Code Block
>>> functions.reverseString('ATCG')
'GCTA'

findStrings

Write a function findStrings that returns a list of indices for all instances of a given query sequence of letters in a larger target string.

The first argument should be the query string and the second argument should be the string to search (the target).

For example, findStrings('AAA','AAATCAAAATC') should return the list [0,5,6].

Write a function that uses another function

Functions are most useful when you use them in other functions. Most programs are divided into multiple functions that call each other one by one in order to process data or perform a larger function.

Each function performs a part of the work, and often there is a so-called main function that calls each of the other functions in order.

This type of programming is called procedural programming, and it is very common in bioinformatics because much of what we do involves processing data in orderly steps.

For example, here are two functions. The second function invokes the first function, assigns its return value to a variable, uses that variable perform a computation, and the returns the result.

Code Block
>>> def first(x):
...     return x*x
...
>>> first(10)
100
>>> def second(x):
...     y = first(x)
...     return y - x
...
>>> second(10)
90

reverseComplement

Write a function reverseComplement that accepts a string representing a DNA sequence and returns the reverse complement.

The letters of the returned string should be the same case as their complements in the input string. For example, if given the string 'AtCG' then reverseComplement should return 'CGaT'.

For full credit, reverseComplement should invoke makeComp and use the output to create and then return the reverse complement.

Turn it in

To turn in the program, put it in a Web-accessible directory on your VM and email Dr. Loraine the URL.

Note that starting next week, you'll turn in all your assignments using subversion, a version control system that is widely used in bioinformatics and in many other fields of computer science.