- Turn it in
In this exercise, you'll learn how to use the python interpreter to write and test functions iteratively, using the import and reload commands to import and then refresh functions definitions stored in a file.
What's a function?
In python, as with all computer languages, you can define functions, new commands that work a lot like UNIX commands in that you can type them into the interpreter with different arguments and observe the results.
Functions can also return values, meaning: you can use a function to create a variable or change the value of a variable that already exists.
To define a function in python, you use the keyword def (for define), followed by the name of the function, followed by an argument list, which contains zero or more arguments that will be passed in by the code that calls the function. (Calling a function is also called invoking the function.)
The function then contains a block of code, which is indented relative to the def keyword; the block of code typically performs manipulations on the variables that were passed in.
If the function returns a value, then the last executed statement of the function code block contains a return statement that returns a value.
For example, here is a function called foo that accepts one argument, adds a number to it, and returns the result:
Note that when we invoke the function in the interpreter without assigning the result to a variable, the interpreter prints the return value. When we assign the result to a variable, the interpreter doesn't print anything. This is because an assignment doesn't return a value. All it does is change the python environment by creating a new variable.
Use the dir command to view names of variables that have been defined, like so.
Note that we have defined two variables, including b and foo, which is the function we created.
Functions can return values
Functions are useful mainly because they can accept data in the form of arguments, perform calculations on those arguments, and then return values.
The way this works is that when you define a function, you include an argument list that can include names of variables that are the function arguments. The argument list can also be empty.
For example, here is a function that returns a string:
Here is a function that accepts two arguments:
Functions can have sides effects (good ones and bad)
Both functions return a value using the return keyword. However, you can write functions that don't return values. They might still do something, such as add an element to list or print something to stdout, but unless they explicitly return a value using the return keyword, their return value is None.
Note how the function nothingAgain contains one statement: a print statement. The print statement does nothing except print its argument to the stdout stream. It doesn't return a value, and if you try to assign a value to the output of a print statement, python reports an error. The print statement is an example of a function that does something useful, but only via a side effect, meaning: it makes something happen, but doesn't return a value.
Here is another example of a function that does something via a side effect.
Observe that the function, when passed a list and an item, adds the item to the list. Nothing gets returned, but the list changes.
The following exercises will give you experience using major constructs important in writing python programs.
- Open and editor and create a file called functions.py.
- Save it in a location where python can find it; it should either reside in the same directory where you invoke python or in a directory listed in your PYTHONPATH environment variable.
- Open two terminals. One terminal will be your interpreter window running python, and the other window will be your UNIX shell.
- You can use your VM, which should have python already installed. Note the version may be quite old (e.g., 2.4.3) but installing a newer version of python is not necessary for this assignment.
Note that in this class, we're using python 2, not python 3.
Write, test, and run a simple function
Write a simple function countLetters that returns the number of letters in a string.
Add the following code to your functions.py file and replace result = 0 with a statement that calculates the length of the string.
Save the file. Next, launch python in one of your terminals. Import your functions into the python environment by typing after the python prompt:
If no error is printed, then the functions module was imported into the environment. However, if functions.py is not in your PYTHONPATH, you will see an error like this:
To fix this, create and export an environment variable called PYTHONPATH. PYTHONPATH is an environment variable consisting of a colon-separated list of directories where python searches for module files such as functions.py.
If your file contains a syntax error, you will see a different message that depends on the error. This is called a compile time error because it takes place when the python code is "compiled" to run in the interpreter.
After importing the functions module, test your function by typing
If you get an error, edit your code, save the file, and reload the changed code into the interpreter by typing
Try to run your new function as before and repeat until you get the correct answer and no errors.
Program more functions
This function should accepted two arguments (strings) and return a boolean value True or False.
If the strings are the same, regardless of the case, it should return True. If they are different, it should return False.
Write the function, import it into your python interpreter, and then test it as in the following example:
A dictionary is a data structure that uses keys to store values.
For example, here's how you would create a dictionary from scratch in the interpreter and add values to it.
Write a function called makeComp that creates and returns a dictionary where keys are letters representing the four nucleotide bases of DNA and the values are their complementary bases. Allow both upper and lower case keys.
Write a function stringReverse that accepts a string and then returns a new string that is the reverse of the first string.
For example, if given 'atcg', stringReverse will return 'gcta'
Write a function findStrings that returns a list of indices for all instances of a given query sequence of letters in a larger target string.
The first argument should be the query string and the second argument should be the string to search (the target).
For example, findStrings('AAA','AAATCAAAATC') should return the list [0,5,6].
Write a function that uses another function
Functions are most useful when you use them in other functions. Most programs are divided into multiple functions that call each other one by one in order to process data or perform a larger function.
Each function performs a part of the work, and often there is a so-called main function that calls each of the other functions in order.
This type of programming is called procedural programming, and it is very common in bioinformatics because much of what we do involves processing data in orderly steps.
For example, here are two functions. The second function invokes the first function, assigns its return value to a variable, uses that variable perform a computation, and the returns the result.
Write a function reverseComplement that accepts a string representing a DNA sequence and returns the reverse complement.
The letters of the returned string should be the same case as their complements in the input string. For example, if given the string 'AtCG' then reverseComplement should return 'CGaT'.
For full credit, reverseComplement should invoke makeComp and use the output to create and then return the reverse complement.
Turn it in
To turn in the program, put it in a Web-accessible directory on your VM and email Dr. Loraine the URL.
Note that starting next week, you'll turn in all your assignments using subversion, a version control system that is widely used in bioinformatics and in many other fields of computer science.