Python training exercise 8
Introduction
So far we've been writing 'sequential' code, basically following the flow of the code from the top to the bottom of the program. Sometimes, however, you want to re-use code elsewhere without copy/pasting a bit of code. You can do this with functions; a function holds a block of code that can be called from other places. Functions are essential for larger projects and code maintenance - if there's a problem with that piece of code, for example, you only have to fix it in one place.
Exercises
Functions
We've already been using built-in Python functions, for example abs() or len(). The name of the function followed by round brackets ( ) is the general syntax when calling a function. You can define them yourself in the following way:
def myAbsFunc(someValue): if someValue < 0: someValue = -someValue return someValue print(abs(-10)) print(myAbsFunc(-10))
So here we've emulated the Python built-in abs() function with myAbsFunc(). Within a function you can use return to 'send back' a value, which can then be used somewhere else in the code.
Functions can also make code more 'readable', as you can give them a name that is easy to understand so that it's clear what is happening without having to examine the code. Important: save the below program in a file called functions1.py.
def getMeanValue(valueList): """ Calculate the mean (average) value from a list of values. Input: list of integers/floats Output: mean value """ valueTotal = 0.0 for value in valueList: valueTotal += value numberValues = len(valueList) return (valueTotal/numberValues) meanValue = getMeanValue([4,6,77,3,67,54,6,5]) print(meanValue) print(getMeanValue([3443,434,34343456,32434,34,34341,23]))
Note that it is good style to add a comment (in this case a multi-line one) to the top of the function that describes what it does, what it takes as input and what it produces as output. This is especially important for more complex functions.
You can call functions within functions, basically anywhere in the code, also in conditions, ...:
def getMeanValue(valueList): """ Calculate the mean (average) value from a list of values. Input: list of integers/floats Output: mean value """ valueTotal = 0.0 for value in valueList: valueTotal += value numberValues = len(valueList) return (valueTotal/numberValues) def compareMeanValueOfLists(valueList1,valueList2): """ Compare the mean values of two lists of values. Input: valueList1, valueList2 Output: Text describing which of the valueLists has the highest average value """ meanValueList1 = getMeanValue(valueList1) meanValueList2 = getMeanValue(valueList2) if meanValueList1 == meanValueList2: outputText = "The mean values are the same ({:.2f}).".format(meanValueList1) elif meanValueList1 > meanValueList2: outputText = "List1 has a higher average ({:.2f}) than list2 {:.2f}).".format(meanValueList1,meanValueList2) else: # No need to compare again, only possibility left outputText = "List2 has a higher average ({:.2f}) than list1 ({:.2f}).".format(meanValueList2,meanValueList1) return outputText valueList1 = [4,6,77,3,67,54,6,5] valueList2 = [5,5,76,5,65,56,4,5] print(compareMeanValueOfLists(valueList1,valueList2)) if getMeanValue(valueList1) > 1: print("The mean value of list 1 is greater than 1.")
Download this matrix file and save it in your directory. Then write a function to read a matrix file in this format, reorder the rows by the values in the given column, and printing out the result. The function should take as argument a file name and a column number. [click on show more for answer) |
---|
def sortMatrixByColumn(fileName,columnNumber): # # Read the tab-delimited file and store the values # fin = open(fileName) lines = fin.readlines() fin.close() # # Convert the data from the file into a Python list # matrix = [] for matrixRow in lines: # Tab-delimited, so split line by \t - this will give a list of strings matrixColumns = matrixRow.rstrip().split("\t") # Add a row to the matrix matrix.append([]) # Add the columns, but convert the strings from the file into a float for matrixValue in matrixColumns: matrix[-1].append(float(matrixValue)) # # Now sort by column - but have to track the row number as well! # selectedColumnValues = [] for rowNumber in range(len(matrix)): selectedColumnValues.append((matrix[rowNumber][columnNumber],rowNumber)) selectedColumnValues.sort() # # Now print out the new matrix - the column value is now not interesting # we want the row number!! # for (columnValue,rowNumber) in selectedColumnValues: columnValueStrings = [] for value in matrix[rowNumber]: columnValueStrings.append("{:.3f}".format(value)) print("\t".join(columnValueStrings)) sortMatrixByColumn("matrix.txt",3) |
Modify the program to read in the TestFile.pdb file by using separate functions to 1. get the title, 2. dissect the information from the ATOM line and 3. to calculate the distance to the reference distance. [click on show more for answer) |
---|
def getTitle(line,cols): # Gets the title title = line.replace(cols[0],'') title = title.strip() return ("The title is '%s'" % title) def getAtomInfo(cols): # Get relevant information from an ATOM line and convert to the right type atomSerial = int(cols[1]) atomName = cols[2] residueNumber = int(cols[5]) x = float(cols[6]) y = float(cols[7]) z = float(cols[8]) return (atomSerial,atomName,residueNumber,x,y,z) def calculateDistance(coordinate1,coordinate2): # Calculate the distance between two 3 dimensional coordinates return ((coordinate1[0] - coordinate2[0]) ** 2 + (coordinate1[1] - coordinate2[1]) ** 2 + (coordinate1[2] - coordinate2[2]) ** 2 ) ** 0.5 # Open the file fileHandle = open("TestFile.pdb") # Read all the lines in the file (as separated by a newline character), and store them in the lines list # Each element in this list corresponds to one line of the file! lines = fileHandle.readlines() # Close the file fileHandle.close() # Initialise some information searchCoordinate = (-8.7,-7.7,4.7) modelNumber = None # Loop over the lines, and do some basic string manipulations for line in lines: line = line.strip() # Remove starting and trailing spaces/tabs/newlines # Only do something if it's not an empty line if line: cols = line.split() # Split the line by white spaces; depending on the format this could be commas, ... # Print off the title if cols[0] == 'TITLE': print(getTitle(line,cols)) # Track the model number elif cols[0] == 'MODEL': modelNumber = int(cols[1]) # For atom lines, calculate the distance elif cols[0] == 'ATOM': (atomSerial,atomName,residueNumber,x,y,z) = getAtomInfo(cols) # Calculate the distance distance = calculateDistance((x,y,z),searchCoordinate) if distance < 2.0: print("Model {}, residue {}, atom {} (serial {}) is {:.2f} away from reference.".format(modelNumber,residueNumber,atomName,atomSerial,distance)) Compared to the original program it is much easier to see what is going on here. Also, the calculateDistance() function is relevant for any other piece of code where you want to calculate distances between two 3D coordinates - it's not embedded in this bit of code any more and is useful anywhere that you need to do this. |
Keywords in functions
In the functions so far we've been using parameters - values that are passed in and are required for the function to work. You can also give keywords to a function; these are not required for the function to work because they are given a default value in the function definition. You can then set these keywords if necessary; consider this example:
def getBeerColour(nameOfBeer,printColour=False): """ Get the colour of a type of beer Input: name of the beer optional keyword printColour: if True, will print the colour within this function Output: colour of the beer """ colourOfBeer = 'unknown' if nameOfBeer.upper() in ('DUVEL','JUPILER','WESTMALLE TRIPEL'): colourOfBeer = 'blond' elif nameOfBeer.upper() in ('PALM',): colourOfBeer = 'amber' elif nameOfBeer.upper() in ('KASTEELBIER','CHIMAY BLEUE'): colourOfBeer = 'dark' if printColour: print("Colour of beer '{}' is {}!".format(nameOfBeer,colourOfBeer)) return colourOfBeer print(getBeerColour('Duvel')) getBeerColour('Palm',printColour=True)
Using these keywords make the function a lot more flexible - you can make the function do things (or not) depending on them.