##Review exercises (10 minutes)
a = 1; b = 2; c = 2.0; d = 'abcdefg'; e = '1'
1 / 2
a / b
a / c
float(a) / b
a / 'b'
a + b
d + e
d + 'e'
if 'd' in d: print 'found d' if d in 'd': print 'found abcdefg' else: print 'the variable d was not in the string "d"'
##Exercises 4: slices (20 minutes).
Try some values of your own to see if you can reconstruct how negative indices and left off start and end arguments work.
'gene_1_UAUCCUA_0.3'as a variable and write a slice notation to retrieve the third character (the 'two-eth' character). (remember that the first item is item number 0). This should give back
##Exercises 5: lists and splitting things (20-30 minutes)
xand then try this command:
x2 = x.split('_').
x2? Note the
]symbols and the commas, as well as the extra quote markings. These are features denoting that
x2is a list (that's what the brackets on the end denote) composed of several strings (commas delimit the individual elements of the list). The
splitcommand that created this list from a string will be covered in part 5.
x2, try these commands to compare the slice notation properties of a list with the slice notation properties of a string.
listvariable we created (
splitmethod can be generalized as follows:
some_list = 'some_string'.split(some_delimiter). There are two inputs to this split statement (the string that needs splitting and the delimiter used to split) and one output (the output list). Try the following commands to familiarize yourself with how split goes about splitting up a string into a list.
x = 'abcdefgh'.split('c')
a = 'hi, my name is Bob, this is Una'
z = a.split(' ')
l = a.split() # this is a very useful trait of split, look up what happens when you split with empty parentheses
caveman = a.split(' is ') # notice that multiple characters can be used as the delimiter
'gene1 gene2 gene3'
'gene1, gene2, gene3'
##15 minute break
##Exercises 6: more properties of lists and 'methods' like split (20-30 minutes)
x = [1, 2, 3]
y = 'ABCDE'
x = 'ab'
x[1:3] = 'R'
y = 'L'
y = y[0:1] + 'hello' + y [2:]
'ATG') as your delimiter. Next, we need to fix each individual ORF in the list (as each ORF is now missing its start codon). To fix each ORF, replace each element in the list with the start codon plus the element (hint: use the string concatenation operator '
.somethingis a method of whatever came before the dot. Methods are a major feature of the Python programming language. We’ve been using the
splitmethod of strings which operates on a string and returns a list. Usually, these methods use whatever came before the dot as the input item to operate on, whatever is in parentheses as parameters, and return some value which the user can store as a variable. Importantly, these three components can be input in very creative ways so long as they evaluate to have values that the python method knows what to do with (in this case the method requires an input string and uses a second string as a delimiter). Try these bizarre looking exercises to test this.
x = ['abcdefgh', 'cd']
y = x.split(x)
y = x.split(x)
y = x.split(x)
y = x.split(x)[x.index('c')]
y = x.split(x).split('g')
In general, you can make your code compact by putting the content of one operation into the input fields of the next, or readable by storing each step as a variable. Here is an alternate version of the final statement:
first_string = x delimiter = x first_list = first_string.split(delimiter) new_string = first_list final_list = new_string.split('g')
interesting_genesportion of the string (with
split), split this portion of the string by
'gene5'to only look at the part that comes after
'gene5'(with a second split), and return the value associated with gene5 (with a third split)
'boring_genes; gene1:2.6, gene2:3.8, interesting_genes; gene4:1.9, gene5:8.2, gene6:9.1'
ifstatement so that it would find the gene5 expression level regardless of whether gene5 was in the
boring_genes, and so that it would report back which set of genes gene5 was in.
lento explore the properties of your new nested list.
##Exercises 7: loops and nested lists (20-30 minutes).
for hamster_plan in 'ALMJKLKJ': print hamster_plan for horse_vitamin in ['frosted', 'berry', 'cereal']: print horse_vitamin
'ALMJKLKJ'? In other words, what is each
hamster_plan? What is being iterated through in the list
['frosted', 'berry', 'cereal']? What is each
horse_vitamin? Do you notice the difference between the type of data retrieved by the
hamster_planloop (which is going through a string) and the
horse_vitaminloop (which is going through a list)? What will
bean_juicebe if you nest the loops as below:
for horse_vitamin in ['frosted', 'berry', 'cereal']: for bean_juice in horse_vitamin: print bean_juice
'ALSQRWQT'and prints each character.
'found Q'every time the character is
xin the line preceding your loop from above (the part 4 loop) at some initial value of your choosing, and putting
x = x + 1within the loop. What happens to
xas you go through the loop? Use this nifty property to print out the letter number where
'Q'was found whenever
x(below) to figure out how many lists are in
x. Use a loop to print each of the lists that make up
x = [[['gene1', 'heart'], ['gene2', 'brain']], [['gene4', 'appendix'], ['gene5', 'stomach'], ['gene6', 'esophagus']]]
x, see how many items the component list has (with the
lenfunction), and make a loop that prints out what those items are.
x). If you have time, try looping through all elements of the slices instead of the full list. (replacing
xin the outermost loop with
dna = 'AATTACCGCATTCCACGGGACCTACGAATTATAGTACCTAAA' i = 0 while i < 10: .... print(dna[i])