Thursday, May 28, 2009

switch statements in python

I may as well weight in on the absence of a switch statement in Python, though the topic has been addressed, see:

Python Zone » Python switch statement, or
http://dinomite.net/2008/python-switch-statements-part-2

Here are some examples of switching structures using dictionaries:

# switch example 1
cases = {
'a':
    lambda: 'one',
'b':
    lambda: 'two',
'default':
    lambda: 'three'
}

switch = lambda c: cases.get(c, cases['default'])()

var = ['b','xxx']
for v in var:
    out = switch(v)
    print("Switch on %s = %s"%(v,out))


# switch example 2
op = raw_input("Enter operation for 2 'op' 3: ")

if op in "+-*/":
    print("2 %s 3 = " % op)
else:
    print("'%s' is an unkown operation." % op)

cases = {
     '+': lambda : 2 + 3,
     '-': lambda : 2 - 3,
     '*': lambda : 2 * 3,
     '/': lambda : 2. / 3.
     }

switch = cases.get(op, lambda : 0)()

print(switch)


# switch example 3
name = raw_input("What is your name? ")
op = raw_input("enter 'l' or 'p'")

cases = {
    'l': len,
    'p': lambda txt: txt.upper()
    }

def default(obj):
    print("Invalid entry, %s" % obj)

switch = cases.get(op, default)
out = switch(name)

print(out)
I would say that each of these is as readable as C's switch statement.  I can't say anything for performance, since I haven't run any tests--but Python's dictionaries are fast.   In fact, if I understand the implementation correctly this pattern is pretty much what the switch statement does--set up a hash table (i.e. a dictionary) of test values and choose a code block based on the condition entered.  

The main argument would be the cumbersomeness of switch on larger blocks of code.  To do that you need to move each block of switch code into a function defined by def.   My only response to that is that if you're writing large blocks of code inside a switch statement, you should think about putting it in a function--in my experience there is usually a fair amount of reused code inside a large switch statement, since the conditions usually each deal with a separate state of a single variable.  Also, the difference in readability between switch and nested if...then..else's diminishes rapidly the larger the conditional code blocks get.

So, should Python have a switch statement?  Readability of code long term may benefit if there is one way to do a switch as opposed to using different, although similar patterns like above.  But in general, I support keeping control flow logic made of the simplest building blocks--in the long run I think that keeps bloat down and enforces more thinking about the nature of the specific programming problem at hand.

But if a switch statement was added to Python, I'd probably use it.





Wednesday, May 13, 2009

Speaking of timing

Using list comprehension is much faster than not:

In [35]: ll = [(x,x*x) for x in range(100)]

In [36]: def f1(obj):
....: for row in obj:
....: x = row[0]
....: y = row[1]
....:

In [37]: def f2(obj):
....: for row in obj:
....: x,y = row
....:

In [38]: def f3(obj):
....: for row in obj:
....: x,y = (row[0],row[1])
....:

In [39]: timing(f1,10000,ll)
f1 2.04

In [40]: timing(f2,10000,ll)
f2 0.8

In [41]: timing(f3,10000,ll)
f3 2.41
Function calls always add a bit of overhead:

def f1(obj):
x = min(obj,0.0)

def f3(obj):
if obj <= 0.0:
x = obj
else:
x = 0.0

In [56]: timing(f1,10000,5.)
f1 0.06

In [57]: timing(f1,10000,-5.)
f1 0.05

In [67]: timing(f3,10000,5.)
f3 0.04

In [68]: timing(f3,10000,-5.)
f3 0.04


Useful to know.






Wednesday, May 6, 2009

Really, they're slow...

Maybe that previous example isn't fair...after all, I'm treating the value like a list.

In [187]: simple = lambda d: d+d

In [188]: timing(simple, 100000,dyyyymm)
<lambda> 34.84

In [189]: simple(dyyyymm)
Out[189]: Decimal("4018.10")

In [190]: simple(fyyyymm)
Out[190]: 4018.0999999999999

In [191]: timing(simple, 100000,fyyyymm)
<lambda> 0.26




Python Decimals are really slow

10.4. decimal — Decimal fixed point and floating point arithmetic — Python v2.6.2 documentation

In [182]: rparts = lambda d: map(int, (floor(d), round(100*(d%1),0)))

In [183]: rparts(dyyyymm)
Out[183]: [2009, 5]

In [184]: rparts(fyyyymm)
Out[184]: [2009, 5]

In [185]: timing(rparts,10000,dyyyymm)
<lambda> 9.03

In [186]: timing(rparts,10000,fyyyymm)
<lambda> 0.28


Note:  the timing function I'm using here is the same one referred to in an earlier post "Timing is everything". 


Saturday, May 2, 2009

6. Built-in Types — Python v2.6.2 documentation

Python variables contain pointers to the data, not the data itself--this one of the more confusing aspects of the language for many.  The implications of this, though, may be made clear by the following example:


class mytester(object):
    def __init__(self, thingone={}, thingtwo=None):
        self.thingone = thingone
        if isinstance(thingtwo,dict):
            self.thingtwo = thingtwo
        else:
            self.thingtwo = {}
    def setone(self, **kwargs):
        self.thingone.update(kwargs)
        return self
    def settwo(self, **kwargs):
        self.thingtwo.update(kwargs)
        return self

t1 = mytester()
t2 = mytester()

print(t1.thingone, t2.thingone)
# ({}, {})

print(t1.thingtwo, t2.thingtwo) 
# ({}, {})

t1.settwo(say='hi').settwo(to='thing one and thing two') # set thingtwo for t1
# <__main__.mytester object at 0xecc9d0>

print(t1.thingtwo, t2.thingtwo)  # t1 & t2 remain independent
# ({'to': 'thing one and thing two', 'say': 'hi'}, {})

t1.setone(say='hi').setone(to='thing one and thing two') # set thingone for t1
# <__main__.mytester object at 0xecc9d0>

print(t1.thingone, t2.thingone) # thingone dict points to same object for each
# ({'to': 'thing one and thing two', 'say': 'hi'}, {'to': 'thing one and thing two', 'say': 'hi'})

print id(t1.thingone), id(t2.thingone) #they're the same object!
# (15704976, 15704976)

print id(t1.thingtwo), id(t2.thingtwo) #they're not!
# (9733440, 9733584)

In the first instance, the 'thingone' attribute of the class, initializing it with the default '={}' creates a pointer to a dictionary object called 'thingone' (NOT self.thingone).  So, when you change the values in that object for one class, you change them for all.

One could see where the ability to do this would be useful (say, where you would a singleton pattern in C++, for example).  Still, in general the ability to pass a value to the __init__ method of class implies to the user that the values will be unique for each instance of the class--so the pattern used for 'thingtwo' should be used more often.

Also, this exercise highlights the importance of the id() function in Python--it clarifies a lot of things.