Extracting Names from Email Addresses

Given a CSV file with the following format:


;;firstname.lastname@somehost.com

The task is to extract the names from the email addresses. We assume that the names are seperated by periods (.) and that all the names are supposed to be capitalized and printed with strings:


#!/usr/bin/env python

import sys

if len( sys.argv ) < 2:
print "Usage: %s filename" % sys.argv[ 0 ]
sys.exit( 1 )

textFileName = sys.argv[ 1 ]
textFile = open( textFileName, "r" )

for line in textFile:
fields = line.strip().split( ';' )
email = fields[ 2 ].split( "@" )
emailName = email[ 0 ].split( '.' )
capitalizedName = [ x[:1].upper() + x[1:].lower() for x in emailName ]
print '%s;%s;%s' % ( capitalizedName[ 0 ], ' '.join( capitalizedName[ 1: ] ), fields[ 2 ] )

There are approximately N/ln(N) primes between N and 2N

Just saw this very nice video by @numberphile, and thought I whip up a small Python program to demonstrate the prime number theorem:


#!/usr/bin/env python
#
# "Chebyshev said it, and I say it again: There's always a prime between n and 2n."
#

import sys
import math

class PrimeFinder:

def __init__( self, n ):
self.n = n

def isNPrime( self, N ):
for x in range( 2, int( math.sqrt( N ) ) + 1 ):
if N % x == 0:
return False
return True

def computeAllPrimesBetweenNAndTwoN( self ):
result = []
for N in range( self.n, 2 * self.n + 1 ):
if self.isNPrime( N ):
result = result + [ N ]
return result

def main():
if len( sys.argv ) != 2:
print "Prints all prime numbers between N and 2N"
print "Usage: %s N" % sys.argv[ 0 ]
print "Where N is some positive, natural number."
sys.exit( 0 )

N = int( sys.argv[ 1 ] )
primeFinder = PrimeFinder( N )
allPrimes = primeFinder.computeAllPrimesBetweenNAndTwoN()
print "There are %u primes between %u and %u: %s" % (
len( allPrimes ), N, 2 * N, str( allPrimes )[ 1 : -1 ]
)

if __name__ == "__main__":
main()

And it seems to work, but check WolframAlpha if you don’t trust me 🙂


$ ./myprimes.py 100000
There are 8392 primes between 100000 and 200000: 100003, 100019, 100043 ...

How to use SciPy Least Squares to minimize multiple functions at once

SciPy comes with a least squares Levenberg-Marquardt implementation. This allows you to minimize functions. By defining your function as the difference between some measurements and your model function, you can fit a model to those measurements.

Sometimes your model contains multiple functions. You can also minimize for all functions using this approach:

  • Define your functions that you like to minimize A(p0), B(P1), …
    their cumulative paramaters will be a tuple (p0, p1, …).
  • Define your function to be minimized as f(x0), where x0 is expanded to the parameter tuple.
  • The function f returns a vector of differences between discrete measured sample and the individual functions A, B etc.
  • Let SciPy minimize this function, starting with a reasonably selected initial parameter vector.

This is an example implementation:


import math
import scipy.optimize

measured = {
1: [ 0, 0.02735, 0.47265 ],
6: [ 0.0041, 0.09335, 0.40255 ],
10: [ 0.0133, 0.14555, 0.34115 ],
20: [ 0.0361, 0.205, 0.2589 ],
30: [ 0.06345, 0.23425, 0.20225 ],
60: [ 0.132, 0.25395, 0.114 ],
90: [ 0.2046, 0.23445, 0.06095 ],
120: [ 0.2429, 0.20815, 0.04895 ],
180: [ 0.31755, 0.1618, 0.02065 ],
240: [ 0.3648, 0.121, 0.0142 ],
315: [ 0.3992, 0.0989, 0.00195 ]
}

def A( x, a, k ):
return a * math.exp( -x * k )

def B( x, a, k, l ):
return k * a / ( l - k ) * ( math.exp( -k * x ) - math.exp( -l * x ) )

def C( x, a, k, l ):
return a * ( 1 - l / ( l - k ) * math.exp( -x * k ) + k / ( l - k ) * math.exp( -x * l ) )

def f( x0 ):
a, k, l = x0
error = []
for x in measured:
error += [ C( x, a, k, l ) - measured[ x ][ 0 ],
B( x, a, k, l ) - measured[ x ][ 1 ],
A( x, a, k ) - measured[ x ][ 2 ]
]
return error

def main():
x0 = ( 0.46, 0.01, 0.001 ) # initial parameters for a, k and l
x, cov, infodict, mesg, ier = scipy.optimize.leastsq( f, x0, full_output = True, epsfcn = 1.0e-2 )
print x

if __name__ == "__main__":
main()

SciPy returns a lot more information, not only the final parameters. See their documentation for details. You also may want to tweak epsfcn for a better fit. This depends on your functions shape and properties.

How to convert a Python list into a string in a strange way

Given a list in Python, suppose you wanted a string representation of that list. Easy enough:


str( [ 1, 2, 3 ] )

However, suppose you did not want the standard notation, but rather apply some function to each string element or simply do away with the brackets and commas, you can use list comprehension, the join function and an empty string literal:


''.join( str( i ) for i in [ 1, 2, 3 ] )

I already knew list comprehension, but using it in this scenario and with a string literal as an object was new to me. Anyway, using such code is probably a bad idea, since it might be hard to read! At the very least, one should stow it away in a function with a fitting name, such as jointStringRepresentationOfListItems( list ). But really, I am not even sure what I would use that for…

Update: even better is this:


','.join( map( str, [ 1, 2, 3 ] ) )

Saving matplotlib plots as PDF

I recently started using matplotlib together with PyQt, and I love it. It’s awesome and has many more features compared to PyQWT. However, I needed to save the plots to a PDF file. Here is how you do that:

import matplotlib.backends.backend_pdf
...
@QtCore.pyqtSlot()
def printPlots(self):
filename,_ = QtGui.QFileDialog.getSaveFileName(self, "Save plots as PDF file", "", "Portable Document File (*.pdf)")
if filename == "":
return
pp = matplotlib.backends.backend_pdf.PdfPages(filename)
pp.savefig(self.plotFigure)
pp.close()

Assuming that your figure is called self.plotFigure. You can connect the above slot to a QAction and map it to some nice menu item or shortcut.

Is “not None” maybe “Something”?

I just had a funny thought. In Python you can write:

if not someObject is None:
someObject.doSomething()
else:
print "someObject is None!"

This reads a bit strange. So what if you could alias “not … is None” to “… is Something”?

if someObject is Something:
someObject.doSomething()
else:
print "someObject is None!"

It seems that this idea was thought of almost eight years ago already. This lead to some PEP 0326, which got rejected. If Python were a macro or functional language, you could probably hack something up to do the same thing, but it does not work like that:

>>> Something = not None
>>> Something
True
>>> A = [1,2,3]
>>> if A is Something:
... print "This is something"
... else:
... print "This is nothing"
...
This is nothing
>>>

The problem here being, that “non None” is immediately evaluated to “True”, since “None” can be implicitly converted to “False” in a boolean sense. Was a funny thought, though.

Update: Turns out you can at least write “if someObject is not None:”, which is more readable.

Python class attributes versus instance attributes

Today I finally found out the difference between class and instance attributes in Python. In C++, this is done by putting the static modifier in front of the declaration. Consider the following code:

#!/usr/bin/env python

class B:
b = 2

class C:
a = B()

def __init__(self):
self.a.b = 1

c = C()
c.a.b = 3
b = C()

print c.a.b, b.a.b

Here, a is a class attribute of class C. That is, there exists only one such attribute for all objects of kind C. So the output of the print statement will be “1 1”. This is the same as a static attribute in C++. But often I want an instance attribute. The correct way to do this would have been:

#!/usr/bin/env python

class B:
b = 2

class C:
def __init__(self):
self.a = B()
self.a.b = 1

c = C()
c.a.b = 3
b = C()

print c.a.b, b.a.b

Now the output is “3 1”, just as expected. I guess this all somehow makes sense in the Python world, but I tripped over this, and worst of all: Sometimes you don’t even notice. If the class attribute is a simple type, like int, the first solution would have worked. However, I have not yet understood why that is the case. One more Python semantic that eluded me so far.

MapReduce on CUDA using Python (and a Discontinuous Galerkin solver!)

As volcore pointed out, there is a library called pycuda, which allows for CUDA programming in Python. It also comes with a nice ReductionKernel class, which allows one to rapidly develop custom MapReduce kernels.

Update: Even better, the same author has published a Discontinuous Galerkin solver, based on the same stuff. This can be used to solve partial differential equations, e.g. for fluid simulations, but also for EM simulations, using Maxwell’s equations.

Finally: Rotations of real Spherical Harmonics according to Blanco et al.

I finally managed to implement Blanco’s 1997 paper. The formulas were quite tedious to implement correctly, with all the matrix indices going possibly wrong. But after lots of debugging, I now have a working implementation in my SH explorer. Have a look:

This is an antenna pattern by Tiiti, which was rotated by 45, 90 and 45 degrees, in ZYZ angles. And it did not explode or deform badly. 🙂