Real Ultimate Programming

The Home for People Who Like to Flip Out and Write Code

Review of Expert Python Programming, Part One

As I mentioned, I got started with the PyAtl Book Club at the May meeting. The first book I received was Expert Python Programming, by Tarek Ziadé. Part of the deal with the book club is posting a review of the book, so I will be posting about the book as I work my way through it.

Chapter 1

This is basic setup info (though the vim section has a nice selection of config settings for those new to it), so I won’t really cover it; I already have multiple versions of Python running on my dev box.

Chapter 2

This chapter is entitled “Syntax Best Practices—Below the Class Level”, and that’s a pretty accurate description. There’s a short section on List Comprehensions followed by a longer section on Iterators and Generators. This is a fast but detailed look at the subject, including the use of send and coroutines (via the mechanisms defined in PEP 342). The treatment is rather brief, so I recommend David Beazley’s A Curious Course on Coroutines and Concurrency for more detail. Up next are genexps (defined in PEP 289). These were added in Python 2.4. This section is short and to the point: use genexps wherever you would use a list comprehension unless you need some feature of lists that a generator lacks. The next section is a brief glimpse at itertools. It is hardly complete, but it does highlight one of the (IMO) most interesting pieces of the module: tee. This function makes it practical to use genexps even when you need to make multiple passes across the data. Up next is the section on decorators (defined in PEP 318), which were introduced with Python 2.4. The author addresses decorators in much more detail than the previous topics, with extended examples of using them for various things such as caching, proxies, and context providers. That last provides a nice segue into the next session, where he shows us how to replace context provider decorators by using the with statement (defined in PEP 343) introduced in Python 2.5. The author does a pretty thorough job of explaining what with is doing and how you can use it to good effect, including how to use the contextlib module (introduced at the same time as the with statement) to use context managers with code that doesn’t provide them out of the box.

Thoughts

Although it was a bit whirlwind at times, the author does a good job of covering the modern language constructs that Python has picked up in the last few versions. Although you may want to read a more detailed tutorial on a given feature, this chapter does a good job of getting you up to speed with modern Python.

Back to flipping out…

Note to Self: Clearing Oracle’s Buffer Cache

The next time you’re doing some rough benchmarking and you want to prevent Oracle’s cache from skewing your subsequent runs:

1
    alter system flush buffer_cache;

Be forewarned: sometimes it takes a while to run this one, and you need system-level privileges.

Back to flipping out…

Notes From PyATL 2009-05-14

What follows is essentially a stream-of-consciousness dump from the PyAtl meeting tonight.


Python Atlanta Meetup 2009-05-14

Miscellaneous

They’re recruiting people to do A/V work during PyCon here in Atlanta next year. Does this mean I get in to the con for free?

Hot Topics
  1. Community - you can get a lot of benefit from small-scale con-style stuff; look for the upcoming article about PyOhio in Python Magazine
  2. Distutils - Interesting things on the distutils mailing list
  3. IronPython - lots of buzz, but it’s not relevant to me right now
Testing with Nose

Alfredo Deza presenting.

Talking about testing Supay (a daemon module; named after an Incan demon god). Why another daemon module?

  • PEP 3143
  • It’s really a service, not just a daemon
  • At least it’s not a new web framework
  • Start, Stop, Status
  • Spawn children

Originally testing was done in Pythoscope. Easy to get started. Moved on to nose. Had problems because things under test were backgrounding, etc. People on the testing in Python mailing list advised him to test smaller chunks. coverage does testing coverage reporting for nose. coverage gives you really specific (package- specific, etc.) statistics.

The cool thing about nose is that it autodetects your tests.

One early gotcha: You have to invoke nose using nosetests. ed. I might as well alias nosetests to nosetests -v --cover-package=$package_name. Simply add --with-coverage to get the coverage report.

Cool, Pythoscope generates test stubs for you. I need me some of this.

Intermission

There was some A/V stuff going on here, so they took this opportunity to hand out books for review. I scored Expert Python Programming.

Choosing a Testing Framework

Brandon Rhodes presenting.

Intro

At first, he thought testing would help most by helping to get it right at first. Once he started testing, he discovered that it helped most with finding regressions. Much like undo, now he can’t imagine getting by without it. One thing to note: installed Python packages rarely ship with tests, so if you want to muck around in them, you typically need to download the tarball yourself.

What does Python testing look like without a framework?

simplejson

Uses setuptools, so you can just use the built-in test command. You can use an additional_tests function to build a test suite for setuptools. setuptools will not autodetect doctest tests. All in all, this is a pretty well-behaved module.

Sidenote: virtualenv - creates local, tiny Python install that you can use to avoid system-wide changes. I’ve seen it mentioned before, but this is the first time I’ve seen it in action; this could be seriously cool.

SQLAlchemy

Has a README.unittests (referenced from README) that tells you exactly how to do things. There is a lot of work to manually keep track of the tests in modules; obviously, this is begging for someone to forget to add a test to the suite.

Grok

The tests here are pretty idiosyncratic (each test defines a complete web app and puts it through its paces). It uses part of zc.testrunner.

Three testing frameworks

Python testing frameworks:

Questions you should ask:

  • How are tests discovered?
  • How can tests be filtered and selected?
  • How fancy are tests when reported?

One benefit of testing frameworks is that it makes the testing in your project look like the testing in other projects that use the same testing framework.

zc.testrunner

Not really covered beyond what we saw with Grok.

py.test

Finds tests based on a naming convention; this is not configurable. Tests are just functions; you don’t need a class w/ test methods. You can tag tests any way you want. By default only outputs print statements if it fails. Has distributed tests and multiplatform, but the documentation is a bit sparse. Uses a module to turn on autodetection for doctest tests.

nose

Finds tests based on a naming convention; this is configurable. Tests are just functions; you don’t need a class w/ test methods. You can tag tests any way you want. Has more extensive documentation and seems to have momentum.

It is probably feasible to write your tests so both py.test and nose can find and run them.

Selenium Demo (lightning talk)

Brandon Rhodes presenting.

Not a lot new to me; already used Selenium at work. The Python client seems to be doing a pretty good job of being Pythonic; the tests use unittest, etc. The basic commands seem pretty similar to the Java client I’ve used before.

Google Code and Hg (lightning talk)

Alfredo Deza presenting.

Brief compare/contrast between hg and svn. I’m already familiar with both so I won’t recap this part of the presentation.

Traditionally, svn has been the standard for Google Code. They’re giving invites to anyone attending the Google IO Conference (next week?). They’re also accepting applications here; this is for old projects and they will move it over to keep the history. Not all features are available on the Mercurial version yet, e.g., code browsing.

Coolest feature: hg serve. It sets up a web interface to your entire repo on localhost:8000. It’s pretty sophisticated, and allows you to push and pull from it.

And that concluded the evening.


Back to flipping out…

Project Euler: Problem 17

I recently finished up Problem 17, and since the problem was fairly straightforward, I used it as an opportunity to explore some of Python’s language features that I’ve been meaning to look into.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
    # problem17.py
    """
    Find the solution to `Problem 17`_ at `Project Euler`_.
    
    .. _Problem 17: http://projecteuler.net/index.php?section=problems&id;=17
    .. _Project Euler: http://projecteuler.net/
    """
    __docformat__ = "restructuredtext en"

    import re

    _NUMBER_NAMES = {1: "one", 2: "two", 3: "three", 4: "four", 5: "five",
                    6: "six", 7: "seven", 8: "eight", 9: "nine", 10: "ten",
                    11: "eleven", 12: "twelve", 13: "thirteen", 14: "fourteen",
                    15: "fifteen", 16: "sixteen", 17: "seventeen",
                    18: "eighteen", 19: "nineteen", 20: "twenty",
                    21: "twenty-one", 22: "twenty-two", 23: "twenty-three",
                    24: "twenty-four", 25: "twenty-five", 26: "twenty-six",
                    27: "twenty-seven", 28: "twenty-eight", 29: "twenty-nine",
                    30: "thirty", 31: "thirty-one", 32: "thirty-two",
                    33: "thirty-three", 34: "thirty-four", 35: "thirty-five",
                    36: "thirty-six", 37: "thirty-seven", 38: "thirty-eight",
                    39: "thirty-nine", 40: "forty", 41: "forty-one",
                    42: "forty-two", 43: "forty-three", 44: "forty-four",
                    45: "forty-five", 46: "forty-six", 47: "forty-seven",
                    48: "forty-eight", 49: "forty-nine", 50: "fifty",
                    51: "fifty-one", 52: "fifty-two", 53: "fifty-three",
                    54: "fifty-four", 55: "fifty-five", 56: "fifty-six",
                    57: "fifty-seven", 58: "fifty-eight", 59: "fifty-nine",
                    60: "sixty", 61: "sixty-one", 62: "sixty-two",
                    63: "sixty-three", 64: "sixty-four", 65: "sixty-five",
                    66: "sixty-six", 67: "sixty-seven", 68: "sixty-eight",
                    69: "sixty-nine", 70: "seventy", 71: "seventy-one",
                    72: "seventy-two", 73: "seventy-three", 74: "seventy-four",
                    75: "seventy-five", 76: "seventy-six", 77: "seventy-seven",
                    78: "seventy-eight", 79: "seventy-nine", 80: "eighty",
                    81: "eighty-one", 82: "eighty-two", 83: "eighty-three",
                    84: "eighty-four", 85: "eighty-five", 86: "eighty-six",
                    87: "eighty-seven", 88: "eighty-eight", 89: "eighty-nine",
                    90: "ninety", 91: "ninety-one", 92: "ninety-two",
                    93: "ninety-three", 94: "ninety-four", 95: "ninety-five",
                    96: "ninety-six", 97: "ninety-seven", 98: "ninety-eight",
                    99: "ninety-nine"}

    _CHARACTERS_WE_CARE_ABOUT = re.compile("\w")

    def _words_from_num(num):
        """
        Convert ``num`` to its (British) English phrase equivalent.
    
        If ``num`` is greater than 9,999 then raise an ``Exception``.
        >>> _words_from_num(115)
        'one hundred and fifteen'
        """
        if num >= 10000:
            raise Exception, 'This function only supports numbers less than 10000.'

        parts_list = []
        if num >= 1000:
            thousands = num // 1000
            parts_list.append(_NUMBER_NAMES[thousands])
            parts_list.append(" thousand")
            num -= thousands * 1000
        if num >= 100:
            hundreds = num // 100
            parts_list.append(_NUMBER_NAMES[hundreds])
            parts_list.append(" hundred")
            num -= hundreds * 100
        if num:
            if parts_list:
                parts_list.append(" and")
            parts_list.extend([" ", _NUMBER_NAMES[num]])

        return "".join(parts_list)

    def _count_characters_we_care_about(string_to_count):
        """
        Count the characters in ``string_to_count``, excluding things like hyphens and spaces.
    
        >>> _count_characters_we_care_about("one hundred and twenty-three")
        24
        """
        return len(_CHARACTERS_WE_CARE_ABOUT.findall(string_to_count))

    def problem_17(upper_bound = 1000):
        """
        Find the solution to `Problem 17`_ at `Project Euler`_.
    
        .. _Problem 17: http://projecteuler.net/index.php?section=problems&id;=17
        .. _Project Euler: http://projecteuler.net/
    
        >>> problem_17(2)
        6
        """
        converted_nums = (_words_from_num(num) for num in xrange(1, upper_bound + 1))
        lengths = (_count_characters_we_care_about(phrase) for phrase in converted_nums)
        return sum(lengths)

    if __name__ == '__main__':
        print problem_17()

Generator Expressions

The quick version: They’re like list comprehensions, only lazy. The longer version resides in PEP 289. The verdict: They rock. It didn’t make a big difference in this problem, but in general, it’s nice that you don’t have to allocate enough memory to contain an entire list. One problem with them is that once you’ve consumed an element of the generator expression, you can’t get it back, so they aren’t well-suited for problems where you need to iterate across the data more than once.

Regexes

Regexes are hardly unique to Python, but this is the first time I’ve ever used them in Python. One feature I was excited to try—and that Python pioneered—is named capturing groups but the regex I used in this example didn’t need them.

Docstrings

I’ve written docstrings before, but this is the first time I’ve tried to generate stand-alone documentation from them. To do the generation, I used epydoc. The original PEP 257 defines docstrings in terms of plaintext, but PEP 287 establishes reStructuredText as a richer alternative. Since I was generating documentation from the docstrings, I decided to go with reStructuredText. To avoid typing --docformat restructuredtext every time I invoked epydoc again:

1
    __docformat__ = "restructuredtext en"

While it took a while to get used to it (I’ve gotten pretty set in my Markdown ways), I really like it so far. In fact, I liked it enough to write this entire blog post in it and simply post the generated HTML into Blogger.

Back to flipping out…