Real Ultimate Programming

The Home for People Who Like to Flip Out and Write Code

Notes From PyATL 2011-09-08

Welcome to Python (Brandon Rhodes)

Sweet. I just learned about fileinput.

OpenOPC for Python (John Pilman)

OPC is OLE for Process Control, and it serves as the high-level communications method for lots of control systems.

There is a newer, non-DCOM-based spec called OPC UA, but it isn’t (yet) widely supported.

OpenOPC is an Open Source library for interacting with OPC from Python.

On a side note: drag-and-drop programming is real. This control system he’s working on is absolutely drag-and-drop programming. It’s just not general purpose programming.

I’m having a bit of a hard time following this one; I think my bad angle to the screen combined with my lack of knowledge in the domain is a bad combination.

This seems like a huge win for anybody who needs to start munging about with multiple pieces in an OPC environment.

Fast, Lightweight Testing for Python (Shawn Boyette)

Why nanotest-py? Because he didn’t like xUnit, and he likes to write tools that do exactly what he wants.

It’s a port of nanotest.js, which is itself inspired by Test::More.

pis_deeply sort of intrigues me. I will have to go read the code for that and see how it is implemented.

I think I might prefer the stuff in the standard library and/or nose, but I think it made it to my list of things to use on a side project.

This is only available for Python 3.2 right now, because he used the new argparse module in the standard lib.

Streamlining Workflows with Puppet Faces (Kelsey Hightower)

I love me some Puppet. “Infrastructure as Code” is a great concept.

Does he have a Puppet script for Oracle‽ Must. Have.

Puppet Faces

Puppet Faces is essentially a giant set of APIs to manipulate Puppet.

Wow. Their stuff for introspecting what’s already on a system and giving you the equivalent Puppet module/whatever is pretty awesome.

The main point of Puppet Faces is to cut out boiler plate and improve facility with ad hoc configuration.

Hmm… this is the sort of thing that I can imagine would be useful, but it’s so far outside what I do on a daily basis that I don’t have an immediate feel for it.

Pretty cool. The jira subcommand is available as a default part of Puppet once you finish building the Face.

I guess if your JIRA users are technically savvy, this would be super useful, but do you need to protect yourself against the new guy?

You get a REST API, the Ruby API, and a CLI for your Face.

Put on Your Data Goggles (Brandon Rhodes)

I like how I take very few notes in Brandon’s talks, because he just makes it feel so intuitive you feel stupid writing something down.

Book Review: Python 3 Web Development Beginner’s Guide

I recently received a free review copy (eBook version) of

Cover art for Python 3 Web Development Beginner's Guide

from Packt Publishing. I was looking forward to this book, because I haven’t really done much Python 3 work yet, and I wanted to see how it could make my life as a web developer better. However, the book wasn’t what I expected. Instead of covering the basics of web development and how Python 3 applies, it is more of an introduction to the sorts of concerns that come up when you build a web framework on top of CherryPy. The sample code just happens to be in Python 3.

The Good

The two best parts of the book, to me, were the coverage of writing a jQuery plugin, and growing an ORM that uses metaclasses to provide a compact, readable way to define the models.

The Bad

I have a rather long list of things I didn’t like about the book, some of which are a function of the title setting misleading expectations, and some of which I think are just problematic in general.

In general, I didn’t care for the examples. Some of this is personal preference: I find that many people (myself included) learn better when they must type in the examples instead of opening up the code and reading through a completed solution. While the book sometimes indicated that something had been left as an exercise to the reader, opening up the sample code showed that the exercise actually had not been left to the reader. This mismatch between what the text of the book says will be in the sample code and what is actually in the sample code occurs in multiple places throughout the book, and gives a sense that the book was sloppily edited.

I also felt the examples in general were too complicated. It’s fine to build up a complicated example over the course of a book, but instead we got a task list, a wiki, a Customer Relationship Management (CRM) tool, a spreadsheet, and more. That’s an awful lot to distract you from the beginner’s principles that you would expect in a book with this title.

I also didn’t care for many of the shortcuts taken in the book. In most instances, the book did acknowledge that the approach taken was not appropriate in the real world, but then proceeded with little or no justification for why it was done the way it was. The two examples that really leap out in this category are the password hashing scheme and the failure to use a template engine.

When the book first introduces authentication, it explains that you should never store passwords in plaintext. This is absolutely correct, but the book then goes on to demonstrate a completely insecure password hashing scheme: UNSALTED SHA1. The author only provides a cursory link to explain what you should actually be doing. In this day and age, demonstrating anything less than a bcrypt-based solution is wrong. Read Enough With The Rainbow Tables and How To Safely Store A Password for a far better explanation than I can provide. There’s really no excuse for this: the added complexity of using py-bcrypt instead of writing your own (insecure) SHA1-based solution is trivial at worst; there’s a strong case to be made that it would actually be simpler.

The failure to use a template engine (also a weakness acknowledged by the book) really makes the code harder to follow than it should be. Virtually any serious web development effort is going to take advantage of a template engine, and for good reason. This code gives me flashbacks to my days of writing Java servlets before the advent of JSP, and I saw where one other reviewer invoked the specter of PHP. The fact that this style of coding draws such comparisons should give you an idea of just how unpythonic it is. I would be sympathetic to claims of not wanting to add too many external dependencies if the book did not already rely significantly on the magic of jQuery UI.

My last major complaint is simply one of focus: the book spends substantial amounts of time growing an ORM and teaching Python metaclasses (and doing a good job of it), but spends little more than the bare minimum required on CherryPy (which is at the core of the code), and essentially none on understanding HTTP. In fact, the few times it comes up is usually in relation to GET vs. POST, where the decision is usually made based on inane implementation details such as whether request arguments are logged by default instead of HTTP fundamentals such as idempotency, safety, or cacheability (although caching is mentioned elsewhere, in the context of how to prevent it). Also, the book does mention security, but it does not give it the sort of omnipresent emphasis that is necessary to write good web applications, given the hostile nature of the domain. XSS, CSRF, and SQL injection attacks all deserve much more attention than they were given.

The Summary

The book has some good content mixed in with the stuff I didn’t like. Unfortunately, the good content is rarely specific to web development. For example, the chapter that uses metaclasses to clean up the ORM is one of the better resources on metaclasses that I’ve seen, but metaclasses are clearly not specific to web development. Furthermore, the impression of sloppy editing makes it hard to put as much faith in the content as it probably deserves. Given these flaws, I really don’t think I’d recommend this book to a friend who was looking to get started with web development.

Back to flipping out…

Notes From PyATL 2011-08-11

News and Notes

Wow. Brandon is moving to Ohio in December.

It’s like he read my mind, because he’s talking about a PyGeorgia sort of event.

Beginning Python (Brandon Rhodes)

Clever way to associate county names with locations: use Inkscape to label the counties on the map and save it as SVG, then parse it with lxml.

I wonder if Brandon is really this paranoid, or if he’s exaggerating for effect.

I think he’s using an Emacs version of Solarized.

Clever Hack: when using print debugging, you can drop your value of interest into a list so you can tell the difference between strings and numbers, e.g., print [x] will give you ['123'] when x = '123' and [123] when x = 123.

Fighting in Unfamiliar Territory (Toby Ho) (sp?)

Why crack open the code to debug a third-party module?

  • Because you can!
  • The code doesn’t lie (even if it is confusing).
  • It can be faster.
  • It’s good practice (it’s harder to read code than to write it).
  • You want to be awesome and fix the bugs and submit them back.

Yet another reason to love pip: it installs the source, so it’s easy to dig into what’s going on.

squinting at Objects in Python (Brandon Rhodes)

squint is Brandon’s library for inspecting Python objects without triggering @property logic, etc. He wrote it after getting frustrated with things like ORM classes changing when he tried to inspect them.

I bet this thing was loads of fun to write. I can definitely see how it would be useful, though.

Back to flipping out…

Notes From PyATL 2011-07-14

Python for Data Mining

Ad-hoc analysis typically requires 3 layers:

  • Date Extraction (SQL or a query builder)
  • Transformation & Analysis (scripting language)
  • Presentation (Excel, Powerpoint, Access (I wonder what this is for?))

Python is a great fit for the middle layer. This is for all the usual reasons: succinct, expressive, std lib is big, PyPI is bigger, readable. There is also another one: easy access to high-speed options: JIT, Cython, Numpy, etc.

PROTIP: Take snapshots from time-to-time, because the data will change on you.

One reason Python is a win is because you can do the analysis on your desktop, and IT is touchy about giving people rights to run PL/SQL on the DB.

You do a lot of templating on the queries, because they’re so verbose and repetitive.

The standard library will get you a lot farther than you think; you don’t always need to jump straight to Numpy.

YES! itertools FTW, baby.

Wow, the first example really demonstrates how dense your code can get with list comprehensions and other standard library stuff. The dataset initialization was doing an awful lot in essentially one line.

Numpy

The classic Numpy array requires you to define your datatypes, and you only get one in a given array.

Structured array lets you name columns and have different datatypes for each column.

Manipulating Data: Matplotlib

As the heading indicates, you can do more than just plot things with it; you can do some serious manipulation, too.

Rapid, Scalable Web Development with MongoDB, Ming, and Python

FossFor.us (the SourceForge black ops project to be more web 2.0) was built on CouchDB.

It didn’t scale the way they needed for SF.net; MongoDB came into play because of that.

TIL: Documents in MongoDB are limited to 4MB (now 16MB). SF.net had to rethink their initial design because of this fact.

Ming

They eventually decided they needed an “Object-Document Mapper” (hehe, they still call it an ORM): Enter Ming.

Ming allows them to define their schema and enforce it.

They also handle migrations with Ming, and they can be eager or lazy.

They have the concept of a “unit of work”, which basically allows them to log all the updates against an object, then distill down into a single update (or close to it). This can be especially handy because you don’t have multi-statement transactions in MongoDB.

You can drop out of Ming if you need to, to get better performance.

Allura

SF.net is trying to give back to the Open Source community with Allura. It’s essentially their codebase.

Zarkov

Zarkov an asynchronous TCP server for event logging with gevent, which they built on top of Ming.

Procedures, Objects, Reusability: httplib and its discontents

This is a deep-dive into how Python’s SimpleHTTPServer handles HTTP requests, which is a build-up to a rant about how the HTTP parsing code is not reusable because it’s attached to a class that is designed to be extended, not used as a utility module.

OK, wow, Brandon is stubborn. He got medieval on the standard lib. Also, he is some kind of riled up about all the hoops he jumped through.

It’s entirely possible Brandon might not ever write another class again.

Back to flipping out…

Notes From PyATL 2011-06-09

The State of Python Packaging (Kelsey Hightower)

  • I just won $20 by naming off pretty much all of the Python packaging tools he mentioned.
  • I wonder how nice bento is. I’ve heard about it before, and since it’s from the guys behing NumPy it probably has an interesting take on things.
  • PyPM sounds like the bee’s knees for Windows.
  • Once again, this guy brings the heat for his presentation.
  • Holy cow! There are a lot of PEPs driving this: PEP-345 (Metadata for Python Software Packages 1.2), PEP-376 (database of installed Python distributions), PEP-386 (changing the version comparison in distutils), PEP-381 (mirroring for PyPI), PEP-390 (static metadata), PEP-396 (module version numbers).
  • pysetup is going to be a more end-user-oriented alternative to pip, and it will be in the std lib.

Talk about names and values in Python (Brandon Rhodes)

  • The material is mostly review, but is well presented (no surprise there, since Brandon is an excellent presenter).
  • The point about function/class definitions creating a name is well-taken.

Exceptions! in Python (Sim Harbert)

Back to flipping out…

Notes From PyATL 2011-04-14

Introducing subprocessing

  • Unsurprisingly, it’s easier to just use a lot of shell utilities rather than the portable alternatives, but it kind of defeats a common purpose of doing it in Python.

Closures in Python

  • This guy is a good presenter.
  • The nonlocal keyword is new to Python 3, and I can see how it would be useful for the FP advocates that always complain about Python (it allows you to mutate function-scoped variables that have been captured by your closure).

Public Service Announcement from WRFG

  • This sounds like Jacob’s cup of tea.
  • They’re basically looking for tech volunteers.
  • 89.3 FM ?

Faster Database Access: An End-Run Around the Django ORM

  • A Brief History of Databases
  • Is this going to be about the new (relatively) raw SQL feature?
  • Looks like it might be about select_related. Oops, the example misled me: select_related didn’t fit Brandon’s actual problem.
  • Aha! It is about .raw!

Back to flipping out…

Note to Self: Source Django’s Bash Completion Automatically

If you’re using Doug Hellman’s awesome virtualenvwrapper to manage your Django projects–and you really should be–try adding the following line to your $VIRTUAL_ENV/bin/postactivate script:

source "$VIRTUAL_ENV/build/Django/extras/django_bash_completion"

If you used the --no-site-packages option to create the virtualenv, that should automatically source the Django bash completion script every time you workon into your project. If you didn’t, you just need to figure out where the Django bash completion script is squirreled away on your system and use that path instead. BTW, --no-site-packages should really be the default.

UPDATE: And now --no-site-packages IS the default. Yay!

Back to flipping out…