Perl, Python, C++

| | Comments (0) | TrackBacks (0)

Python has been on my horizon for a while, and I've written code in it for small projects since 2002 or so.  Until recently it didn't seem to me to have sufficiently deep libraries to warrant making the change from Perl.  And as a survivor of SGML's "ignorable whitespace" and other nightmares, the idea of parsing on whitespace was offputting.  Whitespace is not actually evil--it is just misunderstood.  To have your compiler depend on something so easily misunderstood seemed to me problematic at best.

Sometime in the past two years, though, Python's capabilities, and especially the libraries, have crossed the threshold into serious development territory, and a couple of current and prospective clients have projects where Python is very clearly the language of choice, whitespace or no.  This created an interesting opportunity.

Predictive Patterns makes very heavy use of code generator technology.  Our code generators are aimed at replacing junior developers:  they write code from high-level specifications that is clean, conforms to our coding standard, and can be edited easily by humans.  This is how all our serialization code is written, for example, using an XML-based technology that is loosely based on the idea of treating the application as a document.

Our code generators are written in Perl, and for some time there has been internal pressure to rewrite them in Python.  The use of Python on active projects for clients makes this a necessity.  It would just be wrong to have a Perl script generating Python code.

The first step in this project has been to convert the existing C++ generators to Python, which has produced some reflections on the relative strengths of the three languages involved:  C++, Perl, and Python.

Like any serious C++ developer, Lakos' Large Scale C++ Software Design has an important place on my bookshelf, not too far from Design Patterns and Effective STL.  When I started coding seriously in Python one of the most striking things was the complete irrelevance of physical design principles, which are the focus on Lakos' book.  How the code is organized into translation units, how interfaces are used to insulate the underlying implementation... all of this just doesn't much matter, which is both refreshing and a little bit disorienting.

Python's notion of "duck typing" is both refreshing and disorienting too, particularly when it doesn't work.  For example, many types support an index operator: [].  But the semantics of this operator is completely different for two very important classes of type:  containers, and strings. 

In C++, standard strings are sometimes referred to as "almost containers", a nice bit of conceptual legerdemain that faces up to this problem:  strings have a container-like interface, but at the end of the day are something we want to draw an edge around and call a unitary entity, not simply a container of more-or-less-unrelated characters that happened to get dumped into the same bucket.  Strings have a conceptual unity that other container classes do not, despite their similar interfaces.  If a C++ "vector" were actually the Cartesian tensor its name suggests we might have a similar problem there, but alas this is not the case.

This difference between containers and strings bit me while developing a Python script that generates SQL out of an XML specification.  Trying to be a good Pythonista I distinguished between container types and everything else using a try/except around the index access.  This worked fine until the code got hold of a string, declared it on the basis of the interface to be a container, and complacently went on to do something entirely inappropriate and quite embarrassing.  At this point I learned that "is_instance" should not always be considered harmful.

The differences between Perl and Python are of course more striking.  As I move through the process of translating the main code generator, which is about 3000 lines of Perl, I notice how much more compact the Python implementation is, and how much cleaner the code.  With one major exception:  regular expressions.

Although at the end of the day Perl regexes are probably too complex, they have an inner elegance, a deep beauty, that the Python re module just lacks.  This is a small quibble when taken in context of the gains to be made from using Python, but I can't help thinking that Python 3.0 with regexes built in would be the best of all possible worlds...

0 TrackBacks

Listed below are links to blogs that reference this entry: Perl, Python, C++.

TrackBack URL for this entry: http://www.predictivepatterns.com/cgi-bin/mt/mt-tb.cgi/3

Leave a comment

About this Entry

This page contains a single entry by Tom Radcliffe published on July 11, 2008 11:20 PM.

Welcome to the Predictive Patterns Blog was the previous entry in this blog.

Duck Typing and the Terror Watch List is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Pages

Powered by Movable Type 4.1