Thursday March 22, 2012

08:32 PM: A Monkeypatching Decorator for Python

I had to do a bit of Python monkeypatching today — purely in the name of expediency, you understand — and like most things that are discouraged in Python, monkeypatching turns out to be a rather laborious thing. So, I wrote a simple decorator that automates the tedious bits and works more or less along the lines of Phil Hagelberg’s Robert Hooke library in Clojure.

The operating principle is pretty straightforward: you define a function and use the decorator to specify the function or method to patch it over. When called, your patch function will receive the original function as its first positional argument, followed by any actual arguments that were intended for the real function.

from functools import wraps

def patches(target, name, external_decorator=None):

  def decorator(patch_function):
    original_function = getattr(target, name)

    @wraps(patch_function)
    def wrapper(*args, **kw):
      return patch_function(original_function, *args, **kw)

    if external_decorator is not None:
      wrapper = external_decorator(wrapper)

    setattr(target, name, wrapper)
    return wrapper

  return decorator

Typical usage looks like this:

@patches(SomeClass, 'aMethod')
def aMethodFixed(aMethod, self, foo, bar):
  return aMethod(self, foo, bar + 1) * 2

(It doesn’t really matter what you call the decorated patch function, but anonymous functions aren’t the Python Way.)

Now this function will get called whenever someone invokes .aMethod(...) on an instance of SomeClass (subject to some caveats with some versions of Python that won’t flush the method cache in certain situations). You can do anything you want around calling the original SomeClass.aMethod (which is inserted as an extra initial argument), including not calling it at all.

Instead of patching entire classes, you can also patch methods on individual instances. This works exactly the same way, except that you pass an instance rather than a class to the decorator, and (as with bound methods generally), you won’t get an explicit self argument when your patch function is called. Normally, lacking an explicit self argument is not a big inconvenience, since you’ll usually have a variable referring to the patched instance in the enclosing lexical environment of your patch function.

The main thing that doesn’t work automagically are methods which use descriptors in some non-default way, such as class methods and static methods. In those cases, the staticmethod or classmethod decorator needs to be applied around the wrapper, but before it’s assigned to the target. In other words, neither of these will work:

@classmethod
@patches(SomeClass, 'someStaticMethod')
def someStaticMethodPatch(someStaticMethod, cls):
  return someStaticMethod(cls)
@patches(SomeClass, 'someStaticMethod')
@classmethod
def someStaticMethodPatch(someStaticMethod, cls):
  return someStaticMethod(cls)

You’d need to do this instead:

@patches(SomeClass, 'someStaticMethod', classmethod)
def someStaticMethodPatch(someStaticMethod, cls):
  return someStaticMethod(cls)

(This use case is basically the only reason that the external_decorator argument exists.)

Anyway, there you have it. I’d tell you to use it in good health, but, well—just don’t shoot your eye out. A good heuristic to determine whether it’s appropriate to use this decorator in your code is whether your first thought on reading this article is “Wow, this would be a great way to simplify the code running on my production system.”

(Hint: If that’s your first thought, then please don’t.)

Wednesday December 21, 2011

10:34 PM: www. and Redirects

I mentioned on Twitter that I think most websites ought to provide service with both www. and www.-less hostnames, and also that one of these ought to be canonical, and the other redirect to it.

I’d like to take a little more space to unpack my reasoning, and what I personally do about it.

Read more...

Thursday November 03, 2011

02:35 PM: Asynchronous is not "Fire and Forget": Joiners not Quitters

In all the recent enthusiasm for threaded and asynchronous programming, I’ve noticed people missing something really important: if you fire off an asynchronous task, you will eventually need to wait for it to complete — and you’ll probably need to be able to cancel it as well.

At the basic level, this is why Ruby’s Thread has instance methods such as Thread#join and Thread#kill (even if #kill, specifically, is such a blunt instrument that you should never use it in production code). For the sake of language- and library- neutrality, I’ll call these two sorts of operations join and cancel.

These are very important. Let’s begin by looking at join.

Read more...

Saturday June 18, 2011

11:55 AM: Sprockets Versus CommonJS: Require for Client-Side JavaScript

Once you’ve split your client-side JavaScript application into nice, self-contained individual files, you have two new problems:

  1. Your application has to make a bajillion individual HTTP requests to pick up each one of these individual files; the need for so many HTTP transactions kills your load time.
  2. Dependencies among the files require them to be loaded in depth-first order; this may be prohibitively difficult to arrange by hand.

Sprockets is a tool for concatenating multiple JavaScript files into a single one — in the right order. It’s included with Rails 3.1, and in that environment is the recommended way to approach this.

To make sure that a JavaScript file you depend on is included before the file you’re in, just add a special comment of the form //= require "foo" to the top, and the text of the referenced file (in this case, foo.js) will get included at that point if it hasn’t already been included earlier in the concatenation.

To me, though, it’s a little sad that we’ve come full-circle back to the same approach used by the C preprocessor to attain modularity. (Sprockets require even has similar semantics assigned to the use of quotes and angle brackets for included filenames — //= require <foo> searches “system” paths, whereas //= require "foo" looks in your project.) Everybody’s still pooping in the global namespace, and there’s no inter-module isolation except for what you happen to be disciplined enough to impose on yourself.

It’s doubly sad, because in the server-side JavaScript world, we’ve got a perfectly servicible module system, specified as part of CommonJS. Admittedly, the CommonJS module API isn’t normally appropriate for client-side use, because the require function it specifies is synchronous, whereas loading individual scripts from a browser is an asynchronous activity. However, if you’re going to be concatenating all your JavaScripts together into a single giant file anyway… why not?

Implementing the CommonJS Module API itself isn’t that difficult; a simple implementation of the whole thing seems to require about 70 lines of code:

// module-prelude.js
var require;

(function () {
  var hasOwnProperty = Object.hasOwnProperty;
  var currentModuleId = "";
  var modules = {};

  require = function (moduleId) {
    moduleId = resolveModuleId(moduleId,
                               currentModuleId);
    if (!hasOwnProperty.call(modules, moduleId)) {
      var message = "No such module " + moduleId;
      throw new ReferenceError(message);
    }
    var module = modules[moduleId];

    var exports = module.exports;
    if (!exports) {
      exports = module.exports = {};
      var body = module.body;
      delete module.body;

      var savedModuleId = currentModuleId;
      try {
        currentModuleId = moduleId;
        body(exports, {id: moduleId});
      } finally {
        currentModuleId = savedModuleId;
      }
    }

    return exports;
  }

  require.defineModule = function (moduleId, body) {
    modules[moduleId] = {body: body};
  }

  require.loadAllModules = function () {
    for (var moduleId in modules) {
      if (hasOwnProperty.call(modules, moduleId)) {
        require(moduleId);
      }
    }
  }

  // resolve relative module ids
  function resolveModuleId(moduleId, baseId) {
    moduleId = moduleId.split("/");
    var absModuleId;
    if (moduleId[0] === "." || moduleId[0] === "..") {
      absModuleId = baseId.split("/").slice(0, -1);
    } else {
      absModuleId = [];
    }

    for (var i = 0; i < moduleId.length; i++) {
      var component = moduleId[i];
      if (component === ".") {
        // ignore
      } else if (component === "..") {
        absModuleId.pop();
      } else {
        absModuleId.push(component);
      }
    }

    return absModuleId.join("/");
  }
})();

Untested. require.defineModule and require.loadAllModules are non-standard extensions, and unlike in some other server-side contexts, all the modules share natives and the same global object.

Anyway, let’s say you’ve got two CommonJS modules, one of which uses an API exported by the other:

// foo.js
var bar = require("bar");
bar.displayMessage("Hello!");
// bar.js
function displayMessage(message) {
  alert("Message: " + message);
}
exports.displayMessage = displayMessage;

A Sprockets-like tool could then concatenate them all together like so:

// application.js
(function () {
  // content of module-prelude.js

  require.defineModule("foo", function (exports, module) {
    // content of foo.js
  });

  require.defineModule("bar", function (exports, module) {
    // content of bar.js
  });

  require.loadAllModules();
})();

Ideally, I’d like to see Sprockets support this itself, though the issue of backwards-compatibility needs to be considered. It’d be wonderful to be able to use CommonJS modules to structure client-side JavaScript.

Thursday March 03, 2011

01:06 PM: Queueing Theory: Why the Other Line Moves Faster

In this video, Bill Hammack breezes through an introduction to basic queueing theory, the discipline established by Danish mathematician Agner Krarup Erlang. (Yes, the Erlang programming language is named after him.)

In this era of parallel and distributed systems, I think queueing theory is actually a very important area for programmers to be familiar with, but Bill’s explanation should be accessible even to a non-technical audience.

Why the other line is likely to move faster

Sunday February 27, 2011

11:07 PM: Saving Money with Amazon S3 and Bittorrent

I’m not the biggest fan of Amazon lately, but if you happen to be using S3 for hosting big downloads, or if you want to permanently publish a file using bittorrent without having to maintain your own seed for the rest of time, S3 has a little-used feature that could save you a lot of trouble — and potentially money.

Seeding Torrents from S3

It turns out that S3 will publish and seed a torrent for any publicly-available file stored in S3. This is pretty easy to set up:

  1. Upload a file to S3 and make it public
  2. Visit the file’s URL with ?torrent appended
  3. After a delay, you’ll get a .torrent for that file; save it to your computer
  4. Amazon will seed that torrent for as long as the file remains public

For example, if your uploaded file were available at http://bucketname.s3.amazonaws.com/my.mp3, the URL to get a .torrent for it would be http://bucketname.s3.amazonaws.com/my.mp3?torrent.

S3 doesn’t generate the torrent until the first time it’s requested, so you may have to wait a while for the .torrent to be generated if the original file is large.

In my experience, when demand outstrips supply, Amazon will actually temporarily spin up additional seeds in order to keep download speeds up (each individual Amazon seed seems to max out around 72kbps). In terms of billing, you’re charged (at the normal S3 rates) for all data downloaded via the Amazon seeds, but peer-to-peer transfers and downloads from other seeds would obviously be free for you.

Technical details of working with Bittorrent and the S3 REST API can be found in Amazon’s developer documentation.

Saving Money

There are basically two scenarios (that I can think of) in which seeding from S3 has the potential to save money:

  1. You’re serving popular downloads from S3 and start using Bittorrent (which reduces the amount of data served from S3 for a given number of downloads)
  2. You’re seeding torrents from an EC2 instance (or other hosted server), where bandwidth costs are typically higher than from S3

In the first case, potential savings are going to be largely proportional to how busy the torrent is. If you only ever have one person downloading at a time, costs will be pretty much the same as if people were downloading via HTTP directly.

In the second case, any difference is going to depend on the exact pricing structure you’re dealing with — for example, the first gigabyte downloaded from EC2 in a billing cycle is free, so if your EC2 seeds never serve significantly more than that, seeding from S3 is actually the more expensive option.

In both cases, savings aren’t guranteed; it’s important to keep an eye on costs and run the numbers. If you aren’t measuring, you’re losing.

To sum up:

Advantages

  1. Setting up torrents for files you already have in S3 is extremely simple
  2. You don’t have to maintain a seed or a tracker yourself
  3. Versus direct downloads from S3, you’re only billed for bytes downloaded from Amazon’s seeds

Limitations

  1. S3 won’t generate torrents for:
    • multiple files at once; multi-file torrents aren’t supported
    • files larger than 5GB
  2. You’re stuck using Amazon’s tracker if you want Amazon’s seeds to work for you. (On the other hand, it’s not that difficult to edit a .torrent to add extra trackers.)
  3. In some lower-usage situations it’s possible that — compared to a seed running on EC2 — S3 bandwidth costs would actually be more expensive.

Friday February 11, 2011

11:12 AM: Neil Gaiman on Copyright, Piracy, and the Web

The Open Rights Group interviews Neil Gaiman about his experiences with online piracy:

(Gaiman on Copyright Piracy and the Web)

Edit: Also, here’s a journal entry that Gaiman posted during his American Gods experiment, responding to the concerns of an independent bookseller:

I don’t see this as either they get it for free or they come and buy it from you. I see it as Where do you get the people who come in and buy the books that keep you in business from?

The books you sell have “pass-along” rates. They get bought by one person. Then they get passed along to other people. The other people find an author they like, or they don’t.

When they do, some of them may come in to your book store and buy some paperback backlist titles, or buy the book they read and liked so that they can read it again. You want this to happen.

(Read the rest…)

Thursday February 10, 2011

03:49 AM: Kitten (and Chipmunk) in Slow Motion

What’s the Internet for, if not for posting pictures of cute animals?

TPS Film Studio, a Polish studio specializing in high-speed photography, put up a really charming slow-motion video of a kitten playing outside:

(PhantomHD – Kitten in Slow Motion.mp4)

(The “Phantom HD” in their domain name is a reference to the Phantom HD cameras from Vision Research that TPS rents and uses.)

Also, a bonus video, alas without a cool soundtrack:

(Adorable Chipmunk in Slow Motion)

I suspect if these people could devote their business to making slow-motion videos of cute animals, they’d be sitting on a veritable gold mine.

Tuesday February 08, 2011

06:58 AM: Hibernate Not Working on Your Ubuntu Laptop? Try "hibernate"

The Problem

Since I first got it a few years ago, I haven’t had much luck getting Ubuntu to suspend or hibernate my Toshiba laptop. Suspend does finally work out of the box with Ubuntu Maverick (10.10), but I still haven’t had much luck with hibernating, even with Maverick.

Recently, annoyed by this after a meeting of Seattle.rb where pretty much everyone but me had awesome suspending-and-hibernating Mac laptops, I started looking through the list of packages in Ubuntu to see whether there were any tools available which might help me sort out the problem.

The Solution

hibernate – smartly puts your computer to sleep (suspend to RAM or disk)

The hibernate script helps you in putting your computer to sleep, using one of the various methods available in the kernel.

Hibernate can take care of loading and unloading modules, provides various hacks needed to get some video cards to resume properly under X, can optionally restart networking and system services, and basically do whatever else you ask it. It can be extended by writing new “scriptlets” which run at different points during the suspend process.

Currently the script supports all suspend mechanisms available through the /sys/power/state interface (including ACPI suspend and the in-kernel software suspend), as well as Software Suspend 2 (http://www.suspend2.net)

hibernate isn’t available via the Software Center, since it isn’t a user-facing application, but you can install it (along with software suspend, uswsusp) from the command line easily enough:

sudo aptitude install hibernate uswsusp

Generally speaking, you will also need to have a swap partition (i.e. /dev/something, not a regular file) set up which is at least as large as your physical RAM, and which is configured in /etc/fstab so that it will be available at boot time.

Why isn’t hibernate installed by default? It turns out that a lot of modern laptops don’t actually require it to be able to suspend or hibernate. Alas, mine is not one of these.

Note: if you’re using proprietary video or network drivers, you still may be out of luck. Sorry. With a proprietary driver, it’s up to the vendor who wrote the driver to make this work, and usually they aren’t going to bother.

Saturday February 05, 2011

06:57 AM: Top 10 Mistakes in Behavior Change

Over the past year and a half or so, I’ve been thinking and reading a lot more about personal improvement, the formation of habits, and so on. Along these lines, I recently read set of slides from a presentation given by BJ Fogg et al. from Stanford University’s Persuasive Technology Lab, which I thought offers a very nice summation of some of the things I’ve learned.

(Hat Tip: Giles Bowkett)

Partly for my own reference, here’s the text of the bulk of the slides, along with my commentary:

Top 10 Mistakes in Behavior Change

  1. Relying on willpower for long-term change

    Imagine willpower doesn't exist. That's step 1 to a brighter future.

    Me: When we see someone who is well-disciplined, punctual, and so on, we tend to think "wow, they must have a lot of willpower," when in fact the main reason they can do these things regularly is because these things have become habits, and not because they normally have to consciously will to do them. Conversely, willpower — such as it is — exists for short-term emergencies, and is quickly exhausted. (Given a choice, it's far better to flee temptation than to attempt to resist it.)

  2. Attempting big leaps instead of baby steps

    Seek tiny successes -- one after another.

    Me: The more modest your goals — particularly when you are trying something new — the more likely they are to be realistic. Additionally, it's the constant feedback of small successes which really serves to build a habit. Frequency is massively more important than quality.

  3. Ignoring how often environment shapes behaviors

    Change your context & you change your life.

    Me: This was one of the major factors in my move to Seattle. Among other things it represented a conscious decision to connect with specific groups of people here who are more like the way I want to be myself, both professionally and spiritually. Peer pressure can be a wonderful tool for self-improvement! But changing your context can even mean simple things like rearranging your furniture, or putting particular items where they will be within easy reach. Ever realized after the fact that you had set your (then-)future self up for failure? You can also do little things to set your future self up for success too.

  4. Trying to stop old behaviors instead of creating new ones

    Focus on action, not avoidance.

    Me: Just like you'll tend to steer a car or throw a ball wherever it is you're looking, you'll tend to do whatever it is you're focused on — even if you're focused on it because you're trying to avoid it. Find something positive to focus on and do that instead. A repeating theme of the best spiritual direction I've received over the years has been that you don't break bad habits so much as replace them.

  5. Blaming failures on lack of motivation

    Solution: Make the behavior easier to do.

    Me: "What's keeping me from doing this thing I want to do?" is a valid question, often with important answers; it's worth asking it seriously. Very often we discover a series of roadblocks — once we are consciously aware of what these are, we can often find ways to deal with them or even to make them irrelevant. These roadblocks can be internal or external; in many cases a goal we think we want can actually be scary or threatening for us in other ways that we haven't consciously admitted to ourselves. These fears aren't silly; they deserve consideration and they'll go unaddressed so long as they go unnamed. (Sometimes simply naming them makes them easier to face, and sometimes it's necessary to look for ways to create a safer environment for change.) Find ways to make the things you don't want to do harder, and the things you want to do easier.

  6. Underestimating the power of triggers

    No behavior happens without a trigger.

    Me: Keep track of what you're doing; look for correlations. Find and ruthlessly exploit triggers for good behaviors, and starve your life of bad ones where you can. It's not cheating.

  7. Believing that information leads to action

    We humans aren't so rational.

    Me: We tend to think of human reason as an active quality, with the force of our will steering us here or there, but most of our activity is actually spent on a sort of auto-pilot, with our rational faculty at best reflecting on, or at worst rationalizing, what we are doing. Living mindfully and deliberately is a good habit to cultivate in itself (it's essential every other kind of improvement), but when it comes to changing behaviors, the only sustainable approach is to reprogram the auto-pilot rather than continuously hovering tense over the controls. In fact, it's a lot more like riding a donkey that needs to be (re)trained than it is like flying a plane. Factual input and feedback is important for training, but not because you need to reason with the donkey.

  8. Focusing on abstract goals more than concrete behaviors

    Abstract: Get in shape
    Concrete: Walk 15 min. today

    Me: Roughly speaking, a concrete goal is something you can do, now.

  9. Seeking to change a behavior forever, not for a short time.

    A fixed period works better than "forever"

    Me: In the extreme case, one day at a time. But in the meantime see if you can't start addressing incentives, triggers, and the effort involved.

  10. Assuming that behavior change is difficult.

    Behavior change is not so hard when you have the right process.

    Me: It might be overstating the case, but behavior change really is usually the easy part once you have an idea of what to do. It may be that there are external factors which undermine our attempts to change in particular cases, and those can be hard to get rid of, but for the most part the remaining hard parts are honesty, self-awareness, and a willingness to accept outside help.