A Python Optimization Anecdote

Posted by Pavel Panchekha on October 24, 2011

Hi! I’m Pavel and I interned at Dropbox over the past summer. One of my biggest projects during this internship was optimizing Python for dynamic page generation on the website. By the end of the summer, I optimized many of dropbox.com’s pages to render 5 times faster. This came with a fair share of challenges though, which I’d like to write about today:

The Problem
Dropbox is a large website with lots of dynamically generated pages. The more pages that are dynamically generated from user input, the bigger the risk becomes for Cross-site scripting attacks. To prevent this, we must ensure to escape all dynamically generated text before it gets sent to the browser. Note that we also need to escape things differently in different string contexts such JavaScript code, HTML text, and HTML attributes. This is because an abusive user-generated string placed in JavaScript may not be abusive if placed in HTML and vice-versa.

All user-generated strings on the Dropbox website are passed through a special escaping function we’ve written that takes context into account. The problem is that having defined our own escaping function we need to ensure that it’s as fast as possible since basically all user input is passing through this function. In this article we’ll focus on one particular escaping context, HTML text.

A quick disclaimer, before you accuse us of reinventing the wheel and ignoring both the xml.sax.saxutils.escape and cgi.escape functions, it doesn’t escape quite enough characters. This is especially true if you consider the possibility for confusing the browser into believing your page is UTF-7, at which point the equal sign has to be escaped. So we have to write our own.

First steps
The original function looked like this:

  1. WHITELIST = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
  2.  
  3. def escape_engine(s, on_whitelist, on_ascii, on_unicode):
  4.     pieces = []
  5.     for char in s:
  6.         if char in WHITELIST:
  7.             pieces.append(on_whitelist(char))
  8.         elif ord(char) < 256:
  9.             pieces.append(on_ascii(char))
  10.         else:
  11.             pieces.append(on_unicode(char))
  12.     return "".join(pieces)
  13.  
  14. def html1(s):
  15.     """
  16.    Escapes a string for inclusion into tag contents in HTML.
  17.  
  18.    Do *not* use for including a string into an HTML attribute; use
  19.    escape.attr() for that.
  20.    """
  21.  
  22.     # Ensure we deal with strings
  23.     s = unicode(s)
  24.  
  25.     escaped = escape_engine(s,
  26.         on_whitelist = lambda c: c,
  27.         on_ascii = lambda c: "&#%02x;" % ord(c),
  28.         on_unicode = lambda c: "&#%02x;" % ord(c),
  29.     )
  30.  
  31.     return UNSAFE_bless(escaped) # Now it’s escaped, so should be safe

Applying it to some test templates gives (times in seconds; the digits (obviously) aren’t all significant):

: html1 14.1678471565

Not very fast. (I blame the intern who wrote it…) Of course, the code wasn’t optimized for speed, but for readability. But like I said, given that this is actually a non-negligible part of our render time, it could use some optimization.

Inlining
The first fact of Python optimization is that function calls are slow. html1 is awful in this regard: it calls a function per character! Worse yet, the common case is calling a function that does nothing at all! So the first step is to inline. This also lets us join the on_ascii and on_unicode cases (originally separated because we also use the above for escaping JavaScript, where we do make a distinction between Unicode and ASCII literals).

  1. def html2(s):
  2.     """
  3.    Escapes a string for inclusion into tag contents in HTML.
  4.  
  5.    Do *not* use for including a string into an HTML attribute; use
  6.    escape.attr() for that.
  7.    """
  8.  
  9.     # Ensure we deal with strings
  10.     s = unicode(s)
  11.  
  12.     pieces = []
  13.     for char in s:
  14.         if char in WHITELIST:
  15.             pieces.append(char)
  16.         else:
  17.             pieces.append("&#%02x;" % ord(char))
  18.  
  19.     return UNSAFE_bless("".join(pieces)) # Now it’s escaped, so should be safe

This has a pretty good 15% or so improvement:

: html2 12.8864150047

Implicit Loops
Now, the preceding code sample maybe isn’t too pretty, but the improvement is nothing worth sneering at. There’s more we can do though. The second fact of Python optimization is that the loop overhead can also be pretty significant. This leads to our second attempt, which gets rid of the loop in favor of a generator expression. Fortunately, switching to generator expressions makes the resulting function as readable as the original.

  1. def html3(s):
  2.     """
  3.    Escapes a string for inclusion into tag contents in HTML.
  4.  
  5.    Do *not* use for including a string into an HTML attribute; use
  6.    escape.attr() for that.
  7.    """
  8.  
  9.     # Ensure we deal with strings
  10.     s = unicode(s)
  11.  
  12.     escaped = "".join(
  13.         c if c in WHITELIST
  14.         else "&#%02x;" % ord(c)
  15.         for c in s)
  16.  
  17.     return UNSAFE_bless(escaped) # Now it’s escaped, so should be safe

What are the timings with this new version?

: html3 13.4748219418

Hmm. The readability improvements are nice, but the speed dropped. Maybe we can get both speed and reability with some tweaks?

More OptimizationsWe already picked out a neat 10% improvement and still have a wonderfully readable function. But we can go faster…

Generators?? Really?
The problem with the generator expressions used above is that Python actually constructs a generator. Let’s look at the bytecode:

  1.  
  2. >>> import dis
  3. >>> dis.dis(optimization_story.html3)
  4.  78          12 LOAD_CONST               1 ()
  5.              15 LOAD_ATTR                1 (join)
  6.  
  7.  79          18 LOAD_CONST               2 (<code object <genexpr> at 0x1005dddc8, file "./optimization_story.py", line 79>)
  8.              21 MAKE_FUNCTION            0
  9.  
  10.  81          24 LOAD_FAST                0 (s)
  11.              27 GET_ITER            
  12.              28 CALL_FUNCTION            1
  13.              31 CALL_FUNCTION            1
  14.              34 STORE_FAST               1 (escaped)
  15.  

Luckily, we can avoid making the generator by simply using a list comprehension instead of a generator expression:

  1. def html4(s):
  2.     #…
  3.     escaped = "".join([
  4.         c if c in WHITELIST
  5.         else "&#%02x;" % ord(c)
  6.         for c in s])
  7.     #…

This brings us back to faster than html2.

: html4 11.2531888485

Looks like the generator expression is actually slower than the list comprehension in this case, a good explanation is probably that our sample set of strings are probably too small to reap the benefits of using a generator. As expected, our disassembly now looks a lot more friendly:

  1.  97          12 LOAD_CONST               1 ()
  2.              15 LOAD_ATTR                1 (join)
  3.  
  4.  98          18 BUILD_LIST               0
  5.              21 DUP_TOP            
  6.              22 STORE_FAST               1 (_[1])
  7.  
  8. 100          25 LOAD_FAST                0 (s)
  9.              28 GET_ITER            
  10.         >>   29 FOR_ITER                43 (to 75)
  11.              32 STORE_FAST               2 (c)
  12.              35 LOAD_FAST                1 (_[1])
  13.              38 LOAD_FAST                2 (c)
  14.              41 LOAD_GLOBAL              2 (WHITELIST)
  15.              44 COMPARE_OP               6 (in)
  16.              47 JUMP_IF_FALSE            7 (to 57)
  17.              50 POP_TOP            
  18.              51 LOAD_FAST                2 (c)
  19.              54 JUMP_FORWARD            14 (to 71)
  20.         >>   57 POP_TOP            
  21.              58 LOAD_CONST               2 (‘&#%02x;’)
  22.              61 LOAD_GLOBAL              3 (ord)
  23.              64 LOAD_FAST                2 (c)
  24.              67 CALL_FUNCTION            1
  25.              70 BINARY_MODULO      
  26.         >>   71 LIST_APPEND        
  27.              72 JUMP_ABSOLUTE           29
  28.         >>   75 DELETE_FAST              1 (_[1])
  29.              78 CALL_FUNCTION            1
  30.              81 STORE_FAST               3 (escaped)

Sets?? Really?
But this is still slower than it should be. One low-hanging fruit still stands out: why are we doing a linear search on the whitelist?

  1. WHITELIST2 = set("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")

The timings bear out the guess that sets are faster than strings:

: html5 8.6205868721

The question remains, Can we go faster?

Deeper Python Optimization
If we’re going to optimize this function, we might as well extract all the performance we possibly can. One peculiarity of modern Python is that it has two LOAD instructions: LOAD_GLOBAL and LOAD_FAST. Due to a implementation detail loading a local variable is faster than loading a global variable. Consequently, you want to avoid global variables in tight loops. The above disassembly pointed to two such globals: ord and WHITELIST.

  1. def html6(s):
  2.     # …
  3.     lord = ord; lWHITELIST2 = WHITELIST2
  4.     escaped = "".join([
  5.         c if c in lWHITELIST2
  6.         else "&#%02x;" % lord(c)
  7.         for c in s])
  8.     #…

We don’t expect this to buy us much, but why not?

: html6 7.87281298637

In absolute measurement a second is not much of an improvment but as a percentage it’s still a significant of time.

String Interpolation

String interpolation is another thing Python isn’t very fast at. Let’s see if removing that helps.

  1. def html7(s):
  2.     # …
  3.     lstr = str; lord = ord; lWHITELIST2 = WHITELIST2
  4.     escaped = "".join([
  5.         c if c in lWHITELIST2
  6.         else "&#" + lstr(lord(c)) + ";"
  7.         for c in s])
  8.     # …

: html7 5.58323383331

That’s a huge boost from saving the string interpolation!

Every Problem Can be Solved With a Big Enough Hash Table
Any web programmer knows that performance problems can be universally solved with caching. (Heh heh, if only…) In any case, if the gain from removing string interpolation was so great, maybe we can just cache the results of that characters to escaped form: function, and not do string concatenation either. We’ll set up a cache of characters to HTML escapes:

  1. CACHE_HTML_ESCAPES = {}

Then we read and write our cache as necessary:

  1. def html8(s):
  2.     # …
  3.     escaped = "".join([
  4.         c if c in lWHITELIST2
  5.         else CACHE_HTML_ESCAPES.get(c) or CACHE_HTML_ESCAPES.setdefault(c, "&#" + lstr(lord(c)) + ";")
  6.         for c in s])
  7.     # …

Since our web servers are long-running processes, the cache should eventually capture all of the characters people are using; Python’s dicts are then fast enough to give us a speed boost over string concatenation and =str(ord(c))=.

: html8 4.5724029541

Another big boost from avoid what seems like a trivial computation.

Premature Optimization…
If we’re going down the setdefault branch so rarely, and since that’s the only place we’re using str and ord, maybe it’s not worth making those local variables?

  1. def html9(s):
  2.     # …
  3.     lWHITELIST2 = WHITELIST2
  4.     escaped = "".join([
  5.         c if c in lWHITELIST2
  6.         else CACHE_HTML_ESCAPES.get(c) or CACHE_HTML_ESCAPES.setdefault(c, "&#" + str(ord(c)) + ";")
  7.         for c in s])

We don’t expect much of a benefit here, but maybe a percent change or so…

: html9 4.565928936

Wait, Why a Python Loop At All?
Wait, why are we using a Python loop at all? Python loops are slow… Instead, maybe we can use the C code in Python’s re module? We can use regexes to find anything that isn’t in our whitelist, and do the replacement. That way, the “do-nothing” operation on whitelisted characters becomes much cheaper.

  1. import re
  2. RE_FIND_NON_WHITELISTED = re.compile("([^" + "".join(WHITELIST2) + "])")

Unfortunately, there’s no way to get the entire matched range from a Python match object. So we’re forced to call the group method– and function calls are slow!

  1. def htmlA(s):
  2.     # …
  3.     escaped = RE_FIND_NON_WHITELISTED.sub(
  4.         lambda m: CACHE_HTML_ESCAPES.get(m.group(1)) \
  5.                or CACHE_HTML_ESCAPES.setdefault(m.group(1), "&#" + str(ord(m.group(1)))) + ";",
  6.         s)
  7.     # …

What are our results?

: htmlA 4.54020690918

Hmm, this isn’t that great, actually. We did save a percentage point, but this is a lot less readable (at least, in my opinion) than html9, and the gains aren’t worth it.

Until, that is, we try it on whitelisted-only text:

: html9 3.84376811981
: htmlA 0.796116113663

Whoosh! html9 got a bit faster, by virtue of avoiding hitting the cache at all, but the regex-based solution *rocked*, since it could do all of the skipping of characters in C, not Python.

In fact, the regex-based solution is slower for punctuation, since it has to do multiple function calls instead of just one function call and dictionary lookup. And for short strings, the overhead to initiating the regex search is worse. Let’s try a test on punctuation-heavy, short text snippets:

: html9 1.59476995468
: htmlA 3.44844794273

Measure Twice, Cut Once
So we have one function that’s faster for English alphanumeric text and one that’s faster for everything else. But we can’t just shaft our non-US users, and we don’t want to settle for a function that’s five times slower for the common case! So we have a few options.

The first is simply to expand our whitelist — most file names have a dot in them, and spaces, dashes, and similar are popular. At signs appear in emails. And so on. Of course, one has to be careful not to permit any XSS vector while doing so; but a conservative expansion by adding -|!,. _ to our whitelist should be safe. Of course, this helps both versions, so it’s not really a fix.

Another fix presents itself, though: can we figure out which version of html to use quickly?

  1. def htmlB(s):
  2.     # …
  3.  
  4.     non_whitelisted = RE_FIND_NON_WHITELISTED.findall(s)
  5.     if len(non_whitelisted) > .6 * len(s):
  6.         escaped = RE_FIND_NON_WHITELISTED.sub(
  7.             lambda m: CACHE_HTML_ESCAPES.get(m.group(1)) \
  8.                 or CACHE_HTML_ESCAPES.setdefault(m.group(1), "&#" + str(ord(m.group(1)))) + ";",
  9.             s)
  10.     else:
  11.         lWHITELIST2 = WHITELIST2
  12.         escaped = "".join([
  13.                 c if c in lWHITELIST2
  14.                 else CACHE_HTML_ESCAPES.get(c) or CACHE_HTML_ESCAPES.setdefault(c, "&#" + str(ord(c)) + ";")
  15.                 for c in s])
  16.  
  17.     # …

Why .6, you ask? Well, the exact constant isn’t too important (the function *works* whichever way the test goes, it’s just an optimization), but some testing showed it to be the approximate break-even point of the two methods.

With hope and trepidation, we run the tester…

: htmlB 5.16241598129
: html9 3.97228693962
: htmlA 3.95208191872

Awww…

As we hoped, the result is fast on whitelisted-only characters…

: htmlB 1.24477005005
: html9 3.41327309608
: htmlA 0.696345090866

And is passable on punctuation:

: htmlB 5.97420597076
: html9 3.61161899567
: htmlA 8.88924694061

But the overhead is pretty saddening…

Sampling
We can improve things a bit by testing just a few characters for punctuation/unicode.

  1. def htmlC(s):
  2.     # …
  3.     non_whitelisted = RE_FIND_NON_WHITELISTED.findall(s[:10])
  4.     if len(whitelisted) < 6:
  5.     # …

We use =6= instead of =.6 * min(10, len(s))= because if the string is shorter than 10 characters, either alternative is going to be fast.

This leads to a marked improvement. For alphanumeric strings:

: htmlC 0.836707115173
: html9 3.34154415131
: htmlA 0.701889276505

For unicode-heavy strings:

: htmlC 3.70150613785
: html9 2.82831597328
: htmlA 6.77485609055

This is now really looking like an option for production use. But checking is still quite a bit of overhead still– in the case where the user has just about the break-even balance of English and international characters, we’re looking at a 20% premium. Can we do better?

The User is Always Right
Well, so checking which to use has very high overhead. What to do? Well, we’ve got to think about when each case will come up: how might we predict that lots of non-English-alphanumerics are going to come up? What users are likely to store files with international names, or use an international name? (Hopefully, people who use lots of punctuation in both are few…)

Well luckily, Dropbox added translations a bit back, so we already have all of the infrastructure in place to detect what locale a user is from. So a quick optimization is to switch to the html9 version (the list-comprehension-based one) for international users, and use the fast htmlA version (regexes) for US-based users, and also users from countries with mostly-Latin alphabets. The code here is mostly tied to Dropbox internals, so I’m not going to show it, but I’m sure you get the point.

This final optimization removes the check overhead while giving us most of the benefit. Success!

What We Didn’t Do
Now there are some ways to optimize the above even further. The obvious approach is to rewrite the entire thing in C. This lets us squeeze the last bits of performance out of the above, but it would also make things much harder to maintain.

Similarly, using PyPy or a similar JIT would probably help on the inner loop of the pure-Python version, likely making it as fast as the regex-based approach.

Conclusions
The first, most basic conclusion is that the basic facts of Python optimization inline functions, use implicit loops, move computation into C if possible are perfectly valid. Another fact: Python dictionaries are *fast*. The WHITELIST set and the CACHE_HTML_ESCAPES dict both rely on the superb hash table implementation for their performance gains.

Other “common wisdom”, like using locals instead of globals, yields relatively little gain.

Optimizing inner loops in a high-level loops requires lots of trial and error. It was absolutely amazing that moving to string concatenation from string interpolation gave such a huge boost to performance.

Finally, measurement is key. It was measurement that told us that HTML escaping was slow to begin with; and without measuring performance, we would never have guessed that string interpolation was so slow.

Thanks for reading!

  • Guest

    Great article. Would love to see a table of the percent improvement from each technique.

  • Guest

    awesome stuff. please post more!

  • Anonymous

    Very cool. This feels like something Guido would write. Thanks for sharing!

  • Stephen C.

    WE MUST GO DEEPER!

  • Bob

    Nice work, and well-written writeup too!

    What version of python was this; 2.7, 3.2? Did you get different speeds for each of these?

  • Anonymous

    Nice!

  • Blah

    I’m pretty sure there’s another piece of optimization where you could do something by creating two frozensets of CACHE_HTML_ESCAPES WHITE_LIST2

    and then do something like … frozensets(m) - frozensets(cache_html_escapes) .. which would run a comparison on two hash tables in O(1) with pretty simple code, as well as be a bit cleaner.  Thoughts?

  • Douglas Bagnall

    why the branch between whitelist and cache in html9?

    First:

    for c in WHITELIST2:   CACHE_HTML_ESCAPES[c] = cthen

    def html9b(s):

        # …

        cache = CACHE_HTML_ESCAPES

        escaped = “”.join([cache[c] for c in s])

  • Andrew Dalke

    Try subclassing from __dict__ and use your own __missing__ method, like this:

        class HtmlEscapeCache(dict):
            def __missing__(self, c):
                s = ”&#” + str(ord(c)) + ”;”
                self[c] = s
                return s

    You get the caching “for free”, and you can write “CACHE_HTML_ESCAPES[c]” or “CACHE_HTML_ESCAPES[m.group(1)]” and not have the ugly .get/.setdefault combination.

  • Anonymous

    You can save some dot-lookups in html9: get = CACHE_HTML_ESCAPES.get

  • Paweł Płazieński

    Could you post your test suite? I think I have nice function and I’d like to compare my results ;-)

  • http://derekarnold.net Derek Arnold

    What are you using to dump the bytecode in that format?

  • Anonymous

    Why not put CACHE_HTML_ESCAPES.get and CACHE_HTML_ESCAPES.setdefault in local variables? Getting rid of the attribute access could give you a few milliseconds.

  • Anonymous

    The dis module from the standard library.

    import dis; dis.dis(lambda x: x+3)

  • http://gedmin.as/ Marius Gedminas

    The first few functions appear to have a bug — ‘&#%02x;’ % ord(c) will output hexadecimal numbers where &#nnn; is supposed to use decimals.  Either ‘&#x%02x;’ or ‘&#%d;’ would be correct.

  • http://en.ig.ma/ Filip Wasilewski

    Nice writeup.

    Could you update the post with your test methodology and some sample input files?

    My take on the problem using memoization in the _escape_char function and improving readability a bit (regex version). Also, when working with regular expressions don’t forget to use re.escape.


    import re

    WHITELIST = frozenset("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
    RE_FIND_NON_WHITELISTED = re.compile("[^%s]" % re.escape("".join(WHITELIST)))

    def _escape_char(match, cache=dict()):
    # match - re.MatchObject
    try:
    return cache[match.group(0)]
    except KeyError:
    char = match.group(0)
    escaped = "&#%s;" % str(ord(char))
    cache[char] = escaped
    return escaped

    def escape_html(s, escape_char_callback=_escape_char):
    return RE_FIND_NON_WHITELISTED.sub(escape_char_callback, s)

    • http://en.ig.ma/ Filip Wasilewski

      And another “non-greedy” one that is efficient in escaping special html and non-ascii chars:

      import cgi

      def escape_html(s, extra_chars="="):
          escaped = cgi.escape(s, quote=True)
          for char in extra_chars:
              escaped = escaped.replace(char, "&#%s;" % str(ord(char)))
          return escaped.encode('ascii', 'xmlcharrefreplace')

  • Fijal

    Hey

    Do you feel like providing benchmark inputs? Otherwise it’s impossible to reproduce

  • http://tech.oyster.com/ Ben Hoyt

    I’m curious why you need to support UTF-7?

    Almost all unicode characters are valid in UTF-8 HTML except & and ‘ ” in attribute values, whereas your whitelist will escape any non-alphanumeric chars that could be passed through directly as UTF-8 chars, making the output bigger than it needs to be.We use web.py’s htmlquote() function for this purpose (see source code). It’s fast, readable, and works for attribute values as well (because it escapes single and double quote).We’ve found that str.replace() is very fast, much much faster than looping through chars in Python, and even several times faster than re.sub(), which is why it’s used here to good effect.

    • Andrew Dalke

      When the charset isn’t set, browsers often sniff for the encoding. One possible encoding is utf-7. You might think the solution is to always declare the encoding, but suppose that page is contained in an iframe with content=’text/html;charset=utf-7′ . A malicious user can put a utf-7 encoded comment on your site, wrap it in an iframe, and end up with a XSS attack. Thus the comment “the possibility for confusing the browser into believing your page is UTF-7″.

      • Xavier Combelle

        I tried to set content=’text/html;charset=iso8859-1′ a

      • Xavier Combelle

         I tried to set content=’text/html;charset=iso8859-1′ a page wichch is served with charset=utf-8 and it is rendered as utf-8 in firefox so if the encoding is declared whatever iframe trick the browser behave correctly

  • Dude

    Careful now, you’ll awake the Balrog.

  • http://en.ig.ma/ Filip Wasilewski

    Line 8 of htmlB should be (note the parenthesis position):

    or CACHE_HTML_ESCAPES.setdefault(m.group(1), “&#” + str(ord(m.group(1))) + “;”),

    instead of:

    or CACHE_HTML_ESCAPES.setdefault(m.group(1), “&#” + str(ord(m.group(1)))) + “;”,

    Or even better use string interpolation. It won’t have any significant influence on performance and will improve readability.

  • http://twitter.com/gbin gbin

    I gained 20% by not mixing unicode / str concatenations  for example for the html7 implementation :

    WHITELIST2_UNI = set(u’abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789′)

    def htmlGB():
        s = r
        lstr = unicode; lord = ord; lWHITELIST2 = WHITELIST2_UNI
        return u”.join([
            c if c in lWHITELIST2
            else u'&#' + lstr(lord(c)) + u';'
            for c in s])

    And another 50% compiling it in cython like this :
    def cpython_encode(unicode s):
        return u”.join([
               c if c in WHITELIST
               else u'&#' + unicode(ord(c)) + u';'
               for c in s])

    Note : tried cStringIO, it is 100% slower.

    • http://twitter.com/gbin gbin

      Ha for those interested, those implementations run on cpython, cython and pypy…. cython ruled them all.

      ⚫ gbin@sal escaping % python test.py            

      html7:  2.46649909019

      htmlGB:  2.12551879883

      ⚫ gbin@sal escaping % pypy-c1.5 test.py         
      html7:  2.38735985756
      htmlGB:  1.79328203201

      ⚫ gbin@sal escaping % python test_cython.py
      htmlcython:  1.18395209312

  • patrick mccormick

    Great post, well written. How unique is the input? Would adding a memo-iizing decorator help anything?

    You optimized the functions for single inputs, not over a range of inputs…which could lead to more optimization opportunities…but great writeup!

  • http://www.codekoala.com/ Josh

    Thanks for sharing this.  I enjoy reading about the experiences others have while optimizing code.

  • Anonymous

    I think the general sentiment that “python is slow” makes people expect that their python programs will be slow and not bother to do anything about it. Hopefully this post will encourage people to take a look at their own code and see how they can make use of more efficient algorithms and data structures.

  • Pingback: Câu chuyện tối ưu hóa đoạn mã Python — Python cho người Việt

  • http://twitter.com/gbin gbin

    I posted the full details about my findings in the comment below with source to reproduce the results.

    http://klaig.blogspot.com/2011/10/python-optimizations-exercise.html

    The specialized cypthon code is really fast in those kind of micro-optimizations.
    We can probably do better, if you have an idea, feel free to send me your code and I’ll integrate it to the graph.

  • Ray

    + operator string concatenations are slow. take a look at http://www.skymind.com/~ocrow/python_string/

    in html9:
    CACHE_HTML_ESCAPES.setdefault(c, “&#” + str(ord(c)) + “;”)
    becomesCACHE_HTML_ESCAPES.setdefault(c, ”.join(["&#", str(ord(c)), ";"]))

    You may also try using backticks (“) instead of str since ord returns an integer value. Although read here before doing so http://stackoverflow.com/questions/1673071/what-do-backticks-mean-to-the-python-interpreter-num
    CACHE_HTML_ESCAPES.setdefault(c, ”.join(["&#", str(ord(c)), ";"]))
    becomesCACHE_HTML_ESCAPES.setdefault(c, ”.join(["&#", `ord(c)`, ";"]))

  • Oelewapperke

    Have you tried http://www.rfk.id.au/blog/entry/compiling-rpython-programs/ ?

  • http://twitter.com/ScarfaceDeb Andrew Volozhanin

    great article!

  • ke

    Great article. Thanks for your writing.

  • Pingback: Links for October 25th through November 4th | michael-mccracken.net

  • German Kraut

    Build a sophisticated Machine that transforms Coffee into Code and Software

  • http://www.africasporthuntingsafaris.com big game hunting

    Great stuff, it helped me a lot

  • Pingback: A Smattering of Selenium #70 « Official Selenium Blog

  • Robintoriadaniels

    Im trying to up a Quicktime in a sharing dropbox, Im sharing the box with someone that has a box of 50 GB,  The quicktime size is 815.3 MB.

    I contiue to get an error message taht states, ERROR 500.

    Do you know what that means

    • http://www.absolutefiction.com/ Jed Tylman

      A server-side timeout maybe?

  • Robintoriadaniels

    Im trying to up a Quicktime in a sharing dropbox, Im sharing the box with someone that has a box of 50 GB,  The quicktime size is 815.3 MB.I contiue to get an error message taht states, ERROR 500.Do you know what that means

  • Anonymous

    Need to improve you serp check out seo guildford as we have some fantastic servies 

  • http://www.luxundpartner.at/ Dominik Lux

    Great post! Thanks for the conclusion

    Dominik

  • Pingback: Siempre hay sitio para la optimización « Mbpfernand0's Blog

  • http://hosthamster.eu/ Host Hamster

    Great work I have been trying to optimize my site so I know how hard it can be 

  • Pingback: xion.log » Python Optimization 101

  • Jacob Block

    Interesting read. Per your final comments, I’m not sure how an entire C implementation is less maintainable than some of the python functions you wrote, but in general sure ;)

  • steve2014

    Interesting to see the relationship between speed and readability when optimizing. I personally would have stopped at about your 7th version.