Performance from Day 1

This is an old post and doesn't necessarily reflect my current thinking on a topic, and some links or images may not work. The text is preserved here for posterity.

Performance is important. You can meet all of the requirements, and be completely bug free, but if a page takes 20 seconds to render, the customer won't be happy. As Jeff Atwood wrote, speed still matters. Performance is the functional requirement that every customer forgets to mention, and every developer forgets to ask about. Customers generally just assume the performance will be adequate.

Performance is so important that I suggest we change the standard user story template to:

As a <user> I want to <action> so that <goal> within <performance expectation>

When discussing performance, this quote is often tossed around:

premature optimization is the root of all evil

The full quote, however, is (emphasis mine):

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." - Donald Knuth

Small efficiencies in an algorithm are one thing, but often we go to the other extreme: we make architectural choices that make decent performance impossible.

We introduce unnecessary layers, for the sake of architectural purity, which often have to be torn down to get halfway decent performance at the end of the project
We don't look out for stupid bugs that lead to common problems SELECT N+1 and memory leaks
We don't make provision for the simplest performance improvements, like caching, in our frameworks, so they have to be scattered throughout the code

Most projects end up having a sprint that is devoted wholly to fixing performance in the application. That's not "tweaking 10% extra". It's "making the home page render without 72 SQL queries". It's a sad fact that most performance problems aren't fixed by rocket science micro-optimization, but by undoing dumb architecture decisions.

A simple performance check-list

During the "sprint 0" backlog building stage, there's a few simple questions we should ask the customer:

How many concurrent users should they expect to serve?
Will there be periods of major increase in demand (e.g., Christmas sales)
What is their maximum response time (usually this should be no more than a few seconds)
What costs are they likely to wear as far as server costs, bandwidth costs, etc., so we can keep an eye on them

At the end of each sprint, as a bare minimum, we should:

Measure the number of SQL queries that are issued as we browse the most common pages
Measure the number of network requests (browser->web->app server) necessary to serve a single request
Keep an eye on our memory and CPU usage, and watching how they change as more users are added

This steps won't uncover every potential performance problem, but they take about 10 minutes to do at the end of a sprint, and will uncover the most basic performance problems caused by the architecture, at the best time to fix them. Windows Performance and Reliability Monitor, SQL Profiler, NHibernate Profiler and the "network" tab of your favorite browser's debugging tools are all you really need.

Performance from Day 1

A simple performance check-list

Read more

Project MGA

Fair and Transparent: A deep dive into compensation at Octopus Deploy

Octopus milestones in 2024

Octopus Deploy in 2021