Performance from Day 1
Performance is important. You can meet all of the requirements, and be completely bug free, but if a page takes 20 seconds to render, the customer won't be happy. As Jeff Atwood wrote, speed still matters. Performance is the functional requirement that every customer forgets to mention, and every developer forgets to ask about. Customers generally just assume the performance will be adequate.
Performance is so important that I suggest we change the standard user story template to:
As a <user> I want to <action> so that <goal> within <performance expectation>
When discussing performance, this quote is often tossed around:
premature optimization is the root of all evil
The full quote, however, is (emphasis mine):
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." - Donald Knuth
Small efficiencies in an algorithm are one thing, but often we go to the other extreme: we make architectural choices that make decent performance impossible.
- We introduce unnecessary layers, for the sake of architectural purity, which often have to be torn down to get halfway decent performance at the end of the project
- We don't look out for stupid bugs that lead to common problems SELECT N+1 and memory leaks
- We don't make provision for the simplest performance improvements, like caching, in our frameworks, so they have to be scattered throughout the code
Most projects end up having a sprint that is devoted wholly to fixing performance in the application. That's not "tweaking 10% extra". It's "making the home page render without 72 SQL queries". It's a sad fact that most performance problems aren't fixed by rocket science micro-optimization, but by undoing dumb architecture decisions.
A simple performance check-list
During the "sprint 0" backlog building stage, there's a few simple questions we should ask the customer:
- How many concurrent users should they expect to serve?
- Will there be periods of major increase in demand (e.g., Christmas sales)
- What is their maximum response time (usually this should be no more than a few seconds)
- What costs are they likely to wear as far as server costs, bandwidth costs, etc., so we can keep an eye on them
At the end of each sprint, as a bare minimum, we should:
- Measure the number of SQL queries that are issued as we browse the most common pages
- Measure the number of network requests (browser->web->app server) necessary to serve a single request
- Keep an eye on our memory and CPU usage, and watching how they change as more users are added
This steps won't uncover every potential performance problem, but they take about 10 minutes to do at the end of a sprint, and will uncover the most basic performance problems caused by the architecture, at the best time to fix them. Windows Performance and Reliability Monitor, SQL Profiler, NHibernate Profiler and the "network" tab of your favorite browser's debugging tools are all you really need.