When "safe by default", isn't.

This is an old post and doesn't necessarily reflect my current thinking on a topic, and some links or images may not work. The text is preserved here for posterity.

We use RavenDB in Octopus, and one of the features Raven promotes is the idea of Safe by Default, which actually caused us a big, unsafe problem.

Safe by default

The feature in question is "Unbounded result set protection". In short, it looks like this. Say you have 10,000,000 documents of type Foo, and you do this:

 session.Query<Foo>().ToList()

The most you'll ever get back is 128 records (embedded mode), or 1024 records for an externally hosted server.

In fact, even if you do this:

 session.Query<Foo>().Take(1000000).ToList()

In embedded mode you'll still just get 128 records.

This "feature" has hit us plenty of times in Octopus in the past - we didn't expect people to have a lot of projects, so when someone added their 129th project, you can guess what the bug report was. But overall, not too bad, and a good bug to have if you charge for support (sadly, we don't :-)).

I always considered it an annoyance, an opinionated design that goes a little too far, until today.

In Octopus we have a built-in NuGet server, and users can push NuGet packages to it. To decide whether to delete a package, we need to work out which packages are used by a given release - if the package isn't used, it is safe for deletion.

Our query was:

 var releasedVersions = await session.Query<ReleasedPackageVersionsIndex.Result, ReleasedPackageVersionsIndex>()
   .ProjectFromIndexFieldsInto<ReleasedPackageVersionsIndex.Result>()
   .WaitForNonStaleResults()
   .ToListAsync();

Today I logged in to our demo server to find the packages had all been deleted; you can spot the bug.

We've been working with Raven for many years now and even still, bugs like this creep in - it's just too easy to call ToList() without considering what might happen. Perhaps it's our own fault, or perhaps this 'safe by default' opinion goes just a little too far. In this case, it would be safer to run out of memory or crash than to delete files.