How good is good enough? Defining “done” in real-world software

Software teams love “best practice”. Engineers present immaculate architectures. Testers advocate exhaustive coverage. Leaders talk about world-class quality. Everyone nods, and then reality arrives: a date, a budget, legacy code, unclear requirements, and a stack of competing priorities.

The result is often a gap between what we say we value and what we actually ship. That gap creates tension, rework, and-worst of all-endless debates where “quality” becomes vague, moralistic, and unmeasurable.

This post is about closing that gap. Not by lowering standards, but by agreeing what “good enough” means for your organisation, in a way that is practical, explicit, and enforceable without turning into performative theatre.

“Good enough” is not “low quality”

“Good enough” should never mean careless engineering. It should mean:

Sufficient quality for the risk and context
Clear acceptance criteria for what we will and won’t do
A repeatable, organisational standard that reduces debate

A team with no shared definition of “done” tends to oscillate between two failure modes:

Gold-plating: trying to do everything (full unit coverage, full automation, extensive load tests, perfect observability, pristine architecture) for every change, until progress slows to a crawl.
Chaos shipping: cutting corners inconsistently, accumulating risk and technical debt until the next incident forces a reactive clean-up.

The goal is a stable middle: deliberate trade-offs with eyes open.

Why best-practice talk becomes detached from reality

Best practices are often presented as universal truths, but they’re conditional. They assume a certain maturity, product stage, traffic profile, team size, and risk tolerance.

In many organisations, “best practice” becomes:

A proxy for personal preference (“I like this approach”)
A shield against accountability (“We should do it properly”)
A way to win arguments (quality as a moral high ground)
A leadership comfort blanket (“We’ve said the right things”)

Meanwhile, the delivery system is constrained. The work still has to fit inside time, people, and money. If you pretend constraints don’t exist, you don’t get higher quality-you get hidden compromises and fragile software.

Start with one idea: quality is a risk management function

Instead of asking “What’s the best way to build this?”, ask:

What risks are we trying to reduce, and what is the cheapest reliable way to reduce them?

Most “quality” activities are risk controls:

Unit tests reduce regression risk in logic
Integration tests reduce contract risk between systems
Automation reduces human error and release friction
Load testing reduces performance/capacity risk
Observability reduces diagnosis and recovery risk
Code review reduces correctness and maintainability risk

If you can’t name the risk, you can’t sensibly choose the control.

The house standard: a shared definition of “done”

A “house standard” is a minimum, agreed baseline that teams apply consistently. It is not a manifesto. It is a working agreement designed to:

Reduce argument-by-opinion
Make trade-offs explicit
Protect customers and the business
Protect engineers from impossible expectations
Prevent quality work being perpetually optional

Think of it like building regulations rather than interior design trends.

What the house standard should look like

A good house standard is:

Tiered (not one-size-fits-all)
Measurable (not fluffy)
Lightweight (easy to follow)
Enforceable (via tooling and process)
Owned by the organisation (not an individual)

A simple structure that works well is a “Quality Bar” with three levels.

Level 1: Routine change (low risk)

Examples: small UI change, copy change, non-critical refactor, internal tooling.

Minimum bar:

Code review by one peer
Lint/format/static analysis clean
Targeted unit tests for non-trivial logic (where it matters)
Basic telemetry: errors captured, key actions logged (if relevant)
Release notes / change summary (brief)

Level 2: Customer-impacting change (medium risk)

Examples: new feature, billing flow update, permission changes, new integration.

Minimum bar:

Two-person review or one reviewer + explicit checklist
Unit tests for core logic paths
At least one integration test or contract test where integration risk is high
Feature flag if rollback is non-trivial
Monitoring for key metrics + alerting for error rates
Evidence of manual test coverage (scripted or exploratory notes)

Level 3: High-risk change (high impact / high uncertainty)

Examples: auth changes, payments, data migrations, major performance work, public API changes.

Minimum bar:

Design review (short, written)
Broader automated coverage (unit + integration + critical e2e where it pays)
Rollback plan or forward-only migration plan with rehearsed steps
Load/perf checks where capacity is a real risk
Observability upgraded: dashboards, alerts, traceability if needed
On-call/incident readiness: runbook entry updated

The key isn’t the exact content-it’s that the organisation agrees the categories and the minimum controls.

Avoiding the trap: “Do everything” is not a strategy

When someone says “We should have full unit test coverage and full automation and full observability,” the correct response is:

For which parts of the system?
For which risks?
At what cost?
What do we stop doing to pay for it?

If you want maturity, you must fund it. Otherwise, “do everything” becomes an expectation without investment, and teams quietly cut corners while pretending not to.

A house standard lets you say: “We will do these things always, those things when risk requires it, and we will explicitly agree exceptions.”

How to get engineers, testers, and leadership aligned

Agreement is less about a perfect framework and more about a shared language.

1) Write down what “quality” means for your business

Not “clean code” or “SOLID”. Business outcomes:

Prevent customer harm (data loss, security incidents, billing errors)
Maintain service reliability (uptime, latency, recovery)
Maintain delivery capability (avoid fragile systems that block change)
Meet compliance where applicable

2) Define a small set of non-negotiables

These should be few, clear, and defensible. Example non-negotiables:

No unreviewed changes to production
No silent failures (errors must be captured)
Anything touching auth/billing/data migration must meet Level 3 bar
Every change must have a rollback or mitigation path

3) Make the “why” explicit

If the standard reads like a list of chores, it will be bypassed. Tie each item to a risk it reduces.

4) Build it into your workflow

A house standard that lives in Confluence is a suggestion. Make it real:

PR templates with checklists by risk level
CI gates (tests, linting, security scanning where appropriate)
Release checklist automation
Feature flag tooling
Observability defaults (libraries and scaffolding)

5) Allow exceptions, but require visibility

There will be times you must cut scope. That’s normal. The problem is unspoken exceptions.

Use an explicit mechanism:

“Exception granted for X, risk accepted by Y, mitigation Z, follow-up ticket created.”

This keeps reality honest without turning into bureaucracy.

Practical examples of “good enough” decisions

Unit test coverage

“Full coverage” is rarely cost-effective. “Good enough” is:

High coverage for core domain logic and edge cases
Lower coverage where tests are brittle (UI, glue code) unless risk demands it
A focus on preventing the regressions you actually see

End-to-end automation

“Automate everything” tends to produce slow, flaky pipelines. “Good enough” is:

A small number of stable e2e tests covering critical user journeys
Heavier automation at unit/integration level where it is cheaper and faster
Manual exploratory testing used deliberately for uncertain areas

Load testing

Not every service needs sophisticated load tests. “Good enough” is:

Basic performance checks for key endpoints
Capacity tests before major launches or large customers onboard
Load tests targeted at known bottlenecks, not as ritual

Observability

Perfection here is expensive, but zero visibility is worse. “Good enough” is:

Centralised logs with correlation IDs
Error tracking with alerting
A handful of meaningful service-level dashboards
Tracing where systems are distributed and failures are hard to diagnose

The cultural shift: from virtue signalling to operational discipline

The point of a house standard is to replace vague statements with operational discipline.

Instead of:

“We should do it properly.”
“This needs to be best practice.”
“Quality is important.”

You get:

“This is a Level 2 change; it needs X, Y, Z.”
“We’re skipping the integration test; risk accepted by A; mitigation is B.”
“This is Level 3 because it touches billing; we’ll add a flag and a rollback runbook.”

That eliminates waffle and makes decisions traceable.

Closing thought

“Good enough” isn’t a personal judgement. It’s an organisational agreement.

If you define it clearly, tier it by risk, and embed it into how you build and ship, you get something rare: a team that moves quickly and reliably, without pretending that every change deserves aerospace engineering.

The work isn’t to chase perfection. The work is to build a system where quality is intentional, measurable, and paid for-rather than discussed endlessly and delivered inconsistently.

How Good Is Good Enough? Defining “Done” in Real-World Software

How good is good enough? Defining “done” in real-world software

“Good enough” is not “low quality”

Why best-practice talk becomes detached from reality

Start with one idea: quality is a risk management function

The house standard: a shared definition of “done”

What the house standard should look like

Level 1: Routine change (low risk)

Level 2: Customer-impacting change (medium risk)

Level 3: High-risk change (high impact / high uncertainty)

Avoiding the trap: “Do everything” is not a strategy

How to get engineers, testers, and leadership aligned

1) Write down what “quality” means for your business

2) Define a small set of non-negotiables

3) Make the “why” explicit

4) Build it into your workflow

5) Allow exceptions, but require visibility

Practical examples of “good enough” decisions

Unit test coverage

End-to-end automation

Load testing

Observability

The cultural shift: from virtue signalling to operational discipline

Closing thought