How good is good enough? Defining “done” in real-world software
Software teams love “best practice”. Engineers present immaculate architectures. Testers advocate exhaustive coverage. Leaders talk about world-class quality. Everyone nods, and then reality arrives: a date, a budget, legacy code, unclear requirements, and a stack of competing priorities.
The result is often a gap between what we say we value and what we actually ship. That gap creates tension, rework, and-worst of all-endless debates where “quality” becomes vague, moralistic, and unmeasurable.
This post is about closing that gap. Not by lowering standards, but by agreeing what “good enough” means for your organisation, in a way that is practical, explicit, and enforceable without turning into performative theatre.
“Good enough” is not “low quality”
“Good enough” should never mean careless engineering. It should mean:
- Sufficient quality for the risk and context
- Clear acceptance criteria for what we will and won’t do
- A repeatable, organisational standard that reduces debate
A team with no shared definition of “done” tends to oscillate between two failure modes:
- Gold-plating: trying to do everything (full unit coverage, full automation, extensive load tests, perfect observability, pristine architecture) for every change, until progress slows to a crawl.
- Chaos shipping: cutting corners inconsistently, accumulating risk and technical debt until the next incident forces a reactive clean-up.
The goal is a stable middle: deliberate trade-offs with eyes open.
Why best-practice talk becomes detached from reality
Best practices are often presented as universal truths, but they’re conditional. They assume a certain maturity, product stage, traffic profile, team size, and risk tolerance.
In many organisations, “best practice” becomes:
- A proxy for personal preference (“I like this approach”)
- A shield against accountability (“We should do it properly”)
- A way to win arguments (quality as a moral high ground)
- A leadership comfort blanket (“We’ve said the right things”)
Meanwhile, the delivery system is constrained. The work still has to fit inside time, people, and money. If you pretend constraints don’t exist, you don’t get higher quality-you get hidden compromises and fragile software.
Start with one idea: quality is a risk management function
Instead of asking “What’s the best way to build this?”, ask:
What risks are we trying to reduce, and what is the cheapest reliable way to reduce them?
Most “quality” activities are risk controls:
- Unit tests reduce regression risk in logic
- Integration tests reduce contract risk between systems
- Automation reduces human error and release friction
- Load testing reduces performance/capacity risk
- Observability reduces diagnosis and recovery risk
- Code review reduces correctness and maintainability risk
If you can’t name the risk, you can’t sensibly choose the control.
The house standard: a shared definition of “done”
A “house standard” is a minimum, agreed baseline that teams apply consistently. It is not a manifesto. It is a working agreement designed to:
- Reduce argument-by-opinion
- Make trade-offs explicit
- Protect customers and the business
- Protect engineers from impossible expectations
- Prevent quality work being perpetually optional
Think of it like building regulations rather than interior design trends.
What the house standard should look like
A good house standard is:
- Tiered (not one-size-fits-all)
- Measurable (not fluffy)
- Lightweight (easy to follow)
- Enforceable (via tooling and process)
- Owned by the organisation (not an individual)
A simple structure that works well is a “Quality Bar” with three levels.
Level 1: Routine change (low risk)
Examples: small UI change, copy change, non-critical refactor, internal tooling.
Minimum bar:
- Code review by one peer
- Lint/format/static analysis clean
- Targeted unit tests for non-trivial logic (where it matters)
- Basic telemetry: errors captured, key actions logged (if relevant)
- Release notes / change summary (brief)
Level 2: Customer-impacting change (medium risk)
Examples: new feature, billing flow update, permission changes, new integration.
Minimum bar:
- Two-person review or one reviewer + explicit checklist
- Unit tests for core logic paths
- At least one integration test or contract test where integration risk is high
- Feature flag if rollback is non-trivial
- Monitoring for key metrics + alerting for error rates
- Evidence of manual test coverage (scripted or exploratory notes)
Level 3: High-risk change (high impact / high uncertainty)
Examples: auth changes, payments, data migrations, major performance work, public API changes.
Minimum bar:
- Design review (short, written)
- Broader automated coverage (unit + integration + critical e2e where it pays)
- Rollback plan or forward-only migration plan with rehearsed steps
- Load/perf checks where capacity is a real risk
- Observability upgraded: dashboards, alerts, traceability if needed
- On-call/incident readiness: runbook entry updated
The key isn’t the exact content-it’s that the organisation agrees the categories and the minimum controls.
Avoiding the trap: “Do everything” is not a strategy
When someone says “We should have full unit test coverage and full automation and full observability,” the correct response is:
- For which parts of the system?
- For which risks?
- At what cost?
- What do we stop doing to pay for it?
If you want maturity, you must fund it. Otherwise, “do everything” becomes an expectation without investment, and teams quietly cut corners while pretending not to.
A house standard lets you say: “We will do these things always, those things when risk requires it, and we will explicitly agree exceptions.”
How to get engineers, testers, and leadership aligned
Agreement is less about a perfect framework and more about a shared language.
1) Write down what “quality” means for your business
Not “clean code” or “SOLID”. Business outcomes:
- Prevent customer harm (data loss, security incidents, billing errors)
- Maintain service reliability (uptime, latency, recovery)
- Maintain delivery capability (avoid fragile systems that block change)
- Meet compliance where applicable
2) Define a small set of non-negotiables
These should be few, clear, and defensible. Example non-negotiables:
- No unreviewed changes to production
- No silent failures (errors must be captured)
- Anything touching auth/billing/data migration must meet Level 3 bar
- Every change must have a rollback or mitigation path
3) Make the “why” explicit
If the standard reads like a list of chores, it will be bypassed. Tie each item to a risk it reduces.
4) Build it into your workflow
A house standard that lives in Confluence is a suggestion. Make it real:
- PR templates with checklists by risk level
- CI gates (tests, linting, security scanning where appropriate)
- Release checklist automation
- Feature flag tooling
- Observability defaults (libraries and scaffolding)
5) Allow exceptions, but require visibility
There will be times you must cut scope. That’s normal. The problem is unspoken exceptions.
Use an explicit mechanism:
- “Exception granted for X, risk accepted by Y, mitigation Z, follow-up ticket created.”
This keeps reality honest without turning into bureaucracy.
Practical examples of “good enough” decisions
Unit test coverage
“Full coverage” is rarely cost-effective. “Good enough” is:
- High coverage for core domain logic and edge cases
- Lower coverage where tests are brittle (UI, glue code) unless risk demands it
- A focus on preventing the regressions you actually see
End-to-end automation
“Automate everything” tends to produce slow, flaky pipelines. “Good enough” is:
- A small number of stable e2e tests covering critical user journeys
- Heavier automation at unit/integration level where it is cheaper and faster
- Manual exploratory testing used deliberately for uncertain areas
Load testing
Not every service needs sophisticated load tests. “Good enough” is:
- Basic performance checks for key endpoints
- Capacity tests before major launches or large customers onboard
- Load tests targeted at known bottlenecks, not as ritual
Observability
Perfection here is expensive, but zero visibility is worse. “Good enough” is:
- Centralised logs with correlation IDs
- Error tracking with alerting
- A handful of meaningful service-level dashboards
- Tracing where systems are distributed and failures are hard to diagnose
The cultural shift: from virtue signalling to operational discipline
The point of a house standard is to replace vague statements with operational discipline.
Instead of:
- “We should do it properly.”
- “This needs to be best practice.”
- “Quality is important.”
You get:
- “This is a Level 2 change; it needs X, Y, Z.”
- “We’re skipping the integration test; risk accepted by A; mitigation is B.”
- “This is Level 3 because it touches billing; we’ll add a flag and a rollback runbook.”
That eliminates waffle and makes decisions traceable.
Closing thought
“Good enough” isn’t a personal judgement. It’s an organisational agreement.
If you define it clearly, tier it by risk, and embed it into how you build and ship, you get something rare: a team that moves quickly and reliably, without pretending that every change deserves aerospace engineering.
The work isn’t to chase perfection. The work is to build a system where quality is intentional, measurable, and paid for-rather than discussed endlessly and delivered inconsistently.