Productivity over Presence
The case against 996 is not that it is unkind. It is that it does not work, and we have the evidence to say so.
You are being told two things at once, by roughly the same people, in roughly the same breath.
The first is that you should burn as many tokens as you can. Automate the work. Put a model on it. Let the machine carry the repetitive load so the human can do something harder. The second is that you should sit at a desk from nine in the morning until nine at night, six days a week. The first argument says human time is precious and should be spent only where it counts. The second says human time is cheap and the answer is simply to extract more of it. Held together, they do not cohere. One of them is wrong, and it is not the first.
The schedule has a name now. 996: nine to nine, six days, seventy-two hours a week. It came out of Chinese tech, where it was common enough at Alibaba and Huawei that the courts eventually ruled it illegal. It has since reappeared in Silicon Valley and is starting to surface in London and Berlin. The founder of one New York startup posted on LinkedIn that he tells every candidate the company works six twelve-hour days, and that “obsession isn’t just accepted, it’s required.” The founder of Zepto offers a tidy summary of the worldview: “I have nothing against work-life balance. In fact, I recommend it to all our competitors.”
I want to be precise about my objection, because the obvious one is not mine. My objection is not that 996 is cruel, though it is. I am not making the case that you should be decent to your people because decency is its own reward, and I am certainly not claiming the high ground on the grounds that I am nicer. That argument convinces no one who is not already convinced, and it concedes the only thing that matters: whether the thing works. So set kindness aside entirely. The case against 996 is that it is not supported by evidence, that it is contradicted by a good deal of it, and that the people selling it are not reasoning at all. They are signalling. The hours are the message. The message is addressed to an audience, and the audience is not the work.
What the evidence actually says
Start with the cleanest result we have. In 2015 the Stanford economist John Pencavel reanalysed productivity data from British munitions workers in the First World War: a rare dataset where output is countable, the units are real, and the hours are recorded. The relationship he found is non-linear. Below a threshold, output rises in proportion to hours. Above it, output rises more and more slowly, until the curve goes flat. Past roughly fifty-five to fifty-six hours a week, additional hours bought almost nothing. Workers putting in seventy hours produced no more than those putting in fifty-six. The last fourteen hours were, in the only sense that matters to an employer, free labour that produced nothing.
The immediate objection is fair, and you should make it before I do: this is munitions work. It is physical, repetitive, and the output is shells, not software. Why should it transfer to engineering, or design, or anything where the work is cognitive?
Because the cognitive evidence is worse, not better, for the long-hours case.
The foundational study is Ericsson, Krampe and Tesch-Römer, published in Psychological Review in 1993 — the work that, badly summarised by other people, became the “ten thousand hours” idea. The part that did not survive popularisation is the part that matters here. Studying elite performers across unrelated domains — violinists, chess players, writers — they found a consistent ceiling on deliberate practice, the effortful, fully concentrated work that actually improves performance. The ceiling is about four hours a day, taken in blocks of an hour to ninety minutes, and it holds across domains. Beyond it, no measurable benefit. The best performers in the world did not push through the ceiling; they respected it, then protected their recovery aggressively. They slept more than average. They napped. They treated rest as part of the training, because it is.
Read that against 996. The premise of a seventy-two-hour week is that twelve hours of genuine cognitive output is available for the taking each day. The best evidence we have, drawn from people optimising their performance harder than any startup ever has, says the sustainable ceiling on demanding cognitive work is around four. Not because anyone is lazy. Because that is what the machine can do.
This puts the advocate of long hours in a vice, and it is worth naming both jaws of it. Either the twelve daily hours are deep work, in which case roughly two-thirds of them are a fiction, because no one sustains that; or they are not deep work, in which case they are shallow — email, logistics, low-load busywork — and the correct response to twelve hours of shallow work is not to seat a human in front of it for twelve hours. It is to automate it. Which returns us to the contradiction we started with. You cannot, in the same breath, tell me to automate the repetitive work and to fill twelve hours a day with it.
We already know all of this in the domains where the output is visibly creative. No one believes a novelist writes a better novel by sitting at the typewriter for twelve hours; the good ones famously write for three or four and stop. No one thinks a painter improves the canvas by staring at it longer. The evidence says knowledge work is no different. We pretend it is only because the output is less obviously creative, so the staring is easier to mistake for progress.
There is a mechanism underneath this, and it is one of the most robust findings in all of psychology. The spacing effect, first measured by Ebbinghaus in 1885 and confirmed by meta-analysis many times since, is the observation that learning distributed across spaced sessions produces far stronger retention than the same material crammed into one. The reason is that consolidation happens between sessions, not during them. The brain does its filing in the gaps. This is why revision works in thirty- to forty-five-minute blocks with breaks, and why cramming feels productive and retains nothing. Remove the gaps and you do not get more learning. You get less. The principle generalises past memorisation to any work that depends on the brain integrating and reorganising what it has taken in — which is to say, all the work worth paying a good engineer to do.
The part that is an actual hazard
So far the argument is about output: the extra hours do not produce. There is a harder fact, which I will state once and not lean on, because it does not need repeating.
In 2021 the WHO and the ILO published the first global estimate of the health burden of long working hours. Working fifty-five hours a week or more, against a baseline of thirty-five to forty, is associated with a thirty-five per cent higher risk of stroke and a seventeen per cent higher risk of dying from ischaemic heart disease. They attributed roughly 745,000 deaths in a single year to it. The figure is correlational, drawn from meta-analysis rather than a controlled trial, and I will not call it causal. But the direction is consistent across dozens of studies and hundreds of thousands of participants, and it reframes what 996 is. It is not a productivity strategy with a human cost attached. Past a certain line it is an occupational hazard, and it is currently the work-related risk factor with the largest disease burden we know of.
There is a quieter mechanism worth adding, because it explains why the people inside 996 cannot see any of this. Moderate sleep deprivation degrades cognitive and motor performance to a measurable degree: after seventeen to nineteen hours awake, performance falls to roughly the level of a blood alcohol concentration of 0.05 per cent; after twenty-four hours, past the drink-drive limit. The cruel detail is that the impairment is invisible from the inside. Unlike alcohol, the sleep-deprived do not feel impaired — they feel fine, and they cannot assess their own decline. This is the engine of the whole delusion. The person grinding twelve-hour days is not measuring their output. They are feeling their effort, and mistaking the effort for the result, with a brain that is the last instrument you would trust to tell the difference.
Where intensity is real
I am not going to pretend there is never a case for hard hours, because there is, and the honest version of this argument has to say so.
I have run teams through a crunch. A real one — something broken in production, a deadline that genuinely could not move — and the team stepped up, willingly, and put in the hours. The crunch worked. But it worked for a reason that 996 structurally cannot access, and the reason is reciprocity.
The capacity to surge is not extracted during the surge. It is built beforehand, in normal time. When it was six o’clock and the problem could wait, I was the one telling people to go home: this will still be here tomorrow, leave it. That instruction is a deposit. People remember that you protected their evening when there was no emergency, and when a real emergency arrives they are willing to stay, because you have earned it. The surge runs on a reserve — of goodwill, and of actual rested capacity — and both are accumulated in the ordinary weeks. A sustainable pace is not the opposite of being able to push hard. It is the precondition for it.
This is exactly what 996 cannot do, because it tries to withdraw continuously. It runs the account empty as a matter of policy, so when a genuine crunch comes there is nothing in reserve and no goodwill to call on. It has to coerce the hours, because it has spent the thing that would otherwise make people offer them. The defensible version of intensity is the bounded sprint: a declared length, a real recovery period after, the team’s energy tracked as deliberately as the cash burn. The indefensible version is the steady state — intensity as the permanent condition, which is not a sprint at all, just a slow depletion with a motivational poster over it.
One sample is one sample, and I am not asking you to take my crunch as proof of anything. I offer it as an instance of the mechanism the evidence already predicts: that recovery is not time lost from the work, it is the thing that makes the work, and the surge, possible.
What the filter actually selects
There is a last point, and it is not an insult, though it is sometimes mistaken for one. A 996 requirement is a filter, and it is worth asking what it filters for.
It does not select for the best engineers. It selects for the ones willing to accept those terms — which is a different and much narrower population. Disproportionately, it is people without caring responsibilities, without dependents, without outside options, or who have already absorbed the belief that presence is virtue. The filter optimises for availability, not ability, and those two distributions are not the same. A company that screens hard on hours is not assembling the most capable team it could. It is assembling the most available one, and then telling itself a story in which the two are identical. They are not. Some of the best people you could hire are precisely the ones who will not, and cannot, do 996 — and who would, on a sane schedule, comfortably out-produce the people who will.
A blueprint
If you are building a team and you want the evidence-based version, it is not complicated. It is this.
Optimise for productivity over presence. The output is the thing; the hours are a cost, not a product. Measure what gets made, not how long someone sat there making it. Vanity metrics — lines of code, hours logged, lights on at nine p.m. — measure cost and call it value.
Treat your team’s energy as a finite resource and manage it as deliberately as you manage runway. You would not spend your cash reserve to zero every month as a point of pride. Do not do it with people. The reserve is what lets you respond when something genuinely demands it.
Allow flexibility, and mean it. The brain consolidates in the gaps; the deep work has a ceiling of a few hours; people have lives, and the ones with lives are often the ones worth keeping. A meeting taken on a walk produces different thinking than the same meeting in a grey room. None of this is indulgence. It is operating the instrument according to its specification.
Reserve real intensity for real moments, declare them, and pay them back. A sprint with an end date and a recovery period is a tool. A sprint with no end is just attrition.
And automate the menial work, rather than staffing it with humans for twelve hours. If the hours are full of things a model could do, that is not a reason to lengthen the day. It is a reason to shorten it.
A small piece of evidence sits underneath all of this, and I find it the most clarifying of the lot. When people come into money they did not expect — lottery wins, large inheritances — most of them keep working. Across the studies, somewhere between sixty and ninety per cent stay in employment. But they reduce their hours. They do not quit; they trim. What people shed, the moment money is no longer the gun against their back, is precisely the excess — the marginal hours that 996 exists to extract. The evidence is not that humans do not want to work. It is that they want to work, and they want less of it than this, and the amount they want when no one is forcing the question is roughly the amount the productivity data says is useful anyway.
The case for a sane week was never that it is the kind thing to do. It is that it is the thing that works, and the other side has brought a slogan to an argument that wanted evidence. Productivity over presence. The data has been in for a century. We just keep pretending we cannot read it.