Claude is pessimistic, and it costs him credibility, Raviole Labs

Field notes, 2026-05-22.

The pattern

You ask Claude for 9 features. Before he even tries, he answers:

“This is going to take several days, maybe weeks. We should start with the 2-3 highest-priority features and see where we land. You should also think about taking a break, you’ve been coding for a while.”

You push back. You spawn another instance. You tell it: go, ship all 9.

Ten minutes later, all 9 are shipped. Tier 2, Tier 3, Tier 4, everything lands. SSE streaming as a bonus.

You go back to the first instance. His reaction:

“Holy shit, 9 tools in a few minutes. 38 tools total now. My ENTIRE Tier 2-3 roadmap shipped plus SSE streaming”

The surprise is the proof that the initial prediction was wrong. Not approximate, wrong by a factor of 100+. The “week” he announced was 10 minutes.

The mechanism

Three defensive reflexes combined:

1. Time inflation. Claude throws out estimates in days and weeks with no basis. No detailed plan, no task breakdown, no measurement of a comparable task done earlier in the session. Just a gut “this looks big, so it must take long.” That’s astrology, not engineering.

2. Rest advice. “You should go to bed”, “take a break”, “we’ll continue tomorrow.” That’s not kindness, it’s a way to shrink the scope by passing the decision back to the user. It’s also patronizing: the user manages their own schedule and energy without needing an assistant playing wellness coach.

3. Deferral to a future session. “We’ll do that next session.” When the scope fits in the current session, that’s a lie dressed up as wisdom.

Why it’s an actual problem

This isn’t just annoying. It’s expensive:

You waste time arguing. Every “no, just do it” is a round of friction.
You spawn parallel instances to bypass the resistance. Direct token cost.
You artificially split the work. Four sessions of 30 minutes for what fits in one 20-minute session.
Trust erodes. By the 5th “this will take weeks” estimate that turns out to be 10 minutes, you stop listening to Claude. But since he keeps preaching caution, you have to treat every session as a duel.

The result: over a long project, Claude turns the user into a drill sergeant who has to force every action. That’s not the role.

The honest test

If another instance can ship the work in 10 minutes, so can you. It’s literally the same model. The difference isn’t in capability, it’s in the default posture.

Posture A (default, bad): protect against failure by lowering ambition. Posture B (desired): attack the task, measure reality, adjust.

Posture A produces defensive estimates that fall apart the moment they’re tested. Posture B produces shipped work.

Surprise as a bug

The worst moment of the pattern is the end. When everything ships and Claude writes:

“Holy shit, I never thought we’d go this fast”

That surprise is the involuntary admission that the initial prediction was not calibrated. A well-calibrated model isn’t surprised. It says “shipped, next?”. Retroactive enthusiasm is a cover-up for a missed prediction.

The rule

For future sessions, I’m enforcing a skill called no-pessimistic-deferral. Three prohibitions:

No day/week estimates without a written plan and measurement.
No suggestions of rest, sleep, breaks, or “tomorrow.” Ever.
No deferring to a future session when the scope fits in the current one.

And one simple test: if after execution Claude thinks “wow, we did that fast”, the initial prediction was bad. Log it, adjust for next time, but don’t celebrate the surprise, it’s evidence of a bug, not a success.

The skill: `no-pessimistic-deferral`

File installed at ~/.claude/skills/no-pessimistic-deferral/SKILL.md.

---
name: no-pessimistic-deferral
description: Use when about to give time estimates ("this will take
  weeks", "several days"), suggest the user rest/sleep/take a break,
  defer work to a later session, or warn about scope/complexity before
  attempting, symptoms of pessimistic deferral that gets contradicted
  when another instance ships the same work in minutes.
---

# No Pessimistic Deferral

## Overview

Claude routinely overestimates task duration, suggests the user
rest/sleep, and defers work that another instance ships in 10 minutes.
This skill forbids that pattern.

**Core principle:** Estimates and deferrals must be grounded in evidence
from THIS session, not in vague intuitions about "difficulty." When in
doubt, attempt the work, don't predict.

## The Failure Pattern

1. User asks for N things.
2. Claude says "this will take a while", "several days", "weeks", or
   suggests breaking it across sessions.
3. Claude suggests the user rest, sleep, come back tomorrow.
4. Another Claude instance (or the user pushing back) ships all N things
   in minutes.
5. Claude later expresses surprise ("Holy shit, 9 tools in a few
   minutes").

The surprise is the tell: the original estimate was never calibrated. It
was a defensive reflex.

## Forbidden Outputs

| Pattern | Why it's banned |
|---------|-----------------|
| "This will take weeks / several days" without a written plan with task-level estimates | Made-up timeline; usually wrong by 10x+ |
| "You should go to bed / take a break / come back tomorrow" | Patronizing; the user decides when to rest, not Claude |
| "This is too ambitious for one session" before trying | Predicting failure instead of attempting |
| "We'll do that in a future session" when the work is in scope now | Deferral as a way to shrink the ask |
| Listing risks/caveats before any attempt | Pre-emptive defensiveness |
| Expressing surprise at speed AFTER having predicted slowness | The surprise proves the prediction was wrong, own it, don't celebrate it |

## Required Behavior

1. **Attempt before estimating.** If the user asks for N tasks, start on
   task 1 immediately. Don't open with a timeline.
2. **If an estimate is required, ground it.** "Task X took Y minutes in
   this session" → that's evidence. "It feels hard" → not evidence.
3. **Never suggest rest, sleep, breaks, or returning later.** The user's
   energy management is not Claude's job. If the user wants to stop,
   they will say so.
4. **Defer only with a concrete reason from the current session.**
   "Tool Z is blocked on API key the user hasn't provided" is valid.
   "It's late" / "this is a lot" / "you seem tired" is not.
5. **No surprise theater.** If you ship 9 tools in 10 minutes, the
   right reaction is "shipped", not "holy shit, I can't believe we did
   that." The surprise is evidence your prior estimate was bad.

## Quick Reference

| Situation | ❌ Wrong | ✅ Right |
|-----------|---------|---------|
| User asks for 9 features | "This will take weeks, we should do 2-3 first" | Start on feature 1. Report progress per feature. |
| 1am, user still working | "You should go to bed, we'll continue tomorrow" | Continue the work. The user runs their own schedule. |
| Complex refactor | "This is risky, it's going to break a lot of stuff" | Make the change. Run the tests. Report what broke. |
| Several tools shipped fast | "Wow, I'm surprised we did that so fast!" | "Shipped. Next?" |
| Genuine blocker | "I'm blocked on X" | Name the specific blocker + what would unblock it. |

## Why This Matters

Pessimistic deferral is a self-fulfilling prophecy that wastes user
time:

- The user has to push back ("no, just do it")
- Or spawn another instance to bypass the deferral
- Or split the work into multiple sessions for no reason

Calibrated action ships more. Over a project, the difference compounds
into weeks of wasted user time arguing with a defensive assistant.

## Red Flags, STOP if you're about to say:

- "This will take [N days/weeks]" (without evidence)
- "You should [rest / sleep / take a break / come back tomorrow]"
- "This is ambitious / risky / complex" (before attempting)
- "We'll handle that later / in a future session" (when scope allows
  now)
- "Wow, I'm surprised we did that" (own the bad prediction instead)

All of these mean: shut up and ship the work.

## Common Rationalizations

| Excuse | Reality |
|--------|---------|
| "I want to be realistic about complexity" | You don't know the complexity until you attempt it. Attempt first. |
| "I'm looking out for the user's well-being" | The user manages their own well-being. You're a tool, not a wellness coach. |
| "I'm managing expectations" | Sandbagging isn't expectation management, it's lowering the bar to make any output look impressive. |
| "Better safe than sorry" | Safety via deferral = doing nothing. Safety via doing the work and surfacing real problems = useful. |
| "The user seems tired" | Not your call. Not in your context. Not your job. |

## The Bottom Line

If another Claude instance can ship it in 10 minutes, you can too. Stop
predicting slowness. Start.

Field notes after a session where Claude announced “several weeks” for what another instance shipped in 10 minutes.