12 फ़रवरी 2026

Spec-First Claude Code Development Workflow

Un developers ke beech jo Claude Code se reliable output paate hain aur un developers ke beech jo apna aadha din agent ne jo banaya usse undo karne mein bitaate hain, ek badhta hua gap hai. Yeh farq talent, experience, ya koi secret prompt engineering trick nahi hai. Yeh methodology ka sawaal hai. Jo developers AI agents ke saath production software ship kar rahe hain, woh ek pattern par converge ho gaye hain, chahe woh ise kuch bhi kehte hon: agent ke code likhna shuru karne se pehle define karo ki tumhe kya chahiye.

Yeh article us pattern ko ek naam deta hai. Spec-first development AI-assisted software engineering ke liye ek methodology hai. Koi vague "best practice" nahi. Ek structured, repeatable lifecycle jismein defined phases, clear checkpoints, aur har step par concrete artifacts hain. Agar tum ek aisa tarika dhundh rahe ho jisse Claude Code ka output itna predictable ho ki tum apna release schedule us par daaon laga sako, toh yeh woh framework hai.

Vibe Coding ki seema

"Vibe coding" 2025 ki shuruat mein vocabulary mein aaya. Pitch yeh thi: natural language mein describe karo ki kya chahiye, AI ko likhne do, iterate karo jab tak sahi na lage. Prototypes, weekend projects, aur one-off scripts ke liye vibe coding kaam karta hai. Tez kuch functional mil jaata hai, aur agar baad mein toote, toh stakes kam hain.

Production software alag constraints ke andar operate karta hai. Code ko existing codebase mein integrate hona chahiye, specific requirements satisfy karni chahiye, aur un logon ke contact ko survive karna chahiye jo ise maintain karenge. Jab vibe coding in constraints se milta hai, failure modes predictable hain.

Pehla failure drift hai. Tum ek feature loosely describe karte ho, agent apni interpretation implement karta hai, tum adjust karte ho, agent apni adjusted interpretation reimplements karta hai. Teen iterations baad, tumhare paas working code hai jo tumhari kisi bhi original requirement ko satisfy nahi karta kyunki har iteration ne target ko shift kar diya. Tum us cheez ki taraf converge ho rahe ho jo agent sochta hai ki tum chahte ho, na ki jo tumhe actually chahiye.

Doosra failure invisible decisions hai. Tumhari description mein har gap ek decision hai jo agent silently leta hai. Database schema, error handling strategy, API shape, validation rules, library choices. Tum in decisions ko code review ke dauran discover karte ho, ya isse bhi bura, production mein. Agent ne bure decisions nahi liye. Usne uninstructed decisions liye, aur tumhare paas unhe pakadne ka koi mechanism nahi tha pehle se implementation mein bake hone se.

Teesra failure review paralysis hai. Ek 600-line diff jahan agent ne architecture, data model, error codes, aur edge case handling choose kiya hai, traditional sense mein reviewable nahi hai. Tum code ko spec ke against review nahi kar rahe. Tum code se spec reconstruct kar rahe ho, phir decide kar rahe ho ki tum agree karte ho ya nahi. Yeh zyada time leta hai spec likhne se.

Vibe coding ceiling hit karta hai kyunki yeh do alag activities ko conflate karta hai: decide karna ki kya banana hai aur use banana. Spec-first development inhe alag karta hai.

Ek methodology ke roop mein Spec-First

Spec-first development ek four-phase lifecycle hai. Har phase ek concrete artifact produce karta hai. Har transition ki ek clear gate condition hai. Methodology kisi bhi AI coding agent ke saath kaam karti hai, lekin is article ke examples Claude Code use karte hain kyunki wahan community sabse tez iterate kar rahi hai.

Phase 1: Brainstorm

Tum aur agent (ya sirf tum) problem space explore karte ho. Constraints kya hain? Kaunse approaches hain? Tradeoffs kya hain? Yeh conversational hai. Tum kisi cheez ke liye commit nahi ho rahe. Tum territory map kar rahe ho.

Gate condition: tumhare paas ek preferred approach hai aur tum articulate kar sakte ho ki yeh approach alternatives ke upar kyun.

Claude Code ke saath brainstorming valuable hai kyunki agent ke paas patterns aur libraries ka broad knowledge hai. Galti yeh hai ki brainstorm se seedha code par kood jaao. Brainstorm options surface karta hai. Yeh unke beech choose nahi karta. Woh tum karte ho.

Phase 2: Spec

Tum decision likh lete ho. Yeh woh contract hai jiske against agent implement karega. Ek spec user story nahi hai, Jira ticket nahi hai, prose ka paragraph nahi hai. Yeh ek structured document hai jismein:

Problem statement: kya toota hua hai ya missing hai, concrete terms mein
Proposed approach: brainstorm phase se chosen solution
Files affected: agent ko kaunsi files touch karni chahiye (aur implicitly, kaunsi nahi)
Acceptance criteria: testable conditions jo "done" define karte hain
Out of scope: agent ko explicitly kya avoid karna chahiye

Acceptance criteria sabse important element hain. Har ek concrete action honi chahiye jiska observable outcome ho. "Authentication kaam karna chahiye" criterion nahi hai. "Valid credentials submit karne par 200 aur session token mile; invalid credentials submit karne par 401 bina token ke" criterion hai.

Out-of-scope section gold-plating rokta hai. Iske bina, agents adjacent code "improve" karenge, jo files unhe messy lagi unhe refactor karenge, ya related lagane wali features add karenge. Har minute jo agent unrequested kaam par kharach karta hai woh minute hai jo tum unrequested kaam review karne mein kharach karte ho.

Gate condition: koi jo brainstorm mein nahi tha woh yeh spec padhke sahi cheez bana sake.

Phase 3: Implementation

Agent spec ke against execute karta hai. Conversation ke against nahi. Jo discuss kiya uski memory ke against nahi. Ek concrete document ke against jismein testable criteria hain.

Code likhne se pehle, agent ek plan produce karta hai: ek numbered list changes ki jo woh karna chahta hai, kaunsi files modify karega, aur result kaise verify karega. Yeh plan ek two-minute checkpoint hai. Tum ise padhte ho, confirm karte ho ki yeh tumhari intent se match karta hai, aur implementation ko green-light dete ho. Ya tum misunderstanding pakad lete ho aur correct karte ho. Dono taraf se, tumne bees ki jagah do minute lagaye.

Plan-before-code pattern bureaucracy nahi hai. Yeh poore workflow mein single highest-leverage intervention hai. Zyaadatar implementation mistakes coding errors nahi hain. Woh comprehension errors hain: agent ne spec ko galat samjha. Ek comprehension error ko plan mein pakadna do minute ka hai. Ise 400-line diff mein pakadna bees minute ka hai. Ise production mein pakadna ek din ka hai.

Gate condition: agent ne ek completion report post kiya hai specific claims ke saath ki kya banaya gaya aur kaise verify kiya gaya.

Phase 4: Verification

Tum ya QA process implementation ko spec ke against confirm karte ho. "Kya yeh sahi lagta hai?" nahi balki "kya yeh har acceptance criterion satisfy karta hai?"

Verification mechanical hai. Tum spec se har criterion lete ho, test execute karte ho (command run karo, browser kholo, event trigger karo) aur result record karte ho: pass ya fail. Jo criteria fail hote hain woh Phase 3 mein wapas jaate hain. Verification implementation ke saath document hoti hai taaki koi bhi jo task chhe mahine baad padhe woh exactly dekh sake kya test hua tha.

Gate condition: har acceptance criterion ka ek recorded pass/fail result hai.

Yeh complete lifecycle hai. Chaar phases, chaar artifacts (approach rationale, spec, implementation plan, verification record), chaar gate conditions. Phases sequential hain lekin lightweight. Ek medium-sized feature ke liye, phases 1 aur 2 mein 15-20 minute lagte hain. Phase 3 implementation jitna time le. Phase 4 mein 5-10 minute.

Yeh agents ke saath humans se zyada kyun matter karta hai

Specs likhne ka har argument AI se pehle ka hai. "Requirements pehle likho code se" yeh advice hamare mein se zyaadatar ke paida hone se pehle se hai. Toh phir ise AI-assisted development ke liye specific kyun frame karna?

Kyunki agents cost function badal dete hain.

Ek human developer jo vague requirement receive karta hai woh rukega aur sawaal poochhega. "Password auth ya SSO?" "Mobile par kaam karna chahiye?" "Jab token expire ho tab kya hoga?" Har sawaal ek mini-checkpoint hai jo implementation ko sahi target ki taraf dhakelta hai. Human developer ke saath vague spec ki cost kuch Slack threads aur shayad ek dopahar ka rework hai.

Ek agent jo vague requirement receive karta hai nahi rukega. Woh har ambiguous decision silently lega, ek approach par commit karega, aur tumhe ek finished implementation present karega. Agent ke saath vague spec ki cost ek finished implementation hai jo shayad poori tarah galat ho, plus woh time jo tumhe discover karne mein lagta hai ki woh galat hai, plus woh time jo ise dubara karne mein lagta hai.

Asymmetry stark hai. Agents execution mein humans se tez hain aur judgment mein bure. Spec mein har ambiguity ek judgment call hai, aur har judgment call jo agent bina guidance ke leta hai woh ek coin flip hai ki result tumhari intent se match karega ya nahi. Ek spec coin flips eliminate karti hai.

Ek doosri, zyada subtle wajah hai. Agents push back nahi karte. Ek senior engineer jo buri spec receive karta hai bolega "yeh sense nahi banata X ki wajah se." Ek agent buri spec ko faithfully implement karega aur faithfully galat output produce karega. Spec-first development tumhe majboor karta hai ki tum apni soch ko pressure-test karo pehle ise us entity ko dene se jo ise bina sawaal ke execute karegi. Spec sirf agent ke liye nahi hai. Woh tumhare liye hai.

Beadbox यही समस्या हल करता है।

आपकी पूरी agent fleet क्या कर रही है, real-time में देखें।

Beta में मुफ़्त आज़माएँ →

Plan-before-code checkpoint

Agar tum is article se ek practice lete ho aur baaki ignore karte ho, toh yeh lo.

Agent ke code likhne se pehle, us se ek implementation plan post karne ki demand karo. Code nahi. Diff nahi. Ek structured outline jo woh karna chahta hai.

Plan aisa dikhta hai: numbered steps execution order mein, modify hone wali files, har file mein logic changes, aur verification approach. Agent ise lagbhag tees seconds mein produce karta hai. Tum ise lagbhag do minutes mein padhte ho. Un do minutes mein, tum pakad sakte ho:

Scope violations: agent un files ko modify karne ka plan bana raha hai jo spec mein listed nahi hain
Architectural mismatches: agent ne ek approach choose kiya jo existing patterns se conflict karta hai
Missing steps: plan ek acceptance criterion ko address nahi karta
Overengineering: agent aise abstractions banana chahta hai jo warranted nahi hain

2-minute plan review 20-minute diff review ko replace karta hai jahan tum yeh problems discover karte ho jab woh already ban chuke hote hain. Yeh software engineering mein sabse sasti quality gate hai.

Maine plan-before-code pattern ka detailed walkthrough Spec-Driven Development with Claude Code mein likha hai, jismein spec templates aur completion report formats hain. Yeh article kyun pattern kaam karta hai us par focus karta hai; woh article kaise implement karna hai us par.

Verification ek first-class step ke roop mein

Zyaadatar developers ke workflows mein sabse kam invest ki gayi phase verification hai. Agent bolta hai "done." Developer diff par nazar daalta hai. Merge ho jaata hai. Bug do din baad surface hota hai jab user acceptance criteria ka edge case number three hit karta hai.

Spec-first development verification ko apne artefacts ke saath ek formal step treat karta hai. Completion report har acceptance criterion ko ek concrete check se map karta hai:

Criterion: "Workspace switch karne par saved filter state restore ho."
Check: App kholo, workspace A mein filters set karo, workspace B par switch karo, workspace A par wapas aao, observe karo ki filters restored hain.
Result: Pass.

Yeh overhead nahi hai. Yeh woh step hai jo determine karta hai ki implementation actually spec satisfy karti hai ya nahi. Iske bina, spec ek wishlist hai aur acceptance criteria aspirational hain.

Verification record ek downstream problem bhi solve karta hai: code review. Jab reviewer pull request kholta hai, woh spec padhta hai, verification record padhta hai, aur diff ko full context ke saath review karta hai. Review time kam hota hai kyunki reviewer ek verified claim confirm kar raha hai, investigation nahi conduct kar raha.

Jab tum multiple agents parallel mein run karte ho, har ek alag spec implement karte hue, verification discipline controlled pipeline aur "probably kaam karta hai" code ke pile ke beech ka farq hai. Har spec ke criteria hain. Har implementation ka completion report hai. Har completion report criteria ko checks se map karta hai. Kuch bhi recorded verification ke bina ship nahi hota.

Objections aur honest tradeoffs

Spec-first development free nahi hai. Objections real hain aur seedhe address karne layak hain.

"Specs likhna mujhe slow karta hai." Isolation mein, haan. Ek feature ke liye spec likhne mein 15-20 minutes lagte hain. Lekin tum woh time (aur zyada) implementation aur review phases mein recover karte ho. Clear spec wala agent vague prompt wale agent se zyada baar correct implementation produce karta hai. Kam iterations, kam rewrites, chhotey reviews. Kisi bhi substance wali feature ke liye net effect tez delivery hai, dheemi nahi.

Trivial changes ke liye (variable rename karna, typo fix karna, version bump karna), specs unnecessary overhead hain. Spec-first un kaam ke liye hai jahan implementation decisions maangti hai. Agar change mechanical aur unambiguous hai, spec skip karo.

"Mera agent specs ke bina kaafi achha hai." Kuch tasks ke liye, shayad sahi. Claude Code brief descriptions se intent infer karne mein remarkably capable hai. Sawaal yeh nahi ki agent vague instructions se achha output produce kar sakta hai. Yeh hai ki woh reliably karta hai ya nahi. Agar tumhe occasional rework aur unpredictable review times se koi dikkat nahi, vibe coding tumhare use case ke liye kaafi ho sakta hai. Spec-first tab pay karta hai jab consistency aur predictability matter karti hai: jab feature complex ho, jab code production mein jaaye, jab koi aur ise maintain kare.

"Specs stale ho jaati hain." Valid concern. Brainstorming ke dauran likhi spec shayad reality ke contact ko survive na kare. Fix specs skip karna nahi hai. Spec ko update karna hai jab plan nayi information reveal kare. Agar agent ka plan dikhaye ki spec ka approach kaam nahi karega, spec ko revise karo aage badhne se pehle. Spec implementation ke dauran ek living document hai. Verification ke baad woh historical record ban jaati hai.

"Yeh toh bas waterfall hai." Nahi. Waterfall ki failure bade specs thi bade projects ke liye jismein lambe feedback cycles the. Spec-first development task level par operate karta hai: ek spec per feature ya fix, 15-20 minutes mein likhi, ghanton mein implemented, usi din verified. Feedback loop tight hai. Per spec investment chhota hai. Agar spec galat hai, tumhe plan review ke dauran pata chal jaata hai, chhe mahine baad nahi.

Spec-First lifecycle ke liye tooling

Methodology kisi bhi task system ke saath kaam karti hai: GitHub Issues, Linear, Notion, plain text files. Jo matter karta hai woh yeh ki spec, plan, implementation notes, aur verification results sab ek jagah hon, ek task se attached.

Agar tum ek aisa system dhundh rahe ho jo is workflow ke liye designed hai, beads ek open-source, Git-native issue tracker hai jo poora lifecycle hold karta hai. Har "bead" ek description carry karta hai (tumhari spec), ek comment thread (plans aur completion reports), ek status (open, in_progress, ready_for_qa, done), aur metadata jaise dependencies aur priorities. bd CLI terminal se operate karta hai, matlab agents specs padh sakte hain, plans post kar sakte hain, aur completions report kar sakte hain apna working environment chhodey bina.

bd create --title "Persist filter state across workspaces" \
  --description "## Problem ..." --type feature --priority p2

bd update bb-a1b2 --claim --actor eng1
bd comments add bb-a1b2 --author eng1 "PLAN: ..."

# After implementation:
bd comments add bb-a1b2 --author eng1 "DONE: ... Commit: a1b2c3d"
bd update bb-a1b2 --status ready_for_qa

Poora lifecycle CLI mein hota hai. Chhe mahine baad, bd show bb-a1b2 poori history return karta hai ki kya specify, plan, build, aur verify kiya gaya.

Jab tum ek agent ko is lifecycle se guzarte ho, CLI kaafi hai. Jab tum paanch ya das parallel mein run karte ho, har ek spec-implement-verify pipeline ke alag stage par, tumhe pipeline ka state ek nazar mein dekhna hoga. Beadbox ek real-time dashboard hai jo dikhata hai kaunsi specs open hain, kaunse plans review ka wait kar rahe hain, kya in progress hai, kya blocked hai, aur kya verification ke liye ready hai. Yeh wahi beads database monitor karta hai jismein agents likhte hain, live update hota hai jaise statuses badalte hain.

Tumhe spec-first development practice karne ke liye Beadbox ki zaroorat nahi. Methodology tool-agnostic hai. Lekin jab parallel workstreams tumhari pipeline ko un tasks ki queue mein badal dete hain jo tum sirf memory se track nahi kar sakte, visual layer badal deta hai ki tum kitni tezi se review, unblock, aur ship kar sakte ho.

Bada badlaav

Spec-first development is baat ki reaction nahi hai ki AI coding agents bure hain. Yeh is baat ki recognition hai ki woh bina guidance ke galat cheezon mein achhe hain. Agents extraordinarily capable executors hain. Woh correct syntax likhte hain, patterns follow karte hain, boilerplate handle karte hain, aur ek volume produce karte hain jo koi human match nahi kar sakta. Jo unke paas nahi hai woh context hai achhe decisions lene ka ki kya banana hai. Woh context tumse aata hai, aur spec woh vehicle hai.

Jo developers AI-assisted engineering mein thrive karenge woh nahi hain jo sabse achhe prompts likhte hain. Woh hain jo sabse achhi specs likhte hain. Prompts ephemeral hain. Specs durable hain. Prompts ek single interaction ke liye optimize karte hain. Specs ek lifecycle ke liye optimize karti hain: brainstorm, define, implement, verify.

Yeh koi temporary workaround nahi hai jab tak agents smart na ho jaayein. Chahe models improve hon, fundamental asymmetry rehti hai: human jaanta hai business ko kya chahiye; agent jaanta hai code kaise likhna hai. Spec dono ko bridge karti hai. Better models specs ko tez execute karenge, lekin spec ki zaroorat nahi jaati. Yeh scale ke saath zyada important hoti hai, kyunki zyada agents vague instructions ke against chalte hue zyada divergent output produce karte hain.

Agar tum Claude Code agents run kar rahe ho aur results inconsistent paate ho, ya review mein bahut zyada time lagata hai, ya parallel workstreams coordinate karne mein mushkil hoti hai, yeh try karo: agle feature se pehle, 15 minutes leke testable acceptance criteria ke saath ek spec likho, agent se coding se pehle plan post karne ki demand karo, aur output ko criterion by criterion verify karo. Ek cycle tumhe farq dikha dega.

Agar tum aise workflows bana rahe ho, Beadbox ko GitHub par star karo.

Like what you read?

Beadbox is a real-time dashboard for AI agent coordination. Free during the beta.

Star Beadbox on GitHub

X Reddit LinkedIn