claude-skill

Stage-Progression-Validator für Salesforce

Difficulty

Fortgeschritten

Setup time

60min

For

revops

RevOps

Stack

Ein Claude Skill, der prüft, welche Salesforce-Opportunities die Exit-Kriterien der Stage, in die sie gerade gewechselt sind, tatsächlich erfüllen. Für jede Opp, die in der Vorwoche vorangeschritten ist, überprüft der Skill die deterministischen Regeln (Pflichtfelder, protokollierte Aktivitäten, Stakeholder-Rollen) und vergleicht dann die qualitativen Behauptungen des Reps mit Gong-Call-Transkripten. Die Ausgabe ist eine Coaching-Queue für das wöchentliche RevOps-Review, kein Enforcement-Gate, das Deals automatisch zurückrollt.

Das Artefakt-Bundle wird unter apps/web/public/artifacts/stage-progression-validator-skill/ geliefert und enthält SKILL.md plus drei Referenzvorlagen: references/1-stage-criteria-template.md (das Stage-Rubrik des Teams), references/2-methodology-mapping-template.md (wie MEDDPICC, MEDDIC, SPICED, BANT oder ein benutzerdefiniertes Framework auf Ihre Salesforce-Felder und Gong-Phrasenmuster abgebildet wird) und references/3-sample-output-format.md (das genaue Markdown, das der Skill ausgibt).

Wann verwenden

Führen Sie dies im Rhythmus Ihres Forecast-Meetings aus. Das kanonische Muster ist ein Sonntagnacht-Batch, der auf week_ending verschlüsselt ist, wobei der Bericht in einem Slack-Kanal vor dem Montag-Morgen-Manager-Huddle erscheint. Einzelner-Opp-Modus ist ebenfalls gültig — ein Deal-Desk-Reviewer kann den Skill gegen eine Opportunity.Id vor einem Preisgenehmigungs-Meeting ausführen, oder ein Manager kann ihn gegen einen einzelnen Deal vor einem 1:1 ausführen, um das Gespräch auf die spezifischen Lücken zu erden, anstatt auf ein vages „das fühlt sich feststeckend an”.

Die qualitative Behauptungsprüfung ist der Teil, der sich bezahlt macht. Salesforce erzwingt bereits Pflichtfeld-Validierungsregeln; was es nicht kann, ist zu bemerken, dass der Rep behauptet hat „Buyer hat Erfolgskriterien zugestimmt” und dann kein Gong-Call in den letzten 45 Tagen dieses Gespräch tatsächlich erfasst hat. Der Skill ist methodology-bewusst in seiner Suche — für MEDDPICC’s Economic Buyer sucht er nach dem Namen des Buyers innerhalb von zwölf Token von Entscheidungssprache („genehmigen”, „abzeichnen”, „Budget-Inhaber”) anstatt nur eine beliebige Erwähnung des Namens. Diese Unterscheidung ist es, die einen nützlichen Flag von einem False Positive trennt, das Reps lernen zu ignorieren.

Wann NICHT verwenden

Auto-Rollback. Verdrahten Sie die Ausgabe des Skills nicht in ein Salesforce-Update, das Deals bei einem fail-Urteil herabstuft. Das Urteil ist eine Eingabe unter mehreren; der Manager besitzt die Herabstufungsentscheidung mit vollständigem Kontext, den der Skill nicht sehen kann (Off-Gong-Meetings, Side-Channel-Verpflichtungen, kundenseitige Beschaffungs-Quirks).
Performance-Management. Ein einzelnes fail bei einem einzelnen Deal ist Rauschen. Das Signal sind Muster über Wochen — der Rep, dessen fail-Rate von 5% auf 30% über ein Quartal steigt, während Kollegen stabil bleiben. Die Verwendung eines One-Shot-Urteils in einem PIP bricht das Vertrauen der Reps, und der Skill hört auf zu funktionieren.
Comp-Inputs. Stage bestimmt den Forecast, manchmal die Beschleuniger. Wenn Validator-Ausgabe in Comp-Berechnungen fließt, haben Sie einen direkten Anreiz für Reps geschaffen, die Inputs zu manipulieren — Gong-Aufnahme verweigern, Notizen weglassen, Daten in Desktop-Tabellen speichern. Halten Sie die Validator-Ausgabe im Coaching-Kanal und außerhalb der Comp-Pipeline.
Stages ohne ein dokumentiertes Rubrik. Wenn references/1-stage-criteria-template.md keinen Eintrag für die zu validierende Stage hat, gibt der Skill needs_methodology aus, anstatt zu raten. „Tunen” Sie den Skill nicht, um diese Stages mit einem Standard zu bewerten — korrigieren Sie stattdessen das Rubrik.
Teams, die nichts Strukturiertes speichern. Ein Team, das MEDDPICC in Folien und nicht in Salesforce betreibt, wird jede qualitative Prüfung nicht bestehen. Führen Sie den Skill zwei Wochen im Trockendurchlauf-Modus aus; wenn mehr als 40% der Opps in needs_methodology oder bei allen qualitativen Prüfungen unter 0,2 landen, ist das Methodology-Mapping-Dokument fiktiv. Korrigieren Sie das Dokument oder instrumentieren Sie die fehlenden Felder, bevor Sie live gehen.

Einrichtung

Stages dokumentieren. Öffnen Sie references/1-stage-criteria-template.md und ersetzen Sie den Vorlageninhalt durch das echte Rubrik Ihres Teams, Stage für Stage. Jede Stage hat drei Regel-Buckets: field_rules (ein Salesforce-Feld muss einen Nicht-Standard-Wert haben), activity_rules (eine protokollierte Aktivität eines bestimmten Typs muss innerhalb eines Aktualitätsfensters existieren) und stakeholder_rules (OpportunityContactRole muss einen Contact mit einer Rolle enthalten, die einem Regex entspricht). Markieren Sie Felder als evidence_required: gong, wenn Sie eine Gong-Transkript-Kreuzprüfung der qualitativen Behauptung möchten.
Methodology abbilden. Bearbeiten Sie references/2-methodology-mapping-template.md, um das Framework Ihres Teams anzupassen. Die Datei wird mit ausgearbeiteten Beispielen für MEDDPICC, MEDDIC und SPICED geliefert — kopieren Sie das passende und passen Sie die Salesforce-Feldnamen an die tatsächlichen API-Namen Ihrer Org an. Die Phrasenmuster-Spalte ist es, die dem Skill sagt, was als Gong-Beweis gilt; lassen Sie sie nicht als Vorlage, es sei denn, Ihre Felder entsprechen wirklich den Beispiel-Mappings.
Skill installieren. Legen Sie das Bundle in ~/.claude/skills/stage-progression-validator/ ab. Setzen Sie SFDC_TOKEN (schreibgeschützt auf Opportunity, OpportunityFieldHistory, Task, Event, OpportunityContactRole) und GONG_API_KEY (mit calls/extensive- und deals-Scopes). Schreibgeschützt ist der richtige Scope; der Skill darf nicht in Salesforce zurückschreiben.
Wöchentlichen Lauf planen. Ein einfacher Cron ist gut — claude run stage-progression-validator week_ending=$(date -d 'sunday' +%F) sonntags um 22:00 Uhr. Leiten Sie die Ausgabe in Ihren Slack-Kanal oder eine wöchentliche Digest-E-Mail.
Mit einem Coaching-Ritual koppeln. Die Urteil-Queue ist nutzlos, wenn niemand sie öffnet. Fester 30-Minuten-Slot montags, Manager geht die fail- und needs_manager_review-Zeilen mit jedem Rep durch. Nach acht Wochen sollte das Volumen in diesen Buckets sinken — das ist die Erfolgsmetrik.

Was der Skill tatsächlich tut

Für jede Progression im Fenster berechnet der Skill zwei Scores. Der deterministische Score ist der Anteil der erfüllten Methodology-Regeln — fünf Regeln, drei bestehen, der Score ist 0,6. Dies ist strukturiertes-Rubrik statt Freitext-Natursprache by design: Freitext-Kriterien zwingen das Modell, Randfälle inkonsistent über Läufe und Reps hinweg zu interpretieren, und Reps können nicht vorhersagen, was ein fail auslöst, was das Vertrauen zerstört, von dem das Tool abhängt.

Der qualitative Score ist der Anteil der evidence_required: gong-Behauptungen, die unterstützende Transkriptbelege innerhalb des relevanten Fensters finden. Das Phrasen-Matching ist methodology-bewusst. Für MEDDPICC’s Economic Buyer sucht der Skill nach dem Namen des Buyers innerhalb von zwölf Token von Entscheidungssprache. Für SPICED’s Critical Event sucht er nach datumsbegrenzter Dringlichkeitssprache mit Konsequenzverben (“verpassen”, “rutschen”, “riskieren”) in der Nähe. Eine naive „jede Erwähnung des Namens des Buyers zählt”-Prüfung produziert zu viele False Passes — der Rep, der den Buyer beiläufig in einem Anruf mit einem anderen Stakeholder erwähnt, ist kein Beweis für das Buyer-Commitment.

Die zwei Scores kombinieren sich zu einem von fünf Urteilen: pass (beide bei 1,0), flag (ein Bucket stark, der andere schwach), fail (beide unter dem Grenzschwellenwert, Standard 0,6), needs_manager_review (das Grenzband zwischen flag und fail — weder ein Score klar schlecht noch klar gut) oder needs_methodology (das Rubrik hat keinen Eintrag für diese Stage). Der needs_manager_review-Bucket existiert, weil das Erzwingen jedes Grenzdeals in ein binäres flag versus fail Rauschen produziert, das Reps lernen zu ignorieren; die Grenzzeilen gehen in eine separate Queue, die der Manager von Hand auflöst, was das Signal in den anderen Buckets bewahrt.

Kostenrealität

Claude Sonnet 4 bei aktuellen Preisen läuft bei ca. 15-25 Cent pro validierter Opportunity, dominiert durch das Lesen von Gong-Transkripten (typisches 30-Tage-Fenster deckt 4-8 Calls pro aktivem Deal bei 5-15k Token jeweils, plus ein paar hundert Token Methodology-Rubrik aus Referenzen geladen). Ein 50-Deal-wöchentlicher Batch kostet ca. 7-12 USD an API-Ausgaben.

Die eingesparte Zeit ist das Argument für den Skill. Ein RevOps-Lead, der dieses Audit manuell durchführt, verbringt 20-30 Minuten pro Deal — Stage-Geschichte abrufen, jeden Gong-Call öffnen, nach dem Namen des Buyers und dem Erfolgskriterien-Gespräch suchen. Bei 50 Deals sind das zwei volle Tage manuelles Audit pro Woche, weshalb fast kein Team es tatsächlich macht. Der Skill reduziert das auf einen 4-6-Minuten-Bericht-Review-Durchgang des Digests, mit tieferer Inspektion nur bei den Zeilen in den fail- und needs_manager_review-Buckets — typischerweise 5-10 Deals von 50, also 30-60 Minuten fokussierter Review. Netto: 12-15 RevOps-Stunden pro Woche zurück, für unter 15 USD API-Kosten.

Erfolgsmetrik

Verfolgen Sie zwei Metriken über acht Wochen Ramp. Erste, die fail-Rate — der Anteil der wöchentlichen Progressionen, die in fail landen. Ein gesunder Ramp zeigt, dass sie von einem Basiswert (oft 25-40% im ersten Lauf) auf unter 10% sinkt, da Reps verinnerlichen, was das Rubrik vor dem Voranschreiben eines Deals erfordert. Wenn sie nicht sinkt, ist entweder das Rubrik zu streng (Reps können es physisch nicht erfüllen ohne Buyer-Gespräche, für die der Deal nicht bereit ist) oder der Coaching-Loop findet nicht statt. Zweite, das mediane Stage-Alter in der Stage unmittelbar vor dem strengsten Gate. Wenn das altert — d.h. Reps parken Deals eine Stage unter ihrer Realität, um dem Gate auszuweichen — ist das Rubrik falsch, nicht die Reps. Passen Sie das Rubrik an, bevor Sie den Skill weiterlaufen lassen.

vs. Alternativen

Salesforce-Validierungsregeln — diese erzwingen die Feldpräsenz auf Datensatzebene (Sie können eine Opp in Stage 4 nicht speichern, ohne Economic_Buyer__c zu befüllen). Sie können die qualitative Prüfung nicht durchführen: Ein Rep kann irgendeinen Namen in das Feld tippen, Validierungsregeln bestehen, der Skill erfasst, dass kein Gong-Call die Behauptung unterstützt. Validierungsregeln sind auch ein stumpfes Instrument, weil sie das Speichern direkt ablehnen; der Skill produziert ein abgestuftes Urteil, mit dem der Manager arbeitet.
Clari, Gong Forecast und ähnliche KI-Forecasting-Tools — diese führen Stage-Validierung als Teil einer viel größeren Produktoberfläche durch (Forecast, Deal Review, Konversationsanalyse, Coaching). Erwarten Sie 50-150 USD pro Rep pro Monat gegenüber den ca. 10-15 USD pro Woche an API-Kosten dieses Skills. Wählen Sie die Plattform, wenn Sie auch ihre Forecasting- und Konversationsanalyse-Ebenen benötigen; wählen Sie diesen Skill, wenn Ihre Lücke speziell das Stage-Progression-Audit ist und Sie bereits Salesforce und Gong haben.
Manuelle Deal-Desk-Reviews — ein menschlicher RevOps-Lead, der jede Progression liest. Das richtige Tool für High-ACV-Enterprise-Teams, bei denen Deals wenige und folgenreich sind. Falsches Tool für SMB oder Volume-Midmarket, wo die Audit-Kosten (12-15 Stunden pro Woche) bedeuten, dass es überhaupt nicht passiert und schlechte Progressionen in den Forecast gelangen.
Nichts tun — die tatsächliche Baseline in den meisten Teams. Forecast-Genauigkeit bei den meisten B2B-SaaS-Orgs liegt irgendwo zwischen mittelmäßig und beschämend, genau weil die Stages, auf denen der Forecast aufgebaut wird, nicht validiert sind. Die Kosten des Nichtstuns zeigen sich in der CFO-Reaktion auf einen schlechten Quartalsabdruck, was ein schlechterer Moment ist, um herauszufinden, dass die Eingangsdaten nicht vertrauenswürdig waren.

Fallstricke

Zu strenge Validierung treibt Reps dazu, Stages zu manipulieren. Guard: Instrumentieren Sie das mediane Stage-Alter in der Stage unmittelbar vor dem strengsten Gate. Wenn es nach dem Versand des Skills zunimmt, ist das Rubrik falsch; passen Sie es an, bevor Sie fortfahren.
Methodology-Mismatch zwischen Folien und Salesforce. Guard: Zwei Wochen Trockendurchlauf. Wenn needs_methodology plus niedrige qualitative Scores mehr als 40% der Opps abdecken, korrigieren Sie das Methodology-Mapping oder die zugrunde liegende Feldinstrumentierung, bevor Sie ein Urteil als handlungsfähig behandeln.
Validator-Drift von echten Exit-Kriterien. Sales-Leader definieren Stage-Bedeutungen in QBRs still um; die Rubrikdatei wird nicht aktualisiert. Guard: Das Rubrik trägt ein last_reviewed-Feld; der Skill stellt jedem Bericht eine Warnung voran, wenn das Datum älter als 90 Tage ist.
Gong-Aufnahme-Coverage-Lücken sehen wie Rep-Unehrlichkeit aus. Guard: Die Methodology-Mapping-Datei deklariert einen recording_coverage_floor pro Stage. Deals unter dem Boden landen in needs_manager_review mit der Coverage-Lücke explizit angezeigt, nicht in fail.
Rep-Gegenwehr bei einem fail-Urteil. Guard: Der Bericht enthält die deterministischen Regel-Misses wörtlich und die ungematchten Phrasenmuster. Das Gespräch gründet sich auf der spezifischen Lücke, die der Rep durch Aktualisierung des Felds und erneuten Lauf beheben kann, oder mit Off-Gong-Beweisen zurückdrängen kann, die der Manager akzeptiert.

Stack

Salesforce — Stage-Geschichte, Deal-Felder, Contact-Rollen, protokollierte Aktivitäten
Gong — aufgezeichnete Gesprächstranskripte, dealspezifische Call-Listen
Claude (Sonnet 4) — methodology-bewusstes Phrasen-Matching gegen Transkripte, Urteilssynthese
Cron / Scheduler der Wahl — der wöchentliche Auslöser
Slack oder E-Mail — der Digest-Kanal, in dem der Bericht vor dem Manager-Huddle landet

Diese Seite auf GitHub bearbeiten

Files in this artifact

Download all (.zip)

---
name: stage-progression-validator
description: Validate that a Salesforce opportunity genuinely meets its claimed stage's exit criteria. For each opp that progressed in a window, the skill checks deterministic field rules, cross-references rep-claimed milestones against Gong call evidence, and emits a pass/flag/fail verdict with the specific gap. Designed as a coaching trigger for RevOps weekly reviews, not as an enforcement gate.
---

# Stage progression validator

## When to invoke

Whenever you need to audit deals that progressed between Salesforce stages and want to know which progressions were buyer-driven versus rep-optimistic. Typical cadence: a weekly batch keyed to the forecast meeting (run Sunday night, review Monday morning). Also valid: a one-shot run on a single opportunity ID before a deal-desk review or before a manager 1:1.

Take an `Opportunity.Id` (single mode) or a window expressed as `week_ending=YYYY-MM-DD` (batch mode), plus a path to the methodology rubric. Produce a structured Markdown report with a row per progression and a verdict per row.

Do NOT invoke this skill for:

- **Auto-stage rollback.** The skill emits verdicts; it must not write back to Salesforce. A "fail" verdict is a coaching input, not an instruction to demote the deal — that decision is the manager's, with rep context the skill cannot see.
- **Performance management of reps.** Verdicts are noisy at the per-deal level and only meaningful as patterns over weeks. Using a single "fail" in a PIP is misuse and will collapse rep trust in the tool.
- **Comp implications.** Stage assignments drive forecast, sometimes accelerators. Routing this skill's output into comp calculations creates a direct incentive for reps to game the validator (refusing Gong recording, omitting rep notes, etc.). Keep this output separate from comp data flows.
- **Deals in stages without documented exit criteria.** Garbage in, garbage out. If the methodology doc has no rubric for the stage being validated, return `needs_methodology` rather than guessing a verdict.

## Inputs

- Required: `opp_id` OR `week_ending` — single opportunity or a Sunday-anchored ISO date for the batch window
- Required: `methodology_path` — path to the team's stage exit-criteria rubric (see `references/stage-criteria-template.md`)
- Required: `sfdc_token` — Salesforce session token with read on `Opportunity`, `OpportunityFieldHistory`, `Task`, `Event`, `OpportunityContactRole`
- Required: `gong_api_key` — Gong key with `calls/extensive` and `deals` scopes
- Optional: `methodology_mapping` — path to a methodology-mapping doc if the team uses MEDDPICC, MEDDIC, SPICED, or a custom framework (see `references/methodology-mapping-template.md`)
- Optional: `borderline_threshold` — float in `[0, 1]`, default `0.6`. Verdicts where the deterministic-criteria score falls between the threshold and `1.0 - threshold` are emitted as `needs_manager_review` rather than `flag`/`fail`.

## Reference files

Always read these from `references/` before scoring. Without them, the verdicts collapse to checking Salesforce required-field logic, which Salesforce itself already enforces.

- `references/stage-criteria-template.md` — the team's stage-by-stage exit criteria. Replace the template contents with the team's real rubric.
- `references/methodology-mapping-template.md` — maps the team's chosen sales methodology (MEDDPICC, MEDDIC, SPICED, BANT, custom) onto fields in Salesforce. The skill uses this to know which field holds the economic-buyer name, which holds the metric, etc.
- `references/sample-output-format.md` — the exact Markdown format for the report. The renderer downstream (Slack digest, email) parses this format.

## Method

Run the steps in order. Steps 3 and 4 are where the engineering choices matter; do not skip them.

### 1. Pull the candidate set

For batch mode, query `OpportunityFieldHistory` where `Field = 'StageName'` and `CreatedDate` falls inside the window. For single mode, query the same table filtered to the supplied `opp_id` and take the most recent `StageName` change. Skip progressions where the new stage has no entry in the methodology rubric — emit those as `needs_methodology`, not as `fail`.

### 2. Score deterministic criteria

For each candidate, compute a deterministic score in `[0, 1]` from the methodology rubric. Each rule in the rubric is one of three types:

- **Field rule** — a Salesforce field must hold a non-default value (e.g. `Economic_Buyer__c IS NOT NULL`).
- **Activity rule** — a logged activity of a specified type must exist in the prior 30 days (e.g. `Task.Type = 'Demo'`).
- **Stakeholder rule** — `OpportunityContactRole` must contain a contact with a role matching a regex (e.g. `Role MATCHES /^(VP|Director|C.+O)/`).

The score is the fraction of rules satisfied. This is structured-rubric, not free-form, by design: free-form natural-language criteria force the skill to interpret edge cases inconsistently across runs and produce verdicts that reps cannot predict or trust.

### 3. Cross-reference qualitative claims with Gong

The methodology mapping flags certain fields as `evidence_required: gong`. For each such field that holds a non-default value, the skill must find a Gong call within 30 days where the relevant phrase appears in the transcript.

Phrase matching is methodology-aware, not methodology-agnostic. For MEDDPICC's `Economic Buyer`, the skill searches transcripts for the buyer's name within 12 tokens of decision-language ("approve", "sign off", "budget owner", "final say"). For SPICED's `Critical Event`, it searches for date-bounded urgency language. The mapping doc names the phrase patterns per field — if the mapping says `evidence_required: gong` but provides no patterns, the skill emits `needs_methodology` rather than guessing what counts as evidence.

Why methodology-aware: a generic "look for any mention of the buyer name" check produces too many false passes (the rep mentioning the buyer in a call to a different stakeholder is not evidence of buyer commitment).

### 4. Combine scores into a verdict

Let `D` be the deterministic score from step 2 and `Q` be the fraction of qualitative claims with Gong evidence from step 3. Combine:

- `pass` — `D == 1.0` and `Q == 1.0`
- `flag` — `D >= 0.8` or `Q >= 0.8`, but not both at `1.0`
- `fail` — `D < borderline_threshold` and `Q < borderline_threshold`
- `needs_manager_review` — neither `pass`, `flag`, nor `fail`. The deal sits in the borderline band where false positives and false negatives both have non-trivial cost.

The `needs_manager_review` band exists because the alternative — forcing a binary `flag` versus `fail` on every borderline deal — produces noise that reps learn to dismiss. The borderline bucket goes to a separate queue that the manager hand-resolves, which preserves the signal in the `flag` and `fail` queues.

### 5. Emit the report

Write the report to stdout in the exact format from `references/sample-output-format.md`. Include the deterministic-rule misses verbatim (which rule failed) and the qualitative-claim misses with the field name and the phrase pattern that did not match. Do not paraphrase Salesforce field names or rep notes — the manager will compare the report against the Salesforce UI.

## Output format

```markdown
# Stage progression validation — week ending 2026-05-02

Window: 2026-04-26 → 2026-05-02
Opportunities scored: 18
- pass: 9
- flag: 4
- fail: 3
- needs_manager_review: 2
- needs_methodology: 0

## fail (3)

### Acme Corp — Stage 4 Negotiation
- Owner: jane.doe@example.com
- Progressed: 2026-04-29
- Deterministic score: 0.40 (2 of 5 rules satisfied)
- Qualitative score: 0.00 (0 of 2 claims supported)

Deterministic misses:
- `Economic_Buyer__c` is NULL
- `Decision_Criteria__c` is NULL
- `OpportunityContactRole` has no role matching `/^(VP|Director|C.+O)/`

Qualitative misses:
- `Economic_Buyer__c` claim: no Gong call in last 30 days references claimed buyer "Pat Ellis" within 12 tokens of decision-language pattern
- `Success_Criteria__c` claim: no Gong call in last 30 days contains success-criteria pattern

### {next fail row}
...

## flag (4)
...

## needs_manager_review (2)
...

## pass (9)
| Opp | Owner | New stage | Deterministic | Qualitative |
|---|---|---|---|---|
| ... | ... | ... | 1.00 | 1.00 |
```

## Watch-outs

- **Over-strict validation pushes reps to game stages.** If the rubric demands more than reps can plausibly satisfy without a buyer conversation that isn't yet warranted, reps will park deals one stage below their reality. Guard: instrument a "stage age" metric; if median stage age in the stage just before the strict gate balloons after the skill ships, the rubric is wrong, not the reps. Tune the rubric down before keeping the skill running.
- **Methodology mismatch.** A team that runs MEDDPICC in slides but stores nothing structured in Salesforce will fail every qualitative check. Guard: run the skill in `dry_run` mode for two weeks first; if more than 40% of opps emit `needs_methodology` or score `Q < 0.2` across the board, the methodology mapping doc is fictional — fix the doc or instrument the missing fields before going live.
- **Validator drift from real exit criteria.** Sales leaders quietly change what "Stage 3" means in QBRs; the rubric file does not get updated. Guard: append a `last_reviewed` field at the top of `references/stage-criteria-template.md` and have the skill emit a warning at the top of every report if `last_reviewed` is more than 90 days old. Stale rubrics produce confidently wrong verdicts, which is worse than no verdicts.
- **Gong recording-coverage gaps look like rep dishonesty.** Some calls genuinely happen off-Gong (in-person meetings, customer-side dial-in policies). Guard: the methodology mapping must include a `recording_coverage_floor` per stage; if a deal's recorded-call count is below the floor, emit `needs_manager_review` and surface the coverage gap explicitly rather than emitting `fail`.
- **Single-deal rage at a `fail` verdict.** A "fail" on a deal a rep is confident in will trigger pushback. Guard: the report must include the deterministic-rule misses and the unmatched phrase patterns verbatim. The rep can then either (a) update the field/log the activity and re-run, or (b) point to off-Gong evidence the manager accepts. Either way, the conversation is grounded in the specific gap, not in the verdict label.

# Stage exit-criteria rubric — TEMPLATE

> Replace this template's contents with the team's real stage-by-stage rubric.
> The stage-progression-validator skill reads this file on every run.
> Without your real rules, the verdicts are meaningless.

## Last reviewed

YYYY-MM-DD — bump this date every time the rubric is materially changed. The skill warns at the top of the report if this date is more than 90 days old.

## Methodology in use

One of: `MEDDPICC`, `MEDDIC`, `SPICED`, `BANT`, `custom`. Keep this string in sync with `methodology-mapping-template.md` so the skill loads the right phrase patterns.

## Stages

For each stage that the skill should validate, list rules under three buckets: `field_rules`, `activity_rules`, `stakeholder_rules`. Stages omitted from this file are emitted as `needs_methodology` rather than scored.

### Stage 2 — Discovery confirmed

field_rules:
- `Pain_Point__c IS NOT NULL`
- `Decision_Timeline__c IS NOT NULL`
- `Budget_Range__c IS NOT NULL`

activity_rules:
- `Task.Type = 'Discovery Call'` in last 30 days

stakeholder_rules:
- `OpportunityContactRole` includes a contact with role matching `/^(Manager|Director|VP)/`

evidence_required (qualitative — checked against Gong):
- `Pain_Point__c`
- `Decision_Timeline__c`

### Stage 3 — Solution validated

field_rules:
- `Success_Criteria__c IS NOT NULL`
- `Technical_Validation_Complete__c = true`
- `Decision_Criteria__c IS NOT NULL`

activity_rules:
- `Task.Type = 'Demo'` in last 45 days
- `Task.Type = 'Technical Deep Dive'` in last 30 days

stakeholder_rules:
- `OpportunityContactRole` includes a contact with role matching `/^(VP|Director)/`
- At least one contact with `Is_Technical_Buyer__c = true`

evidence_required (qualitative):
- `Success_Criteria__c`

### Stage 4 — Negotiation

field_rules:
- `Economic_Buyer__c IS NOT NULL`
- `Decision_Criteria__c IS NOT NULL`
- `Paper_Process__c IS NOT NULL`
- `Close_Plan__c IS NOT NULL`
- `Competitive_Landscape__c IS NOT NULL`

activity_rules:
- `Task.Type = 'Pricing Discussion'` in last 30 days

stakeholder_rules:
- `OpportunityContactRole` includes a contact with role matching `/^(VP|Director|C.+O)/`

evidence_required (qualitative):
- `Economic_Buyer__c`
- `Close_Plan__c`

### Stage 5 — Verbal commit

field_rules:
- `Verbal_Commit_Date__c IS NOT NULL`
- `Procurement_Engaged__c = true`
- `MSA_Status__c IN ('In review', 'Approved')`

activity_rules:
- `Task.Type = 'Procurement Call'` in last 21 days

stakeholder_rules:
- `OpportunityContactRole` includes one contact with role `Procurement`
- `OpportunityContactRole` includes one contact with role `Legal` if `MSA_Status__c = 'In review'`

evidence_required (qualitative):
- `Verbal_Commit_Date__c`

## Recording-coverage floor (per stage)

Minimum recorded calls in the prior 30 days for the deal. If the deal is below the floor, the skill emits `needs_manager_review` and surfaces the coverage gap rather than scoring qualitative checks.

| Stage | Min recorded calls in last 30 days |
|---|---|
| Stage 2 | 1 |
| Stage 3 | 2 |
| Stage 4 | 2 |
| Stage 5 | 1 |

# Methodology mapping — TEMPLATE

> Replace this template's contents with the team's real mapping. The skill uses
> this to translate methodology concepts (e.g. MEDDPICC's "Economic Buyer")
> into the Salesforce field that holds the answer and into Gong phrase patterns
> that count as supporting evidence.

## Methodology in use

`MEDDPICC` (replace if your team uses a different framework — see worked examples for MEDDIC, SPICED, and a custom framework below).

## MEDDPICC mapping (replace contents with team's real fields)

| MEDDPICC concept | Salesforce field | Evidence required | Phrase patterns |
|---|---|---|---|
| Metric | `Success_Metric__c` | gong | quantitative-language pattern (numbers, units, deltas) within 20 tokens of the field value |
| Economic Buyer | `Economic_Buyer__c` | gong | the buyer's name within 12 tokens of decision-language: `approve`, `sign off`, `budget owner`, `final say`, `the call is mine` |
| Decision Criteria | `Decision_Criteria__c` | none | n/a |
| Decision Process | `Decision_Process__c` | gong | step-language pattern: ordinal markers (`first`, `then`, `after that`) with named owners |
| Paper Process | `Paper_Process__c` | gong | procurement or legal entity name within 30 tokens of `MSA`, `redline`, `security review`, `vendor onboarding` |
| Identify Pain | `Pain_Point__c` | gong | the rep-claimed pain phrase or a synonym in customer's own voice (not the rep's) |
| Champion | `Champion__c` | gong | the named contact speaking on the customer's behalf in at least one call where the rep is mostly listening |
| Competition | `Competitive_Landscape__c` | none | n/a |

## MEDDIC mapping (worked example for teams on MEDDIC, not MEDDPICC)

Replace your `methodology in use` above with `MEDDIC` and use this table instead:

| MEDDIC concept | Salesforce field | Evidence required | Phrase patterns |
|---|---|---|---|
| Metrics | `Success_Metric__c` | gong | quantitative-language pattern |
| Economic Buyer | `Economic_Buyer__c` | gong | name within 12 tokens of decision-language |
| Decision Criteria | `Decision_Criteria__c` | none | n/a |
| Decision Process | `Decision_Process__c` | gong | step-language pattern |
| Identify Pain | `Pain_Point__c` | gong | pain phrase in customer's voice |
| Champion | `Champion__c` | gong | customer-led call segment |

## SPICED mapping (worked example)

| SPICED concept | Salesforce field | Evidence required | Phrase patterns |
|---|---|---|---|
| Situation | `Current_State__c` | none | n/a |
| Pain | `Pain_Point__c` | gong | pain phrase in customer's voice |
| Impact | `Quantified_Impact__c` | gong | quantified-cost language: currency or time units within 20 tokens of pain |
| Critical Event | `Critical_Event__c` | gong | date-bounded urgency: a specific date or quarter within 15 tokens of consequence-language (`miss`, `slip`, `risk`) |
| Decision | `Decision_Process__c` | gong | named decision steps with owners |

## Custom framework template

If the team uses a homegrown rubric, list each concept on its own row with the same four columns. The skill treats `Salesforce field` as the ground truth for "what was claimed" and the `Phrase patterns` as the ground truth for "what counts as supporting evidence in Gong."

| Custom concept | Salesforce field | Evidence required | Phrase patterns |
|---|---|---|---|
| {concept} | {field} | `gong` or `none` | {regex or natural-language phrase rule} |

## Last reviewed

YYYY-MM-DD

# Sample output format — REFERENCE

> The stage-progression-validator skill must emit the report in this exact
> format. Downstream renderers (Slack digest job, weekly email) parse this
> Markdown — keep section headings and ordering stable.

## Report header

```markdown
# Stage progression validation — week ending YYYY-MM-DD

Window: YYYY-MM-DD → YYYY-MM-DD
Methodology: MEDDPICC (rubric last reviewed YYYY-MM-DD)
Opportunities scored: N
- pass: N
- flag: N
- fail: N
- needs_manager_review: N
- needs_methodology: N
```

If the rubric `last_reviewed` is more than 90 days old, prepend a single line: `> WARNING: stage-criteria rubric last reviewed YYYY-MM-DD (over 90 days).`

## fail section

One block per failed deal. Order by deterministic-score ascending (worst first), tie-break by qualitative-score ascending.

```markdown
## fail (N)

### {Account name} — {New stage label}
- Opp ID: 006xxxxxxxxxxxxxxx
- Owner: owner@example.com
- Progressed: YYYY-MM-DD
- Deterministic score: D.DD (X of Y rules satisfied)
- Qualitative score: D.DD (X of Y claims supported)

Deterministic misses:
- `{field}` is NULL
- `OpportunityContactRole` has no role matching `/{regex}/`
- `Task.Type = '{type}'` not found in last {N} days

Qualitative misses:
- `{field}` claim: no Gong call in last 30 days matches pattern `{pattern_name}`
- `{field}` claim: no Gong call in last 30 days contains `{pattern}` near claimed value

Recording coverage: {N} recorded calls in last 30 days (floor: {M}).
```

## flag section

Same block format as `fail`. Order by combined score ascending.

## needs_manager_review section

Same block format. Add a one-line `Reason:` field naming why the deal landed in the borderline band — `low recording coverage`, `one rule short`, `mixed signal across deterministic and qualitative`, etc.

## needs_methodology section

```markdown
## needs_methodology (N)

| Opp | Owner | New stage | Reason |
|---|---|---|---|
| {Opp ID} | {owner} | {stage label} | no rubric entry for stage |
```

## pass section

Tabular, no per-deal block — passes are not interesting in the digest.

```markdown
## pass (N)

| Opp | Owner | New stage | Deterministic | Qualitative |
|---|---|---|---|---|
| {Opp ID} | {owner} | {stage label} | 1.00 | 1.00 |
```

## Footer

```markdown
---
Generated by stage-progression-validator skill at YYYY-MM-DDTHH:MM:SSZ
Inputs: methodology_path={path}, borderline_threshold={float}
```