An n8n flow that watches Greenhouse for newly-opened reqs, finds the past candidates who reached a late interview stage on a related req and were rejected for a non-disqualifying reason — the “silver medalists” — re-scores each against the new req’s rubric with Claude, and posts a ranked shortlist to one Slack channel. It never contacts anyone, never adds a candidate to a pipeline, and never moves a candidate in the ATS. The recruiter decides every outreach. It turns “we hired someone else last spring, who was the runner-up again?” from a 40-minute archaeology dig into a Slack message that lands the hour the req opens.
When to use
You run on Greenhouse (or another ATS with a read API — the intake nodes swap), and you open enough reqs in recurring job families that last year’s finalists are this year’s shortlist.
You actually reject finalists with structured rejection reasons. The flow’s whole safety model rests on telling “hired someone else” apart from “failed the background check.” If your team rejects everyone with a single generic reason, fix that first; the flow has nothing to gate on.
You have feeder reqs to point at. The flow does not guess which past reqs are “related” — you list the past Greenhouse job IDs per job family in a config file. That makes the match auditable instead of a similarity black box.
A recruiter walks the digest and decides outreach. The flow surfaces and ranks; a human re-screens and contacts.
When NOT to use
Auto-outreach in the loop. The flow ranks and posts to Slack; it never emails, never adds to a sequence, never moves a stage. Wiring an outreach send to the digest turns a re-contact suggestion into automated processing of candidate data — and re-contacting a candidate past the retention period you disclosed to them is a GDPR violation, not a growth hack. The digest’s Confirm first: line per candidate exists precisely so a recruiter checks consent and freshness before any message.
No recency window. GDPR requires you not to hold or re-process candidate data beyond the retention period you told the candidate about — commonly 12–24 months for unsuccessful applicants. The flow’s recency_months gate drops anyone past the window. Setting it longer than your stated retention period to widen the pool is the one edit that turns this flow into a liability.
Rejection reasons you can’t trust. If “Position filled” is silently used for “we had concerns,” the deny-list can’t protect you. The flow is only as safe as the rejection-reason discipline behind it.
Tiny or one-off hiring. A team opening three unrelated reqs a year is faster reading its own memory than authoring a rubric and a feeder-req list. The setup earns back when a job family recurs.
Confidential or executive searches. Different consent posture, different audit chain. These do not belong in a shared Slack channel.
Setup
Import the flow. Drop apps/web/public/artifacts/candidate-rediscovery-n8n/candidate-rediscovery-n8n.json into your n8n instance. Every node carries notesInFlow: true, so the in-canvas notes explain each choice.
Wire the credentials. Three: PLACEHOLDER_GREENHOUSE_CRED_ID (Harvest API key, read scope only — Jobs, Applications, Scorecards), PLACEHOLDER_ANTHROPIC_CRED_ID (Claude API key), PLACEHOLDER_SLACK_CRED_ID (Slack bot token with chat:write for #talent-rediscovery). The bundle’s _README.md shows where each value lives.
Author one config file per job family at ${CONFIG_DIR}/<family>.json. It holds the match_job_ids (the feeder reqs), min_stage_reached (the late-stage gate), the rejection-reason allow- and deny-lists, recency_months, fit_threshold, top_n, and the rubric. The full format is in _README.md. No config for a family → the flow halts with missing_config rather than scoring against defaults.
Set the lookback.POLL_LOOKBACK_HOURS must be ≥ the schedule interval (default 6h) or a req opened between polls slips through. The two are tuned together.
Dry-run on a family you just hired for. The runner-ups you remember should land near the top of the digest. Tune min_stage_reached and the rubric anchors against your memory before trusting it on a fresh family.
Enable the trigger. Flip active: true only after a digest you would actually act on.
What the flow does
Twelve nodes, in order. The deterministic consent and fairness gates run before the model call, because letting an LLM loose on the full reject archive is how you re-contact someone who asked you never to.
Every 6 Hours — schedule trigger. Greenhouse has no reliable job-created webhook, so the flow polls.
Fetch New Open Reqs — GET /v1/jobs?status=open&created_after=… against Greenhouse Harvest. The JSON array splits into one item per new req.
Load Match Config — resolves the req’s job family, loads its config, hashes it for the audit log. Halts on missing_config.
Config Loaded? — IF gate; reqs without a config stop here.
Fetch Rejected Pool — GET /v1/applications?status=rejected&last_activity_after=…, paginated. One item per rejected application.
Eligibility Filter — the five-gate floor: feeder-req match, late-stage reached, rejection-reason allow/deny (deny wins), recency window, do-not-contact suppression. Everything else is dropped before any model sees it.
Fetch Scorecards — pulls the candidate’s prior interview scorecards, the grounding text for the re-match.
Claude Re-Match — scores the past candidate against the new req’s rubric on Sonnet 4.6, told explicitly not to inherit the old reject decision and not to score on protected-class proxies. Evidence-required: no verbatim scorecard citation → fit 1.
Parse + Keep — enforces the evidence rule, flags keep when fit ≥ the config threshold.
Audit Append — one pseudonymous JSONL line per scored candidate (candidate ID + link, no name, no scorecard text).
Build Digest — groups by req, de-duplicates a candidate who matched via two feeder reqs (higher fit wins), ranks, truncates to top_n.
Slack Digest — posts one ranked shortlist per req to #talent-rediscovery, each candidate with a one-line reason to re-surface and a Confirm first: note.
Cost reality
Anthropic API tokens — each candidate sends scorecard text + rubric (~4-5k input tokens) and returns ~300 output tokens. On Sonnet 4.6 list pricing that lands around $0.015-0.03 per candidate scored, so a family pulling 200 eligible silver medalists costs roughly $3-6 per req opened (computed from token counts, not measured on your data).
Greenhouse Harvest calls — read-only: one jobs poll, one paginated applications pull, one scorecards fetch per eligible candidate. This stays within Harvest’s documented per-key rate limit for any realistic family size.
n8n cost — self-hosted is free in container. n8n Cloud’s Starter plan covers the polling volume; only very high req throughput needs Pro.
Recruiter time — the win. Hand-reconstructing a silver-medalist list across past reqs is the better part of an hour per req; the digest lands ranked, with consent flags and re-screen prompts pre-staged, in the minutes after the req opens.
The economics behind the win. Published recruiting benchmarks put cost-per-hire above $4,500 and a rediscovered hire’s savings at roughly $2,000-3,000, with time-to-fill on rediscovery hires falling 20-30 days. Teams typically start at a 5-15% rediscovery rate and target 35-50% within a year; the silver-medalist hire rate benchmark sits around 8-15%. The flow exists to make hitting those numbers a default, not a quarterly project.
Success metric
Track three numbers per job family per quarter:
Shortlist-to-screen rate — share of digest candidates a recruiter takes to a re-screen. Below ~20% means the rubric or min_stage_reached is too loose; tighten anchors before widening the pool.
Rediscovery hire rate — share of hires in the family sourced from the digest. The 8-15% benchmark is the target; below 5% after two quarters means the feeder-req list or recency window is too narrow.
Time-from-req-open to first qualified slate — the candidate-experience and hiring-manager metric. The digest should move this from days to the same day.
vs alternatives
vs Gem or hireEZ rediscovery — these are managed talent-CRM products with their own re-engagement campaigns and a candidate graph; pick them if you want the platform and the budget supports it. Pick the flow if you want the matching rules, the deny-list, and the audit log version-controlled in your own repo, scoped to feeder reqs you choose, with the digest landing in your stack.
vs Greenhouse’s own “prospect pool” search — native search finds candidates by keyword and stage but does not re-score them against a new req’s rubric with cited evidence, and the relevance ranking is a black box. Pick the flow when the per-candidate reason_to_resurface and Confirm first: lines are what make the recruiter act.
vs a recruiter manually mining the ATS — same quality on a good day, but the recruiter forgets the recency window, skips the deny-list under deadline pressure, and only does it for the reqs they remember. The flow does it for every recurring req, every time, with the consent gates non-optional.
Watch-outs
Re-contacting beyond retention.Guard: the recency_months gate drops anyone past the disclosed retention window before scoring, and the audit log records the window used. Set it to your stated retention period or shorter — never longer to grow the pool.
Disqualified candidates resurfacing.Guard: the rejection-reason deny-list runs before the model and deny wins over allow. Failed background/reference checks, conduct concerns, no work authorization, and explicit do-not-contact reasons can never reach the digest. The discipline depends on honest rejection reasons upstream.
Bias carry-forward from old decisions.Guard: the model is instructed not to inherit the prior reject verdict — a candidate passed over because someone else was chosen can be a 5 for a new req — and not to score on name, school as a standalone signal, age, gender, or employment gaps. The config_sha in the audit log makes the matching rules used on any date reproducible under an AI-screening-bias review.
Stale candidate state.Guard: the digest’s Confirm first: line per candidate forces the recruiter to verify the person is still in-region, still interested, and still a fit before outreach; the flow asserts a match, not a current fact. Active-elsewhere candidates are the recruiter’s check in Greenhouse, noted in the bundle’s known limits.
Thin scorecards scoring low.Guard: scorecard text is the only grounding, so a candidate rejected before substantive interviews scores low by design. Raise min_stage_reached rather than feeding the model resumes it cannot see.
Stack
The artifact bundle lives at apps/web/public/artifacts/candidate-rediscovery-n8n/ and contains:
candidate-rediscovery-n8n.json — the n8n flow export (every node configured, no stub parameters)
_README.md — credential setup, config-file format, the consent and fairness gates, the dry-run procedure
# Candidate rediscovery (silver medalists) — n8n flow
This flow polls Greenhouse for newly-opened reqs, finds past candidates who reached a late stage on a related req and were rejected for a non-disqualifying reason ("silver medalists"), re-scores each against the new req's rubric with Claude (Sonnet 4.6 by default), and posts a ranked shortlist to Slack. It never contacts a candidate, never adds anyone to a pipeline, and never moves a candidate in Greenhouse. The recruiter decides every outreach.
This README covers import, credentials, the per-job-family config format, the consent and fairness gates, and the dry-run procedure.
## Import
1. Open n8n → Workflows → Import from file → pick `candidate-rediscovery-n8n.json`.
2. Set the workflow timezone (top of the canvas) to your team's working timezone for sane audit-log timestamps. The default is UTC.
3. Do not enable the workflow yet. Configure credentials and at least one job-family config, complete the dry-run, then flip to enabled.
## Credentials (three required)
### `PLACEHOLDER_GREENHOUSE_CRED_ID` — Greenhouse Harvest API key
- Greenhouse admin → Configure → Dev Center → API Credential Management → Create New API Key → type "Harvest". Grant only the read permissions the flow uses: `GET` on Jobs, Applications, and Scorecards. The flow never writes to Greenhouse.
- In n8n, create an HTTP Basic Auth credential. Username = the API token. Password = empty. (Harvest authenticates as base64 of `token:` with a trailing colon — n8n's Basic Auth credential does this for you.)
- Bind the credential to the three Greenhouse nodes: `Fetch New Open Reqs`, `Fetch Rejected Pool`, `Fetch Scorecards`.
### `PLACEHOLDER_ANTHROPIC_CRED_ID` — Anthropic API key
- console.anthropic.com → API Keys → Create Key. Restrict by IP if your n8n is behind a fixed egress.
- In n8n, create a credential of type "Anthropic API". Paste the key.
- Bind to the `Claude Re-Match` node. The model is set to `claude-sonnet-4-6` in the request body — change it there if you want to test other models.
### `PLACEHOLDER_SLACK_CRED_ID` — Slack bot token
- Create (or reuse) a Slack app with the `chat:write` scope. Install to the workspace. Invite the bot into `#talent-rediscovery`.
- In n8n, create a Slack credential with the bot token (`xoxb-…`).
- Bind to the `Slack Digest` node.
### Environment variables
- `CONFIG_DIR` — directory holding the per-job-family config files. Default `/data/rediscovery`.
- `AUDIT_DIR` — directory for the JSONL audit log. Default `/data/audit`.
- `POLL_LOOKBACK_HOURS` — how far back `Fetch New Open Reqs` looks for newly-opened reqs. Must be **≥** the schedule interval (default 6) or a req opened between polls will be missed. Default 6.
## Config file format (one per job family)
The flow expects one config file per job family at `${CONFIG_DIR}/<family>.json`. The family is resolved from the new req's `job_family` custom field, or — if that is absent — the slugified name of the req's first department. Missing config → the flow halts for that req with `missing_config` and leaves the req for manual sourcing.
The config is the only place the matching rules live. Copy this, replace every value, and save as `<family>.json`:
```json
{
"job_family": "backend-engineer",
"version": "2026-06-15",
"match_job_ids": [4012, 3987, 3654],
"recency_months": 18,
"min_stage_reached": ["Onsite", "Final Interview", "Reference Check", "Offer"],
"rejection_reasons_allow": [
"Position filled — strong candidate",
"Hired another candidate",
"Kept warm for future role",
"Timing — not ready to move"
],
"rejection_reasons_deny": [
"Failed background check",
"Not legally authorized to work",
"Conduct / values concern",
"Failed reference check",
"Withdrew — compensation mismatch",
"Do not contact"
],
"do_not_contact_tags": ["do-not-contact", "gdpr-erased", "opted-out"],
"fit_threshold": 4,
"top_n": 10,
"rubric": {
"role": "Senior Backend Engineer (Distributed Systems)",
"dimensions": {
"fit": {
"must_have": [
"Production Go or Rust (3y+)",
"Owned a distributed-system migration"
],
"anchors": {
"5": "Late-stage scorecards show owned, measurable distributed-system outcomes that map to this req's must-haves",
"4": "Strong scorecards on the core skill; one must-have unconfirmed",
"3": "Adjacent skills; would need a fresh screen on the core must-have",
"2": "Partial overlap; likely a stretch for this req",
"1": "No scorecard evidence the candidate matches this req"
}
}
}
}
}
```
- `match_job_ids` are the **feeder reqs** — the past Greenhouse job IDs whose late-stage rejects count as silver medalists for this family. Find them in the URL of each job in Greenhouse. This is what scopes "related req"; the flow does not guess relatedness.
- `min_stage_reached` is the late-stage gate. A candidate rejected at "Application Review" or "Phone Screen" is not a silver medalist — they never got a real read. Use your own stage names exactly as they appear in Greenhouse.
- `rejection_reasons_deny` is the safety floor and **deny wins over allow**. Any disqualifying reason — failed background/reference check, conduct, no work authorization, an explicit do-not-contact — must be listed here so the candidate is never re-surfaced.
- The config is hashed (SHA-256, first 16 hex chars) per run and the hash is written to the audit log and the Slack footer, so the exact rules used on a given date are reproducible.
## Consent and fairness gates (do not weaken to widen the pool)
Two layers protect the candidate, both **before** the LLM call:
1. **`Eligibility Filter`** drops any application that is not a feeder-req match, did not reach a late stage, carries a disqualifying or non-allow-listed rejection reason, falls outside the recency window, or whose candidate carries a do-not-contact / erasure / opt-out tag.
2. **`Claude Re-Match`** is instructed not to inherit the prior reject decision and not to score on protected-class proxies (name, school as a standalone signal, age, gender, employment gaps), and to cite verbatim scorecard evidence — no citation forces fit to 1.
The recency window exists because GDPR requires you not to hold or re-process candidate data beyond the retention period you told the candidate about — commonly 12–24 months for unsuccessful applicants. Set `recency_months` to your stated retention period or shorter; never longer. Candidates past the window are dropped, not re-contacted.
If you find yourself wanting to delete a deny-list reason or stretch the recency window to grow the shortlist, that is exactly the decision a recruiter — not the flow — should make case by case, in Greenhouse, with the candidate's consent status in view.
## Dry-run procedure
1. Author one config file for a family where you recently filled a role and remember the runner-up candidates.
2. Temporarily point `match_job_ids` at the feeder reqs and set the new-req trigger to fire manually: in n8n, click "Execute workflow" with `Fetch New Open Reqs` returning the already-open target req (or pin a sample job item).
3. Read the Slack digest. The runner-ups you remember should appear near the top. If a known strong silver medalist is missing, check, in order: were they within the recency window, did they reach a `min_stage_reached` stage, was their rejection reason allow-listed, do they carry a suppression tag.
4. If obvious mis-fits rank high, the rubric anchors are too loose or the scorecards are thin — look at the `evidence` line in the digest. Empty or paraphrased evidence means the model had little to work with (the candidate was rejected before substantive interviews); raise `min_stage_reached`.
5. Only switch the workflow `active: true` after a digest you would actually act on.
## First-run sanity check
After enabling, watch the first real digest:
1. Confirm the `Confirm first:` line on each candidate is specific (e.g. "still in-region; was a 2024 reject so re-screen on the new framework"). Generic lines mean the model is guessing — check it is on Sonnet 4.6.
2. Confirm the `config <sha>` in the Slack footer matches the file you authored. A mismatch means the wrong family file loaded.
3. Confirm `${AUDIT_DIR}/rediscovery-<YYYY-MM>.jsonl` exists and has one line per scored candidate. No file means you are operating without the audit trail that a GDPR / EEOC inquiry about automated re-contact would require.
## Known limits
- **Active-elsewhere check is the recruiter's, not the flow's.** The pool query returns rejected applications only, so it cannot tell whether a candidate is currently active on another open req. The recruiter sees that in Greenhouse before reaching out; the flow does not auto-suppress active candidates.
- **A candidate who matched via two feeder reqs is scored twice**, then de-duplicated in `Build Digest` (the higher fit wins). The duplicate scoring is a small, bounded token cost, not a correctness problem.
- **Scorecard text is the only grounding.** Greenhouse does not return parsed resume text via Harvest, so a candidate rejected before any substantive interview has thin scorecards and will score low even if their resume is a strong match. That is intended: re-surface people you actually evaluated, not your whole archive.
- **No dedupe table across runs.** If the same req stays open across two polls it will not re-fire (the `created_after` filter only catches newly-opened reqs), but re-opening a req would re-digest it. The audit log makes repeats visible; add a seen-reqs check in front of `Load Match Config` if your audit posture needs hard idempotency.