n8n-flow

Candidate rediscovery for silver medalists with n8n

Dificultad

intermedio

Tiempo de setup

60min

Para

recruiter · sourcer · talent-acquisition

Reclutamiento y TA

Stack

Un flujo de n8n que vigila Greenhouse en busca de reqs recién abiertas, encuentra a los candidatos del pasado que llegaron a una etapa tardía de entrevista en una req relacionada y fueron rechazados por una razón no descalificante — los “silver medalists” — vuelve a puntuar a cada uno frente a la rúbrica de la nueva req con Claude, y publica una lista corta ordenada en un canal de Slack. Nunca contacta a nadie, nunca agrega un candidato a un pipeline, y nunca mueve a un candidato en el ATS. El recruiter decide cada outreach. Convierte “el año pasado contratamos a otra persona, ¿quién era el segundo lugar otra vez?” de una excavación arqueológica de 40 minutos en un mensaje de Slack que llega la hora en que se abre la req.

Cuándo usarlo

Operas en Greenhouse (u otro ATS con una API de lectura — los nodos de ingesta se intercambian), y abres suficientes reqs en familias de puestos recurrentes como para que los finalistas del año pasado sean la lista corta de este año.
Realmente rechazas a finalistas con razones de rechazo estructuradas. Todo el modelo de seguridad del flujo descansa en distinguir “contratamos a otra persona” de “no pasó la verificación de antecedentes”. Si tu equipo rechaza a todos con una única razón genérica, arregla eso primero; el flujo no tiene nada que filtrar.
Tienes feeder reqs hacia las cuales apuntar. El flujo no adivina qué reqs pasadas están “relacionadas” — tú listas los IDs de puestos pasados de Greenhouse por familia de puesto en un archivo de configuración. Eso hace que el match sea auditable en lugar de una caja negra de similitud.
Un recruiter recorre el digest y decide el outreach. El flujo expone y ordena; un humano vuelve a hacer el screening y contacta.

Cuándo NO usarlo

Outreach automático en el loop. El flujo ordena y publica en Slack; nunca envía emails, nunca agrega a una secuencia, nunca mueve una etapa. Conectar un envío de outreach al digest convierte una sugerencia de re-contacto en procesamiento automatizado de datos de candidatos — y volver a contactar a un candidato pasado el período de retención que le comunicaste es una violación del GDPR, no un growth hack. La línea Confirm first: por candidato del digest existe precisamente para que un recruiter verifique el consentimiento y la actualidad antes de cualquier mensaje.
Sin ventana de recencia. El GDPR exige que no retengas ni vuelvas a procesar datos de candidatos más allá del período de retención que le comunicaste al candidato — comúnmente 12–24 meses para postulantes no seleccionados. El gate recency_months del flujo descarta a cualquiera que esté fuera de la ventana. Configurarlo más largo que tu período de retención declarado para ampliar el pool es la única edición que convierte este flujo en un pasivo.
Razones de rechazo en las que no puedes confiar. Si “Position filled” se usa silenciosamente para “teníamos preocupaciones”, la deny-list no puede protegerte. El flujo es tan seguro como la disciplina de razones de rechazo que lo respalda.
Contrataciones pequeñas o puntuales. Un equipo que abre tres reqs no relacionadas al año es más rápido leyendo su propia memoria que escribiendo una rúbrica y una lista de feeder reqs. La configuración se recupera cuando una familia de puesto se repite.
Búsquedas confidenciales o ejecutivas. Postura de consentimiento distinta, cadena de auditoría distinta. Estas no pertenecen a un canal compartido de Slack.

Configuración

Importa el flujo. Coloca apps/web/public/artifacts/candidate-rediscovery-n8n/candidate-rediscovery-n8n.json en tu instancia de n8n. Cada nodo lleva notesInFlow: true, así que las notas en el lienzo explican cada decisión.
Conecta las credenciales. Tres: PLACEHOLDER_GREENHOUSE_CRED_ID (clave de API de Harvest, solo scope de lectura — Jobs, Applications, Scorecards), PLACEHOLDER_ANTHROPIC_CRED_ID (clave de API de Claude), PLACEHOLDER_SLACK_CRED_ID (token de bot de Slack con chat:write para #talent-rediscovery). El _README.md del bundle muestra dónde vive cada valor.
Escribe un archivo de configuración por familia de puesto en ${CONFIG_DIR}/<family>.json. Contiene los match_job_ids (las feeder reqs), min_stage_reached (el gate de etapa tardía), las allow-list y deny-list de razones de rechazo, recency_months, fit_threshold, top_n, y la rúbrica. El formato completo está en _README.md. Sin config para una familia → el flujo se detiene con missing_config en lugar de puntuar contra valores por defecto.
Configura el lookback. POLL_LOOKBACK_HOURS debe ser ≥ el intervalo del schedule (por defecto 6h) o una req abierta entre polls se escapa. Los dos se ajustan juntos.
Haz un dry-run en una familia para la que acabas de contratar. Los segundos lugares que recuerdas deberían quedar cerca de la parte superior del digest. Ajusta min_stage_reached y los anclajes de la rúbrica frente a tu memoria antes de confiar en él en una familia nueva.
Activa el trigger. Pon active: true solo después de un digest sobre el que realmente actuarías.

Qué hace el flujo

Doce nodos, en orden. Los gates deterministas de consentimiento y equidad se ejecutan antes de la llamada al modelo, porque dejar suelto a un LLM sobre todo el archivo de rechazos es como vuelves a contactar a alguien que te pidió que nunca lo hicieras.

Every 6 Hours — schedule trigger. Greenhouse no tiene un webhook confiable de creación de puestos, así que el flujo hace polling.
Fetch New Open Reqs — GET /v1/jobs?status=open&created_after=… contra Greenhouse Harvest. El array JSON se divide en un ítem por nueva req.
Load Match Config — resuelve la familia de puesto de la req, carga su config, la hashea para el log de auditoría. Se detiene en missing_config.
Config Loaded? — gate IF; las reqs sin una config se detienen aquí.
Fetch Rejected Pool — GET /v1/applications?status=rejected&last_activity_after=…, paginado. Un ítem por postulación rechazada.
Eligibility Filter — el piso de cinco gates: match de feeder req, etapa tardía alcanzada, allow/deny de razón de rechazo (deny gana), ventana de recencia, supresión de do-not-contact. Todo lo demás se descarta antes de que cualquier modelo lo vea.
Fetch Scorecards — extrae los scorecards de entrevistas previas del candidato, el texto de grounding para el re-match.
Claude Re-Match — puntúa al candidato del pasado contra la rúbrica de la nueva req en Sonnet 4.6, con la instrucción explícita de no heredar la vieja decisión de rechazo y de no puntuar sobre proxies de clase protegida. Con evidencia requerida: sin cita textual del scorecard → fit 1.
Parse + Keep — aplica la regla de evidencia, marca keep cuando el fit ≥ el umbral de la config.
Audit Append — una línea JSONL seudónima por candidato puntuado (ID del candidato + enlace, sin nombre, sin texto del scorecard).
Build Digest — agrupa por req, deduplica a un candidato que hizo match vía dos feeder reqs (gana el fit más alto), ordena, trunca a top_n.
Slack Digest — publica una lista corta ordenada por req en #talent-rediscovery, cada candidato con una razón de una línea para volver a surgir y una nota Confirm first:.

Realidad de costos

Tokens de la API de Anthropic — cada candidato envía texto de scorecard + rúbrica (~4-5k tokens de entrada) y devuelve ~300 tokens de salida. Con el precio de lista de Sonnet 4.6 eso queda alrededor de $0.015-0.03 por candidato puntuado, así que una familia que arrastra 200 silver medalists elegibles cuesta aproximadamente $3-6 por req abierta (calculado a partir de conteos de tokens, no medido sobre tus datos).
Llamadas a Greenhouse Harvest — solo lectura: un poll de jobs, un pull paginado de applications, un fetch de scorecards por candidato elegible. Esto se mantiene dentro del rate limit documentado por clave de Harvest para cualquier tamaño de familia realista.
Costo de n8n — auto-hospedado es gratis en contenedor. El plan Starter de n8n Cloud cubre el volumen de polling; solo un throughput de reqs muy alto necesita Pro.
Tiempo del recruiter — la ganancia. Reconstruir a mano una lista de silver medalists a través de reqs pasadas es buena parte de una hora por req; el digest llega ordenado, con flags de consentimiento y prompts de re-screen pre-armados, en los minutos posteriores a que la req se abre.
La economía detrás de la ganancia. Los benchmarks publicados de recruiting ubican el costo por contratación por encima de $4,500 y el ahorro de una contratación redescubierta en aproximadamente $2,000-3,000, con el time-to-fill de las contrataciones por redescubrimiento cayendo 20-30 días. Los equipos típicamente empiezan en una tasa de redescubrimiento del 5-15% y apuntan al 35-50% en un año; el benchmark de tasa de contratación de silver medalists se ubica alrededor del 8-15%. El flujo existe para hacer que alcanzar esos números sea un default, no un proyecto trimestral.

Métrica de éxito

Rastrea tres números por familia de puesto por trimestre:

Tasa de shortlist-a-screen — proporción de candidatos del digest que un recruiter lleva a un re-screen. Por debajo de ~20% significa que la rúbrica o min_stage_reached está demasiado floja; ajusta los anclajes antes de ampliar el pool.
Tasa de contratación por redescubrimiento — proporción de contrataciones en la familia provenientes del digest. El benchmark del 8-15% es el objetivo; por debajo del 5% después de dos trimestres significa que la lista de feeder reqs o la ventana de recencia es demasiado estrecha.
Tiempo desde la apertura de la req hasta la primera slate calificada — la métrica de experiencia del candidato y del hiring manager. El digest debería mover esto de días al mismo día.

vs alternativas

vs el redescubrimiento de Gem o hireEZ — estos son productos gestionados de talent-CRM con sus propias campañas de re-engagement y un grafo de candidatos; elígelos si quieres la plataforma y el presupuesto la soporta. Elige el flujo si quieres las reglas de matching, la deny-list y el log de auditoría versionados en tu propio repo, acotados a las feeder reqs que tú eliges, con el digest llegando a tu stack.
vs la búsqueda de “prospect pool” propia de Greenhouse — la búsqueda nativa encuentra candidatos por keyword y etapa pero no los vuelve a puntuar contra la rúbrica de una nueva req con evidencia citada, y el ranking de relevancia es una caja negra. Elige el flujo cuando las líneas reason_to_resurface y Confirm first: por candidato son lo que hace que el recruiter actúe.
vs un recruiter minando manualmente el ATS — la misma calidad en un buen día, pero el recruiter olvida la ventana de recencia, salta la deny-list bajo presión de deadline, y solo lo hace para las reqs que recuerda. El flujo lo hace para cada req recurrente, cada vez, con los gates de consentimiento no opcionales.

Cosas a vigilar

Re-contactar más allá de la retención. Salvaguarda: el gate recency_months descarta a cualquiera que esté fuera de la ventana de retención divulgada antes de puntuar, y el log de auditoría registra la ventana usada. Configúralo en tu período de retención declarado o menor — nunca más largo para hacer crecer el pool.
Candidatos descalificados reapareciendo. Salvaguarda: la deny-list de razones de rechazo se ejecuta antes del modelo y deny gana sobre allow. Verificaciones de antecedentes/referencias fallidas, preocupaciones de conducta, sin autorización de trabajo, y razones explícitas de do-not-contact nunca pueden llegar al digest. La disciplina depende de razones de rechazo honestas aguas arriba.
Arrastre de sesgo de decisiones viejas. Salvaguarda: al modelo se le instruye que no herede el veredicto de rechazo previo — un candidato dejado de lado porque se eligió a otra persona puede ser un 5 para una nueva req — y que no puntúe sobre nombre, escuela como señal independiente, edad, género, o brechas de empleo. El config_sha en el log de auditoría hace que las reglas de matching usadas en cualquier fecha sean reproducibles bajo una revisión de sesgo de screening con IA.
Estado obsoleto del candidato. Salvaguarda: la línea Confirm first: por candidato del digest obliga al recruiter a verificar que la persona sigue en la región, sigue interesada, y sigue siendo un buen fit antes del outreach; el flujo afirma un match, no un hecho actual. Los candidatos activos en otro lugar son la verificación del recruiter en Greenhouse, anotada en los límites conocidos del bundle.
Scorecards delgados puntuando bajo. Salvaguarda: el texto del scorecard es el único grounding, así que un candidato rechazado antes de entrevistas sustantivas puntúa bajo por diseño. Sube min_stage_reached en lugar de alimentar al modelo currículums que no puede ver.

Stack

El bundle del artefacto vive en apps/web/public/artifacts/candidate-rediscovery-n8n/ y contiene:

candidate-rediscovery-n8n.json — el export del flujo de n8n (cada nodo configurado, sin parámetros de stub)
_README.md — configuración de credenciales, formato del archivo de config, los gates de consentimiento y equidad, el procedimiento de dry-run

Herramientas que el workflow asume que usas: Greenhouse (el ATS — cámbialo por Ashby o Lever reemplazando los nodos de ingesta), Claude (el scorer de re-match), n8n (la orquestación), Slack (la superficie de decisión del recruiter). Para triar inbound nuevo contra una rúbrica, mira el flujo de triaje de postulantes inbound; para calentar a los candidatos que este flujo expone, mira la secuencia de engagement de candidatos y la Claude Skill de sourcing de candidatos.

Conceptos relacionados: métricas del funnel de recruiting, experiencia del candidato, sesgo de screening con IA, entrevistas estructuradas.

Editar esta página en GitHub

Archivos de este artefacto

Descargar todo (.zip)

# Candidate rediscovery (silver medalists) — n8n flow

This flow polls Greenhouse for newly-opened reqs, finds past candidates who reached a late stage on a related req and were rejected for a non-disqualifying reason ("silver medalists"), re-scores each against the new req's rubric with Claude (Sonnet 4.6 by default), and posts a ranked shortlist to Slack. It never contacts a candidate, never adds anyone to a pipeline, and never moves a candidate in Greenhouse. The recruiter decides every outreach.

This README covers import, credentials, the per-job-family config format, the consent and fairness gates, and the dry-run procedure.

## Import

1. Open n8n → Workflows → Import from file → pick `candidate-rediscovery-n8n.json`.
2. Set the workflow timezone (top of the canvas) to your team's working timezone for sane audit-log timestamps. The default is UTC.
3. Do not enable the workflow yet. Configure credentials and at least one job-family config, complete the dry-run, then flip to enabled.

## Credentials (three required)

### `PLACEHOLDER_GREENHOUSE_CRED_ID` — Greenhouse Harvest API key

- Greenhouse admin → Configure → Dev Center → API Credential Management → Create New API Key → type "Harvest". Grant only the read permissions the flow uses: `GET` on Jobs, Applications, and Scorecards. The flow never writes to Greenhouse.
- In n8n, create an HTTP Basic Auth credential. Username = the API token. Password = empty. (Harvest authenticates as base64 of `token:` with a trailing colon — n8n's Basic Auth credential does this for you.)
- Bind the credential to the three Greenhouse nodes: `Fetch New Open Reqs`, `Fetch Rejected Pool`, `Fetch Scorecards`.

### `PLACEHOLDER_ANTHROPIC_CRED_ID` — Anthropic API key

- console.anthropic.com → API Keys → Create Key. Restrict by IP if your n8n is behind a fixed egress.
- In n8n, create a credential of type "Anthropic API". Paste the key.
- Bind to the `Claude Re-Match` node. The model is set to `claude-sonnet-4-6` in the request body — change it there if you want to test other models.

### `PLACEHOLDER_SLACK_CRED_ID` — Slack bot token

- Create (or reuse) a Slack app with the `chat:write` scope. Install to the workspace. Invite the bot into `#talent-rediscovery`.
- In n8n, create a Slack credential with the bot token (`xoxb-…`).
- Bind to the `Slack Digest` node.

### Environment variables

- `CONFIG_DIR` — directory holding the per-job-family config files. Default `/data/rediscovery`.
- `AUDIT_DIR` — directory for the JSONL audit log. Default `/data/audit`.
- `POLL_LOOKBACK_HOURS` — how far back `Fetch New Open Reqs` looks for newly-opened reqs. Must be **≥** the schedule interval (default 6) or a req opened between polls will be missed. Default 6.

## Config file format (one per job family)

The flow expects one config file per job family at `${CONFIG_DIR}/<family>.json`. The family is resolved from the new req's `job_family` custom field, or — if that is absent — the slugified name of the req's first department. Missing config → the flow halts for that req with `missing_config` and leaves the req for manual sourcing.

The config is the only place the matching rules live. Copy this, replace every value, and save as `<family>.json`:

```json
{
"job_family": "backend-engineer",
"version": "2026-06-15",
"match_job_ids": [4012, 3987, 3654],
"recency_months": 18,
"min_stage_reached": ["Onsite", "Final Interview", "Reference Check", "Offer"],
"rejection_reasons_allow": [
"Position filled — strong candidate",
"Hired another candidate",
"Kept warm for future role",
"Timing — not ready to move"
],
"rejection_reasons_deny": [
"Failed background check",
"Not legally authorized to work",
"Conduct / values concern",
"Failed reference check",
"Withdrew — compensation mismatch",
"Do not contact"
],
"do_not_contact_tags": ["do-not-contact", "gdpr-erased", "opted-out"],
"fit_threshold": 4,
"top_n": 10,
"rubric": {
"role": "Senior Backend Engineer (Distributed Systems)",
"dimensions": {
"fit": {
"must_have": [
"Production Go or Rust (3y+)",
"Owned a distributed-system migration"
],
"anchors": {
"5": "Late-stage scorecards show owned, measurable distributed-system outcomes that map to this req's must-haves",
"4": "Strong scorecards on the core skill; one must-have unconfirmed",
"3": "Adjacent skills; would need a fresh screen on the core must-have",
"2": "Partial overlap; likely a stretch for this req",
"1": "No scorecard evidence the candidate matches this req"
}
}
}
}
}
```

- `match_job_ids` are the **feeder reqs** — the past Greenhouse job IDs whose late-stage rejects count as silver medalists for this family. Find them in the URL of each job in Greenhouse. This is what scopes "related req"; the flow does not guess relatedness.
- `min_stage_reached` is the late-stage gate. A candidate rejected at "Application Review" or "Phone Screen" is not a silver medalist — they never got a real read. Use your own stage names exactly as they appear in Greenhouse.
- `rejection_reasons_deny` is the safety floor and **deny wins over allow**. Any disqualifying reason — failed background/reference check, conduct, no work authorization, an explicit do-not-contact — must be listed here so the candidate is never re-surfaced.
- The config is hashed (SHA-256, first 16 hex chars) per run and the hash is written to the audit log and the Slack footer, so the exact rules used on a given date are reproducible.

## Consent and fairness gates (do not weaken to widen the pool)

Two layers protect the candidate, both **before** the LLM call:

1. **`Eligibility Filter`** drops any application that is not a feeder-req match, did not reach a late stage, carries a disqualifying or non-allow-listed rejection reason, falls outside the recency window, or whose candidate carries a do-not-contact / erasure / opt-out tag.
2. **`Claude Re-Match`** is instructed not to inherit the prior reject decision and not to score on protected-class proxies (name, school as a standalone signal, age, gender, employment gaps), and to cite verbatim scorecard evidence — no citation forces fit to 1.

The recency window exists because GDPR requires you not to hold or re-process candidate data beyond the retention period you told the candidate about — commonly 12–24 months for unsuccessful applicants. Set `recency_months` to your stated retention period or shorter; never longer. Candidates past the window are dropped, not re-contacted.

If you find yourself wanting to delete a deny-list reason or stretch the recency window to grow the shortlist, that is exactly the decision a recruiter — not the flow — should make case by case, in Greenhouse, with the candidate's consent status in view.

## Dry-run procedure

1. Author one config file for a family where you recently filled a role and remember the runner-up candidates.
2. Temporarily point `match_job_ids` at the feeder reqs and set the new-req trigger to fire manually: in n8n, click "Execute workflow" with `Fetch New Open Reqs` returning the already-open target req (or pin a sample job item).
3. Read the Slack digest. The runner-ups you remember should appear near the top. If a known strong silver medalist is missing, check, in order: were they within the recency window, did they reach a `min_stage_reached` stage, was their rejection reason allow-listed, do they carry a suppression tag.
4. If obvious mis-fits rank high, the rubric anchors are too loose or the scorecards are thin — look at the `evidence` line in the digest. Empty or paraphrased evidence means the model had little to work with (the candidate was rejected before substantive interviews); raise `min_stage_reached`.
5. Only switch the workflow `active: true` after a digest you would actually act on.

## First-run sanity check

After enabling, watch the first real digest:

1. Confirm the `Confirm first:` line on each candidate is specific (e.g. "still in-region; was a 2024 reject so re-screen on the new framework"). Generic lines mean the model is guessing — check it is on Sonnet 4.6.
2. Confirm the `config <sha>` in the Slack footer matches the file you authored. A mismatch means the wrong family file loaded.
3. Confirm `${AUDIT_DIR}/rediscovery-<YYYY-MM>.jsonl` exists and has one line per scored candidate. No file means you are operating without the audit trail that a GDPR / EEOC inquiry about automated re-contact would require.

## Known limits

- **Active-elsewhere check is the recruiter's, not the flow's.** The pool query returns rejected applications only, so it cannot tell whether a candidate is currently active on another open req. The recruiter sees that in Greenhouse before reaching out; the flow does not auto-suppress active candidates.
- **A candidate who matched via two feeder reqs is scored twice**, then de-duplicated in `Build Digest` (the higher fit wins). The duplicate scoring is a small, bounded token cost, not a correctness problem.
- **Scorecard text is the only grounding.** Greenhouse does not return parsed resume text via Harvest, so a candidate rejected before any substantive interview has thin scorecards and will score low even if their resume is a strong match. That is intended: re-surface people you actually evaluated, not your whole archive.
- **No dedupe table across runs.** If the same req stays open across two polls it will not re-fire (the `created_after` filter only catches newly-opened reqs), but re-opening a req would re-digest it. The audit log makes repeats visible; add a seen-reqs check in front of `Load Match Config` if your audit posture needs hard idempotency.

{
  "name": "Candidate rediscovery (silver medalists)",
  "nodes": [
    {
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "hours",
              "hoursInterval": 6
            }
          ]
        }
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000001",
      "name": "Every 6 Hours",
      "type": "n8n-nodes-base.scheduleTrigger",
      "typeVersion": 1.2,
      "position": [240, 400],
      "notesInFlow": true,
      "notes": "Polls for newly-opened reqs every 6 hours. Greenhouse has no reliable job.created webhook, so this is a scheduled poll. The lookback window in the next node must be >= this interval so no req is missed. Tune both together."
    },
    {
      "parameters": {
        "method": "GET",
        "url": "https://harvest.greenhouse.io/v1/jobs",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "httpBasicAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            { "name": "Accept", "value": "application/json" }
          ]
        },
        "sendQuery": true,
        "queryParameters": {
          "parameters": [
            { "name": "status", "value": "open" },
            { "name": "created_after", "value": "={{ new Date(Date.now() - (($env.POLL_LOOKBACK_HOURS || 6) * 3600000)).toISOString() }}" },
            { "name": "per_page", "value": "100" }
          ]
        },
        "options": {
          "response": {
            "response": { "responseFormat": "json", "neverError": false }
          }
        }
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000002",
      "name": "Fetch New Open Reqs",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [460, 400],
      "credentials": {
        "httpBasicAuth": {
          "id": "PLACEHOLDER_GREENHOUSE_CRED_ID",
          "name": "Greenhouse Harvest API (read scope)"
        }
      },
      "notesInFlow": true,
      "notes": "Greenhouse Harvest is Basic auth: username = API token, password = blank. `created_after` filters to reqs opened since the last poll. The JSON-array response is split into one item per req by n8n; downstream nodes run once per new req."
    },
    {
      "parameters": {
        "jsCode": "// Map each new req to its job-family config and load it from disk.\n// Config keys the rubric, the feeder-req list, the recency window, the\n// rejection-reason allow/deny lists, and the fit threshold. Halt (do not\n// fall back to defaults) if no config exists for the req's job family.\nconst fs = require('fs');\nconst path = require('path');\nconst crypto = require('crypto');\n\nconst CONFIG_DIR = $env.CONFIG_DIR || '/data/rediscovery';\nconst job = $json;\n\nfunction slugify(s) {\n  return String(s || '').toLowerCase().trim().replace(/[^a-z0-9]+/g, '-').replace(/^-+|-+$/g, '');\n}\n\n// Job family resolves from a `job_family` custom field if present, else the\n// first department name. This is the filename the recruiter authored.\nconst customFamily = (job.custom_fields && (job.custom_fields.job_family || job.custom_fields.Job_Family)) || '';\nconst deptName = (job.departments && job.departments[0] && job.departments[0].name) || '';\nconst family = slugify(customFamily) || slugify(deptName);\n\nif (!family) {\n  return [{ json: { status: 'halted', reason: 'no_job_family', job_id: job.id, job_name: job.name } }];\n}\n\nconst configPath = path.join(CONFIG_DIR, `${family}.json`);\nif (!fs.existsSync(configPath)) {\n  return [{ json: { status: 'halted', reason: 'missing_config', job_family: family, expected_path: configPath } }];\n}\n\nconst raw = fs.readFileSync(configPath, 'utf8');\nconst cfg = JSON.parse(raw);\nconst configSha = crypto.createHash('sha256').update(raw).digest('hex').slice(0, 16);\n\nconst recencyMonths = Number(cfg.recency_months) || 18;\nconst recencyCutoffIso = new Date(Date.now() - recencyMonths * 30 * 24 * 3600000).toISOString();\n\nreturn [{\n  json: {\n    status: 'config_loaded',\n    req_id: job.id,\n    req_title: job.name,\n    req_url: `https://app.greenhouse.io/sdash/${job.id}`,\n    job_family: family,\n    config_sha: configSha,\n    recency_months: recencyMonths,\n    recency_cutoff_iso: recencyCutoffIso,\n    match_job_ids: cfg.match_job_ids || [],\n    min_stage_reached: cfg.min_stage_reached || [],\n    rejection_reasons_allow: cfg.rejection_reasons_allow || [],\n    rejection_reasons_deny: cfg.rejection_reasons_deny || [],\n    do_not_contact_tags: cfg.do_not_contact_tags || ['do-not-contact', 'gdpr-erased', 'opted-out'],\n    fit_threshold: Number(cfg.fit_threshold) || 4,\n    top_n: Number(cfg.top_n) || 10,\n    rubric: cfg.rubric || {},\n  }\n}];"
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000003",
      "name": "Load Match Config",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [680, 400],
      "notesInFlow": true,
      "notes": "One config file per job family at /data/rediscovery/<family>.json. No config -> halt (the req is left for manual sourcing). The config SHA is logged so the exact matching rules used on a given date are reproducible under audit."
    },
    {
      "parameters": {
        "conditions": {
          "options": { "caseSensitive": true, "typeValidation": "strict" },
          "conditions": [
            {
              "leftValue": "={{ $json.status }}",
              "rightValue": "config_loaded",
              "operator": { "type": "string", "operation": "equals" }
            }
          ],
          "combinator": "and"
        },
        "options": {}
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000004",
      "name": "Config Loaded?",
      "type": "n8n-nodes-base.if",
      "typeVersion": 2,
      "position": [900, 400]
    },
    {
      "parameters": {
        "method": "GET",
        "url": "https://harvest.greenhouse.io/v1/applications",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "httpBasicAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            { "name": "Accept", "value": "application/json" }
          ]
        },
        "sendQuery": true,
        "queryParameters": {
          "parameters": [
            { "name": "status", "value": "rejected" },
            { "name": "last_activity_after", "value": "={{ $json.recency_cutoff_iso }}" },
            { "name": "per_page", "value": "500" }
          ]
        },
        "options": {
          "pagination": {
            "pagination": {
              "parameters": {
                "parameters": [
                  { "name": "page", "value": "={{ $pageCount + 1 }}" }
                ]
              },
              "paginationCompleteWhen": "responseEmpty",
              "type": "updateAParameterInEachRequest"
            }
          },
          "response": {
            "response": { "responseFormat": "json", "neverError": false }
          }
        }
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000005",
      "name": "Fetch Rejected Pool",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [1140, 320],
      "credentials": {
        "httpBasicAuth": {
          "id": "PLACEHOLDER_GREENHOUSE_CRED_ID",
          "name": "Greenhouse Harvest API (read scope)"
        }
      },
      "notesInFlow": true,
      "notes": "Pulls rejected applications active within the recency window. Paginated (500/page). Returns one item per application. The deterministic filtering happens in the next node, not in the query, so the filter rules are auditable and version-controlled rather than buried in URL params."
    },
    {
      "parameters": {
        "jsCode": "// Deterministic eligibility filter. Runs once per rejected application.\n// Keeps an application only if it is a genuine silver medalist for THIS req:\n// - applied to one of the configured feeder reqs (match_job_ids)\n// - reached a late stage (min_stage_reached)\n// - rejection reason is in the allow-list AND not in the deny-list\n// - last activity within the recency window\n// - candidate carries no do-not-contact / erasure / opt-out tag\n// Drops everything else silently. No LLM has seen the record yet.\nconst app = $json;\nconst cfg = $('Load Match Config').item.json;\n\nfunction drop(reason) { return []; }\n\n// 1) Feeder-req match: the candidate must have applied to a configured past req.\nconst appJobIds = (app.jobs || []).map((j) => j.id);\nconst isFeeder = appJobIds.some((id) => cfg.match_job_ids.includes(id));\nif (!isFeeder) return drop('not_a_feeder_req');\n\n// 2) Late-stage gate: silver medalists reached a configured late stage.\nconst stage = (app.current_stage && app.current_stage.name) || '';\nif (cfg.min_stage_reached.length && !cfg.min_stage_reached.includes(stage)) {\n  return drop('did_not_reach_late_stage');\n}\n\n// 3) Rejection-reason gates. Deny wins over allow.\nconst reason = (app.rejection_reason && app.rejection_reason.name) || '';\nif (cfg.rejection_reasons_deny.includes(reason)) return drop('disqualifying_rejection_reason');\nif (cfg.rejection_reasons_allow.length && !cfg.rejection_reasons_allow.includes(reason)) {\n  return drop('rejection_reason_not_in_allow_list');\n}\n\n// 4) Recency: last activity must be within the window.\nconst lastActivity = app.last_activity_at || app.last_activity || null;\nif (lastActivity && new Date(lastActivity) < new Date(cfg.recency_cutoff_iso)) {\n  return drop('outside_recency_window');\n}\n\n// 5) Consent / suppression tags on the candidate.\nconst tags = (app.candidate_tags || (app.candidate && app.candidate.tags) || []).map((t) => String(t).toLowerCase());\nif (cfg.do_not_contact_tags.some((t) => tags.includes(String(t).toLowerCase()))) {\n  return drop('suppressed_do_not_contact');\n}\n\nconst candidateId = app.candidate_id || (app.candidate && app.candidate.id);\n\nreturn [{\n  json: {\n    status: 'eligible',\n    req_id: cfg.req_id,\n    req_title: cfg.req_title,\n    req_url: cfg.req_url,\n    job_family: cfg.job_family,\n    config_sha: cfg.config_sha,\n    fit_threshold: cfg.fit_threshold,\n    top_n: cfg.top_n,\n    rubric: cfg.rubric,\n    candidate_id: candidateId,\n    application_id: app.id,\n    prior_stage_reached: stage,\n    prior_rejection_reason: reason,\n    prior_req_ids: appJobIds,\n    last_activity_at: lastActivity,\n  }\n}];"
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000006",
      "name": "Eligibility Filter",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [1360, 320],
      "notesInFlow": true,
      "notes": "Five deterministic gates run BEFORE any LLM call: feeder-req match, late-stage reached, rejection-reason allow/deny, recency window, do-not-contact suppression. A candidate failing any gate is dropped and never scored. This is the consent + fairness floor; do not move it after the model call."
    },
    {
      "parameters": {
        "method": "GET",
        "url": "=https://harvest.greenhouse.io/v1/applications/{{ $json.application_id }}/scorecards",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "httpBasicAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            { "name": "Accept", "value": "application/json" }
          ]
        },
        "options": {
          "response": {
            "response": { "responseFormat": "json", "neverError": true }
          }
        }
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000007",
      "name": "Fetch Scorecards",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [1580, 320],
      "credentials": {
        "httpBasicAuth": {
          "id": "PLACEHOLDER_GREENHOUSE_CRED_ID",
          "name": "Greenhouse Harvest API (read scope)"
        }
      },
      "notesInFlow": true,
      "notes": "Pulls the prior interview scorecards for the candidate. These are the grounding text for the re-match score. `neverError: true` so a candidate with no scorecards (rejected early) does not break the run; the next node handles the empty case."
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://api.anthropic.com/v1/messages",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            { "name": "Content-Type", "value": "application/json" },
            { "name": "x-api-key", "value": "={{ $credentials.anthropicApi.apiKey }}" },
            { "name": "anthropic-version", "value": "2023-06-01" }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"model\": \"claude-sonnet-4-6\",\n  \"max_tokens\": 900,\n  \"system\": \"You re-match a PAST candidate against a NEW open req. Score fit 1-5 against the rubric using only evidence in the supplied scorecard text. Cite a verbatim string as `evidence`. If you cannot cite verbatim evidence, fit is 1. Do NOT inherit the prior hire/no-hire decision: a candidate rejected because someone else was chosen can be a 5 for a new req. Do NOT score on name, school as a standalone signal, age, gender, or employment gaps. Return ONLY JSON: {\\\"fit\\\":{\\\"score\\\":N,\\\"evidence\\\":\\\"...\\\"},\\\"reason_to_resurface\\\":\\\"one sentence\\\",\\\"verify_before_outreach\\\":\\\"what a recruiter must confirm is still true\\\"}.\",\n  \"messages\": [\n    {\n      \"role\": \"user\",\n      \"content\": \"New req: {{ $('Eligibility Filter').item.json.req_title }}\\n\\nRubric:\\n{{ JSON.stringify($('Eligibility Filter').item.json.rubric) }}\\n\\nPrior stage reached: {{ $('Eligibility Filter').item.json.prior_stage_reached }}\\nPrior rejection reason: {{ $('Eligibility Filter').item.json.prior_rejection_reason }}\\n\\nPrior scorecards (JSON):\\n{{ JSON.stringify($json).slice(0, 12000) }}\"\n    }\n  ]\n}",
        "options": {
          "response": {
            "response": { "responseFormat": "json", "neverError": false }
          },
          "timeout": 60000
        }
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000008",
      "name": "Claude Re-Match",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [1800, 240],
      "credentials": {
        "anthropicApi": {
          "id": "PLACEHOLDER_ANTHROPIC_CRED_ID",
          "name": "Anthropic API key"
        }
      },
      "notesInFlow": true,
      "notes": "Scores the past candidate against the NEW req's rubric, grounded in the prior scorecards. System prompt explicitly tells the model NOT to inherit the old reject decision and NOT to score on protected-class proxies. Evidence-required: no citation -> fit 1."
    },
    {
      "parameters": {
        "jsCode": "// Parse Claude's JSON, enforce the evidence rule, decide keep vs drop.\nconst input = $json;\nconst ctx = $('Eligibility Filter').item.json;\n\nlet parsed;\ntry {\n  const text = (input.content && input.content[0] && input.content[0].text) || '';\n  const m = text.match(/\\{[\\s\\S]*\\}/);\n  if (!m) throw new Error('no JSON object in response');\n  parsed = JSON.parse(m[0]);\n} catch (e) {\n  return [{ json: { status: 'scored', keep: false, error: 'unparseable_score', req_id: ctx.req_id, candidate_id: ctx.candidate_id } }];\n}\n\nconst rawScore = Number(parsed.fit && parsed.fit.score) || 1;\nconst evidence = (parsed.fit && parsed.fit.evidence) || '';\nconst fit = (rawScore > 1 && evidence.trim().length > 0) ? Math.min(5, Math.max(1, rawScore)) : 1;\nconst keep = fit >= ctx.fit_threshold;\n\nreturn [{\n  json: {\n    status: 'scored',\n    keep,\n    req_id: ctx.req_id,\n    req_title: ctx.req_title,\n    req_url: ctx.req_url,\n    job_family: ctx.job_family,\n    config_sha: ctx.config_sha,\n    top_n: ctx.top_n,\n    candidate_id: ctx.candidate_id,\n    application_id: ctx.application_id,\n    prior_stage_reached: ctx.prior_stage_reached,\n    prior_rejection_reason: ctx.prior_rejection_reason,\n    fit,\n    evidence: evidence.slice(0, 240),\n    reason_to_resurface: (parsed.reason_to_resurface || '').slice(0, 240),\n    verify_before_outreach: (parsed.verify_before_outreach || '').slice(0, 240),\n    scored_at: new Date().toISOString(),\n  }\n}];"
      },
      "id": "3b3b3b3b-0001-0000-0000-000000000009",
      "name": "Parse + Keep",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [2020, 240],
      "notesInFlow": true,
      "notes": "Parses the model output, enforces the evidence-required guarantee (empty evidence -> fit 1), and flags keep when fit >= the config threshold. Unparseable responses are kept in the stream as keep:false so they show up in the audit log rather than vanishing."
    },
    {
      "parameters": {
        "jsCode": "// Append one audit line per scored candidate. Pseudonymous: candidate_id +\n// the Greenhouse link only, no name / no scorecard text. This is the record\n// that a past candidate was machine-scored for re-contact consideration.\nconst fs = require('fs');\nconst path = require('path');\n\nconst AUDIT_DIR = $env.AUDIT_DIR || '/data/audit';\nfs.mkdirSync(AUDIT_DIR, { recursive: true });\n\nconst input = $json;\nconst yyyymm = new Date().toISOString().slice(0, 7);\nconst auditPath = path.join(AUDIT_DIR, `rediscovery-${yyyymm}.jsonl`);\n\nconst entry = {\n  ts: new Date().toISOString(),\n  req_id: input.req_id,\n  job_family: input.job_family,\n  config_sha: input.config_sha,\n  candidate_id: input.candidate_id,\n  application_id: input.application_id,\n  prior_stage_reached: input.prior_stage_reached,\n  prior_rejection_reason: input.prior_rejection_reason,\n  fit: input.fit,\n  kept: !!input.keep,\n  model: 'claude-sonnet-4-6',\n};\n\nfs.appendFileSync(auditPath, JSON.stringify(entry) + '\\n', 'utf8');\nreturn [{ json: input }];"
      },
      "id": "3b3b3b3b-0001-0000-0000-00000000000a",
      "name": "Audit Append",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [2240, 240],
      "notesInFlow": true,
      "notes": "One JSONL line per scored candidate (kept or not). No PII beyond the candidate_id reference. This log is what makes a GDPR / EEOC inquiry about automated re-contact decisions answerable. Retention should match the firm's hiring-records policy."
    },
    {
      "parameters": {
        "mode": "runOnceForAllItems",
        "jsCode": "// Aggregate all scored candidates, group by req, dedupe by candidate (keep\n// the highest fit), keep the top_n above threshold, and build one Slack\n// digest payload per req. Emits one item per req for the Slack node.\nconst all = $input.all().map((i) => i.json).filter((j) => j && j.keep);\n\nconst byReq = {};\nfor (const r of all) {\n  byReq[r.req_id] = byReq[r.req_id] || { req: r, candidates: {} };\n  const existing = byReq[r.req_id].candidates[r.candidate_id];\n  if (!existing || r.fit > existing.fit) {\n    byReq[r.req_id].candidates[r.candidate_id] = r;\n  }\n}\n\nconst out = [];\nfor (const reqId of Object.keys(byReq)) {\n  const group = byReq[reqId];\n  const ranked = Object.values(group.candidates).sort((a, b) => b.fit - a.fit).slice(0, group.req.top_n);\n  if (!ranked.length) continue;\n\n  const lines = ranked.map((c, idx) =>\n    `*${idx + 1}. fit ${c.fit}/5* — <https://app.greenhouse.io/people/${c.candidate_id}|candidate ${c.candidate_id}>\\n   _Reached:_ ${c.prior_stage_reached} · _Rejected:_ ${c.prior_rejection_reason}\\n   _Why:_ ${c.reason_to_resurface}\\n   _Confirm first:_ ${c.verify_before_outreach}`\n  ).join('\\n\\n');\n\n  out.push({\n    json: {\n      req_id: group.req.req_id,\n      req_title: group.req.req_title,\n      req_url: group.req.req_url,\n      config_sha: group.req.config_sha,\n      shortlist_count: ranked.length,\n      slack_text: `*Silver-medalist shortlist — ${group.req.req_title}*\\n${ranked.length} past candidate(s) re-matched. The recruiter decides outreach — nothing has been contacted or moved.\\n\\n${lines}\\n\\n_Matching rules: config \\`${group.req.config_sha}\\`. <${group.req.req_url}|Open the req>_`,\n    }\n  });\n}\n\nreturn out;"
      },
      "id": "3b3b3b3b-0001-0000-0000-00000000000b",
      "name": "Build Digest",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [2460, 240],
      "notesInFlow": true,
      "notes": "Runs once over every scored candidate. Dedupes a candidate who matched via two feeder reqs (keeps the higher fit), ranks, truncates to top_n, and builds one Slack digest per req. A req with zero kept candidates produces no message rather than a noisy empty digest."
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://slack.com/api/chat.postMessage",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "slackApi",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            { "name": "Content-Type", "value": "application/json; charset=utf-8" }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"channel\": \"#talent-rediscovery\",\n  \"text\": \"{{ $json.slack_text }}\"\n}",
        "options": {}
      },
      "id": "3b3b3b3b-0001-0000-0000-00000000000c",
      "name": "Slack Digest",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [2680, 240],
      "credentials": {
        "slackApi": {
          "id": "PLACEHOLDER_SLACK_CRED_ID",
          "name": "Slack bot token (chat:write)"
        }
      },
      "notesInFlow": true,
      "notes": "Posts one ranked digest per req to #talent-rediscovery. The message is a decision surface for the recruiter, not an action: no candidate is contacted, moved, or added to a pipeline by the flow."
    }
  ],
  "connections": {
    "Every 6 Hours": {
      "main": [[{ "node": "Fetch New Open Reqs", "type": "main", "index": 0 }]]
    },
    "Fetch New Open Reqs": {
      "main": [[{ "node": "Load Match Config", "type": "main", "index": 0 }]]
    },
    "Load Match Config": {
      "main": [[{ "node": "Config Loaded?", "type": "main", "index": 0 }]]
    },
    "Config Loaded?": {
      "main": [
        [{ "node": "Fetch Rejected Pool", "type": "main", "index": 0 }],
        []
      ]
    },
    "Fetch Rejected Pool": {
      "main": [[{ "node": "Eligibility Filter", "type": "main", "index": 0 }]]
    },
    "Eligibility Filter": {
      "main": [[{ "node": "Fetch Scorecards", "type": "main", "index": 0 }]]
    },
    "Fetch Scorecards": {
      "main": [[{ "node": "Claude Re-Match", "type": "main", "index": 0 }]]
    },
    "Claude Re-Match": {
      "main": [[{ "node": "Parse + Keep", "type": "main", "index": 0 }]]
    },
    "Parse + Keep": {
      "main": [[{ "node": "Audit Append", "type": "main", "index": 0 }]]
    },
    "Audit Append": {
      "main": [[{ "node": "Build Digest", "type": "main", "index": 0 }]]
    },
    "Build Digest": {
      "main": [[{ "node": "Slack Digest", "type": "main", "index": 0 }]]
    }
  },
  "settings": {
    "executionOrder": "v1",
    "timezone": "UTC",
    "saveExecutionProgress": true,
    "saveManualExecutions": true,
    "callerPolicy": "workflowsFromSameOwner"
  },
  "active": false,
  "versionId": "1"
}