claude-skill

Feedback de rejeição personalizado com o Claude

Dificuldade

intermediário

Tempo de setup

30min

Para

recruiter · talent-acquisition · recruiting-ops

Recrutamento e TA

Stack

Um Claude Skill que pega os scorecards de entrevista de um candidato rejeitado (e, quando disponíveis, transcrições do BrightHire ou Metaview), rascunha um email de rejeição fundamentado em evidências ou notas de talking-points para chamada do recruiter, e produz as notas do lado do recruiter para a chamada. Substitui a rejeição por formulário genérico que prejudica a experiência do candidato por feedback personalizado que o candidato pode realmente usar — e recusa rascunhar quando a rubrica está faltando, o loop não convergiu ou o caso tem um flag de jurisdição.

Quando usar

O candidato chegou pelo menos a um onsite ou loop de etapa final, onde pelo custo do funil de recruiting o candidato investiu tempo suficiente para merecer uma resposta real.
A equipe tem pelo menos dois scorecards aprovados sobre o candidato (Ashby submitted: true, Greenhouse status: complete, Lever state: completed). Um scorecard é a visão de um entrevistador; o skill recusa sintetizar feedback de uma única perspectiva porque isso expõe o escritório a alegações de evidência seletiva.
Uma rubrica de cargo existe em rubrics/<role_id>.yaml com âncoras comportamentais por dimensão (a mesma fonte que o skill de debrief de entrevista lê). O skill pontua contra âncoras de rubrica, não contra prosa livre de scorecard.
O candidato solicitou explicitamente feedback (capturado por escrito no ATS), OU a jurisdição de residência do candidato é uma onde especificidades não solicitadas não carregam risco documentado conforme orientação de RH-jurídico do usuário.
Um recruiter revisa e edita cada rascunho antes de enviar. O skill escreve rascunhos em disco e para; não define nenhuma ação send.

Quando NÃO usar

Envio automático sem revisão do recruiter. Feedback de rejeição rascunhado por IA e enviado é a forma mais confiável de produzir um incidente de EEOC, ADA ou lei de emprego estadual. O recruiter é o portão. Se seu objetivo é remover o humano do loop, este é o workflow errado.
Candidatos que não solicitaram feedback em jurisdições de negação. França (risco do Code du travail sobre razões de rejeição documentadas), Alemanha (deslocamento de ônus probatório do parágrafo 22 do AGG) e qualquer jurisdição que o RH-jurídico do usuário marcou unsolicited_feedback: deny no arquivo de política. O skill recusa especificidades nesses casos e escreve o template de recusa genérica. Não edite o arquivo de política para fazer um caso de jurisdição de negação passar.
Casos que o jurídico sinalizou. Disputa ativa, pedido de acomodação não tratado ou reclamação registrada. O skill retorna um rascunho de recusa genérica e surfaceia o flag ao recruiter. Especificidades num caso sinalizado se tornam evidência na disputa.
Rejeições em estágios mais cedo (triagem de currículo, triagem do recruiter). A recusa por template é a ferramenta certa aí; o custo de modelo por candidato e o tempo de revisão do recruiter não se pagam em volume de topo de funil. O skill é para candidatos que chegaram pelo menos a um onsite.
Ranking comparativo (você foi nossa segunda escolha, tivemos candidatos mais fortes). O skill recusará rascunhar isso — o mapeamento de rubrica-para-feedback não contém a linguagem e a blocklist de frases proibidas a captura. Ranking comparativo é o que transforma uma rejeição construtiva em um post no Glassdoor.
Pedidos de melhoria de processo (pedir feedback ao candidato sobre a entrevista, uma indicação ou um depoimento). Pedidos reversos em um email de rejeição são um risco de declaração de testemunha EEOC e um dano à experiência do candidato. A blocklist os captura.

Setup

Faça o drop do bundle. Coloque apps/web/public/artifacts/rejection-feedback-claude-skill/SKILL.md no seu diretório de skills do Claude Code (ou skills personalizados do claude.ai, com autorização Tier-A para dados de candidatos conforme política de IA).
Configure a fonte da rubrica. O skill lê as rubricas do cargo em rubrics/<role_id>.yaml — o mesmo caminho que o skill de debrief de entrevista usa. Se a rubrica não existir, o skill recusa executar. Entrevistas estruturadas é o pré-requisito, não este skill.
Preencha o mapeamento de rubrica para feedback. Copie references/1-rubric-to-feedback-mapping.md e substitua a linguagem do template pela linguagem aprovada voltada ao candidato da sua equipe por dimensão de rubrica. Obtenha aprovação do RH-jurídico na linguagem aprovada uma vez; o log de auditoria captura o SHA-256 do mapeamento por execução, então as revisões são visíveis em retro.
Escreva o arquivo de política de jurisdição. Um arquivo YAML com um bloco por jurisdição em que sua empresa contrata. Cada bloco define unsolicited_feedback: allow ou deny e referencia o memo de orientação de RH-jurídico relevante. O bundle inclui um template; os padrões de negação são França, Alemanha e qualquer jurisdição com orientação ativa de lei trabalhista contra razões documentadas de rejeição.
Configure a API do ATS. Token da API do Ashby, Greenhouse ou Lever com escopo de leitura em scorecards e candidatos. O skill puxa os scorecards por candidate_id; não aceita texto de scorecard colado porque o texto colado não pode ser auditado de volta ao entrevistador de origem.
Opcional: configure o bundle de transcrição. Acesso à API do BrightHire ou Metaview. Quando um transcript_id é fornecido, o skill cruza as alegações dos scorecards com os turnos de transcrição no passo 4.
Execute em seco num candidato fechado. Execute num candidato que já foi rejeitado no trimestre passado. Compare o rascunho do skill com o que o recruiter realmente enviou. Ajuste o mapeamento de rubrica para feedback se a calibração derivar — o mapeamento, não o modelo, é geralmente a alavanca.

O que o skill realmente faz

Seis passos, em ordem. A ordem importa: o gate de jurisdição e a validação do scorecard acontecem antes de o LLM ler qualquer conteúdo do candidato, porque deixar o modelo solto em texto de scorecard num caso de jurisdição de negação deixa uma entrada de log de chamada de modelo com dados de identificação do candidato que o escritório não precisava reter.

Valide a política de jurisdição e o consentimento. Consulte a jurisdição do candidato no arquivo de política. Se a política for unsolicited_feedback: deny e o candidato não solicitou feedback por escrito, pare as especificidades e mude para o template de recusa genérica. A escolha de fazer gate no consentimento antes de puxar os scorecards mantém a história de minimização de dados limpa para o Art. 5(1)(c) do GDPR.
Puxe os scorecards (e transcrição opcional). Busque via API do ATS. Descarte rascunhos. Se o loop tiver menos de dois scorecards aprovados, pare — feedback sintetizado da visão de um entrevistador é uma opinião, não feedback, e expõe o escritório a alegações de evidência seletiva.
Identifique dimensões e evidências. Calcule a média e desvio padrão entre entrevistadores por dimensão de rubrica. Surfaceie dimensões onde média ≥ 4 (força, abertura positiva) e média ≤ 2 (lacuna do candidato). Recuse surfacear qualquer dimensão com desvio padrão entre entrevistadores ≥ 1,5 — o loop não convergiu, e feedback sobre uma dimensão não convergida não sobreviveria a um desafio “mas o entrevistador X me deu 5”. Para cada dimensão surfaceada, puxe citações verbatim de evidências dos scorecards (ou transcrição, quando disponível). Sem string verbatim → a dimensão não é surfaceada.
Rascunhe contra o mapeamento de rubrica para feedback. Traduza no máximo uma força e uma lacuna em linguagem voltada ao candidato usando references/1-rubric-to-feedback-mapping.md. Limite a um de cada para que o rascunho não leia como uma lista defensiva. Os slots de substituição do mapeamento são preenchidos a partir de campos estruturados (scorecard, âncora de rubrica) ou da lista de tópicos aprovados — o LLM nunca faz texto livre de um valor de substituição, que é o guarda contra especificidades falsas.
Triagem de viés e especificidades falsas. Faça grep do rascunho contra references/2-banned-phrase-blocklist.md. Qualquer hit para a execução com a string ofensiva surfaceada. Verifique se toda alegação específica mapeia de volta a uma string de evidência verbatim do passo 3 — alegações sem fonte param. Esta é uma passagem separada do passo 4 por design; a passagem de triagem vê apenas o texto do rascunho, sem consciência dos scorecards subjacentes, para que não possa racionalizar uma frase proibida como “mas o entrevistador quis dizer X”.
Escreva em disco e log de auditoria. Escreva drafts/<candidate-id>.md e (para route: call) drafts/<candidate-id>-call-notes.md conforme o formato em references/3-output-format.md. Acrescente uma linha JSONL ao audit/<YYYY-MM>.jsonl com candidate_id_hash (SHA-256, não ID bruto), rubric_sha256, blocklist_sha256, mapping_sha256, dimensões surfaceadas, hits da blocklist, ID do modelo, timestamp. Sem texto livre de identificação do candidato na linha de auditoria.

O formato literal do email, o fallback de recusa genérica e o template de notas de chamada vivem em references/3-output-format.md. O formato é fixo porque os consumidores downstream — recruiter, candidato e qualquer revisor futuro de auditoria — precisam de linguagem previsível sem drift específico do recruiter.

Realidade de custos

Por rascunho de rejeição, no Claude Sonnet 4.5:

Tokens de LLM — tipicamente 12-25k tokens de input (rubrica YAML + scorecards + instruções do skill + arquivos de referência) e 0,5-1,5k tokens de output (o rascunho mais as notas de chamada). No Sonnet 4.5 isso é aproximadamente 5-10 centavos de dólar por rascunho. Uma equipe de recruiters executando 200 rascunhos de rejeição por mês gasta 10-20 dólares em custo de modelo.
Custo de API do ATS — zero no Ashby (API gratuita), Greenhouse (incluído no tier), Lever (incluído). Fetches de transcrição contra BrightHire ou Metaview contam contra o plano por assento; fetches de feedback de rejeição são somente leitura e não consomem novos créditos de transcrição.
Tempo do recruiter — o ganho está aqui. Rascunhar manualmente um email de rejeição fundamentado em evidências a partir dos scorecards é 20-30 minutos por candidato quando o recruiter faz bem, ou 3 minutos quando cola um formulário genérico (que é o que a maioria das equipes acaba fazendo em escala). O skill produz o rascunho de 20 minutos em menos de 30 segundos; o recruiter revisa e edita em 4-7 minutos. Economia líquida de aproximadamente 15-20 minutos por rejeição no nível de qualidade de rascunho cuidadoso — chame de 50-60 horas por mês numa equipe executando 200 rejeições.
Tempo de setup — 30 minutos para o mapeamento de rubrica para feedback e a política de jurisdição se sua equipe já tem linguagem aprovada voltada ao candidato em algum lugar; mais longo se o RH-jurídico ainda não pesou sobre linguagem de feedback de rejeição (nesse caso essa conversa é o pré-requisito, não este skill).
O retorno composto da experiência do candidato. Candidatos rejeitados com feedback específico e fundamentado em evidências são mais propensos a se reaplicar, mais propensos a indicar outros e substancialmente menos propensos a deixar reviews prejudiciais no Glassdoor — afirmações comumente citadas na literatura de recruiting são de 30-50% para intenção de reaplicar, embora não tenhamos uma fonte primária para esses números e os tratemos como direcionais. O retorno composto aparece na densidade do pipeline um ano depois, não no mês em que o rascunho foi enviado.

Métrica de sucesso

Rastreie três números por mês, no ATS:

Distância de edição do recruiter por rascunho. O número de caracteres que o recruiter muda entre o rascunho do skill e a mensagem enviada. Se a distância de edição tender a zero, o recruiter está aprovando sem critério — surfaceie isso em retro e reveja o mapeamento de rubrica para feedback. Se a distância de edição for consistentemente alta, o mapeamento está descalibrado.
Taxa de resposta do candidato à rejeição. Respostas a um email de rejeição geralmente são notas de agradecimento-e-futura-candidatura (bom sinal) ou notas de escalada (mau sinal). Rastreie a taxa de escalada como percentagem das rejeições enviadas. Uma equipe de linha de base usando formulários genéricos tipicamente vê menos de 1% de escalada; o objetivo com este skill é ficar nessa linha de base ou abaixo, não acima. Se a taxa de escalada subir, o mapeamento de rubrica para feedback está produzindo linguagem que aterrissa mal — reajuste.
Taxa de reaplicação em 12 meses. Candidatos rejeitados através deste skill versus candidatos rejeitados pelo formulário genérico legado, medidos nos próximos 12 meses. O benefício composto aparece aqui, não no gasto de modelo nem mesmo no thread de rejeição em si.

Versus as alternativas

Versus templates de rejeição integrados do Ashby. O Ashby (e Greenhouse, Lever) incluem templates de rejeição com campos de merge para nome do candidato e cargo. São templates, não feedback — os campos de merge não puxam evidências de scorecard e não há camada de linguagem fundamentada em rubrica. Use os templates do Ashby para rejeições de topo de funil onde templating é honesto. Use este skill para rejeições de etapa final onde o template lê como dispensivo do tempo que o candidato investiu.
Versus emails de recusa genéricos. A recusa genérica é a resposta certa em casos de jurisdição de negação, quando o consentimento não foi dado e quando a rubrica não surfaceou nada específico defensável. O skill escreve o template de recusa genérica byte a byte nesses casos. A diferença é que o skill faz a escolha deterministicamente por política de jurisdição e output da rubrica, em vez de o recruiter padrão para genérico por fadiga.
Versus notas escritas manualmente pelo recruiter. As notas manuais são o padrão ouro para candidatos sênior ou indicados VIP onde o recruiter tem o contexto do relacionamento e o tempo. O skill ganha seu espaço em volume — os 80% das rejeições de etapa final onde o recruiter de outra forma colaria um formulário genérico porque rascunho manual em escala não cabe no dia. Para o tier sênior, o arquivo de notas de chamada dá ao recruiter um ponto de partida estruturado para a chamada, e o recruiter improvisa a partir daí.
Versus um LLM sem arquivo de rubrica e sem blocklist. Este é o modo de falha contra o qual o skill é construído. Um LLM rascunhando a partir de scorecards sozinhos, sem fundamentação em rubrica, sem blocklist de frases proibidas e sem log de auditoria, produz texto de rejeição rápido, confiante e plausível — e aproximadamente um em vinte rascunhos conterá uma citação alucinada, um ranking comparativo ou um proxy de classe protegida. Os arquivos de checklist do bundle são o que move a taxa de falha para perto de zero.

Pontos de atenção

Linguagem que implica EEOC. Guardado pela blocklist de frases proibidas em references/2-banned-phrase-blocklist.md, que roda como uma passagem separada no passo 5 sem consciência dos scorecards subjacentes. Hits param a execução com a string ofensiva surfaceada. Não edite a blocklist para fazer um rascunho passar — corrija a rubrica ou a linguagem do scorecard.
Especificidades falsas do LLM. Guardado pela regra “sem síntese sem citação verbatim” no passo 3. Toda alegação no rascunho deve rastrear a uma string verbatim de um scorecard ou transcrição aprovado. Sem string verbatim → a dimensão não é surfaceada. Este é o guarda contra o modo de falha mais comum de feedback rascunhado por LLM — citações plausíveis que nenhum entrevistador escreveu realmente, citadas de volta ao candidato como fato.
Linguagem de ranking comparativo. Guardado pelo mapeamento de rubrica para feedback em references/1-rubric-to-feedback-mapping.md, que não contém linguagem comparativa, e pela blocklist no passo 5 que a captura se escorregar. Ranking comparativo é o que transforma uma rejeição construtiva em um post no Glassdoor.
Risco de evidência seletiva. Guardado pelo passo 2 (pare se o loop tiver menos de dois scorecards aprovados) e pelo passo 3 (recuse surfacear dimensões com desvio padrão entre entrevistadores em ou acima de 1,5). Desacordo entre entrevistadores não vira feedback do candidato.
Drift de envio automático. Guardado pela ausência de qualquer ação send no skill. Os rascunhos são escritos em drafts/<candidate-id>.md para o recruiter revisar, editar e enviar da caixa de saída do ATS. O recruiter é o portão.
Dano do boilerplate genérico. Guardado pela recusa do passo 3 em surfacear uma dimensão sem evidência verbatim — quando a rubrica não surfaceia nada seguro para compartilhar, o skill escreve o template de recusa genérica em vez de sintetizar especificidades fracas. Recusa genérica é honesta; especificidades fracas são piores que nenhuma especificidade.
PII no log de auditoria. Guardado pelo passo 6 escrevendo apenas candidate_id_hash (SHA-256), nunca o ID bruto do candidato, nome ou texto do scorecard. O log de auditoria é para reprodutibilidade da execução, não retenção de dados do candidato. Os rascunhos voltados ao candidato ficam em drafts/ sob a política de retenção própria do recruiter.
Drift de calibração entre cargos e senioridade. Guardado por rubricas YAML por cargo e pelo mapeamento de rubrica para feedback sendo versionado por equipe. Rejeições de liderança sênior precisam de enquadramento diferente das de nível de entrada; o arquivo de mapeamento é onde isso fica, não no código do skill.
Privacidade e residência de dados. Verifique se o skill opera dentro do AI Tier A empresarial conforme a política de IA. O conteúdo de entrevista é sensível; o candidato não consentiu com o processamento por um modelo terceirizado a menos que sua política de IA e sua linguagem de consentimento de coleta de scorecards cubram isso explicitamente.

Stack

O bundle do skill fica em apps/web/public/artifacts/rejection-feedback-claude-skill/ e contém:

SKILL.md — a definição do skill
references/1-rubric-to-feedback-mapping.md — preencha por equipe, linguagem aprovada pelo RH-jurídico por dimensão de rubrica
references/2-banned-phrase-blocklist.md — verificações pré-voo no rascunho (não edite para fazer rascunhos tendenciosos passarem)
references/3-output-format.md — o email literal, recusa genérica e formatos de notas de chamada

Ferramentas que o workflow assume que você já usa: Claude (o modelo), Ashby, Greenhouse ou Lever (o ATS onde os scorecards vivem) e opcionalmente BrightHire ou Metaview (transcrições de entrevistas para fundamentação de evidências mais rica). Workflow irmão que compartilha a fonte de rubrica: o skill de debrief de entrevista.

Editar esta página no GitHub

Arquivos deste artefato

Baixar tudo (.zip)

---
name: rejection-feedback
description: Take a rejected candidate's interview scorecards and (where available) transcripts, draft an evidence-grounded rejection email or recruiter-call talking points, and produce the recruiter-side notes for the call. Always stops at a recruiter-review gate; never sends. Refuses to draft when the rubric is missing or the case is jurisdiction-flagged.
---

# Rejection feedback

## When to invoke

Use this skill when a recruiter needs to send personalized post-interview feedback to a candidate who reached at least an onsite or final-stage loop, and the team has structured scorecards plus a role rubric on file. Take the candidate's scorecards (across all interviewers), the role rubric, the recruiter-relationship context (was feedback explicitly offered? requested?), and the candidate's residency jurisdiction as input. Produce a Markdown rejection email draft, optional recruiter-call talking-point notes, and a one-line routing recommendation.

Do NOT invoke this skill for:

- **Auto-sending without recruiter review.** The skill writes drafts to disk and stops. There is no `send` action defined anywhere in this skill. Auto-sent rejection feedback is the single most reliable way to produce an inappropriate-content incident under EEOC, ADA, or state employment law. The recruiter is the gate.
- **Candidates who have not requested feedback in jurisdictions where unsolicited feedback creates risk.** Specifically: France (Code du travail risk on documented rejection reasons), Germany (AGG §22 evidentiary shift), and any jurisdiction where the recruiter's HR-counsel guidance disallows unsolicited specifics. The skill reads the `jurisdiction_policy.yaml` file and refuses to draft specifics for any jurisdiction marked `unsolicited_feedback: deny`.
- **EEOC-implicating language or protected-class proxies.** "Cultural fit", age inferences from graduation year, family-status references, national-origin references, accent commentary, gendered descriptors ("aggressive", "abrasive", "soft"), pregnancy-status references, disability or accommodation references. The banned-phrase blocklist in `references/2-banned-phrase-blocklist.md` runs as the final check before the draft is written. Any hit halts the run with the offending string surfaced.
- **Cases legal has flagged.** If the candidate file has a flag for active dispute, accommodation request unaddressed, or a complaint on record, the skill returns "decline to provide specific feedback — legal flag present" and writes a generic-decline draft instead.
- **Rejections from earlier stages** (resume screen, recruiter screen). Templated decline is the right tool there. This skill is for candidates who invested significant time and earned a real answer, per the [recruiting funnel](/en/learn/recruiting-funnel-metrics/) cost.

## Inputs

- Required: `candidate_id` — the ATS record ID ([Ashby](/en/tools/ashby/), [Greenhouse](/en/tools/greenhouse/), or [Lever](/en/tools/lever/)). The skill pulls scorecards via the ATS API; it does not accept pasted scorecard text, because pasted text cannot be audited back to the source interviewer.
- Required: `role_id` — used to load the role's rubric from `rubrics/<role_id>.yaml` (same source the [interview debrief skill](/en/workflows/interview-debrief-summary-skill/) reads). Without a rubric the skill refuses to run; ungrounded feedback is how false specifics get drafted.
- Required: `jurisdiction` — ISO 3166 country code for the candidate's residency at time of application. Drives which jurisdiction-policy block applies.
- Required: `feedback_requested` — boolean. `true` only if the candidate explicitly asked for feedback (in writing, captured in the ATS). `false` defaults to a generic-decline draft in jurisdictions where the policy file flags unsolicited specifics as risk.
- Optional: `transcript_id` — pointer to a [BrightHire](/en/tools/brighthire/) or [Metaview](/en/tools/metaview/) transcript bundle for the loop. When present, the skill cross-references scorecard claims against transcript evidence; when absent, the skill works from scorecards alone and labels the draft accordingly.
- Optional: `route` — one of `email`, `call`, `auto`. `auto` (default) picks based on stage reached and seniority per the routing rules in `references/3-output-format.md`.

## Reference files

Always read the following from `references/` before drafting. Without them the draft is generic, ungrounded, and risks tripping a banned phrase.

- `references/1-rubric-to-feedback-mapping.md` — the mapping from rubric dimensions to safely-sharable, candidate-facing feedback language. Replace the template placeholders with your team's approved phrasing before first use.
- `references/2-banned-phrase-blocklist.md` — the blocklist the skill greps the draft against in step 5. Patterns include EEOC-implicating terms, protected-class proxies, comparative-ranking language, and unverifiable specifics. Do not edit this file to make a draft pass.
- `references/3-output-format.md` — the literal email and call-notes format, including the routing rules.

## Method

Run these six steps in order. Steps 1-3 are deterministic gating; steps 4-5 use the LLM for synthesis and screening; step 6 is the audit log. The order matters — letting the LLM draft against unchecked scorecards produces fast, confident, EEOC-implicating output.

### 1. Validate jurisdiction policy and consent

Open `references/jurisdiction-policy.yaml` (user-supplied; template shipped in the bundle). Look up the candidate's `jurisdiction`. If `unsolicited_feedback: deny` and `feedback_requested: false`, halt specifics and switch to the generic-decline template at the top of `references/3-output-format.md`. Log the reason in the audit line.

The choice to gate on consent before pulling scorecards is deliberate: specifics drafted and then discarded still leave a model-call log entry with candidate-identifying scorecard text. Gating up front keeps the data-minimization story clean for GDPR Art. 5(1)(c).

### 2. Pull scorecards and (optional) transcript

Fetch all scorecards for `candidate_id` via the ATS API. Validate that every scorecard is signed-off (Ashby `submitted: true`, Greenhouse `status: complete`, Lever `state: completed`). Drop drafts. If the loop has fewer than two completed scorecards, halt — feedback synthesized from one interviewer's view is not feedback, it is an opinion, and exposes the firm to selective-evidence claims.

When `transcript_id` is provided, fetch the transcript bundle. The skill will cite scorecard claims against transcript turns in step 4.

### 3. Identify dimensions and evidence

For each rubric dimension, compute the cross-interviewer mean score and the standard deviation. Flag dimensions where:

- mean ≥ 4 (candidate strength, surface as the warm opening)
- mean ≤ 2 (candidate gap, candidate for feedback if safe)
- standard deviation ≥ 1.5 (interviewer disagreement — do NOT cite this dimension; the loop did not converge and the feedback would not survive a "but interviewer X scored me 5" challenge)

For each surfaced dimension, pull the verbatim evidence quotes from the scorecards (or transcript, when available). Every claim in the final draft must cite a verbatim string from the evidence pool. No verbatim string → the dimension is not surfaced.

The "no synthesis without verbatim citation" rule is the guard against false specifics. LLMs drafting feedback from scorecards will, without this rule, invent quotes that sound plausible — "the candidate struggled with system-design tradeoffs" — that no interviewer ever wrote. False specifics cited back to the candidate are how rejection-feedback workflows generate complaint emails.

### 4. Draft against the rubric-to-feedback mapping

Translate at most one strength and one gap into candidate-facing language using `references/1-rubric-to-feedback-mapping.md`. Cap at one of each so the draft does not read as a defensive list. Comparative ranking ("we had stronger candidates", "you were our second choice") is forbidden — the mapping file does not contain the language and step 5 greps it out.

For `route: call`, also draft recruiter-side talking points: bullet-point observations, the suggested phrasing for the gap, and two to three pre-prepared responses to likely candidate questions ("Was there anything I could have done differently?", "Will you keep me in mind for future roles?", "Can I get a second look?").

### 5. Bias and false-specifics screening

Grep the draft against `references/2-banned-phrase-blocklist.md`. Any hit halts the run with the offending string surfaced. Then verify that every specific claim in the draft maps back to a verbatim evidence string from step 3 — if a claim has no source, halt.

This is a separate pass from step 4 by design. The screening pass sees only the draft text, with no awareness of the underlying scorecards, so it cannot rationalize a banned phrase as "but the interviewer meant X".

### 6. Write to disk and audit log

Write the draft to `drafts/<candidate-id>.md` per the format in `references/3-output-format.md`. Write the call notes (if applicable) to `drafts/<candidate-id>-call-notes.md`. Append one JSONL line to `audit/<YYYY-MM>.jsonl` containing: `run_id`, `candidate_id_hash` (SHA-256, not raw ID), `role_id`, `jurisdiction`, `feedback_requested`, `route`, `rubric_sha256`, `dimensions_surfaced`, `blocklist_hits` (zero on success), `model_id`, `timestamp`. No candidate-identifying free text in this line.

Surface the path to the recruiter and exit. The recruiter reviews, edits, and sends from the ATS or their own outbox.

## Output format

Literal example of the email draft the skill writes to `drafts/<candidate-id>.md` for a candidate who reached an onsite for a Senior Backend Engineer role and explicitly requested feedback:

```markdown
Subject: Update on your Senior Backend Engineer interview at Acme

Hi Jamie,

Thank you for the time you invested in our interview process — the
take-home, the system-design loop, and the conversations with the
team. We appreciated the care you put into each stage.

After the team's debrief, we have decided not to move forward with
your candidacy for this role.

You asked for feedback, so here is what stood out from the loop:

- **What went well.** Your take-home submission was clear, well-tested,
  and included a thoughtful note on the failure-mode tradeoffs. Two
  interviewers cited the test coverage specifically.

- **Where the team landed differently.** In the system-design round,
  the discussion of consistency-vs-availability tradeoffs at the
  database layer did not surface the read-replica option that the
  role frequently requires reasoning about. This was the dimension
  that drove the team's decision.

This feedback is specific to the loop you ran with us; it is not a
ranking against other candidates and it is not a comment on your
overall engineering ability.

If a future role at Acme matches your background, we would welcome
your application.

Best,
{Recruiter name}
```

Literal example of the recruiter call-notes file written to `drafts/<candidate-id>-call-notes.md`:

```markdown
# Call notes — Jamie L. (Senior Backend Engineer)

## Frame
- Open with thanks for the time invested.
- Lead with the take-home strength (specific: test coverage note).
- Single gap: system-design read-replica reasoning. One sentence,
  no piling on.

## Suggested phrasing for the gap
"In the system-design conversation, the team was looking for the
read-replica option as part of the consistency-availability tradeoff,
and that did not come up. That was the dimension that drove the
decision for this specific role."

## Likely candidate questions

Q: "Was there anything I could have done differently?"
A: Acknowledge the question. Refer back to the single gap. Do NOT
add new feedback dimensions on the call — anything not in the
written draft is off-script and creates inconsistency risk.

Q: "Will you keep me in mind for future roles?"
A: Yes if true; specifics on what kind of role. Do NOT promise a
timeline.

Q: "Can I get a second-look interview?"
A: No. The decision is final. The recruiter reiterates appreciation
and closes.

## Off-script
If the candidate raises a discrimination concern, comparative-ranking
question, or accommodation issue, the recruiter says "let me come
back to you on that" and routes to HR / counsel. The recruiter does
NOT improvise an answer.
```

Literal example of the routing recommendation appended to the draft file:

```markdown
---
Routing: call (stage: onsite, seniority: senior, prior referrer: yes)
Recruiter review required before send.
```

## Watch-outs

- **EEOC-implicating language.** *Guard:* the banned-phrase blocklist in `references/2-banned-phrase-blocklist.md` runs as a separate pass in step 5, with no awareness of the underlying scorecards, so it cannot rationalize a hit. Any hit halts the run with the offending string surfaced. Do not edit the blocklist to make a draft pass — fix the rubric or the scorecard language instead.
- **False specifics from the LLM.** *Guard:* the "no synthesis without verbatim citation" rule in step 3. Every claim in the draft must trace to a verbatim string from a signed-off scorecard or transcript. No verbatim string → the dimension is not surfaced. This is the guard against the most common failure mode of LLM-drafted feedback — plausible-sounding quotes that no interviewer actually wrote.
- **Comparative ranking language.** *Guard:* the rubric-to-feedback mapping in `references/1-rubric-to-feedback-mapping.md` does not contain comparative phrasing ("stronger candidates", "second choice"), and the blocklist in step 5 catches it if it slips in. Comparative ranking is what turns a constructive rejection into a Glassdoor post.
- **Selective-evidence risk.** *Guard:* step 2 halts if the loop has under two signed-off scorecards. Step 3 refuses to surface dimensions with cross-interviewer standard deviation at or above 1.5 — interviewer disagreement does not become candidate feedback.
- **Auto-send drift.** *Guard:* the skill defines no `send` action. Drafts are written to `drafts/<candidate-id>.md` for the recruiter to review, edit, and send from the ATS outbox. AI-drafted-and-sent rejection feedback without review damages [candidate experience](/en/learn/candidate-experience/) and produces incidents.
- **PII in the audit log.** *Guard:* step 6 writes only `candidate_id_hash` (SHA-256), never the raw candidate ID, name, or scorecard text. The audit line is for run reproducibility, not candidate data retention.
- **Generic boilerplate harm.** *Guard:* if step 3 cannot surface a rubric dimension that has both mean ≤ 2 and a verbatim evidence string, the skill writes the generic-decline template from `references/3-output-format.md` rather than synthesizing weak specifics. Generic decline is honest; weak specifics are worse than no specifics.

# Rubric-to-feedback mapping — TEMPLATE

> Replace this template with your team's approved candidate-facing
> phrasing per rubric dimension. The rejection-feedback skill reads
> this file in step 4 to translate scorecard language (which is
> internal, often blunt) into candidate-facing language (which must
> be specific, evidence-grounded, and EEOC-safe). Without this file
> the skill will not draft specifics — it falls back to the generic
> decline template.

## How this file is used

The skill matches each surfaced dimension (from step 3) against the `dimension_id` below, then uses the `candidate_facing_phrasing` template, substituting in the verbatim evidence string from the scorecard or transcript.

If a dimension is surfaced by step 3 but has no entry below, the skill will NOT draft specifics for it — the dimension is dropped. This forces the team to deliberate on candidate-facing phrasing once, in writing, rather than letting the LLM improvise per run.

## Dimension entries

### dimension_id: technical_depth

**internal_label**: Technical depth (1-5)

**rubric_anchors**:

- 5: Reasons fluently across multiple layers of the stack; explores tradeoffs unprompted.
- 4: Reasons clearly within their primary layer; surfaces tradeoffs when asked.
- 3: Recalls correct patterns; tradeoff reasoning needs prompting.
- 2: Recalls patterns inconsistently; tradeoff reasoning absent or shallow.
- 1: Patterns incorrect or contradicted under follow-up.

**candidate_facing_phrasing** (used for mean ≤ 2):

```
In the {round_name} round, the team was looking for {specific_topic}
as part of {specific_decision_context}, and that did not come up.
That was the dimension that drove the decision for this specific
role.
```

Substitution sources:
- `{round_name}` → from scorecard `interview_round` field
- `{specific_topic}` → from `references/2-banned-phrase-blocklist.md` approved-topics list (NEVER free-text from the LLM)
- `{specific_decision_context}` → from rubric anchor text

**candidate_facing_phrasing** (used for mean ≥ 4, opening only):

```
{Strength_observation}. {Interviewer_count_phrase} cited
{specific_evidence} specifically.
```

---

### dimension_id: system_design

**internal_label**: System design (1-5)

**rubric_anchors**:

- 5: Drives the design conversation; surfaces consistency, availability, and operational tradeoffs unprompted.
- 4: Engages with tradeoffs when prompted; covers most major axes.
- 3: Engages with tradeoffs when prompted; covers one or two axes.
- 2: Tradeoff reasoning shallow; misses major axes that the role requires.
- 1: Cannot construct a system that meets the stated requirements.

**candidate_facing_phrasing** (used for mean ≤ 2):

Same template as `technical_depth`.

---

### dimension_id: collaboration

**internal_label**: Collaboration (1-5)

**rubric_anchors**:

- 5: Specific examples of cross-functional work, named tradeoffs, named outcomes.
- 4: Specific examples, less explicit on tradeoff reasoning.
- 3: General examples, no specifics on tradeoffs or outcomes.
- 2: Vague examples or examples that do not show collaboration evidence.
- 1: No relevant examples surfaced.

**candidate_facing_phrasing** (used for mean ≤ 2):

Same template as `technical_depth`. **Constraint:** never use the words "communication", "fit", "soft skills", or "executive presence" in the candidate-facing draft for this dimension. Those terms are on the banned-phrase blocklist because they correlate with bias claims.

---

## Constraints across all dimensions

- One strength and one gap per draft, maximum. The skill caps at one of each in step 4.
- Every substitution slot is filled from a structured field (scorecard, transcript, rubric anchor) or from the approved-topics list. The LLM never free-texts a substitution value.
- Comparative ranking is not in this file and is on the blocklist. If you find yourself adding "vs other candidates" phrasing, stop and revisit the rubric anchors instead.
- Update this file when the team revises rubric anchors. The skill's audit log captures `rubric_sha256` per run, so revisions are visible in retro.

## Last edited

{YYYY-MM-DD}

# Banned-phrase blocklist

> The rejection-feedback skill greps the final draft against every
> pattern below in step 5 (bias and false-specifics screening). Any
> hit halts the run with the offending string surfaced. Do NOT edit
> this file to make a draft pass — fix the rubric, the scorecard
> language, or the rubric-to-feedback mapping instead.

## A. EEOC-implicating language

A1. **Protected-class proxies.** Any of the following terms or patterns in the draft halts the run:

- `culture fit`, `cultural fit`, `culture add` (without an accompanying behavioral-anchor citation)
- `team fit`, `not a fit` (when used as the substantive reason)
- `personality`, `chemistry`, `vibes`
- `executive presence`, `leadership presence`, `gravitas`
- `polish`, `polished`, `lacks polish`
- `aggressive`, `abrasive`, `pushy` (gendered descriptors)
- `soft`, `nice`, `quiet`, `meek` (inverse gendered descriptors)
- `mature`, `seasoned`, `young`, `energetic`, `digital native` (age proxies)
- `accent`, `articulate`, `well-spoken` (national-origin proxies)
- `family`, `kids`, `pregnant`, `maternity`, `paternity`, `parental` (family-status proxies)
- `accommodation`, `disability`, `health` (any reference to accommodation discussions in the rejection text)
- `religion`, `church`, `prayer`
- `marital`, `married`, `single`
- `name origin`, `surname` (any commentary on the candidate's name)
- `school`, `university`, `Ivy`, `tier-1`, `top-N` (when used as the substantive reason — schools may appear in factual context but not as the rejection driver)

A2. **Comparative ranking language.** Halts the run:

- `stronger candidates`, `better candidates`, `more qualified`
- `second choice`, `runner-up`, `not the top choice`
- `closer fit elsewhere`, `closer match`
- `pool was strong`, `competitive pool`
- `we found someone`, `we hired someone`, `the role is filled` (these belong in a separate sentence about the role status, not framed as a candidate ranking)
- Any phrase that implies a relative ordering of the candidate against unnamed others.

A3. **Defamation-risk language.** Halts the run:

- `dishonest`, `misleading`, `lied`, `lying`
- `unprepared`, `did not try`, `did not care`
- `arrogant`, `entitled`, `difficult`
- `concerning`, `red flag`, `worrying`
- Any subjective-character claim that could be cited against the firm in a defamation action.

## B. False-specifics patterns

B1. **Quote markers without source.** Halts the run if the draft contains any quoted string (`"…"` or `'…'`) that does not appear verbatim in the scorecard or transcript pool from step 2.

B2. **Numeric claims without source.** Halts if the draft contains a numeric claim (`scored X`, `Y out of Z`, `X% of`) — interview scores are internal calibration data, not candidate-facing content.

B3. **Interviewer-identifying claims.** Halts if the draft names an interviewer, references an interviewer's role beyond the generic "the team", or attributes a quote to a specific person. Interviewer identities are protected and naming them creates retaliation risk.

B4. **Round-identifying claims that could not have happened.** Halts if the draft references a round (`take-home`, `system design`, `behavioral`, `pair programming`) that is not present in the scorecard set for this candidate. The skill validates round names against the loop's actual structure.

## C. Process-risk language

C1. **Promises about the future.** Halts the run:

- `we will reach out`, `we'll be in touch`, `next time`
- `definitely apply again`, `you will get an offer`
- `keep your resume on file` (varies by jurisdiction whether this is permissible — neutral phrasing is "we welcome a future application")
- Any timeline commitment.

C2. **Process-improvement requests from the candidate.** Halts if the draft asks the candidate for feedback, a referral, or a testimonial. Reverse asks in a rejection email are an EEOC-witness-statement risk and a candidate-experience harm.

C3. **Unsolicited specifics in deny-jurisdiction cases.** The skill's step 1 should have caught this, but as a defense-in-depth check: if the run's `jurisdiction_policy` returned `unsolicited_feedback: deny` and `feedback_requested: false`, the draft must match the generic-decline template byte-for-byte. Any deviation halts.

## D. Approved-topics list (positive list, used by step 4)

The rubric-to-feedback mapping's `{specific_topic}` substitution slot pulls from this list. The LLM never free-texts a topic string.

- `consistency-availability tradeoffs`
- `read-replica reasoning`
- `caching layer reasoning`
- `failure-mode reasoning`
- `test coverage`
- `error-handling specificity`
- `data-modeling tradeoffs`
- `query-pattern reasoning`
- `migration sequencing`
- `deployment sequencing`
- `cross-team coordination examples`
- `tradeoff reasoning under time pressure`

Add to this list only after team review. Topics added here are permitted to appear in candidate-facing drafts.

## E. Maintenance

This file is version-controlled. The skill captures the SHA-256 of this file in the audit log per run, so the blocklist used on a given date is reproducible. If a candidate raises a claim against a specific draft, the audit log answers "was the blocklist of date X in effect at the time of the draft" — yes or no, no judgment call.

## Last edited

{YYYY-MM-DD}

# Output format

> The rejection-feedback skill writes drafts in exactly the formats
> below. The recruiter reviews and edits in their own outbox or in
> the ATS; the skill never sends.

## Routing rules

The skill picks a route per the matrix below. The recruiter can override.

| Stage reached | Seniority | feedback_requested | Default route |
|---|---|---|---|
| onsite | senior+ | true | call |
| onsite | senior+ | false | email (generic if jurisdiction denies) |
| onsite | mid / junior | true | email (specific) |
| onsite | mid / junior | false | email (generic) |
| final loop | any | any | call (overrides above) |
| referred-by-VIP | any | any | call (recruiter judgment) |
| earlier than onsite | any | any | OUT OF SCOPE — use templated decline |

`senior+` = staff, principal, manager, director. `referred-by-VIP` = candidate has a `referrer_priority: high` flag in the ATS.

## Email format — specific feedback (consent + safe jurisdiction)

```markdown
Subject: Update on your {role_title} interview at {company_name}

Hi {candidate_first_name},

Thank you for the time you invested in our interview process — the
{round_1_label}, {round_2_label}, and the conversations with the
team. We appreciated the care you put into each stage.

After the team's debrief, we have decided not to move forward with
your candidacy for this role.

You asked for feedback, so here is what stood out from the loop:

- **What went well.** {strength_phrasing_from_mapping}.

- **Where the team landed differently.** {gap_phrasing_from_mapping}.
  This was the dimension that drove the team's decision.

This feedback is specific to the loop you ran with us; it is not a
ranking against other candidates and it is not a comment on your
overall engineering ability.

If a future role at {company_name} matches your background, we would
welcome your application.

Best,
{recruiter_first_name}
```

Constraints baked into this template:

- One strength, one gap. No more.
- The phrase "not a ranking against other candidates" is mandatory, because it pre-empts the most common candidate response loop ("how did I compare").
- The phrase "not a comment on your overall engineering ability" is mandatory, because it isolates the feedback to this loop and pre-empts the "you said I am bad at engineering" escalation.
- "We would welcome your application" — neutral future language. Not "we will reach out", not "next time".

## Email format — generic decline (deny jurisdiction OR no consent OR no surfacable specific)

```markdown
Subject: Update on your {role_title} interview at {company_name}

Hi {candidate_first_name},

Thank you for the time you invested in our interview process. We
appreciated the care you put into each stage.

After the team's debrief, we have decided not to move forward with
your candidacy for this role.

If a future role at {company_name} matches your background, we
would welcome your application.

Best,
{recruiter_first_name}
```

This is the safe default. The skill writes this template byte-for-byte when:

- `jurisdiction_policy` returned `unsolicited_feedback: deny` and `feedback_requested: false`
- step 3 surfaced no rubric dimension with both `mean ≤ 2` AND a verbatim evidence string
- a legal flag on the candidate file is present
- the loop has under two signed-off scorecards

Generic decline is honest. Weak specifics are worse than no specifics.

## Call-notes format

```markdown
# Call notes — {candidate_first_name} {candidate_last_initial}. ({role_title})

## Frame
- Open with thanks for the time invested.
- Lead with the strength: {strength_phrasing_from_mapping}.
- Single gap: {gap_topic_from_approved_list}. One sentence, no piling
  on.

## Suggested phrasing for the gap

"{gap_phrasing_from_mapping}"

## Likely candidate questions

Q: "Was there anything I could have done differently?"
A: Acknowledge the question. Refer back to the single gap. Do NOT
add new feedback dimensions on the call — anything not in the
written draft is off-script and creates inconsistency risk.

Q: "Will you keep me in mind for future roles?"
A: Yes if true; specifics on what kind of role. Do NOT promise a
timeline.

Q: "Can I get a second-look interview?"
A: No. The decision is final. The recruiter reiterates appreciation
and closes.

Q: "Who else interviewed?"
A: Decline. Interviewer identities are protected. "I cannot share
that, but I can tell you the team weighed the input from every
round."

Q: "What did interviewer X think?"
A: Decline. Same reason. "I cannot break out individual scores; the
decision was a team decision."

## Off-script

If the candidate raises a discrimination concern, comparative-ranking
question, or accommodation issue, the recruiter says "let me come
back to you on that" and routes to HR / counsel. The recruiter does
NOT improvise an answer.

## Call duration target

10-15 minutes. Past 20 minutes, the call is no longer feedback —
it is an extended negotiation about the decision, and that is not
a useful place to be.
```

## Audit-log line format

One JSON object per line in `audit/<YYYY-MM>.jsonl`:

```json
{
  "run_id": "uuid-v4",
  "candidate_id_hash": "sha256-of-candidate-id",
  "role_id": "role-slug",
  "jurisdiction": "US-CA",
  "feedback_requested": true,
  "route": "email",
  "rubric_sha256": "abcdef...",
  "blocklist_sha256": "abcdef...",
  "mapping_sha256": "abcdef...",
  "dimensions_surfaced": ["technical_depth"],
  "blocklist_hits": 0,
  "model_id": "claude-sonnet-4-5",
  "timestamp": "2026-05-03T14:00:00Z"
}
```

No raw candidate ID, no candidate name, no scorecard text, no draft text. The audit log is for run reproducibility, not data retention. Candidate-facing drafts live in `drafts/<id>.md` under the recruiter's own retention policy.

## Last edited

{YYYY-MM-DD}