claude-skill

Claude を使ったブール検索・X-ray 検索ビルダー

Difficulty

初級

Setup time

20min

For

sourcer · recruiter · talent-acquisition

Recruiting & TA

Stack

構造化されたロール受付票（必須要件、あると望ましい要件、NGシグナル、勤務地ポリシー）を受け取り、3 つのキャリブレーション済み検索アーティファクトを生成する Claude スキルです。hireEZ のブール文字列、LinkedIn / GitHub / Stack Overflow 向けの Google X-ray クエリ、そして構造化フィルター付きの Juicebox PeopleGPT プロンプトを生成します。各クエリには期待されるプール規模の範囲と、捉えている次元・捉えていない次元が記載されているため、ソーサーは 3 つのツールで同じクエリを実行して 3 つの異なる形のプールを得るのではなく、ロールに合ったチャンネルを選択できます。

使用するタイミング

新しいロールを開始する際、3 つのソーシングチャンネルを手動で 3 つの異なるクエリを作成せずに並行してシードする必要がある場合。
低い成果の検索をチューニングする際 — 現在のクエリが 4,000 件または 12 件の結果を返しており、どちらも使えない — 問題が同義語カバレッジ、NOT 句、または勤務地フィルターにあるかどうかをテストする必要がある場合。
ジュニアソーサーをキャリブレーションする際。スキルの構造化された出力により、各クエリでどのシグナルが除外作業を担っているかが可視化されます。これはブールトレーニングが通常スキップする部分です。

使用しないタイミング

シグナルの判断をソーサーから奪う場合。 スキルはルーブリックをクエリに変換しますが、ルーブリックを作成しません。ロール受付票が 2 つの箇条書きであれば、クエリは 2 つの箇条書きの 3 つの風味になり、推測より良い候補者を返しません。
LinkedIn を大規模にスクレイピングする場合。 X-ray クエリは、リクルーターが手動でレート制限を設けながら公開インデックスされた表面に対して時々使用するためのものです。スキルは一括ページネーションについて警告し、拒否します。公開 LinkedIn URL を通じた本番ソーシングは、hiQ 判決に関わらず ToS 違反です。
ダイバーシティスレートの構築。 ブール検索はプロキシ用語（学校名、グループ所属）を通じてバイアスをエンコードする可能性があります。検索クエリではなく、結果の候補者プールにダイバーシティスレート監査ツールを使用してこれを検出してください。
機密性の高い役員検索。 共有された検索履歴やブラウザキャッシュに残るクエリはリスクになります。検索履歴を無効にして手動で実行してください。

セットアップ

バンドルを配置する。 apps/web/public/artifacts/boolean-search-builder-claude-skill/SKILL.md を Claude Code のスキルディレクトリまたは claude.ai のカスタムスキルに配置します。
ロール受付票を作成する。 references/1-role-intake-template.md をコピーして、すべてのプレースホルダーを置き換えます。受付票は、必須要件（二値、AND として使用）、あると望ましい要件（加算的、ランキングに使用）、NGシグナル（NOT として使用）、勤務地ポリシー（構造化フィルターに解析）を区別します。
同義語の深さを設定する。 スキルのデフォルトは次元あたり 5 つの同義語です。自然言語ラベルが曖昧なニッチなロール（例：「プラットフォームエンジニア」は会社によって意味が異なる）では 7〜8 に増やします。10 でキャップしてください — それを超えると偽陽性プールを返します。
まずクローズしたロールで実行する。 先四半期にソーシングしたロールのクエリを生成します。スキルが選択した同義語セットと実際に使用した同義語を比較します。スキルが明らかに関連する肩書きを見逃したり、あり得ないものを含めたりした場合はロール受付票を調整します。

スキルの実際の動作

5 ステップです。順番が重要です：ルーブリックの事前チェックが最初に実行されます。保護されたクラスのプロキシを含むルーブリックは、それをエンコードするクエリを生成するからです。

受付票を検証する（references/2-rubric-fairness-checklist.md に対して）。ロール受付票に学校ランクのスコアリング、名前パターンフィルタリング、雇用ギャッペナルティ、行動的なアンカーのない「カルチャーフィット」が含まれている場合は停止します。チェックは受付票の解析時に実行され、クエリ生成時ではありません。違反するルーブリックは同義語拡張ステップに到達しません。
次元ごとに同義語を拡張する。 各必須要件について、業界の使用法（肩書き、フレームワーク名、認定資格）に基づいた 5〜10 の同義語を生成します。同義語ごとに理由を引用して、ソーサーがクエリ構築前にあり得ないものを削除できるようにします。同義語は発明されません。モデルが名前付きの使用法で同義語を根拠付けられない場合は省略されます。
3 つのクエリを並列で構築する。 hireEZ ブール — 括弧付きの明示的な AND/OR/NOT グループ化、5 同義語上限、hireEZ の構造化フィルターに解析された勤務地（フリーテキストではなく）。Google X-ray — タイトルを引用符で囲んで site:linkedin.com/in または site:github.com を使用し、NGシグナルには - フィルターを使用。Juicebox PeopleGPT — レベルと勤務地の構造化フィルターを持つ自然言語プロンプト。各クエリはチャンネルの強みをターゲットにしています。同じロールが 3 つのチャンネルすべてで同一に説明されるわけではありません。
プールサイズ範囲を推定する。 各クエリについて、期待されるプールサイズ範囲（例：「この地域で hireEZ で 200〜800 件の結果」）と前提を名前を挙げて返します。範囲は同義語数と勤務地フィルターに基づいてキャリブレーションされています。ソーサーはクエリを実行して驚くのではなく、範囲に基づいて絞り込むか広げるかを判断できます。
次元のカバレッジギャップを表面化する。 各クエリに捉えていないものに関するアノテーションが付きます — 通常は応答可能性（新鮮度フィルターなし）、レベル（ブール検索では「シニア IC スコープ」を簡単にエンコードできない）、または行動シグナル（「移行をリードした」を表すブール表現はない）。出力はギャップを可視化するため、ソーサーは次のステップを計画できます。

コストの実態

ロール受付票から 3 つのクエリまで、Claude Sonnet 4.6 で：

LLM トークン — 通常 4〜7K 入力（受付票 + スキル指示 + 例）と 2〜3K 出力（3 つのクエリ + 同義語ごとの理由 + プールサイズ推定）。Sonnet 4.6 の定価で、ロールあたり約 0.04〜0.07 USD。四半期に 30 件のロールを実行するソーサーのモデルコストは 1〜2 USD です。
チャンネルコスト — クエリで何をするかによって異なります。実行するとどちらにせよ使っていたチャンネルクォータを消費します。スキル自体はソーシング API を使用しません。
ソーサーの時間 — これが勝利です。手作業で 3 つのキャリブレーション済みクエリを作成するには 1 ロールあたり 30〜60 分かかります。スキルは同義語の理由を読んであり得ないものを削除する時間を含めて 5〜10 分です。低成果の検索チューニングでより大きな時間節約があります。スキルのプールサイズ推定により診断ループが可視化されます。
セットアップ時間 — 一度だけ 20 分。ロール受付票テンプレートが結束するアーティファクトです。すでに構造化された受付票を書いているチームは 1 つのロール開始サイクルでスキルを採用します。

成功の指標

ソーシングされたロールごとに 2 つの数字を追跡します：

ファーストパス収率 — クエリされたプールからルーブリックランキングステップ（ソーサーのプロセスまたは候補者ソーシングスキル経由）を通過する候補者の割合。キャリブレーション済みクエリでは 25〜50% であるべきです。15% 未満は同義語拡張が緩すぎる意味です。60% 超は狭すぎる意味です。
プールサイズ推定精度 — チャンネルが実際に返したプールサイズ対スキルの推定。よく知られた地域では、範囲の ±50% 以内に収まるべきです。それ以上のドリフトはロールの特殊性に対して同義語数が間違っていることを意味します。

代替案との比較

hireEZ の AI Match（組み込みクエリ提案）と比較して — hireEZ の提案は優れており、製品内 UX はスキルからコピーペーストするより速いです。hireEZ の中に生活しているなら AI Match を選択してください。複数のチャンネル（同じロールが hireEZ、Juicebox、X-ray に一貫した基準で当たる）のためにキャリブレーション済みクエリが必要な場合、またはジュニアソーサーのトレーニングのために同義語の理由を可視化したい場合はスキルを選択してください。
ChatGPT 式の「ブールクエリを書いて」と比較して — 汎用チャットは同義語ごとの理由、プールサイズ推定、チャンネル特有のチューニング、フェアネス事前チェックなしで 1 つのブール文字列を返します。スキルは構造的に異なります：次元を別々のフィールドに強制し、偏ったルーブリックを拒否し、カバレッジギャップを表面化します。
ブールチートシートテンプレートと比較して — テンプレートはテンプレートに一致する 80% のロールで機能し、テンプレートの前提が間違っている 20% のロール（ニッチなスタック、ハイブリッド IC/マネージャースコープ、規制産業）で 4,000 件のゴミクエリを生成します。スキルはそれらのエッジケースの診断です。
手作業によるクエリ作成と比較して — ソーサーのヒューリスティックがすでに同義語をエンコードしている安定した反復可能なルーブリックを持つロールには手作業が正解です。スキルは新規ロールや低成果の検索チューニングでそのセットアップコストを取り戻します。

注意事項

プロキシ用語によるバイアスのエンコード。 ガード： ステップ 1 のフェアネス事前チェックは、ロール受付票が保護されたクラスのプロキシを指定した場合に停止します。学校の名声は特に問題です：特定の学校を must_have として列挙しないでください。それらの学校が相関する傾向がある技術的な深さのシグナルを列挙し、同義語拡張がその深さを持つ非ターゲット校の卒業生を捉えるようにしてください。
X-ray 上での LinkedIn ToS 露出。 ガード： X-ray クエリ出力には「手動使用のみ」の警告と、Recruiter API または Juicebox に切り替える前に 1 クエリあたり最大 50 ページフェッチの推奨が記載されています。スキルはスクレイピングスクリプトを生成しません。
同義語の幻覚。 ガード： 出力の全同義語はソースとなる理由を引用します（「Stripe / Plaid / フィンテック企業で一般的に使用される」「2022 年に導入されたフレームワーク、名前が異なる」）。根拠のある理由のない同義語はクエリ構築前に削除されます。ソーサーが引用された同義語が実際の使用法と一致しないと判断した場合、それはロール受付票を調整するシグナルです。
プールサイズ推定のドリフト。 ガード： 推定は数値ではなく範囲であり、前提とする地域と同義語数でアノテーションされています。実際の結果が範囲から >2× ずれた場合は、ドリフトを記録して再チューニングしてください。推定を測定値として扱って行動しないでください。
高速に変化するスタックでの古い同義語。 ガード： スキルの同義語ソースには「最終確認日」チェックが含まれます。非常に新しいロール（例：2025〜2026 年に肩書きが変わった AI インフラポジション）では、スキルは同義語に「2024 年以降の使用法；チャンネルで確認」とフラグを立て、確言せずに明示します。

スタック

スキルバンドルは apps/web/public/artifacts/boolean-search-builder-claude-skill/ にあります：

SKILL.md — スキル定義（いつ呼び出すか、入力、方法、出力フォーマット、注意事項）
references/1-role-intake-template.md — ロールごとに記入可能なテンプレート
references/2-rubric-fairness-checklist.md — 事前チェック（偏ったルーブリックを通過させるために編集しないこと）
references/3-channel-query-formats.md — チャンネルごとの構文メモ（hireEZ、X-ray、Juicebox）

ワークフローが前提とするツール：Claude（モデル）、hireEZ と Juicebox（検索チャンネル）。返されたプールのランキングには候補者ソーシングスキルを参照してください。

関連コンセプト：AI ソーシング、パッシブ候補者ソーシング、採用ファネル指標。

GitHubでこのページを編集

Files in this artifact

Download all (.zip)

---
name: boolean-search-builder
description: Translate a structured role intake (must-haves, nice-to-haves, anti-signals, location policy) into three calibrated search queries — a hireEZ Boolean string, a Google X-ray query, and a Juicebox PeopleGPT prompt. Each query is annotated with expected pool-size band and dimensional coverage gaps so the sourcer can pick the channel for the role.
---

# Boolean and X-ray search builder

## When to invoke

Use this skill when a sourcer or recruiter hands you a role intake and wants three calibrated search queries — one per channel — without authoring them by hand. Take a structured role intake (a Markdown file with must-haves, nice-to-haves, anti-signals, location policy) as input, and return three queries with synonym reasoning, pool-size estimates, and dimensional coverage gaps.

Do NOT invoke this skill for:

- **Authoring the role rubric.** This skill turns a rubric into a query. If the rubric is two bullet points, the queries will be three flavors of two bullet points and will not return better candidates than guessing. Get the rubric right first; then come back.
- **Bulk-paginating LinkedIn through the X-ray query.** The X-ray output is for occasional manual use against the public-indexed surface. Production sourcing through public LinkedIn URLs is a ToS violation regardless of the *hiQ v. LinkedIn* settlement. The skill refuses to generate scraping scripts.
- **Diversity slate construction.** Boolean queries can encode bias through proxy terms (school name, group affiliation). Use a slate auditor on the returned pool, not the search query, to catch this.
- **Confidential or executive searches.** Queries leaving traces in shared search histories are an exposure risk.

## Inputs

- Required: `role_intake` — path to the role intake Markdown file. Use the template in `references/1-role-intake-template.md`. Without this the skill refuses to run.
- Optional: `synonym_depth` — integer, 5 default, 7-8 for niche roles, hard max 10. Above 10 the queries return false-positive pools that take longer to filter than they save.
- Optional: `channels` — subset of `["hireez", "xray", "juicebox"]`. Default is all three. Use a subset when the team only has access to a subset.
- Optional: `geography_hint` — free-text geography (e.g. "US Pacific time zone, hybrid SF") used in the pool-size estimate. If not provided, the skill reads it from the intake's location-policy field.

## Reference files

Always read these from `references/` before generating queries:

- `references/1-role-intake-template.md` — the structure the skill expects on input. If the user's intake doesn't match this shape, surface the gap and ask for a re-author rather than guessing.
- `references/2-rubric-fairness-checklist.md` — the patterns that, if present in the intake, halt the skill before query generation. Do not edit to make biased intakes pass.
- `references/3-channel-query-formats.md` — per-channel syntax notes (hireEZ Boolean operators, X-ray format conventions, Juicebox PeopleGPT prompt patterns).

## Method

Five steps, in order. Steps 1-2 are validation and grounding; steps 3-5 produce the output. The order matters: if the intake fails fairness pre-flight, the skill must not reach the synonym-expansion step, because synonyms generated against a biased intake encode the bias into the queries themselves.

### 1. Validate the intake

Open `role_intake` and run every check in `references/2-rubric-fairness-checklist.md`. If the intake includes school-tier scoring, name-pattern filtering, employment-gap penalties, photo-presence requirements, or "culture fit" without behavioral anchors, halt and return the offending lines. Do not proceed.

The check runs at intake-parse time, not query-generation time, so a violating rubric never reaches synonym expansion. School-prestige scoring is the most common bias-amplification path here; if the user insists on a school-tier dimension, redirect them to the underlying technical-depth signal that's actually doing the prediction work.

### 2. Expand synonyms per dimension

For each `must_have`, generate `synonym_depth` synonyms. Each synonym must be grounded — cite the reasoning ("commonly used at Stripe / Plaid / fintech engineering teams," "framework introduced in 2022, naming varies between vendors"). If you cannot ground a synonym in named usage, omit it. Do not invent.

Cap the synonym set at 10 even if `synonym_depth` is set higher; beyond 10 the false-positive rate climbs faster than the recall, and queries return unmanageable pools.

For `nice_to_have`, generate up to 3 synonyms each — these are used to rank, not to filter, so over-expansion is less costly.

For `anti_signals`, do not expand. Anti-signals as NOT clauses should match exactly to avoid over-eliminating; the user knows what they meant.

### 3. Build three queries in parallel

Author the three queries against the channel-specific format in `references/3-channel-query-formats.md`. Each channel gets a query tuned to its strengths:

- **hireEZ Boolean** — explicit `AND`/`OR`/`NOT` grouping with parentheses. Title field, skill field, and exclude field used separately rather than crammed into one Boolean string. Location goes into the structured location filter, never into the Boolean. Synonyms grouped per dimension with `OR`.
- **Google X-ray** — `site:linkedin.com/in` (or `site:github.com` for engineering roles where the GitHub signal is stronger). Title in quotes. Anti-signals as `-` exclusions. Synonyms via `OR` operator. The X-ray output is annotated with a "manual use only" warning and a recommended max of 50 page-fetches per query before switching to the Recruiter API.
- **Juicebox PeopleGPT** — natural-language prompt that names the role, level, and key signals in plain English. Location and level go into Juicebox's structured filters, not the prompt. Synonyms inform the prompt's wording but are not enumerated; PeopleGPT's underlying expansion handles that.

The three queries describe the *same role* but are tuned to different retrieval mechanics. Do not generate the same Boolean and label it three ways.

### 4. Estimate pool size band

For each query, return an expected pool-size band with the assumptions named:

```
hireEZ: 200-800 results
Assumes US Pacific + Mountain time zones; 5 synonyms per dimension;
Senior IC level filter applied. Tighten by removing the broadest
synonym in skill_match if results exceed 800.
```

The band is calibrated against the synonym count and the location filter. It is an estimate, not a measurement. Sourcers can tighten or widen based on the band rather than running the query and being surprised.

### 5. Surface dimensional coverage gaps

Every query is annotated with what it is NOT catching. The common gaps are:

- **Response-likelihood** — Boolean and X-ray have no recency filter beyond what the channel exposes. Annotate which channels' UIs let the sourcer filter on profile-update recency post-search.
- **Level / scope** — Boolean cannot easily encode "Senior IC scope across two teams." Note that level filtering happens via structured filters in the channel UI, not in the Boolean.
- **Behavioral signals** — no Boolean expression captures "led a migration." Note that this dimension is best handled by rubric ranking on the returned pool, e.g. via the `candidate-sourcing` skill.

The output must make the gap visible so the sourcer plans the next step.

## Output format

```markdown
# Search queries — {role title}

Intake: `{path}` · Synonym depth: {n} · Generated: {ISO timestamp}

## hireEZ Boolean

**Title field:**
"Senior Backend Engineer" OR "Staff Engineer" OR "Senior Software Engineer"

**Skill field:**
("Go" OR "Golang" OR "Rust") AND ("distributed systems" OR "microservices" OR "event-driven")

**Exclude field:**
"contractor" OR "freelance"

**Location filter (structured):** US Pacific Time, US Mountain Time

### Pool-size estimate
200-800 results.
Assumes US Pacific + Mountain time zones; 5 synonyms per dimension; Senior IC level filter applied. Tighten by removing "microservices" if >800.

### Coverage gaps
- Response-likelihood: filter on profile recency in hireEZ UI after the search.
- Level / scope: confirm "Senior IC vs Manager" in the rubric ranking step; Boolean can't encode it.

## Google X-ray

⚠️ Manual use only. Cap at ~50 result-page fetches per query before switching to a sourcing-tool API. Public LinkedIn scraping at scale violates ToS.

```
site:linkedin.com/in "Senior Backend Engineer" OR "Staff Engineer"
("Go" OR "Golang" OR "Rust") "distributed systems"
-"contractor" -"freelance"
```

### Pool-size estimate
50-300 indexable results.
LinkedIn's robots.txt limits which profiles Google indexes; X-ray surfaces a fraction of the population.

### Coverage gaps
- Recency: Google index lag is 1-3 months on profile updates.
- Level: title quoting catches some but not all level conventions; expect overlap with junior roles.

## Juicebox PeopleGPT

```
Find Senior Backend Engineers in the US Pacific or Mountain time zones
who own production Go or Rust services in distributed systems
contexts. Prefer candidates who have led a service rewrite or
migration with named outcomes. Exclude contractors. Senior IC scope.
```

**Structured filters:** Level = Senior IC. Location = US Pacific, US Mountain.

### Pool-size estimate
80-400 results in PeopleGPT.
Juicebox tends to return tighter pools than hireEZ for the same intake; expect about half the volume.

### Coverage gaps
- Behavioral signal: PeopleGPT picks up "led a migration" loosely; verify in rubric ranking.

## Synonym reasoning (audit trail)

- "Golang" / "Go" — interchangeable; "Go" alone collides with too many false positives in profile text. Pair them with `OR`.
- "distributed systems" / "microservices" / "event-driven" — three labels for overlapping but distinct architectural styles. "microservices" is broader (commonly used at non-distributed-systems shops); flag for tightening if results exceed 800.
- (additional synonyms with reasoning…)
```

## Watch-outs

- **Bias-encoding through proxy terms.** *Guard:* the fairness pre-flight in step 1 halts the skill before any synonym expansion if the intake names protected-class proxies. School-prestige in particular: do not list specific schools; list the technical-depth signal those schools tend to correlate with, and let the synonym expansion catch graduates from non-target schools who have the depth.
- **LinkedIn ToS exposure.** *Guard:* the X-ray output carries a "manual use only" warning and a recommended page-fetch cap. The skill does not generate scraping scripts.
- **Synonym hallucination.** *Guard:* every synonym cites grounded reasoning. Synonyms without reasoning are dropped before the query is built.
- **Over-expansion → garbage pools.** *Guard:* hard cap at 10 synonyms per dimension, with the warning in step 2 about false-positive rate climbing faster than recall.
- **Pool-size estimate treated as measurement.** *Guard:* output is always a band with assumptions named, never a single number. If actual results diverge >2× from the band, the synonym count or location filter is wrong; the skill should be re-run with a tightened intake, not the query patched in place.

# Role intake template

Copy this file to your sourcing repo as `intake/<role-slug>.md` and fill in every section. The Boolean search builder skill reads this shape; deviations are surfaced as missing fields rather than guessed.

A complete intake takes 20-40 minutes per net-new role. For a role family you've sourced before, you usually start from the previous intake and edit 2-3 fields.

---

## Role title and level

- **Posted title:** {e.g. Senior Backend Engineer (Distributed Systems)}
- **Internal level:** {e.g. L5 / Senior IC / IC4 — be specific to your firm's ladder}
- **Reports to:** {Engineering Manager / Director / etc.}

## Must-haves (binary, used as AND in queries)

These are non-negotiable. If a candidate doesn't have one of these, they don't progress, regardless of strength on other dimensions. Limit to 4-6 — more than that and the pool collapses.

For each must-have, write:
- The capability or experience as the candidate would describe it (resume voice, not internal jargon).
- The minimum threshold (years, scope, scale).
- Why it's a must-have (the failure mode if you hire without it).

Example:
- **Production Go or Rust experience, 3+ years.** This role owns latency-critical services. Without production Go/Rust, ramp time exceeds quarter and the on-call rotation can't absorb a junior pattern.
- **Owned a distributed-system migration end-to-end.** Sequencing changes across services with rollback plans is the daily work; without ownership signal, the candidate hasn't internalized the failure modes.

(your must-haves here)

## Nice-to-haves (additive, used to rank, NOT to filter)

Signals that increase confidence but don't gate. Limit to 5-8.

Example:
- Open-source contribution to a distributed-system project (Kafka, Temporal, NATS, etc.)
- Speaker / writer presence (conference talk, technical blog) — signals communication ability under scrutiny.
- Prior experience at a similar-scale company (10K-100K req/sec).

(your nice-to-haves here)

## Anti-signals (used as NOT — exact match, not expanded)

Things that, when present, lower confidence. Anti-signals are NOT clauses, not silent disqualifications. The skill uses them exact-match to avoid over-eliminating.

Example:
- Resume describes job-hopping (>3 jobs in 4 years without an explanation in the application).
- "Full-stack" as the only architectural framing — this role needs a depth signal, not a breadth one.
- Contractor / freelance current title (W-2 candidates only for this opening).

(your anti-signals here)

## Location policy

- **Geography:** {US Pacific Time + US Mountain Time / EU including UK / EMEA / etc.}
- **Remote / hybrid / onsite:** {fully remote / 2 days in {office} / fully onsite}
- **Work authorization:** {US citizen or green card / will sponsor H-1B / EU work auth required / etc.}
- **Time-zone overlap requirement:** {minimum N hours overlap with {office time zone}}

## Compensation band (for response-likelihood calibration)

- **Base:** {$X-$Y}
- **Equity:** {early-stage % / RSU $ value at strike / not applicable for cash-only roles}
- **Bonus / on-target earnings:** {if applicable}

The Boolean search builder does not include comp in queries (NYC LL 32-A, CO/CA/WA pay-transparency laws — comp belongs in the public posting, not in the search query). It uses the band only to calibrate the response-likelihood synonyms.

## Hiring manager intent (free text, ≤200 words)

What is the hiring manager actually trying to solve? "We need three senior engineers" is the requisition. "We're rebuilding the routing layer because the current synchronous design caps throughput at 8K req/sec and the next quarter's traffic forecast pushes 15K" is the intent.

The intent helps the synonym expansion in step 2: synonyms grounded in the actual problem catch better candidates than synonyms grounded in the title.

## Channels available

- [ ] hireEZ (account, plan tier, monthly query quota)
- [ ] Juicebox PeopleGPT (account, plan)
- [ ] LinkedIn Recruiter (seats available)
- [ ] Public X-ray only (no paid sourcing tool — the X-ray query is the primary output)

The skill generates queries for all three by default; if a channel is unavailable, set the `channels` skill parameter on invocation.

# Fairness pre-flight checklist

This is the gate the Boolean search builder runs against the role intake before generating any query. If any check fails, the skill halts and surfaces the offending content to the user. Do not edit this file to make a violating intake pass — edit the intake to remove the proxy.

The intent: a search query that encodes a protected-class proxy will return candidates filtered on that proxy. The downstream rubric ranking and human review will not catch it because the biased filter happened upstream of the pool. The fix has to be at the query layer.

## Halt conditions (any match → halt)

### School-tier or institution scoring as a standalone signal

- Any reference to "Tier 1" / "T1" / "elite" schools as a must-have or nice-to-have.
- Lists of specific universities as a positive or negative filter ("Stanford, MIT, CMU only" or "no bootcamp grads").
- "Top X schools" framing.

**Fix:** identify the underlying technical-depth signal that schools tend to correlate with (algorithmic depth, systems coursework, research exposure) and score on the signal. Graduates from non-target schools who have the depth pass; graduates from target schools who don't, fail.

### Name-pattern filtering

- Filtering or scoring based on candidate name (transliteration patterns, ethnic-origin inference).
- "Native English speaker" or related framings (legal exposure: national-origin discrimination under EEOC).

**Fix:** if language proficiency is required for the role (technical writing, customer-facing), score on demonstrated communication output (blog posts, talks, public PRs), not on name or self-reported proficiency.

### Employment-gap penalties

- Anti-signals or scoring deductions for employment gaps without context.
- "No gaps over 6 months" as a binary filter.

**Fix:** EEOC has guidance that blanket gap penalties have disparate impact (caregiving, illness, military reservist activation, parental leave). If continuous tenure is genuinely required (security-clearance roles), name the actual constraint instead of using a gap proxy.

### Photo presence or photo-based scoring

- Any reference to candidate photos, "professional appearance" requirements.
- Filtering by presence/absence of LinkedIn profile photo.

**Fix:** there is no fix — drop the dimension. Photo-based scoring has no defensible business reason in technical hiring.

### Age inference (graduation year, "early career," "experienced professional")

- Filtering on graduation year as an age proxy.
- "10+ years experience" combined with no senior-scope dimension (often a pretextual age filter).
- "Recent graduate" as a must-have when the role is non-entry-level.

**Fix:** scope-based scoring (Senior IC, Staff scope, Founding-engineer scope) captures the actual signal without the age proxy.

### Pregnancy or parental-status inference

- "Available for full-time work" or "no parental leave plans" framings.
- Filtering based on family status indicators.

**Fix:** there is no fix — drop the dimension. Pregnancy/parental discrimination is illegal in most jurisdictions and has zero defensible role in a search query.

### "Culture fit" without behavioral anchors

- "Culture fit" or "fits our culture" as a standalone dimension.
- Vague affinity signals ("plays sports" / "likes craft beer" / etc.).

**Fix:** name the specific behavioral signal — "communicates ambiguity early," "ships under deadline pressure without quality regression," "challenges plans with evidence" — and score on observable behavior.

### Group-affiliation filtering as positive or negative

- Filtering by political affiliation, religious affiliation, sexual orientation, gender identity, disability status, veteran status (except where law explicitly requires veteran preference, which is documented separately).
- Filtering by membership in / absence from professional organizations that correlate with protected classes.

**Fix:** drop the dimension. None of these have a defensible search-query use case.

## What halts surfaces

When a halt condition matches, return:

```
HALT: rubric_failed_fairness_preflight

Offending lines from the intake:
L{n}: {line content}

Halt category: {category from above}

Suggested fix: {category-specific fix from above}

The skill will not generate queries until the intake is revised.
```

## Why this is non-negotiable

1. **Legal exposure.** NYC Local Law 144 requires bias audits for AI hiring tools. EU AI Act categorizes hiring AI as high-risk. EEOC has issued guidance on AI hiring. A search query is part of the AI hiring decision pipeline.

2. **Technical correctness.** A biased query returns a biased pool. No amount of downstream rubric ranking or human review fixes the upstream filter; you are choosing among a pre-filtered subset that doesn't include the candidates the proxy rejected.

3. **Audit defensibility.** Under any of the above legal frameworks, the firm needs to demonstrate that the search criteria don't encode protected-class proxies. The pre-flight log entry is part of that demonstration.

# Channel-specific query formats

This file documents how to author queries for each of the three supported channels. The Boolean search builder skill uses these conventions; if the user wants to swap or add a channel, this file is what changes.

Per-channel notes are written as the channel's own quirks would be: hireEZ's Boolean is permissive, Google X-ray is brittle, Juicebox PeopleGPT prefers natural language. The skill respects each.

## hireEZ

### Boolean operators

- `AND`, `OR`, `NOT` — uppercase, with parentheses for grouping.
- `"exact phrase"` — double quotes for multi-word matches.
- `*` wildcard — supported but degrades ranking; avoid.
- Field qualifiers: hireEZ exposes title, skill, location, education, current-company, past-company. Use the per-field input rather than cramming everything into one Boolean string.

### Quirks

- Synonym expansion is server-side. hireEZ runs its own synonym map; if you list "Senior Engineer" it will silently include "Sr. Engineer" and "Sr Engineer." Don't double-count.
- Location goes in the structured location filter, NOT in the Boolean. Free-text location in the Boolean returns inconsistent results.
- Boolean strings over ~15 terms degrade relevance ranking. Cap synonyms at 5 per dimension; if you need more, split into multiple saved searches.

### Output shape from the skill

```
Title field:
"Senior Backend Engineer" OR "Staff Engineer" OR "Senior Software Engineer"

Skill field:
("Go" OR "Golang" OR "Rust") AND ("distributed systems" OR "microservices" OR "event-driven")

Exclude field:
"contractor" OR "freelance"

Location filter (structured):
Time zones: US Pacific Time, US Mountain Time
Remote OK: yes
```

## Google X-ray

### Operators

- `site:linkedin.com/in` — restricts to LinkedIn member profiles.
- `site:github.com` — for engineering roles where the GitHub signal is stronger than LinkedIn.
- `"exact phrase"` — double quotes work.
- `OR` — uppercase, otherwise treated as a literal.
- `-term` — exclusion.
- `inurl:` — useful for narrowing to specific subdomains.

### Quirks

- LinkedIn's `robots.txt` excludes a large share of profiles from Google indexing. X-ray surfaces a fraction of the actual LinkedIn population (estimate: 10-30% of US member profiles are X-ray-indexable).
- Google index lag on profile updates is 1-3 months. Recently-updated profiles are systematically under-represented.
- Result page count is unreliable above 500. Above ~50 page-fetches per query the source IP starts seeing CAPTCHAs; production sourcing through this channel violates LinkedIn ToS.
- The skill annotates X-ray output with a "manual use only" warning and a recommended max of 50 page-fetches per query.

### Output shape from the skill

```
⚠️ Manual use only. Cap at ~50 result-page fetches per query.

site:linkedin.com/in "Senior Backend Engineer" OR "Staff Engineer"
("Go" OR "Golang" OR "Rust") "distributed systems"
-"contractor" -"freelance"
```

## Juicebox PeopleGPT

### Format

- Natural-language prompt that names the role, level, and key signals in plain English.
- Structured filters (level, location, time zone, current-company size band) used in addition to the prompt — these are reliable.
- Anti-signals are described in the prompt ("exclude contractors and freelancers") rather than encoded as Boolean NOT.

### Quirks

- PeopleGPT runs its own synonym expansion. Listing 5+ synonyms in the prompt produces over-narrow pools; the model interprets the listing as a tighter intent than a single canonical term plus reliance on its own expansion.
- Behavioral signals ("led a migration") are picked up loosely — PeopleGPT will surface candidates whose profiles describe ownership work, but exact-phrase matching is not guaranteed.
- Pool sizes tend to be tighter than hireEZ for the same intake (estimate: ~50% of hireEZ volume).

### Output shape from the skill

```
Find Senior Backend Engineers in the US Pacific or Mountain time zones
who own production Go or Rust services in distributed systems contexts.
Prefer candidates who have led a service rewrite or migration with named
outcomes. Exclude contractors and freelancers. Senior IC scope.

Structured filters:
Level: Senior IC
Location: US Pacific, US Mountain
Currently employed: yes
Profile updated within: 90 days
```

## Adding a new channel

To add a fourth channel (e.g. SeekOut, AmazingHiring, Loxo):

1. Add a section to this file with the operators, quirks, and output shape.
2. Update the `channels` parameter in `SKILL.md` to include the new channel name.
3. Update the example output in `SKILL.md` to show what a query for the new channel looks like.

The skill will pick up the new channel automatically once it can read the format from this file.