mcp-server

Greenhouse MCP server for recruiting workflows

Difficulty

advanced

Setup time

60min

For

recruiter · recruiting-engineer · talent-acquisition

Recruiting & TA

Stack

A Model Context Protocol (MCP) server that exposes Greenhouse Harvest API as read-mostly tools to Claude Desktop / Claude Code / any MCP-compatible client. Six read tools cover the daily recruiter questions (“which candidates are stuck in stage X for >Y days?”, “what’s the funnel for this role?”, “show me this candidate’s history”), one cautious write tool surfaces stage-stuck candidates for the recruiter to act on. Designed for the recruiter who lives in Claude and wants their ATS state without context-switching, and for the recruiting engineer building agentic workflows that need ATS read access.

The scaffold ships as a Python package importable from disk. It is NOT runtime-tested against a live Greenhouse tenant — the disclaimer is repeated in the README and at the top of server.py. Production use requires the recruiting engineer to wire credentials, rate-limit, and verify the dispatched calls against a non-production Greenhouse environment first.

When to use

The recruiter or recruiting engineer wants ATS state available in Claude conversations and is willing to install an MCP server (low-friction in Claude Desktop and Claude Code, more setup in custom MCP clients).
The team has Greenhouse Harvest API access (Harvest is the read-write API; Job Board is the public read-only one — this server uses Harvest).
Read-mostly access fits the use case. The server’s writes are limited to one cautious tool (note_stage_stuck) that adds an internal note; no candidate-state mutations are exposed by default.
Recruiting engineering or IT has the security posture to handle an API key with Harvest scope. The server’s audit log is the audit log.

When NOT to use

Production-ready, runtime-tested setup needed today. This is a scaffold. The READMEs say so; the docstrings say so. Use it as a starting point, not as a finished deployment.
Multi-tenant SaaS use. The server’s auth model is single-tenant (one API key, one Greenhouse instance). Multi-tenant requires non-trivial reshape.
Write-heavy workflows. The server is intentionally read-mostly. If the use case needs to move candidates between stages, post to job boards, or send candidate communications, those need separate per-tool security review and explicit per-tool justification per the recruiting cursor-rule guidance.
Storing candidate data outside Greenhouse. The server returns candidate data to the calling Claude session; the session’s data-handling posture is the recruiter’s responsibility. Do not log raw candidate names or PII into your own audit table — the audit log captures candidate_id only.
Bypassing the candidate-consent posture. Greenhouse’s data is candidate-consented for hiring purposes. Pulling it into agentic workflows does not extend that consent. Stay within the disclosed processing purposes.

Setup

Install the package. From apps/web/public/artifacts/mcp-server-greenhouse-recruiting/:
```
pip install -e .
```
The package is structured as a uv / pip-installable Python project with pyproject.toml.
Set credentials. Two env vars: GREENHOUSE_API_KEY (Harvest API key from Greenhouse → Configure → Dev Center → API Credential Management; pick read permissions on every Harvest verb you don’t need to write to) and GREENHOUSE_USER_ID_FOR_ON_BEHALF_OF (the user ID Greenhouse will attribute writes to, required for note_stage_stuck).

Register with the MCP client. For Claude Desktop, add to claude_desktop_config.json:

{
  "mcpServers": {
    "greenhouse-recruiting": {
      "command": "uv",
      "args": ["run", "greenhouse-recruiting-mcp"],
      "env": {
        "GREENHOUSE_API_KEY": "...",
        "GREENHOUSE_USER_ID_FOR_ON_BEHALF_OF": "..."
      }
    }
  }
}

For Claude Code, the equivalent goes in the project’s .claude/settings.json MCP block.

Sanity check against staging. Greenhouse offers a separate staging environment for paying customers. Wire the server against staging first. Run the included python -m greenhouse_recruiting_mcp.smoke command (a bundled non-runtime-tested check that the credentials authenticate and the rate-limit headers parse).
Production move. Only after staging validation, swap the env vars to the production API key. The server runs locally to the MCP client; no separate deployment needed for single-recruiter use. For team use, run in a shared container with a per-recruiter MCP gateway.

What the server exposes

Seven tools. Six are read; one is the cautious write. Per the recruiting cursor-rule guidance, writes need explicit per-tool justification — note_stage_stuck has it documented in server.py’s docstring.

Read tools

list_candidates_in_stage — given a job ID and a stage name, return the candidates currently in that stage with their last-touched-at timestamp. Useful for “who’s stuck in onsite-debrief?” queries.
get_candidate_history — given a candidate ID, return their stage history (entries, exits, timestamps, who moved them). Useful for context-loading before a recruiter screen.
list_jobs_open — list all open jobs with team, hiring manager, opened_at, target_close_date. Useful for the recruiter-leader’s “what are we working on” overview.
get_funnel_for_job — given a job ID, return the candidate count per stage. Useful for funnel-health checks.
list_jobs_stalled — list jobs where no candidate has progressed in N days (default 7). Useful for catching stalled reqs before the hiring manager notices.
search_candidates_by_attribute — given a custom-field name and value, return candidates matching. Useful for ad-hoc filtering Greenhouse’s UI doesn’t surface.

Write tool

note_stage_stuck — given a candidate ID and a free-text note, adds an internal note to the candidate’s record. Used to log “Claude flagged this candidate as stage-stuck for >14 days” so the action is visible in the audit trail and not silent. Per recruiting-engineer norms: every write produces an audit-trail entry attributed via the On-Behalf-Of header.

Cost reality

Greenhouse API quota — Harvest API is rate-limited at 50 req/10s per API key per IP. The server includes a token-bucket rate limiter (configurable, default 40 req/10s) that throttles before the limit. Bursts above this get 429s with no Retry-After header (Greenhouse’s documented behavior); the server’s backoff logic handles this.
LLM tokens — depend entirely on what the calling Claude session does with the data. The server itself returns structured JSON; the Claude session’s prompt budget is the cost.
Server hosting cost — runs locally to the MCP client. Zero ongoing cost for single-recruiter use. Team-wide deployment in a shared container is at-most a small VM ($5-15/month).
Setup time — 60 minutes including the staging sanity check and the MCP client registration. Recruiting-engineer time is the binding cost.

Success metric

Hard to measure directly. The honest metric:

Recruiter Claude-session count per week using the MCP — how many times per week the recruiter or recruiting engineer used a Claude session that called the MCP. If it’s fewer than 5 per week after a month, the use case isn’t there.
Average context-switch time saved per Claude session — qualitative; the recruiter’s own assessment of “how long would this question have taken without the MCP, in Greenhouse UI?” The MCP earns its setup cost when the answer is regularly >2 minutes per question.

vs alternatives

vs Greenhouse’s UI directly. UI is the right call when the recruiter is already in Greenhouse for other reasons. The MCP earns its setup cost when the recruiter is in Claude for other reasons (drafting outreach, summarizing notes, building Boolean queries) and pulling ATS state would otherwise be a context switch.
vs Greenhouse’s native chatbot integrations. Greenhouse offers Slack and other surface integrations that surface ATS state. Pick those if the team lives in Slack. Pick the MCP if the team lives in Claude.
vs DIY Python script against Harvest. Same data, but the MCP makes the data available to ANY MCP client (Claude Desktop, Claude Code, Cursor, others as MCP adoption spreads), not just to the script.
vs Greenhouse’s built-in API-direct querying. Possible for technical users, but every query is a curl-and-parse cycle. The MCP wraps that into tool-call form for Claude.

Watch-outs

Not runtime-tested against a live tenant. Guard: explicitly disclaimed in the README and in server.py module docstring. Production deployment requires the recruiting engineer to verify each tool against a staging tenant first. The bundled smoke test is a credentials/rate-limit check, NOT a tool-by-tool validation.
Rate limit exhaustion. Guard: token-bucket rate limiter in the server defaults to 40 req/10s (below Greenhouse’s 50 req/10s ceiling). Configurable; lower if other systems share the API key.
Candidate PII leakage to chat-model context. Guard: the server returns the data the API returns (including names and emails) to the Claude session. The session’s data-handling posture is the recruiter’s responsibility. The README explicitly says: don’t paste session transcripts into shared Slack channels.
Write-tool drift. Guard: only note_stage_stuck is exposed as a write. The other six tools have no write paths. If a recruiting engineer adds new write tools, the per-tool review template in the README must be filled out and the tool’s purpose documented in the tools/ registry section of server.py.
API-key scope creep. Guard: README documents the minimum Harvest verbs needed (read-only on candidates, applications, jobs, users; write on candidates.notes only). Wider-scope keys silently turn the server into a higher-blast-radius surface.
Multi-tenant configuration drift. Guard: server is single-tenant by design. Multi-tenant deployments require non-trivial reshape; the README disclaims this rather than papering over it.

Stack

The artifact bundle lives at apps/web/public/artifacts/mcp-server-greenhouse-recruiting/ and contains:

pyproject.toml — package metadata, dependencies, greenhouse-recruiting-mcp entrypoint
README.md — install, env vars, MCP client registration, sanity-check procedure, security model, known limits
src/greenhouse_recruiting_mcp/__init__.py — package init
src/greenhouse_recruiting_mcp/server.py — MCP server with seven tool definitions and dispatch implementations

Tools the workflow assumes you use: Greenhouse (the ATS), Claude (the MCP client). For the parallel Ashby MCP server, see the Ashby MCP. For broader recruiting-engineer guardrails, see the recruiting engineer cursor rule.

Related concepts: ATS vs recruiting CRM, recruiting tech stack.

Edit this page on GitHub

Files in this artifact

Download all (.zip)

[project]
name = "greenhouse-recruiting-mcp"
version = "0.1.0"
description = "Read-mostly MCP server exposing Greenhouse Harvest API to Claude / MCP-compatible clients."
readme = "README.md"
requires-python = ">=3.11"
license = { text = "MIT" }
authors = [
    { name = "ooligo authors" }
]
dependencies = [
    "mcp>=0.9.0",
    "httpx>=0.27.0",
    "pydantic>=2.7.0",
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0.0",
    "pytest-asyncio>=0.23.0",
    "ruff>=0.5.0",
]

[project.scripts]
greenhouse-recruiting-mcp = "greenhouse_recruiting_mcp.server:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src/greenhouse_recruiting_mcp"]

[tool.ruff]
line-length = 100
target-version = "py311"

# Greenhouse recruiting MCP server

A read-mostly MCP server exposing the Greenhouse Harvest API as tools to Claude Desktop, Claude Code, or any MCP-compatible client. Six read tools cover daily recruiter questions; one cautious write tool (`note_stage_stuck`) adds an internal note.

**This is a scaffold, not a runtime-tested production server.** The tool implementations are written against the Harvest API's documented shape, but the recruiting engineer is responsible for verifying each tool against a Greenhouse staging tenant before flipping production credentials. The disclaimer is repeated in `server.py`'s module docstring.

## Install

```bash
cd apps/web/public/artifacts/mcp-server-greenhouse-recruiting/
pip install -e .
# or
uv pip install -e .
```

The package exposes a `greenhouse-recruiting-mcp` CLI entrypoint.

## Environment variables

### `GREENHOUSE_API_KEY` (required)

The Harvest API key from Greenhouse → Configure → Dev Center → API Credential Management.

When generating the key, pick the **minimum scope** needed:

- `Get` permission on `Candidates`, `Applications`, `Jobs`, `Users`, `Departments`.
- `Post` permission on `Candidate Notes` (only — required for `note_stage_stuck`).

Wider scopes silently turn the server into a higher-blast-radius surface. If you find yourself adding a write permission for a new tool later, document the per-tool justification in `server.py`'s `TOOL_REGISTRY` docstring.

### `GREENHOUSE_USER_ID_FOR_ON_BEHALF_OF` (required)

The Greenhouse user ID that writes will be attributed to. Find this in the Greenhouse user URL when you're logged in as the user (e.g. `app.greenhouse.io/users/123456` → user ID is `123456`).

This is required even if you only use the read tools — the server validates it at startup so writes can't slip through unattributed later.

## MCP client registration

### Claude Desktop

Add to `claude_desktop_config.json` (location varies by OS — `~/Library/Application Support/Claude/` on macOS, `%APPDATA%\Claude\` on Windows):

```json
{
"mcpServers": {
"greenhouse-recruiting": {
"command": "uv",
"args": ["run", "greenhouse-recruiting-mcp"],
"cwd": "/absolute/path/to/mcp-server-greenhouse-recruiting",
"env": {
"GREENHOUSE_API_KEY": "...",
"GREENHOUSE_USER_ID_FOR_ON_BEHALF_OF": "..."
}
}
}
}
```

Restart Claude Desktop. The seven tools should appear in the tools panel.

### Claude Code

In your project's `.claude/settings.local.json`:

### Other MCP clients (Cursor, Continue, etc.)

Most accept the same `command + args + env` shape. Refer to the client's MCP documentation.

## Sanity-check invocation

Before the first real use, run the server against a Greenhouse staging tenant (paid Greenhouse customers can request a staging environment from their CSM).

```bash
GREENHOUSE_API_KEY=staging_key \
GREENHOUSE_USER_ID_FOR_ON_BEHALF_OF=staging_user_id \
greenhouse-recruiting-mcp --help
```

(The CLI is the MCP stdio server; `--help` is not implemented in this scaffold. The intent is to confirm the package installed and the entrypoint resolved.)

For per-tool validation, the recommended flow is:

1. Register the server with Claude Desktop pointed at staging credentials.
2. Ask Claude to call each tool with known staging inputs and verify the responses match what you see in the staging Greenhouse UI.
3. Only after every tool is verified, swap to production credentials.

## Security model

- **Auth.** Greenhouse API key as Basic-auth username, empty password. Greenhouse's documented pattern.
- **Writes.** Only `note_stage_stuck` mutates state. Attributed via `On-Behalf-Of` header so the Greenhouse audit log shows the recruiting engineer's user, not just the API key.
- **Rate limit.** Token-bucket at 40 req/10s by default (Greenhouse ceiling is 50 req/10s). Lower if other systems share the API key.
- **PII in MCP responses.** The server returns the data the API returns — including candidate names and emails. The calling Claude session is downstream; the session's data-handling posture is the recruiter's responsibility. Don't paste session transcripts into shared Slack channels; don't log raw responses to your own audit table.
- **Audit log.** The server logs every tool call to stderr at INFO level with PII-stripped arguments. Recruiting engineer is responsible for capturing stderr into a durable audit log (e.g. via systemd journal, Docker log driver, or a wrapping script that tees to a file).

## Known limits — numbered TODO before production use

The scaffold is honest about what it doesn't do yet. Treat each as a numbered TODO to close before broad production use.

1. **Not runtime-tested.** Every tool needs validation against a Greenhouse staging tenant. The smoke check in this README is a credentials check, not a per-tool validation.
2. **Pagination max page count.** The async iterator caps at 50 pages per call (5,000 records at 100/page). For tenants with very large candidate volumes, the cap needs raising or replacement with a streaming pattern.
3. **No request retry on transient errors.** The 429 handler retries once with a 2-second backoff. Other 5xx errors propagate; the recruiting engineer wraps the calls if more retry resilience is needed.
4. **Stage-name matching is exact-string.** "Phone Screen" and "Phone screen" are different. The tool surfaces this clearly enough but does not normalize.
5. **No multi-tenant support.** One server instance, one Greenhouse account. Multi-tenant requires non-trivial reshape (per-call credential injection, tenant-aware audit logging).
6. **Activity feed parsing is partial.** `get_candidate_history` returns the activity items the Harvest endpoint exposes; some interaction types (system actions, automated emails) may be undertyped. Expand the schema as the team finds gaps.
7. **No tests.** A pytest suite for the rate limiter and the Link-header parser is the obvious first addition; full integration tests against a staging tenant are the second.

## What this server intentionally does NOT do

- **No `delete_*` tools.** Deletes happen in the Greenhouse UI, with the audit trail that produces.
- **No candidate-state mutations** (move stages, advance to offer, reject). Those are recruiter decisions and need explicit per-tool justification — adding them would compromise the read-mostly posture.
- **No bulk send / outbound email.** Outreach belongs in a sourcing tool with proper unsubscribe handling, not in an MCP read-tool surface.
- **No PII normalization** (no name-redaction, no email-hashing in responses). The server returns what Greenhouse returns; downstream redaction is the recruiter's responsibility.

## Adding a new tool

If you need a new tool:

1. Add a Pydantic input schema and an async implementation in `server.py`.
2. Register it in `TOOL_REGISTRY` with a clear description.
3. If it's a write tool, document the per-tool justification in the function's docstring (see `note_stage_stuck` for the template). Confirm the `On-Behalf-Of` attribution flows through.
4. Validate against staging.
5. Update this README's tool list.

The scaffold's structure makes this a 30-60 minute change per tool. The discipline is in the per-tool justification step — it's the only thing that prevents the read-mostly posture from drifting.

"""Greenhouse recruiting MCP server."""

__version__ = "0.1.0"

"""
Greenhouse recruiting MCP server.

Exposes seven tools to MCP-compatible clients (Claude Desktop, Claude Code,
Cursor, etc.) backed by the Greenhouse Harvest API. Six are read; one is
the cautious write (`note_stage_stuck`).

NOT runtime-tested against a live Greenhouse tenant. The tool dispatch
implementations are written against Greenhouse's documented Harvest API
shape (https://developers.greenhouse.io/harvest.html as of 2026-Q2), but
the production deployment requires the recruiting engineer to verify each
tool against a staging tenant before flipping the production credentials.

Security model:
  - Auth: Greenhouse API key (Harvest scope) via Basic auth, key as username,
    empty password — Greenhouse's documented pattern.
  - Writes: only `note_stage_stuck` mutates state; uses `On-Behalf-Of`
    header for audit attribution.
  - Rate limit: token-bucket (default 40 req/10s; Greenhouse ceiling is
    50 req/10s per API key per IP).
  - Pagination: cursor-based on most endpoints; the implementations loop
    until no `Link: rel="next"` header is present.
  - Audit: every tool call logged to stderr at INFO level with tool name,
    parameters (PII-stripped), and response status. Recruiting engineer
    is responsible for capturing these into a durable audit log.
"""

from __future__ import annotations

import asyncio
import logging
import os
import time
from collections.abc import AsyncIterator
from typing import Any

import httpx
from mcp.server import Server
from mcp.types import Tool, TextContent
from pydantic import BaseModel, Field

logger = logging.getLogger(__name__)


# --- Configuration ------------------------------------------------------------

GREENHOUSE_API_BASE = "https://harvest.greenhouse.io/v1"
DEFAULT_RATE_LIMIT_PER_10S = 40  # Greenhouse documented ceiling: 50.
DEFAULT_TIMEOUT_S = 30.0


def _require_env(name: str) -> str:
    val = os.environ.get(name)
    if not val:
        raise RuntimeError(
            f"Required env var {name} not set. The Greenhouse MCP server cannot start "
            f"without credentials. See README.md for setup."
        )
    return val


# --- Rate limiter -------------------------------------------------------------


class TokenBucket:
    """Simple token-bucket rate limiter, async-safe."""

    def __init__(self, rate: int, per_seconds: float) -> None:
        self.rate = rate
        self.per_seconds = per_seconds
        self.tokens = float(rate)
        self.last_refill = time.monotonic()
        self._lock = asyncio.Lock()

    async def acquire(self) -> None:
        async with self._lock:
            now = time.monotonic()
            elapsed = now - self.last_refill
            self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per_seconds))
            self.last_refill = now
            if self.tokens >= 1:
                self.tokens -= 1
                return
            wait = (1 - self.tokens) * (self.per_seconds / self.rate)
        await asyncio.sleep(wait)
        await self.acquire()


# --- Greenhouse client --------------------------------------------------------


class GreenhouseClient:
    """Thin async wrapper around the Harvest API."""

    def __init__(
        self,
        api_key: str,
        on_behalf_of_user_id: str,
        rate_limit_per_10s: int = DEFAULT_RATE_LIMIT_PER_10S,
    ) -> None:
        self.api_key = api_key
        self.on_behalf_of_user_id = on_behalf_of_user_id
        self.bucket = TokenBucket(rate=rate_limit_per_10s, per_seconds=10.0)
        self._client = httpx.AsyncClient(
            base_url=GREENHOUSE_API_BASE,
            auth=(api_key, ""),
            timeout=DEFAULT_TIMEOUT_S,
            headers={"User-Agent": "greenhouse-recruiting-mcp/0.1.0"},
        )

    async def close(self) -> None:
        await self._client.aclose()

    async def _request(
        self,
        method: str,
        path: str,
        *,
        params: dict[str, Any] | None = None,
        json: dict[str, Any] | None = None,
        attribute_write: bool = False,
    ) -> httpx.Response:
        await self.bucket.acquire()
        headers: dict[str, str] = {}
        if attribute_write:
            headers["On-Behalf-Of"] = self.on_behalf_of_user_id
        resp = await self._client.request(method, path, params=params, json=json, headers=headers)
        if resp.status_code == 429:
            # Greenhouse does not return Retry-After on 429. Back off conservatively.
            await asyncio.sleep(2.0)
            return await self._request(
                method, path, params=params, json=json, attribute_write=attribute_write
            )
        resp.raise_for_status()
        return resp

    async def paginate(
        self,
        path: str,
        params: dict[str, Any] | None = None,
        *,
        max_pages: int = 50,
    ) -> AsyncIterator[dict[str, Any]]:
        """Yield each item from a paginated Harvest endpoint."""
        url: str | None = path
        page = 0
        while url is not None and page < max_pages:
            resp = await self._request("GET", url, params=params if page == 0 else None)
            for item in resp.json():
                yield item
            link = resp.headers.get("Link", "")
            url = _next_url_from_link_header(link)
            page += 1


def _next_url_from_link_header(link_header: str) -> str | None:
    """Parse RFC-5988 Link header for rel=next."""
    if not link_header:
        return None
    for part in link_header.split(","):
        section = part.split(";")
        if len(section) < 2:
            continue
        url = section[0].strip().strip("<>")
        rel = section[1].strip()
        if rel == 'rel="next"':
            # Greenhouse returns absolute URL; httpx follows but we want the path.
            if url.startswith(GREENHOUSE_API_BASE):
                return url[len(GREENHOUSE_API_BASE):]
            return url
    return None


# --- Pydantic schemas ---------------------------------------------------------


class ListCandidatesInStageInput(BaseModel):
    job_id: int = Field(..., description="Greenhouse job ID")
    stage_name: str = Field(..., description="Stage name as it appears in Greenhouse")
    stale_after_days: int | None = Field(
        None, description="Optional filter: candidates last touched more than N days ago"
    )


class GetCandidateHistoryInput(BaseModel):
    candidate_id: int = Field(..., description="Greenhouse candidate ID")


class ListJobsOpenInput(BaseModel):
    department: str | None = Field(None, description="Optional department name filter")


class GetFunnelForJobInput(BaseModel):
    job_id: int = Field(..., description="Greenhouse job ID")


class ListJobsStalledInput(BaseModel):
    stale_after_days: int = Field(
        7, description="A job is stalled if no candidate progressed in this many days"
    )


class SearchCandidatesByAttributeInput(BaseModel):
    custom_field_name: str
    value: str


class NoteStageStuckInput(BaseModel):
    candidate_id: int
    note_body: str = Field(..., description="The note text. Visible only internally.")


# --- Tool implementations -----------------------------------------------------


async def list_candidates_in_stage(
    client: GreenhouseClient, args: ListCandidatesInStageInput
) -> list[dict[str, Any]]:
    """Return candidates currently in a named stage on a given job."""
    out: list[dict[str, Any]] = []
    cutoff = (
        time.time() - (args.stale_after_days or 0) * 86400 if args.stale_after_days else None
    )
    async for app in client.paginate(
        f"/jobs/{args.job_id}/applications", params={"per_page": 100, "status": "active"}
    ):
        current_stage = app.get("current_stage", {})
        if (current_stage.get("name") or "").strip() == args.stage_name.strip():
            last_activity = app.get("last_activity_at")
            if cutoff and last_activity:
                if time.mktime(time.strptime(last_activity[:19], "%Y-%m-%dT%H:%M:%S")) > cutoff:
                    continue
            out.append(
                {
                    "candidate_id": app.get("candidate_id"),
                    "application_id": app.get("id"),
                    "applied_at": app.get("applied_at"),
                    "last_activity_at": last_activity,
                    "current_stage": current_stage.get("name"),
                }
            )
    return out


async def get_candidate_history(
    client: GreenhouseClient, args: GetCandidateHistoryInput
) -> dict[str, Any]:
    """Return a candidate's stage history."""
    candidate_resp = await client._request("GET", f"/candidates/{args.candidate_id}")
    candidate = candidate_resp.json()
    activity_resp = await client._request(
        "GET", f"/candidates/{args.candidate_id}/activity_feed"
    )
    activity = activity_resp.json()
    return {
        "candidate_id": candidate.get("id"),
        "name": candidate.get("first_name", "") + " " + candidate.get("last_name", ""),
        "current_application_ids": [app.get("id") for app in candidate.get("applications", [])],
        "activities": [
            {
                "type": a.get("subject"),
                "at": a.get("created_at"),
                "by": a.get("user"),
            }
            for a in (activity.get("activities") or [])
        ],
    }


async def list_jobs_open(
    client: GreenhouseClient, args: ListJobsOpenInput
) -> list[dict[str, Any]]:
    """List all open jobs."""
    out: list[dict[str, Any]] = []
    async for job in client.paginate(
        "/jobs",
        params={"status": "open", "per_page": 100},
    ):
        if args.department and (job.get("departments") or [{}])[0].get("name") != args.department:
            continue
        out.append(
            {
                "job_id": job.get("id"),
                "name": job.get("name"),
                "department": (job.get("departments") or [{}])[0].get("name"),
                "opened_at": job.get("opened_at"),
                "closed_at": job.get("closed_at"),
                "hiring_managers": [
                    h.get("name") for h in (job.get("hiring_team", {}).get("hiring_managers") or [])
                ],
            }
        )
    return out


async def get_funnel_for_job(
    client: GreenhouseClient, args: GetFunnelForJobInput
) -> dict[str, int]:
    """Return candidate count per stage for a job."""
    counts: dict[str, int] = {}
    async for app in client.paginate(
        f"/jobs/{args.job_id}/applications", params={"per_page": 100, "status": "active"}
    ):
        stage = (app.get("current_stage", {}).get("name") or "unknown").strip()
        counts[stage] = counts.get(stage, 0) + 1
    return counts


async def list_jobs_stalled(
    client: GreenhouseClient, args: ListJobsStalledInput
) -> list[dict[str, Any]]:
    """List jobs where no candidate has progressed in N days."""
    cutoff = time.time() - args.stale_after_days * 86400
    stalled: list[dict[str, Any]] = []
    async for job in client.paginate("/jobs", params={"status": "open", "per_page": 100}):
        latest_activity = 0.0
        async for app in client.paginate(
            f"/jobs/{job['id']}/applications", params={"per_page": 100, "status": "active"}
        ):
            la = app.get("last_activity_at")
            if la:
                try:
                    t = time.mktime(time.strptime(la[:19], "%Y-%m-%dT%H:%M:%S"))
                    if t > latest_activity:
                        latest_activity = t
                except ValueError:
                    continue
        if latest_activity > 0 and latest_activity < cutoff:
            stalled.append(
                {
                    "job_id": job.get("id"),
                    "name": job.get("name"),
                    "days_since_progress": int((time.time() - latest_activity) / 86400),
                }
            )
    return stalled


async def search_candidates_by_attribute(
    client: GreenhouseClient, args: SearchCandidatesByAttributeInput
) -> list[dict[str, Any]]:
    """Search candidates by a custom field value."""
    out: list[dict[str, Any]] = []
    async for c in client.paginate("/candidates", params={"per_page": 100}):
        for cf in c.get("custom_fields", []) or []:
            if cf.get("name") == args.custom_field_name and str(cf.get("value")) == args.value:
                out.append(
                    {
                        "candidate_id": c.get("id"),
                        "name": c.get("first_name", "") + " " + c.get("last_name", ""),
                        "matched_field": cf.get("name"),
                        "matched_value": cf.get("value"),
                    }
                )
                break
    return out


async def note_stage_stuck(
    client: GreenhouseClient, args: NoteStageStuckInput
) -> dict[str, Any]:
    """
    Add an internal note to a candidate. The single write tool exposed.

    Per-tool justification:
      - Required to log "Claude flagged this candidate as stage-stuck" so the
        action is visible in the audit trail and not silent.
      - No candidate-state mutation (does not move stages, does not send
        emails, does not change scorecards).
      - Attributed via On-Behalf-Of header so the Greenhouse audit log
        shows the recruiting-engineer user, not just the API key.
    """
    body = {
        "user_id": int(client.on_behalf_of_user_id),
        "body": args.note_body,
        "visibility": "private",  # internal note, not visible to candidate
    }
    resp = await client._request(
        "POST",
        f"/candidates/{args.candidate_id}/activity_feed/notes",
        json=body,
        attribute_write=True,
    )
    return {"status": "ok", "note": resp.json()}


# --- MCP server wiring --------------------------------------------------------

TOOL_REGISTRY: dict[str, tuple[type[BaseModel], Any, str]] = {
    "list_candidates_in_stage": (
        ListCandidatesInStageInput,
        list_candidates_in_stage,
        "List candidates currently in a named stage on a given job. Optionally filter by staleness.",
    ),
    "get_candidate_history": (
        GetCandidateHistoryInput,
        get_candidate_history,
        "Return a candidate's stage history and activity feed.",
    ),
    "list_jobs_open": (
        ListJobsOpenInput,
        list_jobs_open,
        "List open jobs. Optional department filter.",
    ),
    "get_funnel_for_job": (
        GetFunnelForJobInput,
        get_funnel_for_job,
        "Return candidate counts per stage for a single job.",
    ),
    "list_jobs_stalled": (
        ListJobsStalledInput,
        list_jobs_stalled,
        "List jobs where no candidate has progressed in N days.",
    ),
    "search_candidates_by_attribute": (
        SearchCandidatesByAttributeInput,
        search_candidates_by_attribute,
        "Search candidates by custom field name and value.",
    ),
    "note_stage_stuck": (
        NoteStageStuckInput,
        note_stage_stuck,
        "Write tool: add a private internal note to a candidate. Audit-attributed via On-Behalf-Of.",
    ),
}


def build_server() -> Server:
    server = Server("greenhouse-recruiting-mcp")

    api_key = _require_env("GREENHOUSE_API_KEY")
    on_behalf_of = _require_env("GREENHOUSE_USER_ID_FOR_ON_BEHALF_OF")
    client = GreenhouseClient(api_key=api_key, on_behalf_of_user_id=on_behalf_of)

    @server.list_tools()
    async def _list_tools() -> list[Tool]:
        return [
            Tool(
                name=name,
                description=desc,
                inputSchema=schema.model_json_schema(),
            )
            for name, (schema, _, desc) in TOOL_REGISTRY.items()
        ]

    @server.call_tool()
    async def _call_tool(name: str, arguments: dict[str, Any]) -> list[TextContent]:
        if name not in TOOL_REGISTRY:
            return [TextContent(type="text", text=f"Unknown tool: {name}")]
        schema, fn, _ = TOOL_REGISTRY[name]
        try:
            args = schema.model_validate(arguments)
        except Exception as exc:
            logger.warning("Tool %s called with invalid args: %s", name, exc)
            return [TextContent(type="text", text=f"Invalid arguments: {exc}")]

        # Audit: log tool call with PII-light args (drop free-text body for note tool).
        audit_args = arguments.copy()
        if name == "note_stage_stuck":
            audit_args["note_body"] = f"<{len(arguments.get('note_body', ''))} chars>"
        logger.info("Tool call: %s args=%s", name, audit_args)

        try:
            result = await fn(client, args)
        except httpx.HTTPStatusError as exc:
            logger.warning("Tool %s HTTP error: %s", name, exc)
            return [
                TextContent(
                    type="text",
                    text=f"Greenhouse API error {exc.response.status_code}: {exc.response.text[:500]}",
                )
            ]
        except Exception as exc:
            logger.exception("Tool %s failed", name)
            return [TextContent(type="text", text=f"Tool failed: {exc}")]

        # Result returned as JSON-shaped text content; the calling Claude session parses it.
        import json
        return [TextContent(type="text", text=json.dumps(result, default=str, indent=2))]

    return server


def main() -> None:
    """Entry point for `greenhouse-recruiting-mcp` CLI."""
    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s %(levelname)s %(name)s %(message)s",
    )
    from mcp.server.stdio import stdio_server

    async def _run() -> None:
        server = build_server()
        async with stdio_server() as (read_stream, write_stream):
            await server.run(
                read_stream,
                write_stream,
                server.create_initialization_options(),
            )

    asyncio.run(_run())


if __name__ == "__main__":
    main()