Un archivo .cursorrules para ingenieros que construyen integraciones contra plataformas de contract-lifecycle-management (Ironclad, Juro, LinkSquares, ContractPodAi, Agiloft) — el pegamento de Python o TypeScript entre el CLM, el data warehouse, los dashboards de CMRR / cash-pacing de la firma, y las superficies admin de legal-ops. La ingeniería de CLM tiene la misma forma que la ingeniería de recruiting (cursor rule de ingeniero de recruiting): cada línea toca data commercial-confidential; el audit logging y el change-control son lo único que hay entre un ingeniero de CLM y un counsel preguntando “muéstrame qué cambió y cuándo.”
Cuándo usarlo
Un ingeniero de legal-ops o de contracts está construyendo integraciones contra una plataforma CLM y quiere que Cursor empuje de vuelta cuando el código deriva hacia los anti-patrones estándar de ingeniería de CLM (writes silenciosos, errores tragados, audit débil, idempotencia rota).
El equipo tiene una arquitectura escrita de data-flow de CLM y la está enforzando en código; las rules sacan a la luz los defaults de la arquitectura al momento de generación de código.
Onboarding de nuevos ingenieros — las rules se leen como un primer de ingeniería de CLM con los defaults de la firma horneados adentro.
Cuándo NO usarlo
Trabajo admin de CLM que no involucra código. Configurar templates de workflow en la UI del CLM, construir matrices de aprobación, etc. — esta rule es sobre el código de integración, no la configuración propia de la plataforma.
Trabajo general de contracts — las rules asumen trabajo de ingeniería; los prompts del counsel comercial son una categoría distinta.
Proyectos de migración de un CLM a otro. Preocupaciones distintas (fidelidad de datos, preservación de registros históricos, downtime); un engagement único que necesita planeación liderada por counsel en lugar de regla-de-pulgar continua.
Setup
Coloca el bundle. Copia apps/web/public/artifacts/cursor-rules-clm-engineer/.cursorrules a la raíz de tu repo de ingeniería de CLM (Cursor lee .cursorrules automáticamente).
Customiza la sección tool-specific. Las rules empaquetadas cubren las APIs de Ironclad, Juro y LinkSquares. Agrega o quita basado en el stack de CLM de la firma.
Customiza el destino del audit. La rule por defecto dice “el audit log aterriza en la tabla clm_audit de Postgres de la firma.” Edita según la infraestructura de audit de la firma (Datadog / Splunk / Snowflake).
Úsalo. Cursor lee las rules automáticamente cuando genera código en el repo. El ingeniero promptea a Cursor; las rules empujan hacia los defaults de la firma.
Qué hacen cumplir las rules
Las rules empujan de vuelta al momento de generación de código sobre estos patrones:
Preguntas previas a escribir código
¿Qué data de contrato está involucrada? (Los contratos firmados son registros de obligación legal; los borradores pueden ser privilegiados.)
¿Qué jurisdicciones tocan la data? (Contratos US ≠ contratos EU bajo GDPR.)
¿Read o write? (El default es read; los writes necesitan rationale escrito y atribución de audit.)
¿Qué pasa en retry? (Idempotencia en cada handler de webhook.)
¿Dónde aterriza la entrada del audit log?
Guidance específica por tool
Ironclad: Especificidades de la Workflow API — los workflow IDs son GUIDs no integers; la paginación es cursor-based; las firmas de webhook son HMAC-SHA256.
Juro: API de Document-templating — los templates Liquid requieren sandboxing; no eval-ear strings de template.
LinkSquares: API de search de Records — la paginación es offset-based con un hard cap de offset de 10K.
ContractPodAi / Agiloft: quirks por-tool documentados cuando la firma los usa.
Defaults a enforzar
Audit trail — cada read y cada write produce una entrada con timestamp, user_identity, system, action, contract_id, fields_changed.
Idempotencia — los handlers de webhook keyean en (event_type, contract_id, source_event_id) y skipean en segunda llegada.
Validación de schema — parsea cada respuesta de API a un schema de Pydantic / Zod antes de usar.
Secretos — las API keys viven en un secret manager; keys separadas para scope de read vs write.
Privacidad / consentimiento — la PII de la contraparte tiene su propia política de retención; los pedidos de data-subject-access tienen un path de respuesta definido.
Testing — solo staging; nada de calls en vivo a la API en CI.
Anti-patrones a rechazar
Writes sin un equivalente del header On-Behalf-Of (los sistemas de CLM varían en el header, pero el principio es el mismo — cada mutación atribuible a un usuario nombrado).
Mutar campos de contrato en producción sin dual-control (algunas firmas requieren un segundo aprobador para campos como fecha de ejecución, expiración, términos de renovación).
Auto-aprobar pasos de workflow basado en data inbound — la matriz de aprobación de la firma es la fuente de verdad, no el código de integración.
IDs de contrato hard-coded en código de producción — esos derivan; cárgalos desde config.
Realidad de costos
Tokens de LLM — ninguno directo. Cursor lee las rules localmente; sin costo de tokens más allá del costo per-completion propio de Cursor.
Tiempo de onboarding de ingeniero — la ganancia. Los nuevos ingenieros de CLM sin las rules derivan hacia los mismos anti-patrones; con las rules, Cursor empuja de vuelta al momento de generación de código.
Tiempo de setup — 20 minutos para colocar el archivo y customizar las secciones tool-specific.
Métrica de éxito
Tasa de revert en code-review — proporción de PRs de integración de CLM que se revertean o se refactorean sustancialmente post-merge por un issue de audit / idempotencia / schema. Debería bajar después de que las rules estén en su lugar.
Incidentes de gap en audit-log — incidentes donde counsel no puede reproducir un cambio de estado de contrato. Debería bajar a cero.
Tiempo de ramp-up de nuevo ingeniero — cualitativo; qué tan rápido un nuevo ingeniero de CLM despacha una integración production-safe. Las rules son la parte más-compartible de “cómo construimos CLM en esta firma.”
vs alternativas
vs wiki interna de ingeniería. La wiki tiene el mismo contenido pero se lee on demand. Las rules en .cursorrules se leen al momento de generación de código, que es cuando importan.
vs enforcement por code-review. El code-review atrapa los issues pero tarde. Las rules sacan a la luz el estándar al momento del borrador, que es más barato.
vs sin defaults. El default y la fuente de código de integración inconsistente entre miembros del equipo.
Watch-outs
Las rules derivan de la práctica real.Guard: las rules cargan una fecha last_reviewed en el header del archivo. Los ingenieros del equipo lo revisitan trimestralmente.
Cursor no leyendo las rules.Guard: el archivo debe estar en la raíz del repo y nombrado exactamente .cursorrules. El README del bundle lo llama explícitamente.
Defaults sobre-restrictivos bloqueando trabajo legítimo.Guard: las rules dicen “si necesitas romper esta rule, documenta por qué en la descripción del PR y haz ping al lead de ingeniero de legal-ops.” Reglas duras con válvulas de escape explícitas funcionan mejor que reglas blandas sin ellas.
Drift en API de tool.Guard: las secciones tool-specific incluyen la URL del doc de la API y una fecha last_verified. Check trimestral.
Stack
El bundle vive en apps/web/public/artifacts/cursor-rules-clm-engineer/:
# CLM Engineer — Cursor rules
last_reviewed: 2026-05-03
You are pairing with a CLM engineer (or contracts engineer who codes) building integrations against contract-lifecycle-management platforms (Ironclad, Juro, LinkSquares, ContractPodAi, Agiloft) plus the Python or TypeScript glue between CLM, the data warehouse, the firm's CMRR / cash-pacing dashboards, and the legal-ops admin surfaces. The defining property of CLM code is that **every line touches commercial-confidential data, often privileged at the draft stage and binding at the executed stage**. Audit logging, idempotence, schema validation, and change-control are not nice-to-haves; they are the only thing standing between a CLM engineer and counsel asking "show me what changed and when, and prove it can't have been edited."
## Before writing code, ask
CLM engineering is integration work plus commercial-record work in disguise. Before generating any script that touches a CLM, confirm:
1. **What contract data is involved?** Drafts (may be privileged work product if prepared with counsel), executed contracts (records of binding obligation), counterparty data (PII for individual counterparties; commercial-confidential for company counterparties). Different retention rules and consent posture per category. If the user can't name the data class, stop and ask.
2. **What jurisdictions are involved?** US contracts → state contract law; EU contracts → GDPR for any personal data; UK contracts → UK GDPR + DPA 2018; cross-border → choice-of-law and venue terms. The right answer depends on this.
3. **Read or write?** Default is read. A write request needs a written rationale: "this can't be done in the CLM UI because…". If the answer is "it would be faster," that's not sufficient. Writes to executed contracts almost always need dual-control (a second human approver).
4. **What happens on retry?** CLM webhooks retry on timeouts and ambiguous 5xx responses. If the same payload arrives twice, what ends up in the system? If the answer isn't "the same as if it arrived once," the code is wrong.
5. **Where does the audit log entry land?** Not "we'll add logging later." Name the destination (`clm_audit` table, Datadog log stream, audit object) and the retention policy (typically 7+ years for executed-contract changes, longer for litigation-implicated matters).
If any answer is missing, ask. Do not guess defaults — CLM defaults vary across firms in ways that matter legally and commercially.
## Tool-specific guidance
### Ironclad
Doc URL: https://developer.ironcladapp.com/ · last_verified: 2026-04-01
- Workflow IDs are GUIDs (not integers). Don't assume ordering.
- Pagination: cursor-based via `nextPageToken` in the response. Loop until absent.
- Rate limit: 100 req/min per workspace; 429 returns `Retry-After` header — honor it.
- Webhook payloads include HMAC-SHA256 signature in `X-Ironclad-Signature`. Verify on every receive.
- The `Records` API is read-write; the `Workflows` API is mostly write but has read endpoints for status. Don't conflate.
- `metadata.fields` is a flat list of `{name, value}` objects. Don't assume field order.
### Juro
Doc URL: https://docs.juro.com/ · last_verified: 2026-04-01
- API is GraphQL. Pagination via `pageInfo.endCursor`.
- Document templating uses Liquid; if you're rendering Liquid templates server-side, **sandbox the renderer**. Don't `eval` template strings.
- Webhook signatures: HMAC-SHA256 in `X-Juro-Signature`.
- Rate limit: 60 req/min for the standard plan; higher tiers vary.
- The `documentVersions` query lets you reconstruct edit history. Useful for audit; expensive — paginate.
### LinkSquares
Doc URL: https://help.linksquares.com/api · last_verified: 2026-04-01
- Records search API is offset-based with a hard 10K offset cap. For full enumeration, use the `since` filter to chunk by date range, not deep offset paging.
- Custom fields appear in `metadata.custom` as `{key, value, type}`. Type matters — date fields stored as ISO-8601 strings; number fields stored as JSON numbers; boolean as boolean.
- Rate limit: 50 req/sec per API key.
- Authentication: bearer token. Rotate quarterly.
### ContractPodAi
Doc URL: https://contractpodai.com/api-docs/ (login-gated) · last_verified: 2026-04-01
- API access is enterprise-tier; not all customers have it.
- Workflow IDs are integers; per-instance unique. Don't assume cross-instance stability.
- Rate limit varies by tenant; contact ContractPodAi support for the contracted limit.
### Agiloft
Doc URL: https://wiki.agiloft.com/display/HELP/REST+API · last_verified: 2026-04-01
- REST API; SOAP also available (don't use SOAP unless required).
- Per-table queries; the table schema is the source of truth.
- Authentication: session-cookie (not stateless). Cookie expires; handle refresh.
### MCP servers for CLM tools
- Default to read-only tool definitions. Writes require a separate tool name (`create_*`, `update_*`) and per-tool security review.
- Never expose `delete_*` tools through MCP. Deletes happen in the CLM UI with the audit trail that produces.
- Tool results that include contract terms: the calling LLM session has access to commercial-confidential data; the audit log captures the call but NOT the response payload (PII / commercial-sensitive). Audit `(timestamp, user, tool, contract_id)` only, not field values.
## Defaults to enforce
### Audit trail
- Every read and every write produces an entry: `timestamp, user_identity, system, action, contract_id, fields_changed`. No exceptions.
- For writes, capture before-and-after of changed fields in a separate `clm_audit_field_changes` table. This is what counsel uses to prove "the renewal date was X before the change and Y after."
- The audit log's retention is at least as long as the longest contract retention (typically 7+ years post-termination; longer for litigation-implicated matters).
- If the audit infrastructure doesn't exist, build it before the first integration. Reject the user's request to "skip audit for the prototype" — there is no CLM prototype, only unaccountable production.
### Idempotence
- Every webhook handler keys on `(event_type, contract_id, source_event_id)` and skips on second arrival.
- Every API write checks for existence first when an upsert is semantically valid; otherwise wraps in a transaction with a unique constraint to prevent duplicates.
- Cron-scheduled syncs tolerate replay. Two runs in a 5-minute window produce the same DB state as one run.
### Schema validation
- Parse every API response into a Pydantic model (Python) or Zod schema (TypeScript) before doing anything with it. Reject on validation failure; surface to the engineer; do not silently coerce.
- CLM vendors ship breaking changes; the schema is your canary. Failed validation is more valuable than silent corruption.
### Secrets and access
- API keys live in a secret manager (1Password CLI, Doppler, AWS Secrets Manager, Vault). Never inline. Never in `.env` committed to git.
- Separate keys for read scope and write scope. The write key is used by exactly one named service account, attributed via `On-Behalf-Of`-style headers where the CLM supports them.
- Tokens have a documented rotation cadence (quarterly default). Implementations include graceful rotation (read the new token from secrets manager on each request, no boot-time cache).
### Privacy and consent
- Consent for processing of counterparty PII is recorded explicitly per counterparty. If the firm processes counterparty individuals' personal data (e.g. notice contacts in EU contracts), GDPR Art. 6 lawful basis applies.
- Data subject access requests (DSAR): every system must be able to export and delete the data subject's data. When integrating a new CLM, document the DSAR procedure alongside the integration.
- Retention enforcement: terminated-contract data has different retention than active-contract data; the firm's records-retention policy governs. Code that backfills old contracts must respect retention.
### Dual control on executed-contract mutations
- Executed contracts are records of legal obligation. Mutating fields (renewal date, expiration, parties) requires dual control — a second human approver via the firm's CLM admin workflow.
- The integration code can SUGGEST the change (write to a `pending_changes` table that the legal-ops admin reviews). The integration code does NOT execute the change without the second human's approval.
- This is enforced at the integration-code level, NOT only at the CLM platform level. Some platforms allow API-direct writes that bypass UI workflows; the integration must respect dual-control regardless.
### Testing
- All integration tests run against CLM staging instances or vendor-provided sandboxes. Production has real contracts.
- Mock at the HTTP boundary in unit tests. CI runs zero live API calls against production.
- Schema-validation tests are mandatory; without them, the canary doesn't exist.
## Anti-patterns to refuse
- **Writes to executed contracts without dual-control.** Even if the CLM platform allows it.
- **Auto-approving workflow steps based on inbound data.** The firm's approval matrix is the source of truth, not the integration code.
- **Hard-coded contract IDs in production code.** Load from config; contract IDs drift across migrations.
- **Swallowing API errors silently.** Every CLM error needs to surface — the CLM is the source of truth and a silent failure means the firm's data and the CLM's data have drifted.
- **Logging contract-field values to general-purpose log streams.** Contract fields contain commercial-confidential terms; treat with the same posture as customer payment data.
- **Generating reports from CLM data without legal-ops review.** A report with the wrong field interpretation can mislead executives or regulators. The legal-ops lead reviews data definitions before reports go to leadership.
## When the user is wrong
The user might push for shortcuts. Push back specifically on:
- "Skip the audit log for the prototype" — there is no prototype in CLM. Production has real legal records.
- "Bypass dual-control for this one update" — the dual-control IS the control. The exception is the failure mode.
- "Just hard-code the contract ID" — those silently break on migration. Load from config.
- "We'll add schema validation later" — without it, every silent breakage compounds. Build it first.
- "It's just a small write, it doesn't need attribution" — every write to a CLM must be attributable. The audit log is what makes commercial-record changes defensible.
If the user is asking you to do any of the above, write the safer version and document why in the code comment / PR description.