MemoryNode API usage

Canonical HTTP reference for the memorynode-api Cloudflare Worker. Source of truth for behavior is code in apps/api/src/. The regenerated docs/external/openapi.yaml documents the public product surface (run pnpm openapi:gen) — internal, operator, and advanced routes may still exist in the Worker but are omitted from that contract.

Production base: https://api.memorynode.ai
Hosted MCP base: https://mcp.memorynode.ai (same Worker)
Staging base: https://api-staging.memorynode.ai
Local dev: http://127.0.0.1:8787

Current product focus is workflow outcomes for a primary cluster of AI-native products:

SaaS personalization copilots
AI companion products
Creator tools with persistent context

Developer agents are a secondary segment. Advanced surfaces (connectors, audit exports, search history, admin routes, etc.) may still exist on the Worker for operators and power users; they are intentionally omitted from the published OpenAPI contract — see section 9.

1. Authentication

Dispatched in apps/api/src/auth.ts and apps/api/src/workerApp.ts.

Mode	Trigger	Validation	Uses
API key (K)	`Authorization: Bearer <key>` or `x-api-key: <key>`	SHA-256 of key + `API_KEY_SALT` then `authenticate_api_key(p_key_hash)` RPC	All `/v1/*` tenant routes
Dashboard session (S)	`Cookie: mn_dash_session=<opaque>` + `x-csrf-token`	Row in `dashboard_sessions` + CSRF double-submit	`/v1/*` tenant routes from browser
Admin (A)	`x-admin-token`	Equality against `MASTER_ADMIN_TOKEN` or HMAC-SHA256 signed form; optional IP allowlist (`ADMIN_ALLOWED_IPS`); `ADMIN_BREAK_GLASS`	`/admin/`, `/v1/admin/`
PayU webhook (H)	form body	Reverse SHA-512 over PayU fields, or HMAC-SHA256 fallback over raw body keyed by `PAYU_WEBHOOK_SECRET` (`x-payu-webhook-signature`)	`POST /v1/billing/webhook` only
Internal MCP (I)	`x-internal-mcp: 1` + `x-internal-secret: <MCP_INTERNAL_SECRET>`	Constant-time compare	Internal hosted-MCP → REST subrequests
Public (P)	—	—	`/healthz`, `/ready`, `/v1/health`

API keys created through POST /v1/api-keys are rate-limited at 15 RPM for the first 48 h after api_keys.created_at; after that the default is 60 RPM per key. See packages/shared/src/plans.ts.

Webhook forwarding uses internal headers after signature verification, but that internal token path is route-bound to POST /v1/memories and is not accepted as a general-purpose auth bypass on other routes.

2. Middleware order

From handleRequestImpl in apps/api/src/workerApp.ts:

Request-id resolve, CORS + security headers ensemble.
Short-circuit health endpoints (/healthz, /ready, /v1/health).
CORS deny if Origin is not in ALLOWED_ORIGINS (except hosted MCP paths).
enforceRuntimeConfigGuards and ensureRateLimitDo.
OPTIONS short-circuit.
assertBodySize (bounded by MAX_BODY_BYTES / MAX_IMPORT_BYTES).
KNOWN_PATH_ALLOWED_METHODS 405 gate with Allow: header.
Production dashboard ALLOWED_ORIGINS gate.
createSupabaseClient(env) + db_access_path_selected log.
Hosted MCP path → IP rate limit → handleHostedMcpRequest.
Dashboard session POST/logout inline.
Enforce route-level auth and handler dispatch via route().
Build handlerDeps, instantiate factories, call route().
404 if route() returns null.
Catch: ApiError-shaped response with error_code; else 500 INTERNAL.
Finally: emitAuditLog, request_completed structured log, persistApiRequestEvent.

3. Error envelope

Errors emitted by the outer catch in handleRequestImpl (apps/api/src/workerApp.ts ~1379–1407) use:

{ "error": { "code": "STRING_CODE", "message": "human readable" } }

Typical codes: BAD_REQUEST, UNAUTHORIZED, FORBIDDEN, NOT_FOUND, PAYLOAD_TOO_LARGE, RATE_LIMITED, CAP_EXCEEDED, TRIAL_EXPIRED, COST_BUDGET_EXCEEDED, CONTROL_PLANE_ONLY, INTERNAL. Correlate responses with the x-request-id header.

4. Rate limits, concurrency, quotas

All defined in apps/api/src/auth.ts, apps/api/src/usage/quotaReservation.ts, and packages/shared/src/plans.ts.

Control	Default	Source
Per-key RPM	60 (15 for new keys, 48 h)	`RATE_LIMIT_MAX`, `RATE_LIMIT_RPM_NEW_KEY`
Per-workspace RPM	120 (300 for `scale`)	`WORKSPACE_RPM_DEFAULT`, `WORKSPACE_RPM_SCALE`
Per-workspace in-flight	8	`WORKSPACE_CONCURRENCY_MAX`, TTL 30000 ms
Cost/minute burst	15 INR	`WORKSPACE_COST_PER_MINUTE_CAP_INR`
Daily and period caps	atomic via Postgres	`reserve_usage_if_within_cap` / `commit_usage_reservation`
Global AI cost budget	fail-closed (prod)	`AI_COST_BUDGET_INR`, 60 s cache

When entitlement verification is unavailable, quota-consuming routes move into temporary read-only degradation (ENTITLEMENT_DEGRADED, HTTP 503) until billing checks recover.

Chat completions provider (Worker)

Optional multi-vendor chat routing (default OpenAI): CHAT_PROVIDER = openai \| anthropic \| gemini in apps/api/src/env.ts. Keys: ANTHROPIC_API_KEY, GEMINI_API_KEY (Google AI Studio); OPENAI_API_KEY still powers embeddings when EMBEDDINGS_MODE=openai and powers chat when CHAT_PROVIDER=openai. Optional CHAT_MODEL overrides the default cheap model for the active chat provider. Implementation: apps/api/src/llm/chatComplete.ts.

Embeddings are not swapped when changing CHAT_PROVIDER. The memory_chunks table stores embedding_provider, embedding_dimension, and embedding_version per chunk; vector search only compares rows matching the query embedding dimension (default 1536 for legacy OpenAI/stub). No bulk re-embed is required for existing data.

Memory lifecycle intelligence (automatic)

MemoryNode applies deterministic lifecycle rules on ingest, search, and nightly hygiene — no extra routes beyond normal writes and search.

Signal	Behavior
Dedupe	Canonical hash, normalized near-text, or embedding similarity can return an existing `memory_id` with `deduped: true` and optional `dedupe_kind` (`near`, `semantic`)
Confidence / volatility	Persisted on the memory row; used in Worker ranking (not in raw SQL recall scores)
Ephemeral TTL	High-volatility text may get `expires_at`; expired rows move to `archived` via the hygiene job
Supersession	`replaces_memory_id`, conflict resolution, evolution merges, and admin hygiene set `lifecycle_state: replaced`
Retrieval learning	Search/context bump learning signals; `POST /v1/feedback` applies explicit positive/negative using the `x-request-id` from the search call

Implementation: apps/api/src/memories/memoryPolicyEngine.ts, apps/api/src/pipelines/search/ranking.ts, migrations 076–078.

What shows up in HTTP responses today:

Surface	Fields
`POST /v1/memories` (dedupe hit)	`{ memory_id, stored: true, deduped: true, dedupe_kind? }`
`POST /v1/memories` (new row)	`memory_id`, `stored`, `chunks`, `extraction`, optional `intelligence.conflict_state`, optional `trace.memory_evolution_*`, optional `superseded_memory_id`
`POST /v1/search`	`results`, pagination — no `retrieval_trace` in the JSON body
`POST /v1/context`	`context_text`, `citations`, pagination — same search pipeline, assembled for prompts
`POST /v1/feedback`	`{ feedback: "positive" \	"negative", request_id } `—` request_id `is the` x-request-id` header from search/context

Retrieval architecture: Postgres RPCs (match_chunks_vector, match_chunks_text) perform recall only (similarity / ts_rank + filters). Confidence, lifecycle, type, learning, and freshness ranking run in the Worker after RRF fusion (078_search_recall_only_rpcs.sql).

Debugging without custom tooling:

Log x-request-id on every call (support and POST /v1/feedback correlation).
GET /v1/pruning/metrics — duplicate/chunk counts per workspace.
GET /v1/audit/log — paginated API audit rows.
Search body field explain is accepted by schema but not honored on POST /v1/search (always off in the handler today). Do not rely on ranking explain payloads in production integrations.
Chunk rows store embedding_provider, embedding_dimension, embedding_version (default 1536-dim OpenAI/stub behavior).

5. Plans

From packages/shared/src/plans.ts. PlanId = "launch" | "build" | "deploy" | "scale".

Plan	INR	Period (d)	Writes	Reads	Embed tok	Gen tok	Storage GB	Retention (d)	Workspace RPM
launch	399	7	250	1 000	100 000	150 000	0.5	30	120
build	999	30	1 200	4 000	600 000	1 000 000	2	90	120
deploy	2 999	30	5 000	15 000	3 000 000	5 000 000	10	180	120
scale	8 999	30	20 000	60 000	12 000 000	20 000 000	50	365	300

Overage rates per 1 k / per 1 M tok / per GB-mo are hard-coded per plan in plans.ts:87-203.

Pricing experiment scaffolding is exposed via PRICING_EXPERIMENTS in plans.ts (usd_anchor, usage_starter, startup_bundle, annual_commit). These are packaging/presentation experiments; checkout remains on the operational INR plan set while experiments run.

6. Routes (tenant-facing)

Dispatch is in apps/api/src/router.ts. K = API key. S = dashboard session.

6.1 Memories

Method	Path	Auth	Purpose
POST	`/v1/memories`	K/S	Create a memory; embed, chunk, optional extraction up to 10 child memories
POST	`/v1/memories/conversation`	K/S	Create from transcript or messages (transforms → `/v1/memories`)
GET	`/v1/memories`	K/S	Paginated list, filters: `namespace`, `user_id`, `owner_id`, `owner_type`, `memory_type`, `start_time`, `end_time`, `metadata`
GET	`/v1/memories/:id`	K/S	Single memory
DELETE	`/v1/memories/:id`	K/S	Cascade delete with chunks and links
POST	`/v1/memories/:id/links`	K/S	Create `memory_links` edge (unique per workspace)
DELETE	`/v1/memories/:id/links`	K/S	Delete edge
POST	`/v1/ingest`	K/S	Discriminated dispatch → memory / conversation / document-as-text

6.2 Search and context

Method	Path	Auth	Purpose
POST	`/v1/search`	K/S	Embed query, pgvector search, optional rerank; header `x-save-history: 1` inserts `search_query_history`
POST	`/v1/context`	K/S	Search + context assembly with citations and linked memories
POST	`/v1/context/feedback`	K/S	Insert feedback row
POST	`/v1/feedback`	K/S	Apply explicit positive/negative signal to latest record by `request_id`
GET	`/v1/search/history`	K/S	Saved queries (paginated)
POST	`/v1/search/replay`	K/S	Rerun from history by `query_id`
PATCH	`/v1/profile/pins`	K/S	Update pinned memories

6.3 Usage, audit, pruning

Method	Path	Auth	Purpose
GET	`/v1/usage/today`	K/S	Caps vs consumed reads/writes/tokens; includes `entitlement_active` and `entitlement_source` (`billing`)
GET	`/v1/audit/log`	K/S	Paginated `api_audit_log` rows
GET	`/v1/pruning/metrics`	K/S	`workspace_pruning_metrics` RPC

6.4 Connectors

Connector settings are currently an advanced/deprioritized surface, not part of the default activation journey.

Method	Path	Auth
GET	`/v1/connectors/settings`	K/S
PATCH	`/v1/connectors/settings`	K/S

6.5 Workspaces and API keys

Console users create workspaces and API keys through the dashboard session routes in §6.7 (/v1/dashboard/bootstrap, /v1/dashboard/workspaces, /v1/dashboard/api-keys*). That family is what docs/external/openapi.yaml tracks.

Some deployments still accept operator paths such as POST /v1/workspaces (admin-scoped API key) and /v1/api-keys* (x-admin-token). They are intentionally omitted from the generated OpenAPI contract; treat them as internal/legacy unless your operator runbook says otherwise.

6.6 Billing

All PayU checkout flows use POST /v1/billing/checkout from the console. Legacy Stripe portal calls (POST /v1/billing/portal) may still return 410 Gone for old integrations; that path is omitted from the published OpenAPI contract.

Method	Path	Auth	Notes
GET	`/v1/billing/status`	K/S	`select workspace_entitlements`
POST	`/v1/billing/checkout`	K/S	Body: `{plan, firstname?, email?, phone?}`. Inserts `payu_transactions`, computes SHA-512 request hash, returns `{url, method:"POST", fields}` for the PayU form.
POST	`/v1/billing/webhook`	H	PayU callback. Verifies reverse SHA-512 (or HMAC-SHA256 fallback), calls PayU verify API, upserts entitlements.

6.7 Dashboard

Browser console routes use the dashboard session cookie (S) and request-scoped Supabase execution. Mutating POST routes also require a valid x-csrf-token (double-submit with the session bootstrap). JSON bodies use { ok: true, data: … } on success or { ok: false, error: { code, message, details? } } on failure unless noted.

Method	Path	Auth	Notes
POST	`/v1/dashboard/bootstrap`	Supabase JWT in body	`{ access_token, workspace_name? }`. Pre-cookie: resolves or creates the user’s default workspace via `create_workspace` when none exists; then the client calls `/v1/dashboard/session` with the chosen `workspace_id`.
POST	`/v1/dashboard/session`	Supabase JWT in body	`{access_token, workspace_id}`; verifies via `SUPABASE_JWT_SECRET`, inserts `dashboard_sessions`, sets HttpOnly cookie, returns `csrf_token`.
POST	`/v1/dashboard/logout`	S	Deletes session, clears cookie.
GET	`/v1/dashboard/overview-stats`	S	`dashboard_console_overview_stats` RPC.
GET	`/v1/dashboard/workspaces`	S	Lists workspaces the signed-in user belongs to (`id`, `name`, `role`).
POST	`/v1/dashboard/workspaces`	S + CSRF	`{ name }`; `create_workspace` RPC.
GET	`/v1/dashboard/api-keys`	S	Query `workspace_id?` (defaults to active session workspace; must match session). `list_api_keys` RPC.
POST	`/v1/dashboard/api-keys`	S + CSRF	`{ workspace_id, name }`; `create_api_key` RPC (returns plaintext key once).
POST	`/v1/dashboard/api-keys/revoke`	S + CSRF	`{ api_key_id }`; `revoke_api_key` RPC.

6.8 Health

Method	Path	Auth	Notes
GET	`/healthz`	P	Validates critical env, returns `version` + `embedding_model`.
GET	`/ready`	P	Circuit-breaker-wrapped `get_api_key_salt` RPC.
GET	`/v1/health`	P	Same payload as `/healthz`.

6.9 Admin

All auth with x-admin-token.

Method	Path	Notes
POST	`/admin/webhooks/reprocess`	Rerun `reconcilePayUWebhook` for deferred events
POST	`/admin/usage/reconcile`	`process_usage_reservation_refunds()` RPC
POST	`/admin/sessions/cleanup`	Delete expired `dashboard_sessions`
POST	`/admin/memory-hygiene`	`find_near_duplicate_memories(...)`; query: `dry_run`, `limit`, `workspace_id`
POST	`/admin/memory-retention`	Archive per retention; query: `limit`
GET	`/v1/admin/billing/health`	Billing health view

6.10 MCP

POST /v1/mcp, POST /mcp → Streamable HTTP JSON-RPC (handleHostedMcpRequest in apps/api/src/mcpHosted.ts).
GET /v1/mcp, GET /mcp → browser landing or SSE.
DELETE /v1/mcp, DELETE /mcp → close session.

See docs/MCP_SERVER.md for the tool catalog.

7. Canonical flows

7.1 `POST /v1/memories`

auth → rate limit → concurrency lease → quota reserve → embed (EMBEDDINGS_MODE, versioned chunk metadata) → dedupe / conflict / evolution passes → insert memories + memory_chunks → optional LLM extraction via CHAT_PROVIDER (up to 10 child memories) → commit reservation → audit.

resolveQuotaForWorkspace uses the billing entitlement row source (workspace_entitlements).

Entitlement source changes are audited in workspace_entitlement_audit and write-protected after create.

7.2 `POST /v1/search`

auth → rate limit → read reservation → optional query embed → SQL recall (vector and/or keyword RPCs) → Worker RRF fusion + ranking (+ optional rerank) → implicit retrieval-learning signals → JSON { results, page, page_size, total, has_more } (no retrieval_trace in body). Header x-save-history: true stores history without trace snapshot.

7.3 `POST /v1/billing/checkout`

Insert payu_transactions row → build SHA-512 request hash (buildPayURequestHashInput) → return {url, method:"POST", fields}. The dashboard auto-submits the form.

7.4 `POST /v1/billing/webhook`

Verify reverse SHA-512 (or HMAC-SHA256 fallback) → verifyPayUTransactionViaApi with retry and timeout → upsertWorkspaceEntitlementFromTransaction → 200. Idempotent via payu_webhook_events.

8. Client headers you may see

Header	Meaning
`x-request-id`	Correlation id on every response
`x-mn-resolved-container-tag`	Resolved tenant container (debug)
`x-mn-routing-mode`	`service-role`, `rpc-first`, or `rls-first` (debug)
`Retry-After`	Seconds (429 responses)

9. Changes and drift

docs/external/openapi.yaml is generated from code by apps/api/scripts/generate_openapi.mjs. CI enforces OpenAPI drift with pnpm openapi:check. The generator intentionally documents a reduced product surface (core memory, search, context, usage, billing checkout/status, dashboard session/bootstrap/workspaces/api-keys, health, MCP). Operator routes (/admin/, /v1/admin/, PayU webhooks, connectors, audit, pruning, etc.) remain in the Worker and are described in this file where relevant, but are not duplicated as paths in OpenAPI.

Keep API truth docs aligned in the same PR (see .cursor/rules/documentation-governance.mdc).

Recent backend hardening keeps API behavior unchanged while making optional internals fail-safe in degraded or stubbed environments: learned-adjustment lookup, monthly LLM usage reads, retrieval-attribute loading, and async feedback persistence now no-op safely when dependent storage/query capabilities are unavailable. The dev/CI smoke path (pnpm smoke:ci, which boots wrangler dev with SUPABASE_MODE=stub) also auto-registers the in-memory Supabase stub on first request when stage is non-production; production stages reject SUPABASE_MODE=stub outright in apps/api/src/db/createSupabaseClient.ts so no external API behavior is affected.

Admin and billing webhook routes are served directly by the main Worker.