MemoryNode API usage
Canonical HTTP reference for the memorynode-api Cloudflare Worker. Source of truth for behavior is code in apps/api/src/. The regenerated docs/external/openapi.yaml documents the public product surface (run pnpm openapi:gen) — internal, operator, and advanced routes may still exist in the Worker but are omitted from that contract.
- Production base:
https://api.memorynode.ai - Hosted MCP base:
https://mcp.memorynode.ai(same Worker) - Staging base:
https://api-staging.memorynode.ai - Local dev:
http://127.0.0.1:8787
Current product focus is workflow outcomes for a primary cluster of AI-native products:
- SaaS personalization copilots
- AI companion products
- Creator tools with persistent context
Developer agents are a secondary segment. Advanced surfaces (connectors, audit exports, search history, admin routes, etc.) may still exist on the Worker for operators and power users; they are intentionally omitted from the published OpenAPI contract — see section 9.
1. Authentication
Dispatched in apps/api/src/auth.ts and apps/api/src/workerApp.ts.
| Mode | Trigger | Validation | Uses |
|---|---|---|---|
| API key (K) | Authorization: Bearer <key> or x-api-key: <key> | SHA-256 of key + API_KEY_SALT then authenticate_api_key(p_key_hash) RPC | All /v1/* tenant routes |
| Dashboard session (S) | Cookie: mn_dash_session=<opaque> + x-csrf-token | Row in dashboard_sessions + CSRF double-submit | /v1/* tenant routes from browser |
| Admin (A) | x-admin-token | Equality against MASTER_ADMIN_TOKEN or HMAC-SHA256 signed form; optional IP allowlist (ADMIN_ALLOWED_IPS); ADMIN_BREAK_GLASS | /admin/, /v1/admin/ |
| PayU webhook (H) | form body | Reverse SHA-512 over PayU fields, or HMAC-SHA256 fallback over raw body keyed by PAYU_WEBHOOK_SECRET (x-payu-webhook-signature) | POST /v1/billing/webhook only |
| Internal MCP (I) | x-internal-mcp: 1 + x-internal-secret: <MCP_INTERNAL_SECRET> | Constant-time compare | Internal hosted-MCP → REST subrequests |
| Public (P) | — | — | /healthz, /ready, /v1/health |
API keys created through POST /v1/api-keys are rate-limited at 15 RPM for the first 48 h after api_keys.created_at; after that the default is 60 RPM per key. See packages/shared/src/plans.ts.
Webhook forwarding uses internal headers after signature verification, but that internal token path is route-bound to POST /v1/memories and is not accepted as a general-purpose auth bypass on other routes.
2. Middleware order
From handleRequestImpl in apps/api/src/workerApp.ts:
- Request-id resolve, CORS + security headers ensemble.
- Short-circuit health endpoints (
/healthz,/ready,/v1/health). - CORS deny if
Originis not inALLOWED_ORIGINS(except hosted MCP paths). enforceRuntimeConfigGuardsandensureRateLimitDo.OPTIONSshort-circuit.assertBodySize(bounded byMAX_BODY_BYTES/MAX_IMPORT_BYTES).KNOWN_PATH_ALLOWED_METHODS405 gate withAllow:header.- Production dashboard
ALLOWED_ORIGINSgate. createSupabaseClient(env)+db_access_path_selectedlog.- Hosted MCP path → IP rate limit →
handleHostedMcpRequest. - Dashboard session POST/logout inline.
- Enforce route-level auth and handler dispatch via
route(). - Build
handlerDeps, instantiate factories, callroute(). 404ifroute()returns null.- Catch:
ApiError-shaped response witherror_code; else500 INTERNAL. - Finally:
emitAuditLog,request_completedstructured log,persistApiRequestEvent.
3. Error envelope
Errors emitted by the outer catch in handleRequestImpl (apps/api/src/workerApp.ts ~1379–1407) use:
{ "error": { "code": "STRING_CODE", "message": "human readable" } }
Typical codes: BAD_REQUEST, UNAUTHORIZED, FORBIDDEN, NOT_FOUND, PAYLOAD_TOO_LARGE, RATE_LIMITED, CAP_EXCEEDED, TRIAL_EXPIRED, COST_BUDGET_EXCEEDED, CONTROL_PLANE_ONLY, INTERNAL. Correlate responses with the x-request-id header.
4. Rate limits, concurrency, quotas
All defined in apps/api/src/auth.ts, apps/api/src/usage/quotaReservation.ts, and packages/shared/src/plans.ts.
| Control | Default | Source |
|---|---|---|
| Per-key RPM | 60 (15 for new keys, 48 h) | RATE_LIMIT_MAX, RATE_LIMIT_RPM_NEW_KEY |
| Per-workspace RPM | 120 (300 for scale) | WORKSPACE_RPM_DEFAULT, WORKSPACE_RPM_SCALE |
| Per-workspace in-flight | 8 | WORKSPACE_CONCURRENCY_MAX, TTL 30000 ms |
| Cost/minute burst | 15 INR | WORKSPACE_COST_PER_MINUTE_CAP_INR |
| Daily and period caps | atomic via Postgres | reserve_usage_if_within_cap / commit_usage_reservation |
| Global AI cost budget | fail-closed (prod) | AI_COST_BUDGET_INR, 60 s cache |
When entitlement verification is unavailable, quota-consuming routes move into temporary read-only degradation (ENTITLEMENT_DEGRADED, HTTP 503) until billing checks recover.
Chat completions provider (Worker)
Optional multi-vendor chat routing (default OpenAI): CHAT_PROVIDER = openai \| anthropic \| gemini in apps/api/src/env.ts. Keys: ANTHROPIC_API_KEY, GEMINI_API_KEY (Google AI Studio); OPENAI_API_KEY still powers embeddings when EMBEDDINGS_MODE=openai and powers chat when CHAT_PROVIDER=openai. Optional CHAT_MODEL overrides the default cheap model for the active chat provider. Implementation: apps/api/src/llm/chatComplete.ts.
Embeddings are not swapped when changing CHAT_PROVIDER. The memory_chunks table stores embedding_provider, embedding_dimension, and embedding_version per chunk; vector search only compares rows matching the query embedding dimension (default 1536 for legacy OpenAI/stub). No bulk re-embed is required for existing data.
Memory lifecycle intelligence (automatic)
MemoryNode applies deterministic lifecycle rules on ingest, search, and nightly hygiene — no extra routes beyond normal writes and search.
| Signal | Behavior |
|---|---|
| Dedupe | Canonical hash, normalized near-text, or embedding similarity can return an existing memory_id with deduped: true and optional dedupe_kind (near, semantic) |
| Confidence / volatility | Persisted on the memory row; used in Worker ranking (not in raw SQL recall scores) |
| Ephemeral TTL | High-volatility text may get expires_at; expired rows move to archived via the hygiene job |
| Supersession | replaces_memory_id, conflict resolution, evolution merges, and admin hygiene set lifecycle_state: replaced |
| Retrieval learning | Search/context bump learning signals; POST /v1/feedback applies explicit positive/negative using the x-request-id from the search call |
Implementation: apps/api/src/memories/memoryPolicyEngine.ts, apps/api/src/pipelines/search/ranking.ts, migrations 076–078.
What shows up in HTTP responses today:
| Surface | Fields | |
|---|---|---|
POST /v1/memories (dedupe hit) | { memory_id, stored: true, deduped: true, dedupe_kind? } | |
POST /v1/memories (new row) | memory_id, stored, chunks, extraction, optional intelligence.conflict_state, optional trace.memory_evolution_*, optional superseded_memory_id | |
POST /v1/search | results, pagination — no retrieval_trace in the JSON body | |
POST /v1/context | context_text, citations, pagination — same search pipeline, assembled for prompts | |
POST /v1/feedback | `{ feedback: "positive" \ | "negative", request_id } — request_id is the x-request-id` header from search/context |
Retrieval architecture: Postgres RPCs (match_chunks_vector, match_chunks_text) perform recall only (similarity / ts_rank + filters). Confidence, lifecycle, type, learning, and freshness ranking run in the Worker after RRF fusion (078_search_recall_only_rpcs.sql).
Debugging without custom tooling:
- Log
x-request-idon every call (support andPOST /v1/feedbackcorrelation). GET /v1/pruning/metrics— duplicate/chunk counts per workspace.GET /v1/audit/log— paginated API audit rows.- Search body field
explainis accepted by schema but not honored onPOST /v1/search(always off in the handler today). Do not rely on ranking explain payloads in production integrations. - Chunk rows store
embedding_provider,embedding_dimension,embedding_version(default 1536-dim OpenAI/stub behavior).
5. Plans
From packages/shared/src/plans.ts. PlanId = "launch" | "build" | "deploy" | "scale".
| Plan | INR | Period (d) | Writes | Reads | Embed tok | Gen tok | Storage GB | Retention (d) | Workspace RPM |
|---|---|---|---|---|---|---|---|---|---|
| launch | 399 | 7 | 250 | 1 000 | 100 000 | 150 000 | 0.5 | 30 | 120 |
| build | 999 | 30 | 1 200 | 4 000 | 600 000 | 1 000 000 | 2 | 90 | 120 |
| deploy | 2 999 | 30 | 5 000 | 15 000 | 3 000 000 | 5 000 000 | 10 | 180 | 120 |
| scale | 8 999 | 30 | 20 000 | 60 000 | 12 000 000 | 20 000 000 | 50 | 365 | 300 |
Overage rates per 1 k / per 1 M tok / per GB-mo are hard-coded per plan in plans.ts:87-203.
Pricing experiment scaffolding is exposed via PRICING_EXPERIMENTS in plans.ts (usd_anchor, usage_starter, startup_bundle, annual_commit). These are packaging/presentation experiments; checkout remains on the operational INR plan set while experiments run.
6. Routes (tenant-facing)
Dispatch is in apps/api/src/router.ts. K = API key. S = dashboard session.
6.1 Memories
| Method | Path | Auth | Purpose |
|---|---|---|---|
| POST | /v1/memories | K/S | Create a memory; embed, chunk, optional extraction up to 10 child memories |
| POST | /v1/memories/conversation | K/S | Create from transcript or messages (transforms → /v1/memories) |
| GET | /v1/memories | K/S | Paginated list, filters: namespace, user_id, owner_id, owner_type, memory_type, start_time, end_time, metadata |
| GET | /v1/memories/:id | K/S | Single memory |
| DELETE | /v1/memories/:id | K/S | Cascade delete with chunks and links |
| POST | /v1/memories/:id/links | K/S | Create memory_links edge (unique per workspace) |
| DELETE | /v1/memories/:id/links | K/S | Delete edge |
| POST | /v1/ingest | K/S | Discriminated dispatch → memory / conversation / document-as-text |
6.2 Search and context
| Method | Path | Auth | Purpose |
|---|---|---|---|
| POST | /v1/search | K/S | Embed query, pgvector search, optional rerank; header x-save-history: 1 inserts search_query_history |
| POST | /v1/context | K/S | Search + context assembly with citations and linked memories |
| POST | /v1/context/feedback | K/S | Insert feedback row |
| POST | /v1/feedback | K/S | Apply explicit positive/negative signal to latest record by request_id |
| GET | /v1/search/history | K/S | Saved queries (paginated) |
| POST | /v1/search/replay | K/S | Rerun from history by query_id |
| PATCH | /v1/profile/pins | K/S | Update pinned memories |
6.3 Usage, audit, pruning
| Method | Path | Auth | Purpose |
|---|---|---|---|
| GET | /v1/usage/today | K/S | Caps vs consumed reads/writes/tokens; includes entitlement_active and entitlement_source (billing) |
| GET | /v1/audit/log | K/S | Paginated api_audit_log rows |
| GET | /v1/pruning/metrics | K/S | workspace_pruning_metrics RPC |
6.4 Connectors
Connector settings are currently an advanced/deprioritized surface, not part of the default activation journey.
| Method | Path | Auth |
|---|---|---|
| GET | /v1/connectors/settings | K/S |
| PATCH | /v1/connectors/settings | K/S |
6.5 Workspaces and API keys
Console users create workspaces and API keys through the dashboard session routes in §6.7 (/v1/dashboard/bootstrap, /v1/dashboard/workspaces, /v1/dashboard/api-keys*). That family is what docs/external/openapi.yaml tracks.
Some deployments still accept operator paths such as POST /v1/workspaces (admin-scoped API key) and /v1/api-keys* (x-admin-token). They are intentionally omitted from the generated OpenAPI contract; treat them as internal/legacy unless your operator runbook says otherwise.
6.6 Billing
All PayU checkout flows use POST /v1/billing/checkout from the console. Legacy Stripe portal calls (POST /v1/billing/portal) may still return 410 Gone for old integrations; that path is omitted from the published OpenAPI contract.
| Method | Path | Auth | Notes |
|---|---|---|---|
| GET | /v1/billing/status | K/S | select workspace_entitlements |
| POST | /v1/billing/checkout | K/S | Body: {plan, firstname?, email?, phone?}. Inserts payu_transactions, computes SHA-512 request hash, returns {url, method:"POST", fields} for the PayU form. |
| POST | /v1/billing/webhook | H | PayU callback. Verifies reverse SHA-512 (or HMAC-SHA256 fallback), calls PayU verify API, upserts entitlements. |
6.7 Dashboard
Browser console routes use the dashboard session cookie (S) and request-scoped Supabase execution. Mutating POST routes also require a valid x-csrf-token (double-submit with the session bootstrap). JSON bodies use { ok: true, data: … } on success or { ok: false, error: { code, message, details? } } on failure unless noted.
| Method | Path | Auth | Notes |
|---|---|---|---|
| POST | /v1/dashboard/bootstrap | Supabase JWT in body | { access_token, workspace_name? }. Pre-cookie: resolves or creates the user’s default workspace via create_workspace when none exists; then the client calls /v1/dashboard/session with the chosen workspace_id. |
| POST | /v1/dashboard/session | Supabase JWT in body | {access_token, workspace_id}; verifies via SUPABASE_JWT_SECRET, inserts dashboard_sessions, sets HttpOnly cookie, returns csrf_token. |
| POST | /v1/dashboard/logout | S | Deletes session, clears cookie. |
| GET | /v1/dashboard/overview-stats | S | dashboard_console_overview_stats RPC. |
| GET | /v1/dashboard/workspaces | S | Lists workspaces the signed-in user belongs to (id, name, role). |
| POST | /v1/dashboard/workspaces | S + CSRF | { name }; create_workspace RPC. |
| GET | /v1/dashboard/api-keys | S | Query workspace_id? (defaults to active session workspace; must match session). list_api_keys RPC. |
| POST | /v1/dashboard/api-keys | S + CSRF | { workspace_id, name }; create_api_key RPC (returns plaintext key once). |
| POST | /v1/dashboard/api-keys/revoke | S + CSRF | { api_key_id }; revoke_api_key RPC. |
6.8 Health
| Method | Path | Auth | Notes |
|---|---|---|---|
| GET | /healthz | P | Validates critical env, returns version + embedding_model. |
| GET | /ready | P | Circuit-breaker-wrapped get_api_key_salt RPC. |
| GET | /v1/health | P | Same payload as /healthz. |
6.9 Admin
All auth with x-admin-token.
| Method | Path | Notes |
|---|---|---|
| POST | /admin/webhooks/reprocess | Rerun reconcilePayUWebhook for deferred events |
| POST | /admin/usage/reconcile | process_usage_reservation_refunds() RPC |
| POST | /admin/sessions/cleanup | Delete expired dashboard_sessions |
| POST | /admin/memory-hygiene | find_near_duplicate_memories(...); query: dry_run, limit, workspace_id |
| POST | /admin/memory-retention | Archive per retention; query: limit |
| GET | /v1/admin/billing/health | Billing health view |
6.10 MCP
POST /v1/mcp,POST /mcp→ Streamable HTTP JSON-RPC (handleHostedMcpRequestin apps/api/src/mcpHosted.ts).GET /v1/mcp,GET /mcp→ browser landing or SSE.DELETE /v1/mcp,DELETE /mcp→ close session.
See docs/MCP_SERVER.md for the tool catalog.
7. Canonical flows
7.1 POST /v1/memories
auth → rate limit → concurrency lease → quota reserve → embed (EMBEDDINGS_MODE, versioned chunk metadata) → dedupe / conflict / evolution passes → insert memories + memory_chunks → optional LLM extraction via CHAT_PROVIDER (up to 10 child memories) → commit reservation → audit.
resolveQuotaForWorkspace uses the billing entitlement row source (workspace_entitlements).
Entitlement source changes are audited in workspace_entitlement_audit and write-protected after create.
7.2 POST /v1/search
auth → rate limit → read reservation → optional query embed → SQL recall (vector and/or keyword RPCs) → Worker RRF fusion + ranking (+ optional rerank) → implicit retrieval-learning signals → JSON { results, page, page_size, total, has_more } (no retrieval_trace in body). Header x-save-history: true stores history without trace snapshot.
7.3 POST /v1/billing/checkout
Insert payu_transactions row → build SHA-512 request hash (buildPayURequestHashInput) → return {url, method:"POST", fields}. The dashboard auto-submits the form.
7.4 POST /v1/billing/webhook
Verify reverse SHA-512 (or HMAC-SHA256 fallback) → verifyPayUTransactionViaApi with retry and timeout → upsertWorkspaceEntitlementFromTransaction → 200. Idempotent via payu_webhook_events.
8. Client headers you may see
| Header | Meaning |
|---|---|
x-request-id | Correlation id on every response |
x-mn-resolved-container-tag | Resolved tenant container (debug) |
x-mn-routing-mode | service-role, rpc-first, or rls-first (debug) |
Retry-After | Seconds (429 responses) |
9. Changes and drift
docs/external/openapi.yaml is generated from code by apps/api/scripts/generate_openapi.mjs. CI enforces OpenAPI drift with pnpm openapi:check. The generator intentionally documents a reduced product surface (core memory, search, context, usage, billing checkout/status, dashboard session/bootstrap/workspaces/api-keys, health, MCP). Operator routes (/admin/, /v1/admin/, PayU webhooks, connectors, audit, pruning, etc.) remain in the Worker and are described in this file where relevant, but are not duplicated as paths in OpenAPI.
Keep API truth docs aligned in the same PR (see .cursor/rules/documentation-governance.mdc).
Recent backend hardening keeps API behavior unchanged while making optional internals fail-safe in degraded or stubbed environments: learned-adjustment lookup, monthly LLM usage reads, retrieval-attribute loading, and async feedback persistence now no-op safely when dependent storage/query capabilities are unavailable. The dev/CI smoke path (pnpm smoke:ci, which boots wrangler dev with SUPABASE_MODE=stub) also auto-registers the in-memory Supabase stub on first request when stage is non-production; production stages reject SUPABASE_MODE=stub outright in apps/api/src/db/createSupabaseClient.ts so no external API behavior is affected.
Admin and billing webhook routes are served directly by the main Worker.