MemoryNode API usage

Canonical HTTP reference for the memorynode-api Cloudflare Worker. Source of truth for behavior is code in apps/api/src/. The regenerated docs/external/openapi.yaml documents the public product surface (run pnpm openapi:gen) — internal, operator, and advanced routes may still exist in the Worker but are omitted from that contract.

  • Production base: https://api.memorynode.ai
  • Hosted MCP base: https://mcp.memorynode.ai (same Worker)
  • Staging base: https://api-staging.memorynode.ai
  • Local dev: http://127.0.0.1:8787

Current product focus is workflow outcomes for a primary cluster of AI-native products:

  • SaaS personalization copilots
  • AI companion products
  • Creator tools with persistent context

Developer agents are a secondary segment. Advanced surfaces (connectors, audit exports, search history, admin routes, etc.) may still exist on the Worker for operators and power users; they are intentionally omitted from the published OpenAPI contract — see section 9.

1. Authentication

Dispatched in apps/api/src/auth.ts and apps/api/src/workerApp.ts.

ModeTriggerValidationUses
API key (K)Authorization: Bearer <key> or x-api-key: <key>SHA-256 of key + API_KEY_SALT then authenticate_api_key(p_key_hash) RPCAll /v1/* tenant routes
Dashboard session (S)Cookie: mn_dash_session=<opaque> + x-csrf-tokenRow in dashboard_sessions + CSRF double-submit/v1/* tenant routes from browser
Admin (A)x-admin-tokenEquality against MASTER_ADMIN_TOKEN or HMAC-SHA256 signed form; optional IP allowlist (ADMIN_ALLOWED_IPS); ADMIN_BREAK_GLASS/admin/, /v1/admin/
PayU webhook (H)form bodyReverse SHA-512 over PayU fields, or HMAC-SHA256 fallback over raw body keyed by PAYU_WEBHOOK_SECRET (x-payu-webhook-signature)POST /v1/billing/webhook only
Internal MCP (I)x-internal-mcp: 1 + x-internal-secret: <MCP_INTERNAL_SECRET>Constant-time compareInternal hosted-MCP → REST subrequests
Public (P)/healthz, /ready, /v1/health

API keys created through POST /v1/api-keys are rate-limited at 15 RPM for the first 48 h after api_keys.created_at; after that the default is 60 RPM per key. See packages/shared/src/plans.ts.

Webhook forwarding uses internal headers after signature verification, but that internal token path is route-bound to POST /v1/memories and is not accepted as a general-purpose auth bypass on other routes.

2. Middleware order

From handleRequestImpl in apps/api/src/workerApp.ts:

  1. Request-id resolve, CORS + security headers ensemble.
  2. Short-circuit health endpoints (/healthz, /ready, /v1/health).
  3. CORS deny if Origin is not in ALLOWED_ORIGINS (except hosted MCP paths).
  4. enforceRuntimeConfigGuards and ensureRateLimitDo.
  5. OPTIONS short-circuit.
  6. assertBodySize (bounded by MAX_BODY_BYTES / MAX_IMPORT_BYTES).
  7. KNOWN_PATH_ALLOWED_METHODS 405 gate with Allow: header.
  8. Production dashboard ALLOWED_ORIGINS gate.
  9. createSupabaseClient(env) + db_access_path_selected log.
  10. Hosted MCP path → IP rate limit → handleHostedMcpRequest.
  11. Dashboard session POST/logout inline.
  12. Enforce route-level auth and handler dispatch via route().
  13. Build handlerDeps, instantiate factories, call route().
  14. 404 if route() returns null.
  15. Catch: ApiError-shaped response with error_code; else 500 INTERNAL.
  16. Finally: emitAuditLog, request_completed structured log, persistApiRequestEvent.

3. Error envelope

Errors emitted by the outer catch in handleRequestImpl (apps/api/src/workerApp.ts ~1379–1407) use:

{ "error": { "code": "STRING_CODE", "message": "human readable" } }

Typical codes: BAD_REQUEST, UNAUTHORIZED, FORBIDDEN, NOT_FOUND, PAYLOAD_TOO_LARGE, RATE_LIMITED, CAP_EXCEEDED, TRIAL_EXPIRED, COST_BUDGET_EXCEEDED, CONTROL_PLANE_ONLY, INTERNAL. Correlate responses with the x-request-id header.

4. Rate limits, concurrency, quotas

All defined in apps/api/src/auth.ts, apps/api/src/usage/quotaReservation.ts, and packages/shared/src/plans.ts.

ControlDefaultSource
Per-key RPM60 (15 for new keys, 48 h)RATE_LIMIT_MAX, RATE_LIMIT_RPM_NEW_KEY
Per-workspace RPM120 (300 for scale)WORKSPACE_RPM_DEFAULT, WORKSPACE_RPM_SCALE
Per-workspace in-flight8WORKSPACE_CONCURRENCY_MAX, TTL 30000 ms
Cost/minute burst15 INRWORKSPACE_COST_PER_MINUTE_CAP_INR
Daily and period capsatomic via Postgresreserve_usage_if_within_cap / commit_usage_reservation
Global AI cost budgetfail-closed (prod)AI_COST_BUDGET_INR, 60 s cache

When entitlement verification is unavailable, quota-consuming routes move into temporary read-only degradation (ENTITLEMENT_DEGRADED, HTTP 503) until billing checks recover.

Chat completions provider (Worker)

Optional multi-vendor chat routing (default OpenAI): CHAT_PROVIDER = openai \| anthropic \| gemini in apps/api/src/env.ts. Keys: ANTHROPIC_API_KEY, GEMINI_API_KEY (Google AI Studio); OPENAI_API_KEY still powers embeddings when EMBEDDINGS_MODE=openai and powers chat when CHAT_PROVIDER=openai. Optional CHAT_MODEL overrides the default cheap model for the active chat provider. Implementation: apps/api/src/llm/chatComplete.ts.

Embeddings are not swapped when changing CHAT_PROVIDER. The memory_chunks table stores embedding_provider, embedding_dimension, and embedding_version per chunk; vector search only compares rows matching the query embedding dimension (default 1536 for legacy OpenAI/stub). No bulk re-embed is required for existing data.

Memory lifecycle intelligence (automatic)

MemoryNode applies deterministic lifecycle rules on ingest, search, and nightly hygiene — no extra routes beyond normal writes and search.

SignalBehavior
DedupeCanonical hash, normalized near-text, or embedding similarity can return an existing memory_id with deduped: true and optional dedupe_kind (near, semantic)
Confidence / volatilityPersisted on the memory row; used in Worker ranking (not in raw SQL recall scores)
Ephemeral TTLHigh-volatility text may get expires_at; expired rows move to archived via the hygiene job
Supersessionreplaces_memory_id, conflict resolution, evolution merges, and admin hygiene set lifecycle_state: replaced
Retrieval learningSearch/context bump learning signals; POST /v1/feedback applies explicit positive/negative using the x-request-id from the search call

Implementation: apps/api/src/memories/memoryPolicyEngine.ts, apps/api/src/pipelines/search/ranking.ts, migrations 076078.

What shows up in HTTP responses today:

SurfaceFields
POST /v1/memories (dedupe hit){ memory_id, stored: true, deduped: true, dedupe_kind? }
POST /v1/memories (new row)memory_id, stored, chunks, extraction, optional intelligence.conflict_state, optional trace.memory_evolution_*, optional superseded_memory_id
POST /v1/searchresults, pagination — no retrieval_trace in the JSON body
POST /v1/contextcontext_text, citations, pagination — same search pipeline, assembled for prompts
POST /v1/feedback`{ feedback: "positive" \"negative", request_id }request_id is the x-request-id` header from search/context

Retrieval architecture: Postgres RPCs (match_chunks_vector, match_chunks_text) perform recall only (similarity / ts_rank + filters). Confidence, lifecycle, type, learning, and freshness ranking run in the Worker after RRF fusion (078_search_recall_only_rpcs.sql).

Debugging without custom tooling:

  • Log x-request-id on every call (support and POST /v1/feedback correlation).
  • GET /v1/pruning/metrics — duplicate/chunk counts per workspace.
  • GET /v1/audit/log — paginated API audit rows.
  • Search body field explain is accepted by schema but not honored on POST /v1/search (always off in the handler today). Do not rely on ranking explain payloads in production integrations.
  • Chunk rows store embedding_provider, embedding_dimension, embedding_version (default 1536-dim OpenAI/stub behavior).

5. Plans

From packages/shared/src/plans.ts. PlanId = "launch" | "build" | "deploy" | "scale".

PlanINRPeriod (d)WritesReadsEmbed tokGen tokStorage GBRetention (d)Workspace RPM
launch39972501 000100 000150 0000.530120
build999301 2004 000600 0001 000 000290120
deploy2 999305 00015 0003 000 0005 000 00010180120
scale8 9993020 00060 00012 000 00020 000 00050365300

Overage rates per 1 k / per 1 M tok / per GB-mo are hard-coded per plan in plans.ts:87-203.

Pricing experiment scaffolding is exposed via PRICING_EXPERIMENTS in plans.ts (usd_anchor, usage_starter, startup_bundle, annual_commit). These are packaging/presentation experiments; checkout remains on the operational INR plan set while experiments run.

6. Routes (tenant-facing)

Dispatch is in apps/api/src/router.ts. K = API key. S = dashboard session.

6.1 Memories

MethodPathAuthPurpose
POST/v1/memoriesK/SCreate a memory; embed, chunk, optional extraction up to 10 child memories
POST/v1/memories/conversationK/SCreate from transcript or messages (transforms → /v1/memories)
GET/v1/memoriesK/SPaginated list, filters: namespace, user_id, owner_id, owner_type, memory_type, start_time, end_time, metadata
GET/v1/memories/:idK/SSingle memory
DELETE/v1/memories/:idK/SCascade delete with chunks and links
POST/v1/memories/:id/linksK/SCreate memory_links edge (unique per workspace)
DELETE/v1/memories/:id/linksK/SDelete edge
POST/v1/ingestK/SDiscriminated dispatch → memory / conversation / document-as-text

6.2 Search and context

MethodPathAuthPurpose
POST/v1/searchK/SEmbed query, pgvector search, optional rerank; header x-save-history: 1 inserts search_query_history
POST/v1/contextK/SSearch + context assembly with citations and linked memories
POST/v1/context/feedbackK/SInsert feedback row
POST/v1/feedbackK/SApply explicit positive/negative signal to latest record by request_id
GET/v1/search/historyK/SSaved queries (paginated)
POST/v1/search/replayK/SRerun from history by query_id
PATCH/v1/profile/pinsK/SUpdate pinned memories

6.3 Usage, audit, pruning

MethodPathAuthPurpose
GET/v1/usage/todayK/SCaps vs consumed reads/writes/tokens; includes entitlement_active and entitlement_source (billing)
GET/v1/audit/logK/SPaginated api_audit_log rows
GET/v1/pruning/metricsK/Sworkspace_pruning_metrics RPC

6.4 Connectors

Connector settings are currently an advanced/deprioritized surface, not part of the default activation journey.

MethodPathAuth
GET/v1/connectors/settingsK/S
PATCH/v1/connectors/settingsK/S

6.5 Workspaces and API keys

Console users create workspaces and API keys through the dashboard session routes in §6.7 (/v1/dashboard/bootstrap, /v1/dashboard/workspaces, /v1/dashboard/api-keys*). That family is what docs/external/openapi.yaml tracks.

Some deployments still accept operator paths such as POST /v1/workspaces (admin-scoped API key) and /v1/api-keys* (x-admin-token). They are intentionally omitted from the generated OpenAPI contract; treat them as internal/legacy unless your operator runbook says otherwise.

6.6 Billing

All PayU checkout flows use POST /v1/billing/checkout from the console. Legacy Stripe portal calls (POST /v1/billing/portal) may still return 410 Gone for old integrations; that path is omitted from the published OpenAPI contract.

MethodPathAuthNotes
GET/v1/billing/statusK/Sselect workspace_entitlements
POST/v1/billing/checkoutK/SBody: {plan, firstname?, email?, phone?}. Inserts payu_transactions, computes SHA-512 request hash, returns {url, method:"POST", fields} for the PayU form.
POST/v1/billing/webhookHPayU callback. Verifies reverse SHA-512 (or HMAC-SHA256 fallback), calls PayU verify API, upserts entitlements.

6.7 Dashboard

Browser console routes use the dashboard session cookie (S) and request-scoped Supabase execution. Mutating POST routes also require a valid x-csrf-token (double-submit with the session bootstrap). JSON bodies use { ok: true, data: … } on success or { ok: false, error: { code, message, details? } } on failure unless noted.

MethodPathAuthNotes
POST/v1/dashboard/bootstrapSupabase JWT in body{ access_token, workspace_name? }. Pre-cookie: resolves or creates the user’s default workspace via create_workspace when none exists; then the client calls /v1/dashboard/session with the chosen workspace_id.
POST/v1/dashboard/sessionSupabase JWT in body{access_token, workspace_id}; verifies via SUPABASE_JWT_SECRET, inserts dashboard_sessions, sets HttpOnly cookie, returns csrf_token.
POST/v1/dashboard/logoutSDeletes session, clears cookie.
GET/v1/dashboard/overview-statsSdashboard_console_overview_stats RPC.
GET/v1/dashboard/workspacesSLists workspaces the signed-in user belongs to (id, name, role).
POST/v1/dashboard/workspacesS + CSRF{ name }; create_workspace RPC.
GET/v1/dashboard/api-keysSQuery workspace_id? (defaults to active session workspace; must match session). list_api_keys RPC.
POST/v1/dashboard/api-keysS + CSRF{ workspace_id, name }; create_api_key RPC (returns plaintext key once).
POST/v1/dashboard/api-keys/revokeS + CSRF{ api_key_id }; revoke_api_key RPC.

6.8 Health

MethodPathAuthNotes
GET/healthzPValidates critical env, returns version + embedding_model.
GET/readyPCircuit-breaker-wrapped get_api_key_salt RPC.
GET/v1/healthPSame payload as /healthz.

6.9 Admin

All auth with x-admin-token.

MethodPathNotes
POST/admin/webhooks/reprocessRerun reconcilePayUWebhook for deferred events
POST/admin/usage/reconcileprocess_usage_reservation_refunds() RPC
POST/admin/sessions/cleanupDelete expired dashboard_sessions
POST/admin/memory-hygienefind_near_duplicate_memories(...); query: dry_run, limit, workspace_id
POST/admin/memory-retentionArchive per retention; query: limit
GET/v1/admin/billing/healthBilling health view

6.10 MCP

  • POST /v1/mcp, POST /mcp → Streamable HTTP JSON-RPC (handleHostedMcpRequest in apps/api/src/mcpHosted.ts).
  • GET /v1/mcp, GET /mcp → browser landing or SSE.
  • DELETE /v1/mcp, DELETE /mcp → close session.

See docs/MCP_SERVER.md for the tool catalog.

7. Canonical flows

7.1 POST /v1/memories

auth → rate limit → concurrency lease → quota reserve → embed (EMBEDDINGS_MODE, versioned chunk metadata) → dedupe / conflict / evolution passes → insert memories + memory_chunks → optional LLM extraction via CHAT_PROVIDER (up to 10 child memories) → commit reservation → audit.

resolveQuotaForWorkspace uses the billing entitlement row source (workspace_entitlements).

Entitlement source changes are audited in workspace_entitlement_audit and write-protected after create.

7.2 POST /v1/search

auth → rate limit → read reservation → optional query embed → SQL recall (vector and/or keyword RPCs) → Worker RRF fusion + ranking (+ optional rerank) → implicit retrieval-learning signals → JSON { results, page, page_size, total, has_more } (no retrieval_trace in body). Header x-save-history: true stores history without trace snapshot.

7.3 POST /v1/billing/checkout

Insert payu_transactions row → build SHA-512 request hash (buildPayURequestHashInput) → return {url, method:"POST", fields}. The dashboard auto-submits the form.

7.4 POST /v1/billing/webhook

Verify reverse SHA-512 (or HMAC-SHA256 fallback) → verifyPayUTransactionViaApi with retry and timeout → upsertWorkspaceEntitlementFromTransaction200. Idempotent via payu_webhook_events.

8. Client headers you may see

HeaderMeaning
x-request-idCorrelation id on every response
x-mn-resolved-container-tagResolved tenant container (debug)
x-mn-routing-modeservice-role, rpc-first, or rls-first (debug)
Retry-AfterSeconds (429 responses)

9. Changes and drift

docs/external/openapi.yaml is generated from code by apps/api/scripts/generate_openapi.mjs. CI enforces OpenAPI drift with pnpm openapi:check. The generator intentionally documents a reduced product surface (core memory, search, context, usage, billing checkout/status, dashboard session/bootstrap/workspaces/api-keys, health, MCP). Operator routes (/admin/, /v1/admin/, PayU webhooks, connectors, audit, pruning, etc.) remain in the Worker and are described in this file where relevant, but are not duplicated as paths in OpenAPI.

Keep API truth docs aligned in the same PR (see .cursor/rules/documentation-governance.mdc).

Recent backend hardening keeps API behavior unchanged while making optional internals fail-safe in degraded or stubbed environments: learned-adjustment lookup, monthly LLM usage reads, retrieval-attribute loading, and async feedback persistence now no-op safely when dependent storage/query capabilities are unavailable. The dev/CI smoke path (pnpm smoke:ci, which boots wrangler dev with SUPABASE_MODE=stub) also auto-registers the in-memory Supabase stub on first request when stage is non-production; production stages reject SUPABASE_MODE=stub outright in apps/api/src/db/createSupabaseClient.ts so no external API behavior is affected.

Admin and billing webhook routes are served directly by the main Worker.