v3 · Apache 2.0

The compiler that connects AI to your organization.

Auto-learns your schema and gives AI agents one governed query graph across your databases, APIs, source code, filesystems, and workflows. Ask about the data, the services that enrich it, the code that touches it, and the files around it from one MCP-ready binary.

One config defines what agents can discover, query, execute, edit, and never touch. Humans can audit it. Models can inspect it. GraphJin enforces it across your entire AI surface.

DatabasesAPIsSource codeFilesWorkflowsSecurity postureMCP + GraphQL
npx graphjin serve
DATABASES

Works with all your databases. And more.

Point GraphJin at as many systems as you need — Postgres for users, MySQL for orders, Snowflake for analytics, MongoDB for events, HTTP APIs for remote services, object storage for files, and CodeSQL for source trees — and query them through a single GraphQL endpoint. Joins, remote joins, subscriptions, search, and mutations compose across systems in one request, so an AI assistant can reason across the data, APIs, files, and code without learning every backend.

  • PostgreSQL
    PostgreSQL
  • MySQL
    MySQL
  • MariaDB
    MariaDB
  • MongoDB
    MongoDB
  • SQLite
    SQLite
  • SQL Server
    SQL Server
  • Oracle
    Oracle
  • CockroachDB
    CockroachDB
  • YugabyteDB
    YugabyteDB
  • Snowflake
    Snowflake
  • AWS Aurora
    AWS Aurora
  • Cloud SQL
    Cloud SQL
  • HTTP APIs
  • S3 / GCS / Files
  • Code
HOW IT WORKS

One compiler. Any system. Any client.

Point GraphJin at databases, object storage, source trees, and remote APIs. It learns the shape, compiles one GraphQL surface, enforces RBAC, and gives AI assistants, REST clients, and federated routers the same production-safe engine.

AGENTIC GRAPHJIN

Auto-learn, compile, govern, audit.

GraphJin is the operating graph agents use to understand a real organization. It auto-learns the live surface, compiles GraphQL into database and source-backed work, and keeps policy visible enough for both humans and models to inspect.

gj_catalog -> evidence -> gj_security -> validate/preview -> governed action -> observe/refresh
01 · Discover

gj_catalog

Find schemas, relationships, syntax, workflows, capabilities, examples, and evidence before choosing a path.

02 · Check

gj_security

Read effective policy and high-risk findings before config, workflow, file, code, or mutation actions.

03 · Validate

preview

Validate filters, inspect generated work, run approved workflows, or preview CodeSQL changes before applying.

04 · Act

governed surface

Execute through GraphQL, MCP, saved queries, workflows, and guarded source operations instead of raw credentials.

WHY GRAPHJIN

Built for the AI era, hardened for production.

A compiler — not a query parser, not a resolver framework. It learns the live shape of your systems, plans the work, and emits optimized database operations. The result is calmer code, fewer round-trips, and a governed integration point where agents can explore without guessing.

01

Auto-discovery

Introspects tables, columns, relationships, source metadata, and configured surfaces so agents start from live shape instead of pasted context.

02

Compiler, not resolvers

Nested GraphQL compiles into optimized database work. No N+1, no resolver sprawl, no ORM layer between the agent and the real system.

03

Governed exploration

RBAC, row filters, allow-lists, saved queries, read-only sources, security rows, and audit trails give agents room to move inside visible boundaries.

AI INTEGRATION

A native MCP server that starts with discovery.

GraphJin ships a Model Context Protocol server with the tools an assistant actually needs: catalog-first discovery, saved queries, where-clause validation, query repair, gj_security guidance, query execution, audit logs, and health checks. Same engine, same RBAC, same allow-lists — everything that protects your HTTP API protects the AI.

One install command wires GraphJin into Claude Desktop, Codex, or any MCP host. Tools are discoverable, narrow, and audited: agents search gj_catalog, inspect evidence and examples, validate filters, check gj_security, then run approved queries or workflows through the same governed surface.

For development, graphjin mcp runs over stdio. For team access, run it as a long-lived HTTP+SSE endpoint, gated by the same JWT or OIDC flow as the main API. No shell access, no raw SQL by default, no surprise mutations.

Claude Desktop stdio Codex / Cursor http+sse Custom MCP host spec GRAPHJIN MCP query_catalog validate · repair saved_query · gj_security RBAC · allowlist · jwt COMPILER GraphQL → SQL DATABASE
terminal
# install for Claude Code (or codex / cursor / custom)
graphjin mcp install --client claude --scope global --yes

# or set up against a hosted GraphJin
graphjin mcp setup https://api.example.com
claude_desktop_config.json json
// claude_desktop_config.json
{
  "mcpServers": {
    "graphjin": {
      "command": "graphjin",
      "args": ["mcp", "--config", "/etc/graphjin"],
      "env": {
        "GRAPHJIN_USER_ID": "system",
        "GRAPHJIN_USER_ROLE": "user"
      }
    }
  }
}
SECURITY MODEL

Safer agents, not smaller agents.

GraphJin makes agents safer by giving them explicit boundaries, not by making them blind. Agents can explore more of the live organization because policy, evidence, and action paths are inspectable and enforced.

One config defines the AI surface.

Humans can review and diff the policy. Models can inspect the same posture through gj_catalog and gj_security before acting. GraphJin enforces that policy across GraphQL, MCP, workflows, code, files, APIs, and databases.

One auditable config

Databases, sources, roles, MCP settings, saved queries, mutations, read-only boundaries, and workflow access live in one policy artifact.

Same auth everywhere

HTTP, WebSocket, SSE, CLI, workflows, and MCP land in the same request context before GraphJin compiles or executes work.

RBAC and row filters

Roles, table permissions, column blocks, automatic filters, and mutation limits are enforced inside the compiler.

Saved queries and allow-lists

Production agents can run named, reviewed query contracts instead of inventing arbitrary operations at runtime.

Read-only source boundaries

Filesystems, CodeSQL, databases, and control-plane tables can expose discovery without granting writes.

Preview before change

CodeSQL change sets require file hashes, exact ranges, old text, optional locks, and a preview/apply loop.

AI-POWERED QUERIES

Ask in plain English. Get real data back.

Claude Desktop, Codex, or any MCP client talks to GraphJin — GraphJin compiles the query, hits your database, and the assistant answers with rows it can reason over.

Claude Desktop

who's the top customer?

execute_graphql
{ customers { id full_name email purchases { quantity product { price } } } }
Done

Based on the purchase data, here are the top customers ranked by total spend:

RankCustomerOrdersItemsTotal Spent
🥇Antwan Friesen20124$928.45
🥈Lon Cruickshank2094$586.50
🥉Susana Schaefer2091$580.72

Antwan Friesen is the top customer with almost $1,000 in purchases — about 60% more than the runner-up.

CODE INTELLIGENCE

CodeSQL: query your code as well

GraphJin turns databases, HTTP APIs, discovered metadata, source code, and filesystems into one governed graph for AI agents. CodeSQL lets agents ask where a column exists, which code references it, which symbol owns that reference, and what guarded change set would update it — with preview/apply checks instead of raw file writes.

agent asks Where is users.email used? gj_columns { code_db_refs { file { path } symbol { name } } }
api/invoices.ts source
export async function createInvoiceHandler(req) {
  return workflows.run("sync-invoices")
}
CodeSQL tree-sitter + metadata read-only SQLite graph
gj_columnscode_db_refscode_symbolscode_docscode_files
matched rows
symbol createInvoiceHandler api/invoices.ts
column ref users.email api/users.go
refresh dev live watch prod restart sync
STORAGE

Files as queryable tables. Local, S3, or GCS.

GraphJin streams multipart uploads straight to local disk, S3, Cloudflare R2, or Google Cloud Storage. Each backend exposes a virtual table — list, stat, get, put, delete, presign — and joins seamlessly with the rest of your schema.

Uploads follow the graphql-multipart-request-spec: send a single request, GraphJin parses, validates, signs, and persists. Returned rows include the storage URL and metadata, ready for the next mutation or a presigned download.

Bring your own bucket: GCS uses Application Default Credentials, S3 respects the standard AWS chain, local writes go to a configured volume. Backends are pluggable behind one interface.

CLIENT multipart/form-data GRAPHJIN UPLOAD stream · validate · sign S3 aws · cloudflare r2 GCS google cloud storage LOCAL disk volume fstable row queryable · presign
config/prod.yml yaml
# config/prod.yml
filesystems:
  - name: "media"
    type: s3            # local | s3 | gcs
    bucket: "graphjin-media"
    region: "us-east-1"
    prefix: "uploads/"

uploads:
  enabled: true
  storage: "media"
  storage_key_prefix: "avatars/{date}/"
  allowed_mime: ["image/*", "application/pdf"]
  max_size: 25_000_000
upload.graphql graphql
# graphql-multipart-request-spec
mutation ($file: Upload!) {
  avatars(insert: { file: $file, user_id: $auth.user_id }) {
    id
    file_url
    file_size
    content_type
  }
}
REMOTE APIS

OpenAPI specs become first-class fields in your graph.

Drop a Stripe, GitHub, or internal-service OpenAPI 3 spec into the config directory. GraphJin parses it, classifies the operations, and exposes them alongside your tables — joinable on any column → parameter mapping. One GraphQL query, one response, even when half the data lives behind REST.

Auth is configured once per spec — bearer, basic, API key, OAuth2 client-credentials, or token-exchange — and tokens are cached transparently. Concurrency caps per-spec keep upstream rate limits respected.

Joins are declarative: tell GraphJin which column feeds which parameter and the result is a nested field, RBAC-aware, with the same compiler that generates your SQL planning the calls.

stripe.yaml OpenAPI 3 spec stripe live REST endpoint customers postgres table GRAPHJIN load · classify · join column → param map bearer · oauth2 · keys remote_api · db · workflows UNIFIED RESPONSE customers + invoices in one GraphQL request
config/openapi/stripe.yml yaml
# config/openapi/stripe.yml
base_url: "https://api.stripe.com"

auth:
  scheme: bearer
  token_url: "https://api.stripe.com/v1/oauth/token"
  cache_ttl: "55m"

# Map a DB column onto a REST path/query param,
# so a join is just GraphQL.
joins:
  - table: customers
    operation: listInvoices
    params:
      - column: stripe_customer_id
        param: customer
customer_with_invoices.graphql graphql
query ($id: ID!) {
  customers(id: $id) {
    full_name
    email

    # joined live from Stripe via OpenAPI spec
    invoices {
      id
      total
      status
      created
    }
  }
}
TOOLING

A CLI that fits the developer loop.

One binary covers everything: a dev server with auto schema discovery, a database toolchain (setup, diff, migrate, seed), a remote client that authenticates over OIDC device-code, and an MCP server. No tokens to copy, no frameworks to learn.

graphjin serve --demo starts a working example in seconds. graphjin cli setup opens the device-code login URL in your browser and persists a refreshable JWT for every subsequent command. Workflows can be invoked by name from CLI, MCP, REST, or another workflow.

Every subcommand respects the same config, the same RBAC, and the same allow-list. What runs in CI matches what runs in production.

$ graphjin serve db cli mcp --demo · prod test · subscribe setup · diff migrate · seed setup · query workflow · audit install · setup info · plugin
dev
# spin up against a demo database
graphjin serve --demo

# scaffold and migrate a real schema
graphjin db setup
graphjin db migrate
graphjin db seed
prod
# authenticate via OIDC device-code flow
graphjin cli setup https://api.example.com

# run a saved query against prod
graphjin cli query top_customers --limit 5

# exec a workflow (chained queries + JS)
graphjin cli workflow customer_report

# tail audit logs
graphjin cli audit --since 1h
SECURITY

OAuth, JWT, OIDC — and row-level rules.

JWT from Auth0, Firebase, Okta, or any JWKS endpoint. Header- or cookie-based sessions for legacy stacks. OIDC device-code login for the CLI and MCP. Whatever the source, every request lands in the same context, so one AI surface can be governed with RBAC, row-level filters, and audited policy.

Configure once. Every transport — HTTP, WebSocket, SSE, MCP — runs the same auth pipeline. Roles + row-level filters are authored in YAML and enforced inside the compiler, so even an agent-run workflow cannot read or write outside its lane.

The CLI and MCP authenticate via OIDC device-code: open a URL, approve, done. Tokens refresh automatically, and the resulting permissions are the same ones reflected back through catalog and security posture rows.

JWT Auth0 · Firebase · Okta · JWKS HEADER · API KEY X-User-Id · X-Api-Key OIDC DEVICE-CODE graphjin cli setup AUTH MIDDLEWARE verify · introspect · cache user · role · claims REQUEST CONTEXT RBAC · row-level rules
config/prod.yml yaml
# config/prod.yml
auth:
  type: jwt
  jwt:
    provider: "auth0"          # or firebase, okta, custom
    audience: "https://api.example.com"
    jwks_url: "https://example.auth0.com/.well-known/jwks.json"
  cookie: "gj_session"

auth_login:
  enabled: true                # OIDC device-code login for CLI / MCP
  provider: "https://login.example.com"
  client_id: "graphjin-cli"
  scopes: ["openid", "email", "offline_access"]
config/roles.yml yaml
# config/roles.yml
roles:
  - name: anon
    tables:
      products: { query: { columns: [id, name, price] } }

  - name: user
    tables:
      orders:
        query:  { filters: ["{ user_id: { eq: $user_id } }"] }
        insert: { columns: [product_id, quantity] }
        update: { filters: ["{ status: { eq: "draft" } }"] }
REALTIME

Live queries with cursors that survive reconnects.

Subscribe with the same GraphQL you'd use for a query. GraphJin streams deltas over SSE or WebSockets, batches database polls into one statement, and emits cursors so clients can resume after a network hiccup without missing rows.

The subscription API is just queries with a cursor — no new schema, no resolver tree, no pub/sub bus to operate. Cursor-based pagination keeps feeds and chat-style UIs deterministic; the adaptive poll sizer keeps load predictable as subscriber count grows.

Multiple subscriptions can share a single WebSocket. Per-message timeouts and JWT expiry are enforced at the transport layer, not by hand-rolled middleware.

CLIENT GRAPHJIN DATABASE subscribe { … } listen / poll @ N ms delta + cursor SSE · WebSocket frame repeat · resume by cursor
live_orders.graphql graphql
subscription LiveOrders($since: Cursor) {
  orders(
    where: { status: { eq: "open" } }
    after: $since
    first: 50
    order_by: { id: asc }
  ) {
    id
    total
    customer { id full_name }
    cursor
  }
}
config/prod.yml yaml
# config/prod.yml
subs_poll_duration: "2s"   # adaptive batched polling
subs_max_clients: 10000

# both transports active at once
http:
  sse: true
  websocket: true
FEATURES

Everything a governed AI surface needs.

One binary, one config file — compiler, catalog, MCP, auth, workflows, CodeSQL, subscriptions, and a CLI. The agent sees a map; the organization keeps the controls.

1

Auditable config for agent access across the AI surface.

8+

Databases supported through the same GraphQL surface.

0

Lines of resolver code. The compiler does the work.

Catalog discovery spine

Agents discover tables, columns, relationships, syntax, workflows, and safety notes through gj_catalog.

Compiler engine

GraphQL compiles into optimized database work, with cross-database composition when sources allow it.

Security posture graph

gj_security exposes policy rows and findings so agents can check risk before write-capable actions.

Live subscriptions

SSE and WebSocket transports with cursor-based resume.

Governed workflows

Discover approved workflows, inspect variable contracts, and execute through GraphQL, REST, MCP, or CLI.

Read-only replicas

Lock a database to query-only with a single config flag.

Remote API joins

Stitch in REST and GraphQL endpoints alongside your tables.

CodeSQL preview/apply

Source edits use hashes, exact ranges, old text, optional locks, and preview diffs before apply.

Auditable config

One YAML surface defines roles, sources, saved queries, MCP permissions, and read-only boundaries.

QUICKSTART

Run it in under a minute.

Pick your platform, copy the command, and you're querying. The demo flag ships a real schema and example queries so there's something to point an AI client at on the very first run.

$npx graphjin serve --demo

Wire it into your AI client

one command
Claude Code logo
graphjin mcp install --client claude --scope global --yes
OpenAI Codex logo
graphjin mcp install --client codex --scope global --yes

Prefer interactive setup? graphjin mcp install

GET STARTED

Two paths. Both end with queries running.

  1. 1

    Point to your database

    Configure the connection — PostgreSQL, MySQL, SQLite, MongoDB, Oracle, MSSQL.

  2. 2

    Auto-discover schema

    GraphJin introspects tables, columns, and relationships on boot.

  3. 3

    Start querying

    Joins, mutations, subscriptions, federation, MCP — all out of the box.

ADVANCED · SUPERGRAPH

Drop GraphJin into a federated supergraph.

Already running Apollo Router, Cosmo, or Hive? Flip one config flag and every primary-keyed table becomes a federation v2 subgraph — SDL with @key, @shareable, and @inaccessible directives, plus a working _service entry point. No resolver code.

config/prod.yml yaml
# config/prod.yml
federation:
  enabled: true
  version: v2.5
  keys:
    users: "id"
    products: "sku"
  • · Generated SDL refreshes on schema change
  • · Per-table key overrides; field-level @shareable / @inaccessible
  • · Multiple GraphJin processes compose into one supergraph
  • · Same RBAC + allow-lists apply to entity references