Files
ai-gateway/SECURITY.md
T

17 KiB
Raw Blame History

Security Review — AI Gateway

Date: 2026-04-02
Reviewer: Claude Code (automated static analysis)
Scope: Full codebase (d:\Antigravity Apps\ai_gateway)


Summary

Severity Count
Critical 4
High 13
Medium 10
Low 9
Informational 3
Total 39

Immediate Action Items

Do these before any other work.

  1. Separate API_KEY and SECRET_KEY — they are currently set to the same value; a leaked API key allows forging admin JWTs.
  2. Remove hardcoded fallback credentials in app/core/config.py ("admin123", "storyline-secret-key-123").
  3. Add rate limiting to login (/auth/login) — currently bruteforceable.
  4. Consider runtime secret injection (Docker secrets, AWS Secrets Manager) so credentials are never written to disk at all.

Critical

C-1 — Plaintext Secrets in .env File

File: .env lines 110

Multiple live credentials stored in plaintext. The .env file is correctly listed in .gitignore and is not committed to the repository, so there is no git history exposure. However, the secrets still warrant attention:

Variable Issue
API_KEY Same value as SECRET_KEY — two independent secrets should not share a value
GOOGLE_API_KEY Live Google API key — should be scoped to minimum required APIs/quotas
DATABASE_URL Full Supabase connection string with embedded password
ADMIN_PASSWORD Plaintext in a file on disk; no rotation mechanism
SECRET_KEY Identical to API_KEY — compromising one compromises both

Remaining concerns:

  • Anyone with filesystem access to the server (compromised container, misconfigured volume mount, log leak of DATABASE_URL) gets all credentials at once.
  • SECRET_KEY and API_KEY being the same value means a leaked API key also forges JWTs.

Fix: Use a secrets manager or environment injection at runtime (Docker secrets, AWS Secrets Manager, Doppler) so credentials are never written to disk. At minimum, use separate values for API_KEY and SECRET_KEY.


C-2 — Weak Default Credentials in Config

File: app/core/config.py lines 8, 13, 16

API_KEY: str = os.getenv("API_KEY", "storyline-secret-key-123")
SECRET_KEY: str = os.getenv("SECRET_KEY", "your-super-secret-key-change-me")
ADMIN_PASSWORD: str = os.getenv("ADMIN_PASSWORD", "admin123")

If environment variables are absent, the app falls back to trivially guessable values.

Fix: Remove all default values. Raise a RuntimeError on startup if these are not set.


C-3 — Fully Permissive CORS

File: app/main.py lines 4956

CORSMiddleware(
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

allow_origins=["*"] combined with allow_credentials=True is both a browser-rejected combination and a clear signal of over-permissive intent. Any domain can make cross-origin requests.

Fix:

allow_origins = os.getenv("CORS_ORIGINS", "https://your-frontend.com").split(",")

C-4 — Hardcoded Admin Credentials with No Password Hashing

File: app/api/endpoints/auth.py lines 1426

if form_data.username == "admin" and form_data.password == settings.ADMIN_PASSWORD:
  • Password is plaintext comparison — passlib is a dependency but never used.
  • No rate limiting on the login endpoint → bruteforceable.
  • Single static admin account with no database backing.

Fix: Use a proper user table with bcrypt-hashed passwords. Add @limiter.limit("5/minute") to the login route.


High

H-1 — Debug Print Statements Leak Auth Info in Production

File: app/api/deps.py lines 2328

print(f"DEBUG: API Key received (prefix): {api_key[:5]}...")

First 5 chars of each API key are printed to stdout on every authenticated request. ENVIRONMENT=production is set in .env.

Fix: Remove all print() debug statements. Use structured logging gated on DEBUG level.


H-2 — Request Logging Leaks URLs and Origins

File: app/main.py lines 4047

Every incoming request, including URL (which may contain query-param secrets) and Origin header, is logged at INFO level in production.

Fix: Gate debug logging behind settings.ENVIRONMENT == "development". Sanitize URLs before logging.


H-3 — No Rate Limiting on Admin Endpoints

File: app/api/endpoints/admin.py lines 24120

POST /modules, GET /modules, POST /modules/{id}/rotate, DELETE /modules/{id} — none have @limiter.limit(). Regular API endpoints use settings.RATE_LIMIT but admin does not.

Fix: Add @limiter.limit("10/minute") or stricter to all admin routes.


H-4 — Duplicate Admin Endpoints with Inconsistent Auth

File: app/api/admin_backend.py and app/api/endpoints/admin.py

Two sets of module management endpoints are mounted to the same prefix. admin_backend.py uses API key auth (any valid module key can call it); admin.py uses JWT. This creates an auth bypass path.

Fix: Remove admin_backend.py endpoints. Consolidate to one set requiring JWT with an admin role check.


H-5 — Exception Messages Returned to Clients

File: app/api/endpoints/gemini.py lines 7179, 142143; similar in app/api/endpoints/openai.py

except Exception as e:
    return {"status": "error", "detail": f"Failed to parse text/plain as JSON: {str(e)}"}

Raw Python exception messages reveal framework internals, library versions, and data structures to callers.

Fix: Log the full exception server-side; return a generic message: "Service error — please try again later".


H-6 — No HTTPS Enforcement at Application Level

File: docker-compose.yml, Caddyfile

TLS is terminated by Caddy, but the application itself has no HTTPS redirect or HSTS header. If Caddy is bypassed (direct port 8000 access), credentials are transmitted in plaintext.

Fix: Add an HTTPS enforcement middleware and set Strict-Transport-Security: max-age=31536000; includeSubDomains in all responses.


H-7 — Unvalidated Query Parameter Casting

File: app/api/endpoints/gemini.py lines 4661

temperature=float(params.get("temperature", 0.7)),
top_k=int(params.get("top_k", 40)),

Invalid input raises ValueError → 500. No bounds checking: temperature could be -999, top_k could be 2147483647.

Fix: Use Pydantic Query() with Field(ge=0.0, le=2.0) constraints and handle ValueError with a 422 response.


H-8 — Prompt Injection via User-Controlled System Prompt

File: app/api/endpoints/gemini.py lines 110112

system_instruction = chat_data.system_prompt or GEMINI_SYSTEM_PROMPT
system_instruction += f"\n\nKnowledge Base:\n{chat_data.knowledge_base}"

Callers can fully replace the system prompt and append arbitrary content to it. This enables jailbreaking the model, data exfiltration via the LLM, and policy bypass.

Fix: Restrict system_prompt to an allowlist of predefined values. Never concatenate raw user input into a system instruction.


H-9 — Stored XSS in Admin Panel (Module Names)

File: app/static/admin.html line 608

<div style="font-weight: 600;">${m.name}</div>

Module names from the API are interpolated directly into innerHTML via template literals. An attacker with a valid API key can create a module named <img src=x onerror="..."> to exfiltrate the admin JWT from localStorage.

Fix: Use element.textContent = m.name or DOMPurify. Add a Content-Security-Policy header.


H-10 — XSS via Inline Event Handler with API Key

File: app/static/admin.html lines 617, 620

<button onclick="viewKey('${m.secret_key}')">

The raw secret key is embedded in an onclick attribute. A key containing ');alert(1);// would execute arbitrary JavaScript.

Fix: Use data-* attributes and event delegation. Fetch the key from the API via ID, never embed it in HTML.


H-11 — Admin Panel Accessible Without Authentication

File: app/main.py lines 8486

@app.get("/admin")
async def admin_panel():
    return FileResponse("app/static/admin.html")

The admin UI is served to anyone without any auth check. Though the API calls require a JWT, the HTML surface itself reveals the full admin structure.

Fix: Add Depends(get_current_user) to this route, or serve the admin panel only from an authenticated session.


H-12 — JWT Secret Identical to API Key

File: .env lines 1, 9

SECRET_KEY and API_KEY are set to the same value. A leaked API key immediately compromises JWT signing, allowing anyone to forge admin tokens.

Fix: Use independent, separately rotated secrets for JWT signing and API authentication.


H-13 — No Token Revocation

File: app/core/security.py

Issued JWTs cannot be revoked. A compromised admin token is valid until it expires. No logout endpoint exists on the server.

Fix: Implement a token blacklist (Redis or DB table keyed on jti claim). Add a /auth/logout endpoint.


Medium

M-1 — JWT Token Stored in localStorage

File: app/static/admin.html lines 503, 565

localStorage is accessible to any JavaScript on the page — including via XSS. Combined with findings H-9/H-10, this leads to full account takeover.

Fix: Use httpOnly session cookies. If localStorage must be kept, ensure CSP prevents inline scripts.


M-2 — No CSRF Protection

State-changing admin operations (POST, DELETE) accept requests from any origin without a CSRF token. An attacker on a different page can trigger module creation or deletion if the admin is logged in.

Fix: Use SameSite=Strict cookies or implement double-submit CSRF tokens.


M-3 — Timing Attack on API Key Comparison

File: app/api/deps.py lines 3642

if api_key == settings.API_KEY:

Python's == short-circuits on first differing byte, allowing statistical timing attacks to brute-force valid key prefixes.

Fix:

import hmac
if hmac.compare_digest(api_key, settings.API_KEY):

M-4 — Missing Security Headers

File: app/main.py

None of the following headers are set: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, Content-Security-Policy, Strict-Transport-Security.

Fix: Add a response middleware that sets these headers on every response.


M-5 — Rate Limit Too Permissive and IP-Spoofable

File: app/core/config.py line 9

RATE_LIMIT = "10/minute" = 14,400 requests/day per IP. slowapi uses X-Forwarded-For which can be spoofed by clients.

Fix: Apply stricter per-user limits via JWT subject. Validate X-Forwarded-For against a trusted proxy list.


M-6 — Database Credentials Exposed in Connection String

File: .env line 5

Full Supabase URL including username and password stored in .env as plaintext. A single log line that includes the DATABASE_URL exposes credentials.

Fix: Use IAM/token-based database auth where possible. At minimum, ensure DATABASE_URL is never logged.


M-7 — No Input Length Limits on Admin Models

File: app/api/endpoints/admin.py lines 1227

ModuleCreate fields (name, program, lob, job_code) have no max_length constraint. Extremely long values can exhaust memory or trigger DB-level errors.

Fix: Add Field(..., max_length=100) to all string fields.


M-8 — datetime.utcnow() Deprecated

File: app/core/security.py lines 21, 23

datetime.utcnow() is deprecated since Python 3.12.

Fix: Replace with datetime.now(timezone.utc).


M-9 — No Request Body Size Limit

Large prompt or knowledge_base payloads are not bounded, enabling memory exhaustion.

Fix: Add a LimitUploadSize Starlette middleware or configure max_upload_size in the ASGI server.


M-10 — Swagger/ReDoc Disabled Only by Convention

File: app/main.py lines 2728

docs_url is set to None when ENVIRONMENT != "development". If ENVIRONMENT is accidentally unset, full API documentation is publicly exposed.

Fix: Assert ENVIRONMENT is explicitly set to "production" on startup; fail hard otherwise.


Low

L-1 — Mock Responses Silently Returned in Production

File: app/api/endpoints/gemini.py, app/api/endpoints/openai.py

When GOOGLE_API_KEY or OPENAI_API_KEY is missing, a fake "MOCK: …" response is returned instead of an error. This hides misconfiguration in production.

Fix: Raise a RuntimeError on startup if required API keys are absent when ENVIRONMENT=production.


L-2 — Inconsistent Error Response Format

Some endpoints return {"status": "error", "detail": "…"}, others raise HTTPException. Inconsistency makes error handling harder for clients.

Fix: Define a single error schema and use a global exception handler.


L-3 — Unused Dependencies

File: requirements.txt

passlib and alembic are declared but not used. Unused dependencies increase the attack surface.

Fix: Remove unused packages.


L-4 — LLM Model Names Hardcoded

File: app/api/endpoints/gemini.py line 117; app/api/endpoints/openai.py line 58

Model names (gemini-2.5-flash, gpt-3.5-turbo) are hardcoded. Switching models requires a code change.

Fix: Read from GEMINI_MODEL / OPENAI_MODEL env vars with sensible defaults.


L-5 — Health Endpoint Has No Rate Limit

File: app/api/endpoints/storyline.py lines 3133

/health is unauthenticated and unlimited. Not critical for public load-balancer health checks, but worth a basic rate limit to prevent enumeration/scanning.


L-6 — Duplicate Module Endpoints (Maintenance Hazard)

File: app/api/admin_backend.py, app/api/endpoints/admin.py

Two sets of module CRUD endpoints are registered. See H-4 for the security impact; consolidation also reduces maintenance burden.


L-7 — Cache Not Invalidated After Key Rotation

File: app/api/deps.py

get_current_module() uses a TTL cache. After rotating a module's secret key, the old key remains valid for up to 5 minutes.

Fix: Call auth_cache.clear() (or invalidate the specific key) inside the rotate endpoint.


L-8 — No Mechanism to Rotate Global Secrets

There is no admin endpoint or tooling to rotate API_KEY, SECRET_KEY, or ADMIN_PASSWORD without redeploying.

Fix: Add a /internal/admin/rotate-global-key endpoint (JWT-protected, admin-only) or document the rotation procedure.


L-9 — No Audit Log for Admin Actions

Module creation, deletion, and key rotation are not logged with actor identity and timestamp, making incident response and forensics difficult.

Fix: Log actor, action, resource_id, and timestamp on every state-changing admin operation.


Informational

I-1 — No Startup Configuration Validation

File: app/core/config.py

Required configuration values are not validated at startup. Misconfigurations surface as runtime errors under load rather than at boot time.

Fix: Add a @app.on_event("startup") hook that asserts all required settings are set and sane.


I-2 — API Documentation Incomplete on Security

File: api_documentation.md

The API docs do not document authentication requirements per endpoint, rate limits, or expected error codes.


I-3 — admin.html Token Not Checked for Expiry on Page Load

File: app/static/admin.html

localStorage.getItem('admin_token') is used on load but the token's exp claim is not validated client-side. An expired token is sent to the API before the user sees a 401.

Fix: Decode the JWT payload (base64) client-side and redirect to login if exp < Date.now()/1000.


Priority Findings
Do now C-1 (rotate credentials), C-4 (add login rate limit)
This sprint C-2, C-3, H-1 through H-13
Next sprint M-1 through M-10
Backlog L-1 through L-9, I-1 through I-3