Files
ai-gateway/SECURITY.md
T

463 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Security Review — AI Gateway
**Date:** 2026-04-02
**Reviewer:** Claude Code (automated static analysis)
**Scope:** Full codebase (`d:\Antigravity Apps\ai_gateway`)
---
## Summary
| Severity | Count |
|---|---|
| Critical | 4 |
| High | 13 |
| Medium | 10 |
| Low | 9 |
| Informational | 3 |
| **Total** | **39** |
---
## Immediate Action Items
> Do these before any other work.
1. **Separate `API_KEY` and `SECRET_KEY`** — they are currently set to the same value; a leaked API key allows forging admin JWTs.
2. **Remove hardcoded fallback credentials** in `app/core/config.py` (`"admin123"`, `"storyline-secret-key-123"`).
3. **Add rate limiting to login** (`/auth/login`) — currently bruteforceable.
4. **Consider runtime secret injection** (Docker secrets, AWS Secrets Manager) so credentials are never written to disk at all.
---
## Critical
### C-1 — Plaintext Secrets in `.env` File
**File:** `.env` lines 110
Multiple live credentials stored in plaintext. The `.env` file is correctly listed in `.gitignore` and is **not committed to the repository**, so there is no git history exposure. However, the secrets still warrant attention:
| Variable | Issue |
|---|---|
| `API_KEY` | Same value as `SECRET_KEY` — two independent secrets should not share a value |
| `GOOGLE_API_KEY` | Live Google API key — should be scoped to minimum required APIs/quotas |
| `DATABASE_URL` | Full Supabase connection string with embedded password |
| `ADMIN_PASSWORD` | Plaintext in a file on disk; no rotation mechanism |
| `SECRET_KEY` | Identical to `API_KEY` — compromising one compromises both |
**Remaining concerns:**
- Anyone with filesystem access to the server (compromised container, misconfigured volume mount, log leak of `DATABASE_URL`) gets all credentials at once.
- `SECRET_KEY` and `API_KEY` being the same value means a leaked API key also forges JWTs.
**Fix:** Use a secrets manager or environment injection at runtime (Docker secrets, AWS Secrets Manager, Doppler) so credentials are never written to disk. At minimum, use separate values for `API_KEY` and `SECRET_KEY`.
---
### C-2 — Weak Default Credentials in Config
**File:** [app/core/config.py](app/core/config.py) lines 8, 13, 16
```python
API_KEY: str = os.getenv("API_KEY", "storyline-secret-key-123")
SECRET_KEY: str = os.getenv("SECRET_KEY", "your-super-secret-key-change-me")
ADMIN_PASSWORD: str = os.getenv("ADMIN_PASSWORD", "admin123")
```
If environment variables are absent, the app falls back to trivially guessable values.
**Fix:** Remove all default values. Raise a `RuntimeError` on startup if these are not set.
---
### C-3 — Fully Permissive CORS
**File:** [app/main.py](app/main.py) lines 4956
```python
CORSMiddleware(
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
```
`allow_origins=["*"]` combined with `allow_credentials=True` is both a browser-rejected combination and a clear signal of over-permissive intent. Any domain can make cross-origin requests.
**Fix:**
```python
allow_origins = os.getenv("CORS_ORIGINS", "https://your-frontend.com").split(",")
```
---
### C-4 — Hardcoded Admin Credentials with No Password Hashing
**File:** [app/api/endpoints/auth.py](app/api/endpoints/auth.py) lines 1426
```python
if form_data.username == "admin" and form_data.password == settings.ADMIN_PASSWORD:
```
- Password is plaintext comparison — `passlib` is a dependency but never used.
- No rate limiting on the login endpoint → bruteforceable.
- Single static admin account with no database backing.
**Fix:** Use a proper user table with bcrypt-hashed passwords. Add `@limiter.limit("5/minute")` to the login route.
---
## High
### H-1 — Debug Print Statements Leak Auth Info in Production
**File:** [app/api/deps.py](app/api/deps.py) lines 2328
```python
print(f"DEBUG: API Key received (prefix): {api_key[:5]}...")
```
First 5 chars of each API key are printed to stdout on every authenticated request. `ENVIRONMENT=production` is set in `.env`.
**Fix:** Remove all `print()` debug statements. Use structured logging gated on `DEBUG` level.
---
### H-2 — Request Logging Leaks URLs and Origins
**File:** [app/main.py](app/main.py) lines 4047
Every incoming request, including URL (which may contain query-param secrets) and `Origin` header, is logged at INFO level in production.
**Fix:** Gate debug logging behind `settings.ENVIRONMENT == "development"`. Sanitize URLs before logging.
---
### H-3 — No Rate Limiting on Admin Endpoints
**File:** [app/api/endpoints/admin.py](app/api/endpoints/admin.py) lines 24120
`POST /modules`, `GET /modules`, `POST /modules/{id}/rotate`, `DELETE /modules/{id}` — none have `@limiter.limit()`. Regular API endpoints use `settings.RATE_LIMIT` but admin does not.
**Fix:** Add `@limiter.limit("10/minute")` or stricter to all admin routes.
---
### H-4 — Duplicate Admin Endpoints with Inconsistent Auth
**File:** [app/api/admin_backend.py](app/api/admin_backend.py) and [app/api/endpoints/admin.py](app/api/endpoints/admin.py)
Two sets of module management endpoints are mounted to the same prefix. `admin_backend.py` uses API key auth (any valid module key can call it); `admin.py` uses JWT. This creates an auth bypass path.
**Fix:** Remove `admin_backend.py` endpoints. Consolidate to one set requiring JWT with an admin role check.
---
### H-5 — Exception Messages Returned to Clients
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) lines 7179, 142143; similar in [app/api/endpoints/openai.py](app/api/endpoints/openai.py)
```python
except Exception as e:
return {"status": "error", "detail": f"Failed to parse text/plain as JSON: {str(e)}"}
```
Raw Python exception messages reveal framework internals, library versions, and data structures to callers.
**Fix:** Log the full exception server-side; return a generic message: `"Service error — please try again later"`.
---
### H-6 — No HTTPS Enforcement at Application Level
**File:** [docker-compose.yml](docker-compose.yml), [Caddyfile](Caddyfile)
TLS is terminated by Caddy, but the application itself has no HTTPS redirect or HSTS header. If Caddy is bypassed (direct port 8000 access), credentials are transmitted in plaintext.
**Fix:** Add an HTTPS enforcement middleware and set `Strict-Transport-Security: max-age=31536000; includeSubDomains` in all responses.
---
### H-7 — Unvalidated Query Parameter Casting
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) lines 4661
```python
temperature=float(params.get("temperature", 0.7)),
top_k=int(params.get("top_k", 40)),
```
Invalid input raises `ValueError` → 500. No bounds checking: `temperature` could be `-999`, `top_k` could be `2147483647`.
**Fix:** Use Pydantic `Query()` with `Field(ge=0.0, le=2.0)` constraints and handle `ValueError` with a 422 response.
---
### H-8 — Prompt Injection via User-Controlled System Prompt
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) lines 110112
```python
system_instruction = chat_data.system_prompt or GEMINI_SYSTEM_PROMPT
system_instruction += f"\n\nKnowledge Base:\n{chat_data.knowledge_base}"
```
Callers can fully replace the system prompt and append arbitrary content to it. This enables jailbreaking the model, data exfiltration via the LLM, and policy bypass.
**Fix:** Restrict `system_prompt` to an allowlist of predefined values. Never concatenate raw user input into a system instruction.
---
### H-9 — Stored XSS in Admin Panel (Module Names)
**File:** [app/static/admin.html](app/static/admin.html) line 608
```javascript
<div style="font-weight: 600;">${m.name}</div>
```
Module names from the API are interpolated directly into `innerHTML` via template literals. An attacker with a valid API key can create a module named `<img src=x onerror="...">` to exfiltrate the admin JWT from `localStorage`.
**Fix:** Use `element.textContent = m.name` or DOMPurify. Add a `Content-Security-Policy` header.
---
### H-10 — XSS via Inline Event Handler with API Key
**File:** [app/static/admin.html](app/static/admin.html) lines 617, 620
```javascript
<button onclick="viewKey('${m.secret_key}')">
```
The raw secret key is embedded in an `onclick` attribute. A key containing `');alert(1);//` would execute arbitrary JavaScript.
**Fix:** Use `data-*` attributes and event delegation. Fetch the key from the API via ID, never embed it in HTML.
---
### H-11 — Admin Panel Accessible Without Authentication
**File:** [app/main.py](app/main.py) lines 8486
```python
@app.get("/admin")
async def admin_panel():
return FileResponse("app/static/admin.html")
```
The admin UI is served to anyone without any auth check. Though the API calls require a JWT, the HTML surface itself reveals the full admin structure.
**Fix:** Add `Depends(get_current_user)` to this route, or serve the admin panel only from an authenticated session.
---
### H-12 — JWT Secret Identical to API Key
**File:** `.env` lines 1, 9
`SECRET_KEY` and `API_KEY` are set to the same value. A leaked API key immediately compromises JWT signing, allowing anyone to forge admin tokens.
**Fix:** Use independent, separately rotated secrets for JWT signing and API authentication.
---
### H-13 — No Token Revocation
**File:** [app/core/security.py](app/core/security.py)
Issued JWTs cannot be revoked. A compromised admin token is valid until it expires. No logout endpoint exists on the server.
**Fix:** Implement a token blacklist (Redis or DB table keyed on `jti` claim). Add a `/auth/logout` endpoint.
---
## Medium
### M-1 — JWT Token Stored in `localStorage`
**File:** [app/static/admin.html](app/static/admin.html) lines 503, 565
`localStorage` is accessible to any JavaScript on the page — including via XSS. Combined with findings H-9/H-10, this leads to full account takeover.
**Fix:** Use `httpOnly` session cookies. If `localStorage` must be kept, ensure CSP prevents inline scripts.
---
### M-2 — No CSRF Protection
State-changing admin operations (`POST`, `DELETE`) accept requests from any origin without a CSRF token. An attacker on a different page can trigger module creation or deletion if the admin is logged in.
**Fix:** Use `SameSite=Strict` cookies or implement double-submit CSRF tokens.
---
### M-3 — Timing Attack on API Key Comparison
**File:** [app/api/deps.py](app/api/deps.py) lines 3642
```python
if api_key == settings.API_KEY:
```
Python's `==` short-circuits on first differing byte, allowing statistical timing attacks to brute-force valid key prefixes.
**Fix:**
```python
import hmac
if hmac.compare_digest(api_key, settings.API_KEY):
```
---
### M-4 — Missing Security Headers
**File:** [app/main.py](app/main.py)
None of the following headers are set: `X-Content-Type-Options`, `X-Frame-Options`, `X-XSS-Protection`, `Content-Security-Policy`, `Strict-Transport-Security`.
**Fix:** Add a response middleware that sets these headers on every response.
---
### M-5 — Rate Limit Too Permissive and IP-Spoofable
**File:** [app/core/config.py](app/core/config.py) line 9
`RATE_LIMIT = "10/minute"` = 14,400 requests/day per IP. `slowapi` uses `X-Forwarded-For` which can be spoofed by clients.
**Fix:** Apply stricter per-user limits via JWT subject. Validate `X-Forwarded-For` against a trusted proxy list.
---
### M-6 — Database Credentials Exposed in Connection String
**File:** `.env` line 5
Full Supabase URL including username and password stored in `.env` as plaintext. A single log line that includes the `DATABASE_URL` exposes credentials.
**Fix:** Use IAM/token-based database auth where possible. At minimum, ensure `DATABASE_URL` is never logged.
---
### M-7 — No Input Length Limits on Admin Models
**File:** [app/api/endpoints/admin.py](app/api/endpoints/admin.py) lines 1227
`ModuleCreate` fields (`name`, `program`, `lob`, `job_code`) have no `max_length` constraint. Extremely long values can exhaust memory or trigger DB-level errors.
**Fix:** Add `Field(..., max_length=100)` to all string fields.
---
### M-8 — `datetime.utcnow()` Deprecated
**File:** [app/core/security.py](app/core/security.py) lines 21, 23
`datetime.utcnow()` is deprecated since Python 3.12.
**Fix:** Replace with `datetime.now(timezone.utc)`.
---
### M-9 — No Request Body Size Limit
Large prompt or `knowledge_base` payloads are not bounded, enabling memory exhaustion.
**Fix:** Add a `LimitUploadSize` Starlette middleware or configure `max_upload_size` in the ASGI server.
---
### M-10 — Swagger/ReDoc Disabled Only by Convention
**File:** [app/main.py](app/main.py) lines 2728
`docs_url` is set to `None` when `ENVIRONMENT != "development"`. If `ENVIRONMENT` is accidentally unset, full API documentation is publicly exposed.
**Fix:** Assert `ENVIRONMENT` is explicitly set to `"production"` on startup; fail hard otherwise.
---
## Low
### L-1 — Mock Responses Silently Returned in Production
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py), [app/api/endpoints/openai.py](app/api/endpoints/openai.py)
When `GOOGLE_API_KEY` or `OPENAI_API_KEY` is missing, a fake `"MOCK: …"` response is returned instead of an error. This hides misconfiguration in production.
**Fix:** Raise a `RuntimeError` on startup if required API keys are absent when `ENVIRONMENT=production`.
---
### L-2 — Inconsistent Error Response Format
Some endpoints return `{"status": "error", "detail": "…"}`, others raise `HTTPException`. Inconsistency makes error handling harder for clients.
**Fix:** Define a single error schema and use a global exception handler.
---
### L-3 — Unused Dependencies
**File:** [requirements.txt](requirements.txt)
`passlib` and `alembic` are declared but not used. Unused dependencies increase the attack surface.
**Fix:** Remove unused packages.
---
### L-4 — LLM Model Names Hardcoded
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) line 117; [app/api/endpoints/openai.py](app/api/endpoints/openai.py) line 58
Model names (`gemini-2.5-flash`, `gpt-3.5-turbo`) are hardcoded. Switching models requires a code change.
**Fix:** Read from `GEMINI_MODEL` / `OPENAI_MODEL` env vars with sensible defaults.
---
### L-5 — Health Endpoint Has No Rate Limit
**File:** [app/api/endpoints/storyline.py](app/api/endpoints/storyline.py) lines 3133
`/health` is unauthenticated and unlimited. Not critical for public load-balancer health checks, but worth a basic rate limit to prevent enumeration/scanning.
---
### L-6 — Duplicate Module Endpoints (Maintenance Hazard)
**File:** [app/api/admin_backend.py](app/api/admin_backend.py), [app/api/endpoints/admin.py](app/api/endpoints/admin.py)
Two sets of module CRUD endpoints are registered. See H-4 for the security impact; consolidation also reduces maintenance burden.
---
### L-7 — Cache Not Invalidated After Key Rotation
**File:** [app/api/deps.py](app/api/deps.py)
`get_current_module()` uses a TTL cache. After rotating a module's secret key, the old key remains valid for up to 5 minutes.
**Fix:** Call `auth_cache.clear()` (or invalidate the specific key) inside the rotate endpoint.
---
### L-8 — No Mechanism to Rotate Global Secrets
There is no admin endpoint or tooling to rotate `API_KEY`, `SECRET_KEY`, or `ADMIN_PASSWORD` without redeploying.
**Fix:** Add a `/internal/admin/rotate-global-key` endpoint (JWT-protected, admin-only) or document the rotation procedure.
---
### L-9 — No Audit Log for Admin Actions
Module creation, deletion, and key rotation are not logged with actor identity and timestamp, making incident response and forensics difficult.
**Fix:** Log `actor`, `action`, `resource_id`, and `timestamp` on every state-changing admin operation.
---
## Informational
### I-1 — No Startup Configuration Validation
**File:** [app/core/config.py](app/core/config.py)
Required configuration values are not validated at startup. Misconfigurations surface as runtime errors under load rather than at boot time.
**Fix:** Add a `@app.on_event("startup")` hook that asserts all required settings are set and sane.
---
### I-2 — API Documentation Incomplete on Security
**File:** [api_documentation.md](api_documentation.md)
The API docs do not document authentication requirements per endpoint, rate limits, or expected error codes.
---
### I-3 — `admin.html` Token Not Checked for Expiry on Page Load
**File:** [app/static/admin.html](app/static/admin.html)
`localStorage.getItem('admin_token')` is used on load but the token's `exp` claim is not validated client-side. An expired token is sent to the API before the user sees a 401.
**Fix:** Decode the JWT payload (base64) client-side and redirect to login if `exp < Date.now()/1000`.
---
## Recommended Fix Priority
| Priority | Findings |
|---|---|
| Do now | C-1 (rotate credentials), C-4 (add login rate limit) |
| This sprint | C-2, C-3, H-1 through H-13 |
| Next sprint | M-1 through M-10 |
| Backlog | L-1 through L-9, I-1 through I-3 |