463 lines
17 KiB
Markdown
463 lines
17 KiB
Markdown
# Security Review — AI Gateway
|
||
|
||
**Date:** 2026-04-02
|
||
**Reviewer:** Claude Code (automated static analysis)
|
||
**Scope:** Full codebase (`d:\Antigravity Apps\ai_gateway`)
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
| Severity | Count |
|
||
|---|---|
|
||
| Critical | 4 |
|
||
| High | 13 |
|
||
| Medium | 10 |
|
||
| Low | 9 |
|
||
| Informational | 3 |
|
||
| **Total** | **39** |
|
||
|
||
---
|
||
|
||
## Immediate Action Items
|
||
|
||
> Do these before any other work.
|
||
|
||
1. **Separate `API_KEY` and `SECRET_KEY`** — they are currently set to the same value; a leaked API key allows forging admin JWTs.
|
||
2. **Remove hardcoded fallback credentials** in `app/core/config.py` (`"admin123"`, `"storyline-secret-key-123"`).
|
||
3. **Add rate limiting to login** (`/auth/login`) — currently bruteforceable.
|
||
4. **Consider runtime secret injection** (Docker secrets, AWS Secrets Manager) so credentials are never written to disk at all.
|
||
|
||
---
|
||
|
||
## Critical
|
||
|
||
### C-1 — Plaintext Secrets in `.env` File
|
||
**File:** `.env` lines 1–10
|
||
|
||
Multiple live credentials stored in plaintext. The `.env` file is correctly listed in `.gitignore` and is **not committed to the repository**, so there is no git history exposure. However, the secrets still warrant attention:
|
||
|
||
| Variable | Issue |
|
||
|---|---|
|
||
| `API_KEY` | Same value as `SECRET_KEY` — two independent secrets should not share a value |
|
||
| `GOOGLE_API_KEY` | Live Google API key — should be scoped to minimum required APIs/quotas |
|
||
| `DATABASE_URL` | Full Supabase connection string with embedded password |
|
||
| `ADMIN_PASSWORD` | Plaintext in a file on disk; no rotation mechanism |
|
||
| `SECRET_KEY` | Identical to `API_KEY` — compromising one compromises both |
|
||
|
||
**Remaining concerns:**
|
||
- Anyone with filesystem access to the server (compromised container, misconfigured volume mount, log leak of `DATABASE_URL`) gets all credentials at once.
|
||
- `SECRET_KEY` and `API_KEY` being the same value means a leaked API key also forges JWTs.
|
||
|
||
**Fix:** Use a secrets manager or environment injection at runtime (Docker secrets, AWS Secrets Manager, Doppler) so credentials are never written to disk. At minimum, use separate values for `API_KEY` and `SECRET_KEY`.
|
||
|
||
---
|
||
|
||
### C-2 — Weak Default Credentials in Config
|
||
**File:** [app/core/config.py](app/core/config.py) lines 8, 13, 16
|
||
|
||
```python
|
||
API_KEY: str = os.getenv("API_KEY", "storyline-secret-key-123")
|
||
SECRET_KEY: str = os.getenv("SECRET_KEY", "your-super-secret-key-change-me")
|
||
ADMIN_PASSWORD: str = os.getenv("ADMIN_PASSWORD", "admin123")
|
||
```
|
||
|
||
If environment variables are absent, the app falls back to trivially guessable values.
|
||
|
||
**Fix:** Remove all default values. Raise a `RuntimeError` on startup if these are not set.
|
||
|
||
---
|
||
|
||
### C-3 — Fully Permissive CORS
|
||
**File:** [app/main.py](app/main.py) lines 49–56
|
||
|
||
```python
|
||
CORSMiddleware(
|
||
allow_origins=["*"],
|
||
allow_credentials=True,
|
||
allow_methods=["*"],
|
||
allow_headers=["*"],
|
||
)
|
||
```
|
||
|
||
`allow_origins=["*"]` combined with `allow_credentials=True` is both a browser-rejected combination and a clear signal of over-permissive intent. Any domain can make cross-origin requests.
|
||
|
||
**Fix:**
|
||
```python
|
||
allow_origins = os.getenv("CORS_ORIGINS", "https://your-frontend.com").split(",")
|
||
```
|
||
|
||
---
|
||
|
||
### C-4 — Hardcoded Admin Credentials with No Password Hashing
|
||
**File:** [app/api/endpoints/auth.py](app/api/endpoints/auth.py) lines 14–26
|
||
|
||
```python
|
||
if form_data.username == "admin" and form_data.password == settings.ADMIN_PASSWORD:
|
||
```
|
||
|
||
- Password is plaintext comparison — `passlib` is a dependency but never used.
|
||
- No rate limiting on the login endpoint → bruteforceable.
|
||
- Single static admin account with no database backing.
|
||
|
||
**Fix:** Use a proper user table with bcrypt-hashed passwords. Add `@limiter.limit("5/minute")` to the login route.
|
||
|
||
---
|
||
|
||
## High
|
||
|
||
### H-1 — Debug Print Statements Leak Auth Info in Production
|
||
**File:** [app/api/deps.py](app/api/deps.py) lines 23–28
|
||
|
||
```python
|
||
print(f"DEBUG: API Key received (prefix): {api_key[:5]}...")
|
||
```
|
||
|
||
First 5 chars of each API key are printed to stdout on every authenticated request. `ENVIRONMENT=production` is set in `.env`.
|
||
|
||
**Fix:** Remove all `print()` debug statements. Use structured logging gated on `DEBUG` level.
|
||
|
||
---
|
||
|
||
### H-2 — Request Logging Leaks URLs and Origins
|
||
**File:** [app/main.py](app/main.py) lines 40–47
|
||
|
||
Every incoming request, including URL (which may contain query-param secrets) and `Origin` header, is logged at INFO level in production.
|
||
|
||
**Fix:** Gate debug logging behind `settings.ENVIRONMENT == "development"`. Sanitize URLs before logging.
|
||
|
||
---
|
||
|
||
### H-3 — No Rate Limiting on Admin Endpoints
|
||
**File:** [app/api/endpoints/admin.py](app/api/endpoints/admin.py) lines 24–120
|
||
|
||
`POST /modules`, `GET /modules`, `POST /modules/{id}/rotate`, `DELETE /modules/{id}` — none have `@limiter.limit()`. Regular API endpoints use `settings.RATE_LIMIT` but admin does not.
|
||
|
||
**Fix:** Add `@limiter.limit("10/minute")` or stricter to all admin routes.
|
||
|
||
---
|
||
|
||
### H-4 — Duplicate Admin Endpoints with Inconsistent Auth
|
||
**File:** [app/api/admin_backend.py](app/api/admin_backend.py) and [app/api/endpoints/admin.py](app/api/endpoints/admin.py)
|
||
|
||
Two sets of module management endpoints are mounted to the same prefix. `admin_backend.py` uses API key auth (any valid module key can call it); `admin.py` uses JWT. This creates an auth bypass path.
|
||
|
||
**Fix:** Remove `admin_backend.py` endpoints. Consolidate to one set requiring JWT with an admin role check.
|
||
|
||
---
|
||
|
||
### H-5 — Exception Messages Returned to Clients
|
||
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) lines 71–79, 142–143; similar in [app/api/endpoints/openai.py](app/api/endpoints/openai.py)
|
||
|
||
```python
|
||
except Exception as e:
|
||
return {"status": "error", "detail": f"Failed to parse text/plain as JSON: {str(e)}"}
|
||
```
|
||
|
||
Raw Python exception messages reveal framework internals, library versions, and data structures to callers.
|
||
|
||
**Fix:** Log the full exception server-side; return a generic message: `"Service error — please try again later"`.
|
||
|
||
---
|
||
|
||
### H-6 — No HTTPS Enforcement at Application Level
|
||
**File:** [docker-compose.yml](docker-compose.yml), [Caddyfile](Caddyfile)
|
||
|
||
TLS is terminated by Caddy, but the application itself has no HTTPS redirect or HSTS header. If Caddy is bypassed (direct port 8000 access), credentials are transmitted in plaintext.
|
||
|
||
**Fix:** Add an HTTPS enforcement middleware and set `Strict-Transport-Security: max-age=31536000; includeSubDomains` in all responses.
|
||
|
||
---
|
||
|
||
### H-7 — Unvalidated Query Parameter Casting
|
||
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) lines 46–61
|
||
|
||
```python
|
||
temperature=float(params.get("temperature", 0.7)),
|
||
top_k=int(params.get("top_k", 40)),
|
||
```
|
||
|
||
Invalid input raises `ValueError` → 500. No bounds checking: `temperature` could be `-999`, `top_k` could be `2147483647`.
|
||
|
||
**Fix:** Use Pydantic `Query()` with `Field(ge=0.0, le=2.0)` constraints and handle `ValueError` with a 422 response.
|
||
|
||
---
|
||
|
||
### H-8 — Prompt Injection via User-Controlled System Prompt
|
||
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) lines 110–112
|
||
|
||
```python
|
||
system_instruction = chat_data.system_prompt or GEMINI_SYSTEM_PROMPT
|
||
system_instruction += f"\n\nKnowledge Base:\n{chat_data.knowledge_base}"
|
||
```
|
||
|
||
Callers can fully replace the system prompt and append arbitrary content to it. This enables jailbreaking the model, data exfiltration via the LLM, and policy bypass.
|
||
|
||
**Fix:** Restrict `system_prompt` to an allowlist of predefined values. Never concatenate raw user input into a system instruction.
|
||
|
||
---
|
||
|
||
### H-9 — Stored XSS in Admin Panel (Module Names)
|
||
**File:** [app/static/admin.html](app/static/admin.html) line 608
|
||
|
||
```javascript
|
||
<div style="font-weight: 600;">${m.name}</div>
|
||
```
|
||
|
||
Module names from the API are interpolated directly into `innerHTML` via template literals. An attacker with a valid API key can create a module named `<img src=x onerror="...">` to exfiltrate the admin JWT from `localStorage`.
|
||
|
||
**Fix:** Use `element.textContent = m.name` or DOMPurify. Add a `Content-Security-Policy` header.
|
||
|
||
---
|
||
|
||
### H-10 — XSS via Inline Event Handler with API Key
|
||
**File:** [app/static/admin.html](app/static/admin.html) lines 617, 620
|
||
|
||
```javascript
|
||
<button onclick="viewKey('${m.secret_key}')">
|
||
```
|
||
|
||
The raw secret key is embedded in an `onclick` attribute. A key containing `');alert(1);//` would execute arbitrary JavaScript.
|
||
|
||
**Fix:** Use `data-*` attributes and event delegation. Fetch the key from the API via ID, never embed it in HTML.
|
||
|
||
---
|
||
|
||
### H-11 — Admin Panel Accessible Without Authentication
|
||
**File:** [app/main.py](app/main.py) lines 84–86
|
||
|
||
```python
|
||
@app.get("/admin")
|
||
async def admin_panel():
|
||
return FileResponse("app/static/admin.html")
|
||
```
|
||
|
||
The admin UI is served to anyone without any auth check. Though the API calls require a JWT, the HTML surface itself reveals the full admin structure.
|
||
|
||
**Fix:** Add `Depends(get_current_user)` to this route, or serve the admin panel only from an authenticated session.
|
||
|
||
---
|
||
|
||
### H-12 — JWT Secret Identical to API Key
|
||
**File:** `.env` lines 1, 9
|
||
|
||
`SECRET_KEY` and `API_KEY` are set to the same value. A leaked API key immediately compromises JWT signing, allowing anyone to forge admin tokens.
|
||
|
||
**Fix:** Use independent, separately rotated secrets for JWT signing and API authentication.
|
||
|
||
---
|
||
|
||
### H-13 — No Token Revocation
|
||
**File:** [app/core/security.py](app/core/security.py)
|
||
|
||
Issued JWTs cannot be revoked. A compromised admin token is valid until it expires. No logout endpoint exists on the server.
|
||
|
||
**Fix:** Implement a token blacklist (Redis or DB table keyed on `jti` claim). Add a `/auth/logout` endpoint.
|
||
|
||
---
|
||
|
||
## Medium
|
||
|
||
### M-1 — JWT Token Stored in `localStorage`
|
||
**File:** [app/static/admin.html](app/static/admin.html) lines 503, 565
|
||
|
||
`localStorage` is accessible to any JavaScript on the page — including via XSS. Combined with findings H-9/H-10, this leads to full account takeover.
|
||
|
||
**Fix:** Use `httpOnly` session cookies. If `localStorage` must be kept, ensure CSP prevents inline scripts.
|
||
|
||
---
|
||
|
||
### M-2 — No CSRF Protection
|
||
State-changing admin operations (`POST`, `DELETE`) accept requests from any origin without a CSRF token. An attacker on a different page can trigger module creation or deletion if the admin is logged in.
|
||
|
||
**Fix:** Use `SameSite=Strict` cookies or implement double-submit CSRF tokens.
|
||
|
||
---
|
||
|
||
### M-3 — Timing Attack on API Key Comparison
|
||
**File:** [app/api/deps.py](app/api/deps.py) lines 36–42
|
||
|
||
```python
|
||
if api_key == settings.API_KEY:
|
||
```
|
||
|
||
Python's `==` short-circuits on first differing byte, allowing statistical timing attacks to brute-force valid key prefixes.
|
||
|
||
**Fix:**
|
||
```python
|
||
import hmac
|
||
if hmac.compare_digest(api_key, settings.API_KEY):
|
||
```
|
||
|
||
---
|
||
|
||
### M-4 — Missing Security Headers
|
||
**File:** [app/main.py](app/main.py)
|
||
|
||
None of the following headers are set: `X-Content-Type-Options`, `X-Frame-Options`, `X-XSS-Protection`, `Content-Security-Policy`, `Strict-Transport-Security`.
|
||
|
||
**Fix:** Add a response middleware that sets these headers on every response.
|
||
|
||
---
|
||
|
||
### M-5 — Rate Limit Too Permissive and IP-Spoofable
|
||
**File:** [app/core/config.py](app/core/config.py) line 9
|
||
|
||
`RATE_LIMIT = "10/minute"` = 14,400 requests/day per IP. `slowapi` uses `X-Forwarded-For` which can be spoofed by clients.
|
||
|
||
**Fix:** Apply stricter per-user limits via JWT subject. Validate `X-Forwarded-For` against a trusted proxy list.
|
||
|
||
---
|
||
|
||
### M-6 — Database Credentials Exposed in Connection String
|
||
**File:** `.env` line 5
|
||
|
||
Full Supabase URL including username and password stored in `.env` as plaintext. A single log line that includes the `DATABASE_URL` exposes credentials.
|
||
|
||
**Fix:** Use IAM/token-based database auth where possible. At minimum, ensure `DATABASE_URL` is never logged.
|
||
|
||
---
|
||
|
||
### M-7 — No Input Length Limits on Admin Models
|
||
**File:** [app/api/endpoints/admin.py](app/api/endpoints/admin.py) lines 12–27
|
||
|
||
`ModuleCreate` fields (`name`, `program`, `lob`, `job_code`) have no `max_length` constraint. Extremely long values can exhaust memory or trigger DB-level errors.
|
||
|
||
**Fix:** Add `Field(..., max_length=100)` to all string fields.
|
||
|
||
---
|
||
|
||
### M-8 — `datetime.utcnow()` Deprecated
|
||
**File:** [app/core/security.py](app/core/security.py) lines 21, 23
|
||
|
||
`datetime.utcnow()` is deprecated since Python 3.12.
|
||
|
||
**Fix:** Replace with `datetime.now(timezone.utc)`.
|
||
|
||
---
|
||
|
||
### M-9 — No Request Body Size Limit
|
||
Large prompt or `knowledge_base` payloads are not bounded, enabling memory exhaustion.
|
||
|
||
**Fix:** Add a `LimitUploadSize` Starlette middleware or configure `max_upload_size` in the ASGI server.
|
||
|
||
---
|
||
|
||
### M-10 — Swagger/ReDoc Disabled Only by Convention
|
||
**File:** [app/main.py](app/main.py) lines 27–28
|
||
|
||
`docs_url` is set to `None` when `ENVIRONMENT != "development"`. If `ENVIRONMENT` is accidentally unset, full API documentation is publicly exposed.
|
||
|
||
**Fix:** Assert `ENVIRONMENT` is explicitly set to `"production"` on startup; fail hard otherwise.
|
||
|
||
---
|
||
|
||
## Low
|
||
|
||
### L-1 — Mock Responses Silently Returned in Production
|
||
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py), [app/api/endpoints/openai.py](app/api/endpoints/openai.py)
|
||
|
||
When `GOOGLE_API_KEY` or `OPENAI_API_KEY` is missing, a fake `"MOCK: …"` response is returned instead of an error. This hides misconfiguration in production.
|
||
|
||
**Fix:** Raise a `RuntimeError` on startup if required API keys are absent when `ENVIRONMENT=production`.
|
||
|
||
---
|
||
|
||
### L-2 — Inconsistent Error Response Format
|
||
Some endpoints return `{"status": "error", "detail": "…"}`, others raise `HTTPException`. Inconsistency makes error handling harder for clients.
|
||
|
||
**Fix:** Define a single error schema and use a global exception handler.
|
||
|
||
---
|
||
|
||
### L-3 — Unused Dependencies
|
||
**File:** [requirements.txt](requirements.txt)
|
||
|
||
`passlib` and `alembic` are declared but not used. Unused dependencies increase the attack surface.
|
||
|
||
**Fix:** Remove unused packages.
|
||
|
||
---
|
||
|
||
### L-4 — LLM Model Names Hardcoded
|
||
**File:** [app/api/endpoints/gemini.py](app/api/endpoints/gemini.py) line 117; [app/api/endpoints/openai.py](app/api/endpoints/openai.py) line 58
|
||
|
||
Model names (`gemini-2.5-flash`, `gpt-3.5-turbo`) are hardcoded. Switching models requires a code change.
|
||
|
||
**Fix:** Read from `GEMINI_MODEL` / `OPENAI_MODEL` env vars with sensible defaults.
|
||
|
||
---
|
||
|
||
### L-5 — Health Endpoint Has No Rate Limit
|
||
**File:** [app/api/endpoints/storyline.py](app/api/endpoints/storyline.py) lines 31–33
|
||
|
||
`/health` is unauthenticated and unlimited. Not critical for public load-balancer health checks, but worth a basic rate limit to prevent enumeration/scanning.
|
||
|
||
---
|
||
|
||
### L-6 — Duplicate Module Endpoints (Maintenance Hazard)
|
||
**File:** [app/api/admin_backend.py](app/api/admin_backend.py), [app/api/endpoints/admin.py](app/api/endpoints/admin.py)
|
||
|
||
Two sets of module CRUD endpoints are registered. See H-4 for the security impact; consolidation also reduces maintenance burden.
|
||
|
||
---
|
||
|
||
### L-7 — Cache Not Invalidated After Key Rotation
|
||
**File:** [app/api/deps.py](app/api/deps.py)
|
||
|
||
`get_current_module()` uses a TTL cache. After rotating a module's secret key, the old key remains valid for up to 5 minutes.
|
||
|
||
**Fix:** Call `auth_cache.clear()` (or invalidate the specific key) inside the rotate endpoint.
|
||
|
||
---
|
||
|
||
### L-8 — No Mechanism to Rotate Global Secrets
|
||
There is no admin endpoint or tooling to rotate `API_KEY`, `SECRET_KEY`, or `ADMIN_PASSWORD` without redeploying.
|
||
|
||
**Fix:** Add a `/internal/admin/rotate-global-key` endpoint (JWT-protected, admin-only) or document the rotation procedure.
|
||
|
||
---
|
||
|
||
### L-9 — No Audit Log for Admin Actions
|
||
Module creation, deletion, and key rotation are not logged with actor identity and timestamp, making incident response and forensics difficult.
|
||
|
||
**Fix:** Log `actor`, `action`, `resource_id`, and `timestamp` on every state-changing admin operation.
|
||
|
||
---
|
||
|
||
## Informational
|
||
|
||
### I-1 — No Startup Configuration Validation
|
||
**File:** [app/core/config.py](app/core/config.py)
|
||
|
||
Required configuration values are not validated at startup. Misconfigurations surface as runtime errors under load rather than at boot time.
|
||
|
||
**Fix:** Add a `@app.on_event("startup")` hook that asserts all required settings are set and sane.
|
||
|
||
---
|
||
|
||
### I-2 — API Documentation Incomplete on Security
|
||
**File:** [api_documentation.md](api_documentation.md)
|
||
|
||
The API docs do not document authentication requirements per endpoint, rate limits, or expected error codes.
|
||
|
||
---
|
||
|
||
### I-3 — `admin.html` Token Not Checked for Expiry on Page Load
|
||
**File:** [app/static/admin.html](app/static/admin.html)
|
||
|
||
`localStorage.getItem('admin_token')` is used on load but the token's `exp` claim is not validated client-side. An expired token is sent to the API before the user sees a 401.
|
||
|
||
**Fix:** Decode the JWT payload (base64) client-side and redirect to login if `exp < Date.now()/1000`.
|
||
|
||
---
|
||
|
||
## Recommended Fix Priority
|
||
|
||
| Priority | Findings |
|
||
|---|---|
|
||
| Do now | C-1 (rotate credentials), C-4 (add login rate limit) |
|
||
| This sprint | C-2, C-3, H-1 through H-13 |
|
||
| Next sprint | M-1 through M-10 |
|
||
| Backlog | L-1 through L-9, I-1 through I-3 |
|