Files
ai-gateway/implementation_plan.md

52 lines
1.6 KiB
Markdown

# Storyline AI Gateway - Implementation Plan
## Project Overview
A FastAPI-based gateway for e-learning modules (Articulate Storyline) to access LLM services (Gemini, OpenAI) with centralized authentication and rate limiting.
## Tech Stack
- **Framework**: FastAPI
- **Server**: Uvicorn
- **Rate Limiting**: Slowapi (limiter)
- **Auth**: Header-based API Key (`X-API-Key`)
- **LLMs**: Google Gemini, OpenAI
## Directory Structure
- `app/main.py`: Application entry point and middleware configuration.
- `app/core/`: Configuration and utilities (limiter, settings).
- `app/api/deps.py`: Shared dependencies (authentication).
- `app/api/router.py`: API versioning and route aggregation.
- `app/api/endpoints/`:
- `storyline.py`: Generic endpoint.
- `gemini.py`: Dedicated Gemini endpoint.
- `openai.py`: Dedicated OpenAI endpoint.
## Configuration
Managed via `.env` file:
- `API_KEY`: Secret key for Storyline modules.
- `GOOGLE_API_KEY`: API key for Google Generative AI.
- `OPENAI_API_KEY`: API key for OpenAI.
- `PORT`: Server port (default 8000).
## API Endpoints
All endpoints are versioned under `/api/v1`.
### 1. Gemini Chat
- **URL**: `/api/v1/gemini/chat`
- **Method**: POST
- **Headers**: `X-API-Key: <your_key>`
- **Body**: `{"prompt": "string", "context": "string"}`
### 2. OpenAI Chat
- **URL**: `/api/v1/openai/chat`
- **Method**: POST
- **Headers**: `X-API-Key: <your_key>`
- **Body**: `{"prompt": "string", "context": "string"}`
## Rate Limiting
- Applied globally/per endpoint: **20 calls per minute**.
## Future Steps
- Add logging (WandB or file-based).
- Implement response caching.
- Add more LLM providers (Anthropic, etc.).