# Storyline AI Gateway - Implementation Plan ## Project Overview A FastAPI-based gateway for e-learning modules (Articulate Storyline) to access LLM services (Gemini, OpenAI) with centralized authentication and rate limiting. ## Tech Stack - **Framework**: FastAPI - **Server**: Uvicorn - **Rate Limiting**: Slowapi (limiter) - **Auth**: Header-based API Key (`X-API-Key`) - **LLMs**: Google Gemini, OpenAI ## Directory Structure - `app/main.py`: Application entry point and middleware configuration. - `app/core/`: Configuration and utilities (limiter, settings). - `app/api/deps.py`: Shared dependencies (authentication). - `app/api/router.py`: API versioning and route aggregation. - `app/api/endpoints/`: - `storyline.py`: Generic endpoint. - `gemini.py`: Dedicated Gemini endpoint. - `openai.py`: Dedicated OpenAI endpoint. ## Configuration Managed via `.env` file: - `API_KEY`: Secret key for Storyline modules. - `GOOGLE_API_KEY`: API key for Google Generative AI. - `OPENAI_API_KEY`: API key for OpenAI. - `PORT`: Server port (default 8000). ## API Endpoints All endpoints are versioned under `/api/v1`. ### 1. Gemini Chat - **URL**: `/api/v1/gemini/chat` - **Method**: POST - **Headers**: `X-API-Key: ` - **Body**: `{"prompt": "string", "context": "string"}` ### 2. OpenAI Chat - **URL**: `/api/v1/openai/chat` - **Method**: POST - **Headers**: `X-API-Key: ` - **Body**: `{"prompt": "string", "context": "string"}` ## Rate Limiting - Applied globally/per endpoint: **20 calls per minute**. ## Future Steps - Add logging (WandB or file-based). - Implement response caching. - Add more LLM providers (Anthropic, etc.).