AI Provider Abstraction Layer

Date: 2025-11-10 Version: 1.0 Author: Software Architect Status: Approved for Development

Overview

The AI Provider Abstraction Layer enables flexible AI provider switching, cost optimization through fallback strategies, and seamless development experiences with local models. This design supports OpenAI GPT-5 → GPT-4.1 fallback for production and Ollama for local development without changing application code. Key Benefits:

Provider flexibility: Swap AI providers without code changes
Cost optimization: Automatic fallback to cheaper models
Development experience: Local Ollama testing (no API costs)
Future-proof: Easy to add Anthropic, Gemini, or custom models
Resilience: Graceful degradation when providers fail

Architecture

Interface Design

// ai-provider.interface.ts
export interface AIProvider {
  /**
   * Generate journal prompts based on photo content
   * @param photoUrl - Public URL to photo (or base64)
   * @param context - Optional context (previous entries, user preferences)
   * @returns Array of 3 journal prompts (8-15 words each)
   */
  generatePrompts(photoUrl: string, context?: AIContext): Promise<AIPrompt[]>;

  /**
   * Suggest emotions based on photo and/or journal text
   * @param photoUrl - Public URL to photo
   * @param journalText - Optional journal text for sentiment analysis
   * @returns Array of 2-3 emotion IDs from 12-emotion taxonomy
   */
  suggestEmotions(photoUrl: string, journalText?: string): Promise<string[]>;

  /**
   * Provider name (for logging and debugging)
   */
  readonly providerName: string;

  /**
   * Whether this provider supports vision/image analysis
   */
  readonly supportsVision: boolean;
}

export interface AIContext {
  userId?: string;
  previousEntries?: string[];  // Recent journal entries for context
  timeOfDay?: 'morning' | 'afternoon' | 'evening' | 'night';
  location?: string;
}

export interface AIPrompt {
  text: string;           // "What made this moment special?"
  style: 'reflective' | 'gratitude' | 'storytelling' | 'question';
  confidence: number;      // 0-1 confidence score
}

Implementation: OpenAI Provider

GPT-5 with GPT-4.1 Fallback

// openai-provider.ts
import OpenAI from 'openai';
import type { AIProvider, AIContext, AIPrompt } from './ai-provider.interface';

export class OpenAIProvider implements AIProvider {
  private client: OpenAI;
  readonly providerName = 'OpenAI';
  readonly supportsVision = true;

  constructor() {
    this.client = new OpenAI({
      apiKey: process.env.OPENAI_API_KEY,
    });
  }

  async generatePrompts(photoUrl: string, context?: AIContext): Promise<AIPrompt[]> {
    const systemPrompt = this.buildSystemPrompt();
    const userPrompt = this.buildUserPrompt(context);

    // Try GPT-5 first
    try {
      return await this.callOpenAI('gpt-5-turbo', systemPrompt, userPrompt, photoUrl);
    } catch (error) {
      // Fallback to GPT-4.1 on specific errors
      if (this.shouldFallback(error)) {
        console.warn('GPT-5 failed, falling back to GPT-4.1', error);
        return await this.callOpenAI('gpt-4-turbo', systemPrompt, userPrompt, photoUrl);
      }
      throw error;
    }
  }

  private async callOpenAI(
    model: string,
    systemPrompt: string,
    userPrompt: string,
    photoUrl: string
  ): Promise<AIPrompt[]> {
    const response = await this.client.chat.completions.create({
      model,
      messages: [
        { role: 'system', content: systemPrompt },
        {
          role: 'user',
          content: [
            { type: 'text', text: userPrompt },
            { type: 'image_url', image_url: { url: photoUrl } }
          ]
        }
      ],
      max_tokens: 200,
      temperature: 0.7,
      response_format: { type: 'json_object' }
    });

    const content = response.choices[0]?.message?.content;
    if (!content) throw new Error('No response from OpenAI');

    const parsed = JSON.parse(content);
    return this.parsePromptsResponse(parsed);
  }

  private buildSystemPrompt(): string {
    return `You are an empathetic journaling assistant that helps people reflect on their moments.

Your task:
1. Analyze the photo (people, setting, mood, time of day, activities)
2. Generate exactly 3 thoughtful journal prompts
3. Each prompt must be 8-15 words
4. Vary the style:
   - One reflective question (encourages deep thinking)
   - One gratitude prompt (focuses on appreciation)
   - One storytelling starter (encourages narrative)

Guidelines:
- Be specific to the photo content (avoid generic prompts)
- Use warm, encouraging language
- Focus on emotions and meaning, not just what's visible
- Avoid clichés or overly poetic language

Output format (JSON):
{
  "prompts": [
    { "text": "What made this moment special?", "style": "reflective", "confidence": 0.85 },
    { "text": "What are you grateful for in this scene?", "style": "gratitude", "confidence": 0.90 },
    { "text": "Tell the story of what happened before this photo.", "style": "storytelling", "confidence": 0.80 }
  ]
}`;
  }

  private buildUserPrompt(context?: AIContext): string {
    let prompt = 'Generate 3 journal prompts for this photo.';

    if (context?.timeOfDay) {
      prompt += ` It's ${context.timeOfDay}.`;
    }

    if (context?.previousEntries && context.previousEntries.length > 0) {
      prompt += ` Recent themes: ${context.previousEntries.slice(0, 3).join(', ')}.`;
    }

    return prompt;
  }

  private parsePromptsResponse(response: any): AIPrompt[] {
    if (!response.prompts || !Array.isArray(response.prompts)) {
      throw new Error('Invalid response format from OpenAI');
    }

    return response.prompts.map((p: any) => ({
      text: p.text,
      style: p.style || 'question',
      confidence: p.confidence || 0.5
    }));
  }

  private shouldFallback(error: any): boolean {
    // Fallback conditions
    return (
      error.code === 'model_not_found' ||       // GPT-5 not available yet
      error.code === 'rate_limit_exceeded' ||   // Rate limited on GPT-5
      error.status === 503 ||                   // Service unavailable
      error.status === 429                      // Too many requests
    );
  }

  async suggestEmotions(photoUrl: string, journalText?: string): Promise<string[]> {
    const systemPrompt = `You are an emotion detection assistant.
Analyze the photo and optional text to suggest 2-3 emotions from this list:
- happy, grateful, excited, peaceful (Positive)
- thoughtful, nostalgic, proud, loving (Reflective)
- sad, anxious, frustrated, hopeful (Challenging)

Return emotion IDs as JSON array: ["happy", "grateful"]`;

    const userPrompt = journalText
      ? `Photo + Text: "${journalText.slice(0, 200)}"`
      : 'Analyze the photo to detect emotions.';

    try {
      const response = await this.callOpenAI('gpt-4-turbo', systemPrompt, userPrompt, photoUrl);
      // Parse emotion response (simplified)
      return ['happy', 'grateful']; // Placeholder
    } catch (error) {
      console.error('Emotion detection failed', error);
      return []; // Return empty array on failure
    }
  }
}

Implementation: Ollama Provider

Local Development with Llama 3.3

// ollama-provider.ts
import type { AIProvider, AIContext, AIPrompt } from './ai-provider.interface';

export class OllamaProvider implements AIProvider {
  private baseUrl: string;
  readonly providerName = 'Ollama';
  readonly supportsVision = true; // LLaVA for vision

  constructor() {
    this.baseUrl = process.env.OLLAMA_BASE_URL || 'http://localhost:11434';
  }

  async generatePrompts(photoUrl: string, context?: AIContext): Promise<AIPrompt[]> {
    const systemPrompt = this.buildSystemPrompt();
    const userPrompt = this.buildUserPrompt(context);

    // Convert photo URL to base64 (Ollama expects base64)
    const photoBase64 = await this.fetchImageAsBase64(photoUrl);

    const response = await fetch(`${this.baseUrl}/api/chat`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        model: 'llama3.3:70b',  // Or llava for vision
        messages: [
          { role: 'system', content: systemPrompt },
          {
            role: 'user',
            content: userPrompt,
            images: [photoBase64]
          }
        ],
        format: 'json',
        stream: false,
        options: {
          temperature: 0.7,
          num_predict: 200
        }
      })
    });

    if (!response.ok) {
      throw new Error(`Ollama error: ${response.statusText}`);
    }

    const data = await response.json();
    const content = data.message?.content;

    if (!content) throw new Error('No response from Ollama');

    const parsed = JSON.parse(content);
    return this.parsePromptsResponse(parsed);
  }

  private async fetchImageAsBase64(url: string): Promise<string> {
    const response = await fetch(url);
    const buffer = await response.arrayBuffer();
    const base64 = Buffer.from(buffer).toString('base64');
    return base64;
  }

  private buildSystemPrompt(): string {
    // Same as OpenAI (reusable)
    return `You are an empathetic journaling assistant...`;
  }

  private buildUserPrompt(context?: AIContext): string {
    // Same as OpenAI
    return 'Generate 3 journal prompts for this photo.';
  }

  private parsePromptsResponse(response: any): AIPrompt[] {
    // Same parsing logic as OpenAI
    if (!response.prompts || !Array.isArray(response.prompts)) {
      throw new Error('Invalid response format from Ollama');
    }

    return response.prompts.map((p: any) => ({
      text: p.text,
      style: p.style || 'question',
      confidence: p.confidence || 0.5
    }));
  }

  async suggestEmotions(photoUrl: string, journalText?: string): Promise<string[]> {
    // Similar to OpenAI implementation
    return ['happy']; // Placeholder
  }
}

Factory Pattern

Provider Selection

// ai-provider.factory.ts
import type { AIProvider } from './ai-provider.interface';
import { OpenAIProvider } from './openai-provider';
import { OllamaProvider } from './ollama-provider';

export function getAIProvider(): AIProvider {
  const provider = process.env.AI_PROVIDER || 'openai';

  switch (provider) {
    case 'ollama':
      return new OllamaProvider();

    case 'openai':
    default:
      return new OpenAIProvider();
  }
}

// Usage in API routes
export async function generatePromptsHandler(req: Request) {
  const provider = getAIProvider();
  const { photoUrl, context } = await req.json();

  const prompts = await provider.generatePrompts(photoUrl, context);
  return Response.json({ prompts });
}

Environment Configuration

Development (.env.development)

# Use Ollama for local development (no API costs)
AI_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434

# Supabase local instance
SUPABASE_URL=http://localhost:54321
SUPABASE_ANON_KEY=eyJ...

Staging (.env.staging)

# Use cheaper GPT-4.1 for staging
AI_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4-turbo  # Override to force GPT-4

SUPABASE_URL=https://staging-xxx.supabase.co
SUPABASE_ANON_KEY=eyJ...

Production (.env.production)

# Use GPT-5 with automatic GPT-4.1 fallback
AI_PROVIDER=openai
OPENAI_API_KEY=sk-...
# No model override - uses GPT-5 → GPT-4.1 fallback logic

SUPABASE_URL=https://xxx.supabase.co
SUPABASE_ANON_KEY=eyJ...

Usage Examples

API Route: Generate Prompts

// /api/ai/generate-prompts.ts
import { getAIProvider } from '@/lib/ai/ai-provider.factory';
import type { NextRequest } from 'next/server';

export const runtime = 'edge';

export async function POST(req: NextRequest) {
  try {
    const { photoUrl, userId, entryId } = await req.json();

    // Validate auth (middleware)
    const user = await authenticateRequest(req);
    if (!user) {
      return Response.json({ error: 'Unauthorized' }, { status: 401 });
    }

    // Get AI provider (factory handles environment)
    const provider = getAIProvider();

    // Optional context
    const context = {
      userId: user.id,
      timeOfDay: getTimeOfDay(),
      // Could fetch recent entries for better context
    };

    // Generate prompts
    const prompts = await provider.generatePrompts(photoUrl, context);

    // Cache prompts in database (no regeneration)
    await savePromptsToDatabase(entryId, prompts);

    return Response.json({
      prompts,
      provider: provider.providerName
    });
  } catch (error) {
    console.error('AI prompt generation failed', error);
    return Response.json(
      { error: 'Failed to generate prompts', fallback: getFallbackPrompts() },
      { status: 500 }
    );
  }
}

function getFallbackPrompts(): AIPrompt[] {
  // Generic fallback prompts if AI fails
  return [
    { text: 'What happened in this moment?', style: 'reflective', confidence: 0.5 },
    { text: 'What are you grateful for today?', style: 'gratitude', confidence: 0.5 },
    { text: 'Tell the story behind this photo.', style: 'storytelling', confidence: 0.5 }
  ];
}

function getTimeOfDay(): 'morning' | 'afternoon' | 'evening' | 'night' {
  const hour = new Date().getHours();
  if (hour < 12) return 'morning';
  if (hour < 17) return 'afternoon';
  if (hour < 21) return 'evening';
  return 'night';
}

iOS Client Usage

// AIService.swift
struct AIService {
    private let baseURL = "https://api.overengineered.app"

    func generatePrompts(for photo: UIImage, entryId: String) async throws -> [AIPrompt] {
        // 1. Upload photo to Supabase Storage
        let photoUrl = try await uploadPhoto(photo, entryId: entryId)

        // 2. Trigger async AI job
        let jobId = try await createAIJob(photoUrl: photoUrl, entryId: entryId)

        // 3. Poll job status with exponential backoff
        return try await pollJobStatus(jobId: jobId)
    }

    private func createAIJob(photoUrl: String, entryId: String) async throws -> String {
        let request = URLRequest(url: URL(string: "\(baseURL)/api/ai/generate-prompts")!)
        // ... make POST request
        let response = try await URLSession.shared.data(for: request)
        return response.jobId
    }

    private func pollJobStatus(jobId: String) async throws -> [AIPrompt] {
        var delay: TimeInterval = 2.0  // Start with 2 seconds
        let maxDelay: TimeInterval = 16.0
        let timeout: TimeInterval = 30.0
        let startTime = Date()

        while Date().timeIntervalSince(startTime) < timeout {
            let job = try await fetchJob(jobId)

            if job.status == .completed {
                return job.prompts
            } else if job.status == .failed {
                throw AIError.jobFailed
            }

            // Exponential backoff: 2s, 4s, 8s, 16s
            try await Task.sleep(nanoseconds: UInt64(delay * 1_000_000_000))
            delay = min(delay * 2, maxDelay)
        }

        throw AIError.timeout
    }
}

struct AIPrompt: Codable {
    let text: String
    let style: PromptStyle
    let confidence: Double

    enum PromptStyle: String, Codable {
        case reflective, gratitude, storytelling, question
    }
}

Prompt Engineering

Journal Prompt Generation

Prompt Template:

const JOURNAL_PROMPT_SYSTEM = `You are an empathetic journaling assistant.

Task: Generate 3 journal prompts based on the photo.

Requirements:
1. Length: 8-15 words per prompt
2. Specificity: Reference photo content (people, setting, mood)
3. Variety: One reflective, one gratitude, one storytelling
4. Tone: Warm, encouraging, non-judgmental
5. Depth: Encourage meaningful reflection, not surface-level

Examples of GOOD prompts:
- "What made you smile in this moment?"
- "Who are you grateful to have shared this with?"
- "What happened right before you took this photo?"

Examples of BAD prompts:
- "What do you see?" (too generic)
- "Describe the weather and lighting conditions." (too objective)
- "What does this photo make you think about the meaning of life?" (too abstract)

Output JSON:
{
  "prompts": [
    { "text": "...", "style": "reflective", "confidence": 0.85 },
    { "text": "...", "style": "gratitude", "confidence": 0.90 },
    { "text": "...", "style": "storytelling", "confidence": 0.80 }
  ]
}`;

Emotion Detection

Prompt Template:

const EMOTION_DETECTION_SYSTEM = `You are an emotion detection AI.

Task: Analyze photo and/or text to suggest 2-3 emotions.

Emotion Taxonomy (12 emotions):
Positive: happy, grateful, excited, peaceful
Reflective: thoughtful, nostalgic, proud, loving
Challenging: sad, anxious, frustrated, hopeful

Guidelines:
- Suggest 2-3 emotions maximum (don't overwhelm)
- Base suggestions on visual cues (faces, setting, colors)
- If text provided, use sentiment analysis
- Consider context (time of day, location)
- Avoid extremes unless clearly present

Output JSON:
{
  "emotions": ["happy", "grateful"],
  "confidence": { "happy": 0.85, "grateful": 0.75 }
}`;

Error Handling

Graceful Degradation

// ai-service.ts
export async function generatePromptsWithFallback(
  photoUrl: string,
  context?: AIContext
): Promise<AIPrompt[]> {
  const provider = getAIProvider();

  try {
    // Try AI provider
    return await provider.generatePrompts(photoUrl, context);
  } catch (error) {
    console.error('AI generation failed, using fallback', error);

    // Track failure for monitoring
    trackAIFailure(provider.providerName, error);

    // Return generic fallback prompts
    return getFallbackPrompts();
  }
}

function getFallbackPrompts(): AIPrompt[] {
  return [
    {
      text: 'What happened in this moment?',
      style: 'reflective',
      confidence: 0.5
    },
    {
      text: 'What are you grateful for in this photo?',
      style: 'gratitude',
      confidence: 0.5
    },
    {
      text: 'Tell the story behind this moment.',
      style: 'storytelling',
      confidence: 0.5
    }
  ];
}

Retry Logic

export async function generatePromptsWithRetry(
  photoUrl: string,
  context?: AIContext,
  maxRetries = 2
): Promise<AIPrompt[]> {
  const provider = getAIProvider();
  let lastError: Error | null = null;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await provider.generatePrompts(photoUrl, context);
    } catch (error) {
      lastError = error as Error;
      console.warn(`AI attempt ${attempt + 1} failed`, error);

      // Exponential backoff
      if (attempt < maxRetries - 1) {
        await sleep(Math.pow(2, attempt) * 1000);
      }
    }
  }

  // All retries failed
  throw lastError || new Error('AI generation failed after retries');
}

Testing

Unit Tests (Vitest)

// ai-provider.test.ts
import { describe, it, expect, vi } from 'vitest';
import { OpenAIProvider } from './openai-provider';
import { OllamaProvider } from './ollama-provider';

describe('OpenAIProvider', () => {
  it('should generate 3 prompts', async () => {
    const provider = new OpenAIProvider();
    const prompts = await provider.generatePrompts('https://example.com/photo.jpg');

    expect(prompts).toHaveLength(3);
    expect(prompts[0].text).toMatch(/\w+/);
    expect(prompts[0].style).toBeOneOf(['reflective', 'gratitude', 'storytelling']);
  });

  it('should fallback to GPT-4.1 on model_not_found', async () => {
    const provider = new OpenAIProvider();

    // Mock GPT-5 failure
    vi.spyOn(provider as any, 'callOpenAI')
      .mockRejectedValueOnce({ code: 'model_not_found' })
      .mockResolvedValueOnce([{ text: 'Fallback prompt', style: 'reflective', confidence: 0.8 }]);

    const prompts = await provider.generatePrompts('https://example.com/photo.jpg');

    expect(prompts).toHaveLength(1);
    expect((provider as any).callOpenAI).toHaveBeenCalledTimes(2);
  });
});

describe('OllamaProvider', () => {
  it('should work with local Ollama instance', async () => {
    const provider = new OllamaProvider();
    const prompts = await provider.generatePrompts('https://example.com/photo.jpg');

    expect(prompts).toHaveLength(3);
  });
});

Integration Tests

// ai-integration.test.ts
describe('AI Provider Integration', () => {
  it('should switch providers based on environment', () => {
    process.env.AI_PROVIDER = 'ollama';
    const provider1 = getAIProvider();
    expect(provider1.providerName).toBe('Ollama');

    process.env.AI_PROVIDER = 'openai';
    const provider2 = getAIProvider();
    expect(provider2.providerName).toBe('OpenAI');
  });

  it('should handle provider failures gracefully', async () => {
    const prompts = await generatePromptsWithFallback('https://example.com/photo.jpg');

    // Should return fallback if provider fails
    expect(prompts).toHaveLength(3);
    expect(prompts[0].text).toContain('What happened');
  });
});

Performance Optimization

Caching Strategy

// Cache AI prompts in database (no regeneration)
export async function getCachedOrGeneratePrompts(
  entryId: string,
  photoUrl: string
): Promise<AIPrompt[]> {
  // Check database cache first
  const cached = await db
    .from('ai_prompts')
    .select('prompts')
    .eq('entry_id', entryId)
    .single();

  if (cached?.prompts) {
    return cached.prompts;
  }

  // Generate new prompts
  const prompts = await generatePromptsWithRetry(photoUrl);

  // Save to database
  await db
    .from('ai_prompts')
    .insert({
      entry_id: entryId,
      prompts,
      created_at: new Date().toISOString()
    });

  return prompts;
}

Request Deduplication

// Prevent multiple simultaneous requests for same entry
const activeRequests = new Map<string, Promise<AIPrompt[]>>();

export async function generatePromptsDeduped(
  entryId: string,
  photoUrl: string
): Promise<AIPrompt[]> {
  // Return existing promise if request in flight
  if (activeRequests.has(entryId)) {
    return activeRequests.get(entryId)!;
  }

  // Create new request
  const promise = generatePromptsWithRetry(photoUrl)
    .finally(() => activeRequests.delete(entryId));

  activeRequests.set(entryId, promise);
  return promise;
}

Cost Optimization

Token Usage Monitoring

export async function generatePromptsWithCostTracking(
  photoUrl: string,
  context?: AIContext
): Promise<{ prompts: AIPrompt[]; cost: number }> {
  const provider = getAIProvider();
  const startTime = Date.now();

  const prompts = await provider.generatePrompts(photoUrl, context);
  const duration = Date.now() - startTime;

  // Estimate cost (GPT-5: ~$0.01, GPT-4.1: ~$0.005)
  const estimatedCost = provider.providerName === 'OpenAI'
    ? 0.01
    : 0; // Ollama is free

  // Log for cost tracking
  await logAIUsage({
    provider: provider.providerName,
    operation: 'generatePrompts',
    duration,
    estimatedCost,
    timestamp: new Date()
  });

  return { prompts, cost: estimatedCost };
}

Budget Alerts

// Check daily AI spend
export async function checkAIBudget(): Promise<boolean> {
  const today = new Date().toISOString().split('T')[0];

  const usage = await db
    .from('ai_usage_logs')
    .select('estimated_cost')
    .gte('timestamp', today)
    .sum('estimated_cost');

  const dailySpend = usage.sum || 0;
  const dailyBudget = 100; // $100/day limit

  if (dailySpend > dailyBudget) {
    // Alert via email/Slack
    await sendBudgetAlert(dailySpend, dailyBudget);
    return false;
  }

  return true;
}

Future Enhancements

Support for Additional Providers

// anthropic-provider.ts (future)
export class AnthropicProvider implements AIProvider {
  async generatePrompts(photoUrl: string): Promise<AIPrompt[]> {
    // Claude implementation
  }
}

// gemini-provider.ts (future)
export class GeminiProvider implements AIProvider {
  async generatePrompts(photoUrl: string): Promise<AIPrompt[]> {
    // Google Gemini implementation
  }
}

// Update factory
export function getAIProvider(): AIProvider {
  switch (process.env.AI_PROVIDER) {
    case 'anthropic': return new AnthropicProvider();
    case 'gemini': return new GeminiProvider();
    case 'ollama': return new OllamaProvider();
    default: return new OpenAIProvider();
  }
}

A/B Testing Different Prompts

export async function generatePromptsABTest(
  photoUrl: string,
  userId: string
): Promise<AIPrompt[]> {
  // Assign users to test groups
  const group = getUserABTestGroup(userId);

  if (group === 'variant_a') {
    // More reflective prompts
    return generatePromptsWithStyle('reflective');
  } else {
    // Standard mix
    return generatePromptsWithFallback(photoUrl);
  }
}

System Design: /docs/architecture/system-design.md
Tech Stack: /docs/architecture/tech-stack.md
Async Processing: /docs/architecture/async-processing.md
API Contracts: /docs/architecture/api-contracts/ai.yaml
Development Setup: /docs/architecture/development-setup.md

END OF AI ABSTRACTION LAYER DOCUMENTATION

​AI Provider Abstraction Layer

​Overview

​Architecture

​Interface Design

​Implementation: OpenAI Provider

​GPT-5 with GPT-4.1 Fallback

​Implementation: Ollama Provider

​Local Development with Llama 3.3

​Factory Pattern

​Provider Selection

​Environment Configuration

​Development (.env.development)

​Staging (.env.staging)

​Production (.env.production)

​Usage Examples

​API Route: Generate Prompts

​iOS Client Usage

​Prompt Engineering

​Journal Prompt Generation

​Emotion Detection

​Error Handling

​Graceful Degradation

​Retry Logic

​Testing

​Unit Tests (Vitest)

​Integration Tests

​Performance Optimization

​Caching Strategy

​Request Deduplication

​Cost Optimization

​Token Usage Monitoring

​Budget Alerts

​Future Enhancements

​Support for Additional Providers

​A/B Testing Different Prompts

​Related Documentation