Serverless Functions for AI Workflows
Build serverless functions for AI workflows — edge functions, API routes, cron jobs, and event-driven processing without managing servers.
Serverless architecture for AI workloads
Serverless functions run your code without managing servers. You deploy a function, and the platform handles scaling, infrastructure, and availability. For AI workflows, this is ideal: AI processing is bursty (occasional heavy usage with quiet periods), event-driven (triggered by user actions or schedules), and stateless (each request is independent). You pay only for execution time — no idle server costs. The main serverless platforms are Vercel Functions (built into Next.js), AWS Lambda, and Cloudflare Workers. This guide uses Vercel because it integrates seamlessly with Next.js, but the patterns apply everywhere. Ask Claude Code: Create a Next.js project with TypeScript for serverless AI workflows. Set up the project with API routes at src/app/api/. Create a simple test function at src/app/api/health/route.ts that returns a JSON response with the current timestamp, the runtime environment (edge or nodejs), and the region (from Vercel's headers). Deploy to Vercel to verify the serverless infrastructure works. There are two runtime options. Node.js runtime: full Node.js environment, up to 10 seconds default execution (60 seconds on Pro plans), 4.5MB deployment size limit, and access to all npm packages. Edge runtime: runs on Cloudflare's edge network, sub-millisecond cold start, limited to Web APIs (no Node.js-specific modules), 1MB deployment size limit, and executes in the region closest to the user for minimal latency. Ask Claude Code: Create two versions of a text processing function — one running on Node.js runtime and one on Edge runtime. Compare the cold start times and response times. Document which runtime is appropriate for which use cases: Edge for latency-sensitive reads, Node.js for heavier processing. Common error: serverless functions have execution time limits. A function that calls an AI API and waits 30 seconds for a response will timeout on most plans. Use streaming responses or background processing for long-running AI tasks.
API routes for AI processing
Most AI workflows start with an API route that receives a request, processes it with an AI service, and returns the result. Ask Claude Code: Create an AI text analysis API at src/app/api/analyse/route.ts. The endpoint accepts a POST request with a text body (up to 10,000 characters) and returns: a sentiment analysis (positive, negative, neutral with confidence scores), key topics extracted from the text (up to 5 topics), a readability score, and a one-paragraph summary. For the MVP, implement the analysis using heuristic algorithms (keyword matching for sentiment, TF-IDF-like scoring for topics, Flesch-Kincaid for readability, and extractive summarisation picking the most representative sentences). This avoids external API costs during development. Add a route that calls an external AI API. Ask Claude Code: Create src/app/api/ai/summarise/route.ts that accepts text and returns an AI-generated summary. Use the Anthropic API (npm install @anthropic-ai/sdk). Set the API key as an environment variable. Send the text to Claude with a summarisation prompt. Stream the response back to the client using the Web Streams API — this means the user sees text appearing word by word instead of waiting for the entire response. Implement streaming. Ask Claude Code: Create a streaming AI response handler. The route uses export const runtime = 'edge' for streaming support. Create a ReadableStream that receives chunks from the Anthropic API and forwards them to the client. On the client side, use fetch with response.body.getReader() to read the stream and update the UI progressively. This pattern is essential for AI responses — users tolerate waiting 5 seconds when they see text streaming, but abandon the page after 3 seconds of a blank loading spinner. Add rate limiting. Ask Claude Code: Implement rate limiting for the AI API routes. Track requests per IP using Vercel's KV store (or an in-memory Map for development). Allow 10 requests per minute per IP. Return 429 Too Many Requests with a Retry-After header when the limit is exceeded. Common error: streaming responses cannot set headers after the stream starts. Set all headers (Content-Type, Cache-Control, rate limit headers) before calling response.body on the stream.
Cron jobs and scheduled functions
Cron jobs run functions on a schedule — daily reports, hourly data syncs, periodic cleanup tasks. Vercel Cron Jobs trigger your API routes on a schedule defined in vercel.json. Ask Claude Code: Create a scheduled data aggregation function at src/app/api/cron/daily-report/route.ts. The function runs daily at 8 AM UTC and generates a daily business summary. It queries your data sources, calculates key metrics (new users, revenue, active sessions), compares to yesterday and last week, and sends the summary via email using Resend. Configure the cron schedule in vercel.json: add a crons array with the path /api/cron/daily-report and the schedule 0 8 * * * (cron syntax for 8 AM daily). Secure the cron endpoint. Ask Claude Code: Add authentication to the cron route. Vercel sets a CRON_SECRET environment variable and sends it as an Authorization header. Check this header and return 401 if it does not match. This prevents anyone from triggering your cron job by visiting the URL directly. Build a content refresh cron. Ask Claude Code: Create src/app/api/cron/refresh-content/route.ts that runs every 6 hours. The function fetches data from external APIs (news feeds, social media metrics, market data), processes and stores the results in your database, and updates any cached content. This keeps your application's data fresh without user-triggered API calls. Add a cleanup cron. Ask Claude Code: Create src/app/api/cron/cleanup/route.ts that runs daily at midnight. The function deletes expired sessions, removes old temporary files, archives completed tasks older than 90 days, and generates a cleanup report (rows deleted, space freed, errors encountered). Send the report to an admin email. Add a monitoring cron. Ask Claude Code: Create src/app/api/cron/health-check/route.ts that runs every 15 minutes. The function checks that all critical services are responding: database connection, external API availability, email service, and payment processor. If any service is down, send an alert notification. Store the results in a health check log for historical uptime tracking. Common error: Vercel Cron Jobs have the same execution time limits as regular functions. If your daily report takes more than 60 seconds to generate, split it into smaller functions: one to collect data, another to process it, and a third to send the email. Chain them with webhook calls.
Event-driven processing and webhooks
Event-driven serverless functions react to external events — a user signs up, a payment is processed, a file is uploaded, or a third-party service sends a notification. Ask Claude Code: Create a webhook handler at src/app/api/webhooks/stripe/route.ts for Stripe payment events. The handler: reads the raw request body, verifies the webhook signature using the Stripe webhook secret (this prevents attackers from sending fake events), parses the event type, and dispatches to the appropriate handler function. Handle these events: checkout.session.completed (create the user's subscription, send welcome email), invoice.payment_failed (send a payment failure notification, grant a 7-day grace period), and customer.subscription.deleted (revoke access, send a win-back email). Ask Claude Code: Create a generic webhook receiver at src/app/api/webhooks/[provider]/route.ts. The dynamic route handles webhooks from multiple providers: Stripe, GitHub, Resend, and your own internal events. Each provider has its own signature verification method. The receiver validates, parses, and routes events to handler functions. Build an event processing pipeline. Ask Claude Code: Create an event queue system using Vercel's background functions (or a simple database queue for development). When a webhook arrives, the handler validates it and enqueues the event for processing. A separate function processes events from the queue. This decoupling means the webhook handler responds instantly (important — many providers timeout after 5 to 10 seconds) while the actual processing happens asynchronously. Add retry logic for failed event processing. Ask Claude Code: If an event processor fails (network error, database timeout), retry with exponential backoff: first retry after 1 minute, second after 5 minutes, third after 30 minutes. After 3 failures, move the event to a dead letter queue and alert an operator. Log every attempt with the result. Build an internal event system. Ask Claude Code: Create a publish-subscribe system at src/lib/events.ts. Components in your application publish events: events.publish('user.signed_up', { userId, email }). Serverless functions subscribe to events and process them. This decouples your application logic — the signup handler does not need to know about the welcome email, the analytics tracking, and the CRM update. Each concern is a separate subscriber. Common error: webhook handlers must be idempotent. The same event can be delivered multiple times (network issues cause retries). Your handler must produce the same result whether it processes an event once or five times. Check for duplicate event IDs before processing.
Edge middleware and request processing
Edge middleware runs before your application handles a request. It executes on the edge network, close to the user, with sub-millisecond latency. Use it for authentication, redirects, A/B testing, and request modification. Ask Claude Code: Create a middleware.ts in the project root that runs on every request. Add these processing layers. Authentication check: read the session cookie, validate it against your auth service, and redirect to /login if the session is invalid. Only apply to protected routes (define a matcher pattern like /dashboard/:path*). Geolocation-based routing: read the x-vercel-ip-country header and redirect users to their regional site or show region-specific content. For example, redirect UK users to /en-gb/ and US users to /en-us/. A/B test assignment: hash the user's IP or cookie to deterministically assign them to a variant. Set a cookie with the assignment so it persists. Pass the variant as a header that your pages can read for conditional rendering. Ask Claude Code: Add a bot detection layer to the middleware. Check the user agent against known bot signatures. For verified search engine bots (Googlebot, Bingbot), allow access and set a header that your pages use to serve pre-rendered content. For suspicious bots (no JavaScript support, abnormal request patterns), serve a challenge page. For known malicious bots, return 403. Add feature flags via middleware. Ask Claude Code: Create a feature flag system that reads flag configuration from a KV store. Based on the user's identity and the flag configuration, set cookies that enable or disable features. Your React components check these cookies to conditionally render new features. This lets you ship code to production and enable it for specific users, percentages, or regions without redeploying. Common error: middleware runs on the edge runtime, which does not support all Node.js APIs. You cannot use fs, path, or most npm packages that depend on Node.js internals. Use only Web APIs (fetch, crypto, TextEncoder, URL). Check the Vercel Edge Runtime documentation for the supported APIs before adding dependencies.
Monitoring, debugging, and deployment
Serverless functions are harder to debug than traditional servers because there is no persistent process to attach a debugger to. Observability through logging and monitoring is essential. Ask Claude Code: Add structured logging to all serverless functions. Create a logger utility at src/lib/logger.ts that outputs JSON-formatted logs with: timestamp, function name, request ID (from Vercel headers), execution duration, success or error status, and any relevant metadata. Use console.log for info messages and console.error for errors — Vercel captures both and makes them searchable in the Logs dashboard. Add performance monitoring. Ask Claude Code: Create a timing utility that wraps async operations. Before calling the AI API, start a timer. After the response, log the duration. Track: cold start time (first request after deployment takes longer), function execution time, external API call duration (how long the AI service took to respond), and total response time (what the user experiences). Log these as structured metrics that can be graphed in your monitoring dashboard. Add error tracking. Ask Claude Code: Create an error handler that catches all unhandled errors in serverless functions. For each error, log: the error message and stack trace, the request that triggered it (URL, method, headers — redact sensitive data), the function name and runtime, and whether it was a timeout, a runtime error, or an external service failure. Send critical errors (5xx responses) to an alerting service (email, Slack, or PagerDuty). Deploy with proper configuration. Ask Claude Code: Create a vercel.json with: function configuration (maximum duration for different routes — 10 seconds for simple routes, 60 seconds for AI processing), cron schedules for all scheduled functions, region selection (deploy functions in the region closest to your data sources), and environment variable configuration. Deploy with: vercel --prod. Test every function in production. Ask Claude Code: Create a post-deployment smoke test script that: calls every API route and verifies the expected response, triggers each cron job manually (Vercel allows this from the dashboard) and verifies execution, sends test webhook payloads and verifies processing, and checks that middleware rules are applied correctly (authentication redirect, geo-routing, A/B assignment). Run this after every deployment to catch production-specific issues. The serverless AI workflow system is complete — scalable, cost-effective, and maintainable.
Cloud Architecture
This guide is hands-on and practical. The full curriculum covers the conceptual foundations in depth with structured lessons and quizzes.
Go to lesson