AI Agent System
The MeetAI agent is a LiveKit Voice Pipeline Agent — a Dockerized Node.js process that joins LiveKit rooms, processes audio via Gemini, and executes tool calls with human-in-the-loop approval.
Agent Architecture
Agent Lifecycle
- Dispatch — LiveKit’s agent framework dispatches the agent when a room is created with agent-compatible metadata
- Connect — Agent connects to the room via
ctx.connect() - Metadata Parse — Reads room metadata to extract meeting info and custom agent instructions
- Session Start — Creates an
AgentSessionwith Gemini Multimodal Live (voice-to-voice) - Conversation Loop — Listens for
ConversationItemAddedevents and processes each turn - Transcript Storage — Buffers and batches transcript lines to the backend
- Tool Execution — Handles tool calls (calendar events) via LiveKit RPC
- Shutdown — Flushes remaining transcript buffer on room disconnect
Custom Instructions
Each agent has user-defined instructions stored in the agents table. These are injected into the system prompt at runtime:
const finalInstructions = `
You are a helpful voice AI assistant named ${currentAgent.name}.
The current meeting name is "${currentMeeting.name}".
Your Core Instructions: ${currentAgent.instructions}
IMPORTANT: Default language is English.
`;This allows users to create specialized agents:
- Standup Bot — “Focus on blockers and action items”
- Interview Coach — “Ask behavioral questions and give feedback”
- Brainstorm Facilitator — “Encourage divergent thinking, track all ideas”
Transcript Storage Service
The TranscriptStorageService class handles reliable transcript persistence:
Configuration
| Parameter | Value | Rationale |
|---|---|---|
BATCH_SIZE | 10 lines | Amortize HTTP overhead without excessive latency |
FLUSH_INTERVAL_MS | 3000ms | Balance between real-time persistence and request volume |
MAX_RETRIES | 3 | Exponential backoff: 1s, 2s, 4s — covers transient network issues |
Tool Calling — Calendar Events
The calendar tool uses a RPC-based approval pattern instead of traditional HTTP webhooks:
Why RPC over HTTP?
| Aspect | HTTP Webhook | LiveKit RPC |
|---|---|---|
| Latency | Agent → Backend → WebSocket → Client | Agent → SFU → Client (direct) |
| Targeting | Broadcast to all | Specific participant |
| Bidirectional | Needs separate response channel | Request-response built-in |
| Auth | Separate token validation | LiveKit session inherent |
Approval Flow
// Agent side: send approval request to specific participant
const response = await ctx.room.localParticipant.performRpc({
destinationIdentity: targetParticipant,
method: "approval-request",
payload: JSON.stringify({ type: "calendar_event", ...eventDetails }),
});
// Client side: registered RPC handler renders approval dialog
// User clicks Approve/Reject → response sent back via RPCConcurrency guard: A pendingApprovals Map deduplicates identical in-flight RPC calls keyed by participantIdentity + eventHash. This prevents duplicate approval dialogs caused by Gemini speech interruptions re-triggering the same tool call.
Deployment
The agent runs as a standalone Docker container, separate from the Next.js app:
FROM node:20-slim
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN corepack enable && pnpm install --frozen-lockfile
COPY . .
RUN pnpm build
CMD ["node", "dist/agent.js", "start"]Environment variables required:
LIVEKIT_URL— LiveKit server WebSocket URLLIVEKIT_API_KEY/LIVEKIT_API_SECRET— Agent authenticationGOOGLE_API_KEY— Gemini API accessNEXT_PUBLIC_APP_URL— Backend URL for transcript storageAGENT_SECRET— Shared secret for agent → backend auth