Loading…
Loading…
Built an AI-powered customer support agent that handles 80% of inbound queries automatically, reducing response time from hours to seconds.
NovaTech Solutions, a fast-growing B2B SaaS company providing project management and collaboration tools, was facing a customer support crisis. As their user base grew from 5,000 to 25,000 accounts in under a year, their support team of 12 agents was struggling to keep up. Peak periods saw response times stretch beyond 4 hours, and their customer satisfaction score had fallen from 92% to 78%. NovaTech had tried chatbots before — basic rule-based bots that frustrated users with rigid, keyword-matched responses. They needed something fundamentally different: an intelligent agent that could understand context, reason across their knowledge base, and hold natural conversations. KumoDevs designed and deployed a production-grade AI support agent powered by large language models with retrieval-augmented generation, integrated directly into NovaTech's existing support infrastructure.
NovaTech's customer support team was overwhelmed with 3,000+ weekly queries, leading to average response times of 4+ hours and a declining CSAT score that had dropped below 80%.
Developed a conversational AI agent powered by large language models, integrated with NovaTech's knowledge base and CRM, that autonomously resolves common issues and escalates complex cases to human agents with full conversation context.
KumoDevs began with a two-week discovery phase, analysing 6 months of historical support tickets to categorise query types, identify resolution patterns, and map the knowledge base structure. We then built a custom RAG pipeline: every support article was chunked, embedded, and stored in a vector database. When a user message arrives, the system performs hybrid search (vector similarity + keyword BM25) against the knowledge base, retrieves the most relevant chunks, and constructs a context-aware prompt for the LLM. The agent outputs structured responses with citations. We implemented confidence scoring — if the agent's confidence in its answer falls below a configurable threshold (we settled on 0.85), the conversation is automatically escalated to a human agent with the full transcript and suggested resolution. The system was deployed incrementally: first as a human-assisted co-pilot (agents saw AI-suggested responses), then as a fully autonomous tier-1 agent with human oversight, and finally as the primary first-response system.
Analysed 6 months of support tickets, categorised 40+ query types, mapped resolution patterns, and conducted stakeholder interviews with support team leads.
Built the document ingestion pipeline: chunking strategies, embedding model selection, vector database setup, and hybrid search implementation with re-ranking.
Iteratively designed system prompts, few-shot examples, and response formatting templates through 200+ test scenarios covering edge cases and known failure modes.
Built FastAPI middleware connecting the AI agent to Zendesk's API for ticket creation, user context retrieval, and conversation history access.
Deployed as a co-pilot to 3 support agents for 2 weeks, collected 500+ feedback ratings, and tuned confidence thresholds and retrieval parameters based on real-world performance.
Gradually increased autonomous handling from 30% to 80% of tier-1 queries over 4 weeks, with daily monitoring and immediate rollback capability.
“Our support team was drowning before KumoDevs built this agent. The difference is night and day — our agents now focus on complex, high-value conversations while the AI handles the rest. Our customers get instant answers, and our team has never been happier. The ROI in the first quarter alone justified the entire project.”
Built on a RAG architecture using LangChain for orchestration, OpenAI embeddings for vector search against NovaTech's 2,000+ article knowledge base, FastAPI for the middleware layer, and Redis for conversation state management and rate limiting. Human handoff is triggered by confidence thresholds and sentiment analysis, with full conversation summary passed to Zendesk tickets.
Expand the agent to support multilingual queries (5 languages), add voice channel integration through Twilio, and implement proactive outreach for account health monitoring.