NetWebMedia does not answer the phone. Not because no one is available — because no one needs to be. Every inbound lead across web, WhatsApp, Instagram, Facebook, and SMS is handled by an automated system that qualifies, sells, supports, and routes without a single human in the loop for initial contact. This playbook is the exact architecture behind that system. Nine modules. Real scripts. Every failure state mapped. Deploy it and your business runs contact-to-qualification around the clock, in any language, across every channel your customers actually use.
This is not theory. Every script, framework, and checklist in here was built from production deployments. The names of frameworks are named so you can remember and teach them. The dialog examples are exact — copy them, adjust the product nouns, deploy.
Course Modules
Conversation Design Fundamentals — The DIAL Framework
Most chatbots fail in the first exchange. Not because the AI is bad — because the designer confused a transaction with a conversation. A transaction is a form. A conversation is a relationship. The bot that says "Please select from the following options: 1) Sales 2) Support 3) Billing" is a transaction. The bot that says "Hey — what brought you here today?" is a conversation. One creates friction. One creates trust. The entire performance of your automation system depends on which one you build.
The DIAL Framework: Detect, Intend, Answer, Loop
Every chatbot interaction — on every platform, in every language, for every use case — follows four phases. Memorize them, and you have a mental model for diagnosing any broken conversation.
Detect is the opening phase. The bot observes the user's first message and gathers signal: language, sentiment, urgency, platform context, and any prior session history. Detection is passive. You are not asking questions yet — you are reading. A user who types "URGENT help with my order" carries different signal than one who types "hi." The Detect phase processes both before the bot says a single word back.
Intend is classification. Based on the detected signal, the bot forms a hypothesis about what the user actually wants — not what they literally said. "I want to know the price" is the literal message. The intent is likely evaluate_purchase. Misidentify the intent and the entire conversation goes sideways. The Intent phase maps raw language to one of your defined intent categories (covered in Module 2).
Answer is delivery. The bot delivers the most useful, concise response to the identified intent. Useful does not mean long. It means the user's need moves forward. A one-sentence answer that resolves the intent beats a paragraph that hedges. This phase is where your scripts, product knowledge, and tone all live.
Loop is continuation. The bot does not end the conversation — it creates the next opening. Every Answer should include a forward question, a soft prompt, or a clear next step. The Loop phase prevents dead-ends and keeps the user moving toward the outcome your business needs them to reach (booked call, purchase, ticket closed).
- D — Detect: Read language, sentiment, urgency, channel context, session history
- I — Intend: Classify into a defined intent category with a confidence score
- A — Answer: Deliver the most useful, concise response to that intent
- L — Loop: Create a forward prompt that moves the user to the next step
- Map the four DIAL phases to your first 3 conversation flows before writing any scripts
- Define what "Detect" reads on each platform (web has full URL context; WhatsApp does not)
- Write your Loop prompts first — work backwards from the outcome you want
- Test your opening exchange with 10 real people before deploying
Conversation vs. Transaction — Why the Distinction Changes Everything
A transactional bot presents options. A conversational bot asks questions. The difference sounds small. The revenue impact is not. Internal benchmarks across chatbot deployments show that conversational-style openers generate 2.3x more completed qualification flows than menu-driven openers. The reason is simple: menus externalize the user's thinking back at them before you've built any trust. Questions invite them into a dialogue where your bot is positioned as a helpful advisor, not a phone tree.
The test for whether you've built a transaction or a conversation: read your bot's first three messages out loud. Would a competent human sales rep say those words in a real call? If not, you've built a transaction. Rebuild it. The friction isn't the technology — it's the design.
There is one legitimate use for menus: when the user is mid-flow and needs to navigate between discrete options (e.g., checking order status vs. initiating a return). Even then, the menu should follow a conversational opener, not replace it. Lead with the human, then give them the shortcuts.
- Audit every existing bot opener — replace menus with open questions where possible
- Write three conversational openers for each channel (web, WhatsApp, Instagram)
- Record a human sales rep doing 5 real calls — extract their opening questions and use those as bot copy
- Menus are only permitted after the first qualifying exchange
Persona Design — Building a Bot Identity That Converts
Your bot needs a name, a voice, and a character. Not because branding is nice — because unnamed, voiceless bots feel like forms, and forms get abandoned. When users feel they are talking to someone (even someone who is transparently AI), they engage longer, share more context, and convert at higher rates. The persona is not a lie — it is a user experience decision.
Persona design has five components. Name: short, professional, gender-neutral when possible (Alex, Sam, Jordan, Riley). Voice: match your brand — if your website copy is formal, the bot is formal; if it's casual, the bot is casual. Expertise signal: the bot should demonstrate domain knowledge within the first three exchanges. Transparency: always disclose that the user is talking to an AI assistant when asked directly. Consistency: the persona must be identical across every platform — if Alex on WhatsApp says "Hey!" but Alex on the web widget says "Greetings," you have two bots that happen to share a name.
One practical note on names and languages: if you serve a bilingual market (English and Spanish), your bot name should work in both languages without sounding odd. "Alex" works. "Brad" is harder to pronounce in Spanish. "Alejandro" works in Spanish but sounds stiff in English casual contexts. Pick for the full audience.
- Name: Short, pronounceable in all target languages, gender-neutral preferred
- Voice: 3 adjectives that define tone (e.g., "Direct, warm, expert")
- Expertise signal: Domain knowledge demonstrated within turn 3
- Transparency: "I'm an AI assistant" disclosed on direct inquiry, never hidden
- Consistency: Same name, same voice, same personality — every channel, every language
- Write a one-paragraph persona brief (name, 3 voice adjectives, expertise domain)
- Create a "are you a bot?" response for every language you deploy in
- Review persona consistency across all platforms before launch
- Test persona voice with 5 real team members — does it match your brand?
Tone Calibration — Matching Energy Without Losing Authority
Tone is not a static setting. It adjusts based on the user's message. A user who types in all caps is expressing urgency or frustration — the bot's tone should shift toward calm efficiency. A user who types with exclamation points and emoji is relaxed and friendly — the bot can mirror that energy slightly. A user who writes in formal complete sentences expects formality back. This dynamic adjustment is called tone mirroring, and it is a learnable, scriptable behavior.
The rule: match energy, never match negativity. If a user is frustrated, do not become curt or defensive. Become more grounded. If a user is excited, you can be warm — but do not become so casual that you lose authority. The bot should always feel like the most competent, composed person in the room.
- User: casual / excited → Bot: warm, direct, light energy. "Great choice — here's exactly how it works."
- User: formal / professional → Bot: precise, structured. "I'd be happy to provide specifications for each package."
- User: frustrated / urgent → Bot: calm, efficient, immediate ownership. "I've got this — let me pull that up right now."
- User: confused / overwhelmed → Bot: simple language, one step at a time. "No worries. Let's start with one thing."
- Write 4 tone variants of your standard greeting for each matrix state
- Tag frustration/urgency signals in your NLP training data (all-caps, "URGENT", "frustrated", "this is ridiculous")
- Define the maximum casualness level for your brand (some brands should never use slang, even if the user does)
- Review all scripts for passive voice — replace with active
Failure States — Designing for What Goes Wrong
A conversation fails in one of three ways: the bot misunderstands the intent, the bot understands but has no answer, or the user's need is outside the system's scope entirely. Each failure state needs a designed response. An unhandled failure state produces dead silence, a generic error, or an infinite loop — all of which end the conversation and, likely, the lead.
The misunderstanding failure is the most common. The bot classifies the wrong intent. The correct response is a graceful clarification request — not "I don't understand," which is a dead-end, but a reframing question: "I want to make sure I get you the right answer — are you asking about X or Y?" Two options, not an open field. Open fields after a misunderstanding produce more misunderstandings.
The no-answer failure occurs when the intent is correctly classified but the knowledge base has no response. The correct behavior: acknowledge, set an expectation, collect contact info, trigger an async follow-up. "I don't have that detail on hand, but I can have the right person send it to you within an hour. What's the best email?" The conversation ends with a forward commitment, not a wall.
The out-of-scope failure is when the user asks something the system was never designed to handle (e.g., a legal question, a personal emergency). The correct behavior: acknowledge, do not attempt to answer, route to a human or relevant resource. Never let the bot improvise on topics outside its defined scope.
- Write a clarification script for each major intent category (2-option reframe format)
- Build a "no-answer" flow that collects email and triggers a human notification
- Define 10 out-of-scope topics and write explicit bot responses for each
- Set a maximum clarification loop limit of 2 — third failure always escalates to human
Intent Mapping & NLP Architecture
Every word a user types is a data point. The NLP layer's job is to convert those data points into actionable classifications that route the conversation correctly. Most chatbot projects fail here — not because the AI is wrong, but because the intent architecture was never designed. You cannot train a bot to understand what you haven't defined. This module is about building the definition layer that makes everything else work.
Intent Categories — Building Your Intent Taxonomy
An intent taxonomy is the master list of everything a user might want to do in your system, organized into a hierarchy. At the top level are intent groups — broad categories like Sales, Support, and Account. Below each group are specific intents. Below each intent are the phrases that trigger it. The hierarchy is not optional — without it, your bot's training data becomes a flat list that produces collisions (two intents matching the same phrase) and gaps (phrases that match nothing).
Start with 5-8 top-level intent groups. Expand from there. Common groups for a B2B service business: new_lead_inquiry, pricing_and_packaging, support_and_troubleshooting, account_management, scheduling, general_information, objection_or_hesitation, and out_of_scope. Each group should have 3-8 specific intents beneath it. More than 10 intents per group and the classifier struggles to discriminate.
- new_lead_inquiry: evaluate_product, request_demo, ask_about_services, compare_competitors
- pricing_and_packaging: get_price, understand_tiers, ask_about_discounts, request_custom_quote
- support_and_troubleshooting: report_bug, ask_how_to, check_status, request_refund
- account_management: update_info, cancel_subscription, upgrade_plan, add_user
- scheduling: book_meeting, reschedule, cancel_appointment
- objection_or_hesitation: price_objection, competitor_comparison, need_more_time, not_convinced
- out_of_scope: everything else — route to human or polite decline
- Build your full intent taxonomy before writing any training phrases
- Limit top-level groups to 8 maximum
- Validate taxonomy against 30 real customer messages — every message must map to an intent
- Flag any message that maps to more than one intent — that is a discrimination problem to solve
Entity Types — Extracting the Variables That Matter
Intents tell you what the user wants. Entities tell you the specifics. When a user says "I need a quote for 3 locations in Texas," the intent is request_custom_quote and the entities are quantity: 3, entity_type: locations, and geography: Texas. Without entity extraction, your bot has to ask follow-up questions for information the user already provided. That friction is annoying and it's your fault for not building entity extraction.
Standard entity types to define: product_name (which specific service or product), quantity (how many), timeframe (when, how long), geography (city, state, country), contact_info (email, phone), account_id (existing customer identifiers), and sentiment_marker (urgency words, negative emotion words). Most NLP platforms support these out-of-the-box for contact_info and quantity. Product names, timeframes, and sentiment markers need custom training.
- Define all entity types for your use case before training begins
- Create entity synonym lists (e.g., "asap", "urgent", "right away", "ASAP" all map to urgency: high)
- Test entity extraction on 20 real messages before declaring the model ready
- Build entity confirmation into flows where the wrong entity would cause a bad outcome (e.g., booking wrong date)
Training Phrases — Writing Data That Actually Trains
Training phrases are the examples you provide to teach the NLP model what each intent sounds like. Most teams write too few, make them too similar to each other, and forget to include misspellings and abbreviations. All three mistakes produce a brittle classifier that performs well in testing and fails in production.
The rule for training phrase quality: minimum 15 phrases per intent, maximum similarity of 70% between any two phrases in the same intent. Every phrase should represent a genuinely different way of expressing that intent. Include at least 3 misspelled variants. Include at least 3 very short variants (1-3 words). Include at least 2 phrases in each additional language you serve. If you have fewer than 15 distinct phrases, you don't have training data — you have a list of synonyms.
- Minimum 15 phrases per intent, target 25
- At least 3 misspelled or abbreviated variants ("hw much", "pric?", "wat does it cost")
- At least 3 ultra-short variants ("price", "how much", "cost")
- At least 2 variants per additional language
- No two phrases may be more than 70% lexically similar
- Include question and statement forms ("what does it cost" and "I want to know the price")
- Count your training phrases per intent — anything under 15 needs expansion
- Pull real user messages from any existing chat logs and add them as training phrases
- Run a similarity audit — remove phrases that are near-duplicates
- Add 5 Spanish training phrases for each intent you plan to serve bilingually
Confidence Thresholds & Fallback Logic
Every NLP classification comes with a confidence score — a number between 0 and 1 representing how certain the model is. Most teams deploy with default thresholds and never think about them again. That is a mistake that produces silent failures. Setting thresholds correctly is the difference between a bot that routes well and one that confidently sends leads into the wrong flow.
The three-zone model: High confidence (above 0.80) — execute the intent without confirmation. Medium confidence (0.50–0.80) — execute but include a soft confirmation ("It sounds like you're asking about pricing — is that right?"). Low confidence (below 0.50) — do not guess. Trigger the clarification flow. The exact thresholds depend on your data volume and error tolerance — start with these defaults and adjust based on false positive rates from your testing cadence (Module 9).
- Set explicit confidence thresholds in your bot platform (high / medium / low zones)
- Write a soft-confirm template for every major intent (used in medium-confidence zone)
- Log all low-confidence events — review weekly and add training phrases from them
- Never let a low-confidence classification execute a consequential action (booking, payment, cancellation)
Lead Qualification Flows
Qualification is the most commercially valuable thing your bot does. Done right, it delivers a pre-scored, context-rich lead to your CRM before any human has spoken to the prospect. Done wrong, it interrogates the user like a government form. The difference is sequence: a qualification flow that feels like a conversation, not a questionnaire.
BANT Automation — Budget, Authority, Need, Timeline as a Dialogue
BANT (Budget, Authority, Need, Timeline) is the oldest qualification framework in B2B sales. The problem is that most teams try to ask all four BANT questions sequentially, which feels like an interrogation. The fix: weave BANT into a natural conversation where each question emerges logically from the previous answer. The user never sees "budget question #2 of 4." They experience a helpful advisor learning about their situation.
Start with Need — it's the most natural entry point and the answer shapes how you frame the other questions. Then move to Timeline (it signals urgency and helps prioritize your pipeline). Then Authority (frame it as "who else would be involved in this decision" — less confrontational than "are you the decision-maker"). Budget last — after they've explained their need and timeline, budget questions feel like responsible planning, not gatekeeping.
- Step 1 — Need: "What specifically are you trying to achieve?"
- Step 2 — Timeline: "Is there a date you're working toward, or are you more in the research phase?"
- Step 3 — Authority: "Would it just be you making this call, or would others on your team be involved?"
- Step 4 — Budget: "Do you have a ballpark budget in mind, or is that something we'd need to scope out together?"
- Script the full BANT flow in Need → Timeline → Authority → Budget order
- Write Spanish variants of every BANT question
- Build a branch for "still researching" timeline responses — route to nurture, not sales
- Define budget thresholds that determine routing (e.g., under $1k/mo → self-serve, $1k-$5k → inside sales, above $5k → account exec)
Lead Scoring Logic — Automated Scoring Before CRM Handoff
Not every lead is equal. A chatbot that routes every conversation equally to your sales team is wasting their time and yours. Build a numeric scoring model that runs in the background during qualification and attaches a score to the CRM record before any human sees it. Sales reps work the high-score leads first. Nurture sequences handle the rest automatically.
The scoring model has four inputs mapped from BANT: Need strength (0-25 points), Timeline urgency (0-25 points), Authority level (0-25 points), and Budget fit (0-25 points). Total score 0-100. Thresholds: 75+ is a hot lead (immediate routing to sales). 50-74 is a warm lead (24-hour follow-up sequence). Below 50 is a cool lead (automated nurture, 2-week cadence). The exact point allocations should be calibrated to your sales cycle — adjust after 30 days of data.
- Need (0-25): 25 = specific pain with quantified impact; 15 = general dissatisfaction; 5 = vague interest
- Timeline (0-25): 25 = within 30 days; 15 = 1-3 months; 5 = 3-6 months; 0 = no timeline
- Authority (0-25): 25 = sole decision-maker; 15 = co-decision with known partner; 5 = influencer only
- Budget (0-25): 25 = stated budget above your minimum; 15 = budget in range; 5 = no budget stated; 0 = below minimum
- 75-100: Hot — route to sales immediately
- 50-74: Warm — 24-hour automated follow-up
- 0-49: Cool — 2-week nurture sequence
- Define point values for each BANT dimension in your scoring model
- Set the three routing thresholds (hot/warm/cool) and document them
- Configure CRM to receive score as a custom field with every chatbot lead
- Review scoring accuracy after 30 days — do your hot leads actually close?
CRM Handoff Triggers — What Gets Sent, When, and How
The handoff from bot to CRM is a data event, not just a notification. Every qualified conversation should push a structured record containing: contact name, email, phone, company (if B2B), lead score, BANT answers verbatim, conversation transcript, intent path (which intents were triggered in sequence), channel of origin, timestamp, and language. That is your hot lead package. A sales rep who reads it before calling has a massive advantage over one who calls cold.
Trigger conditions for CRM handoff: (1) user provides contact information, (2) user reaches a minimum score threshold (50+), or (3) user explicitly requests to speak to someone. Any one of these triggers the push. The push should happen in real-time via webhook, not in a batch at end of day. A hot lead that waits four hours is a cool lead.
- Define the exact fields pushed to CRM at handoff (minimum 10 fields listed above)
- Build real-time webhook — no batch processing for lead data
- Set response SLA for hot leads (we use 2 business hours — never more than same business day)
- Test handoff with 5 fake leads before deploying — confirm every field arrives correctly
Sales Conversation Automation
A bot that only qualifies leads leaves money on the table. Buyers often make decisions in the conversation — before they ever talk to a human. A well-designed sales flow can close warm leads, handle objections, and book paid calls without a salesperson in the loop. This module is the playbook for that flow.
Product Presentation Flow — The PACE Script
When a lead has been qualified and shows high intent, the bot transitions from discovery into presentation. The PACE framework governs this transition: Point of Pain (restate the user's need back to them), Answer (present the solution directly), Credential (one piece of evidence — a result, a client, a number), Extension (the next step). PACE is four sentences minimum, seven sentences maximum. Any longer and you are writing a brochure, not having a conversation.
The discipline of PACE is that it forces you to earn the presentation by opening with the user's own words. Starting with "You mentioned you need more leads and your current agency isn't delivering" is more persuasive than starting with "Our service offers." One is about them. One is about you. Bots that open with "Our service offers" lose prospects at a higher rate — because the prospect hasn't been shown they were heard.
- Write a PACE script for each of your top 3 services
- Include one specific result number (not a range) in every Credential section
- End every PACE with a clear binary CTA — yes/no decision, not an open question
- Keep each PACE to 4-6 sentences — test by reading aloud
Price Anchoring in Chat — How to Present Cost Without Losing the Lead
Price is the moment most bots go wrong. They either refuse to give a number ("pricing varies, contact us") which destroys trust, or they dump a price with no context which causes sticker shock. The right approach is anchored presentation: state the value before the cost, give a range rather than a fixed number, and immediately follow the price with a bridge to the next step.
The anchor sequence: (1) restate the outcome they get, (2) give the price range, (3) normalize the investment with a comparison or ROI frame, (4) move to next step. Never let a price hang in the air without a frame. A number alone is just a number. A number with an ROI context is an investment decision.
- Write price anchoring scripts for every service tier
- Always state the outcome before the number
- Include an ROI frame (client LTV × lead volume vs. investment)
- Follow every price reveal with a specific next-step prompt
Objection Handling Trees — The Four Objections and How to Script Them
In chatbot sales flows, 90% of objections fall into four categories: price ("too expensive"), timing ("not right now"), trust ("I've been burned before"), and authority ("I need to check with my partner"). Each one has a distinct handling approach — and mixing them up is worse than giving no response at all. Treating a trust objection as a price objection (offering a discount when someone is actually skeptical) destroys the conversation.
Build a decision tree: detect the objection type from keywords, route to the correct handling script, include a re-engage question at the end. The re-engage question is critical — every objection handling script must end with a question that invites the user to continue. Without it, you've handled the objection into a dead end.
- Write detection keywords for all four objection types (price, timing, trust, authority)
- Build a separate handling script for each — do not combine
- End every objection script with a forward question
- Build a "second objection" path — if they object twice, route to human
Close Sequences & Follow-Up Triggers
The close in a chatbot context is a booking, not a signature. The bot's job is to get the prospect into a meeting with a human (or, for lower-ticket items, directly into a payment flow). The close sequence has three steps: confirm interest, reduce friction, lock the commitment. Do not ask "would you like to schedule a call?" — that is a question with an easy no. Ask "I have Tuesday at 2pm or Wednesday at 10am — which works better for you?" That is a choice between two yeses.
Every prospect who doesn't close in session should enter a follow-up trigger sequence. The bot captures enough contact info to continue the conversation via a different channel. Typical sequence: WhatsApp message 24 hours later, email 48 hours later, final WhatsApp 7 days later. Three touches, then move to a low-frequency nurture. Do not spam. Four unanswered touches is harassment, not sales.
- Build a 2-option time-slot close for every booking flow — never open-ended calendar links
- Automate calendar invite via Calendly API or native CRM calendar
- Build 3-touch follow-up sequence for unclosed sessions (24h, 48h, 7-day)
- Define the "stop follow-up" trigger — unsubscribe, or 3 unanswered messages
Customer Service Automation
Customer service is where chatbot automation pays its rent. FAQ deflection, order status, ticket creation, refunds — these interactions are high-volume, low-complexity, and perfectly suited to automation. A well-built service bot resolves 60-70% of inbound contacts without human involvement. The ones that fail add friction instead of removing it.
FAQ Resolution Flow — The Three-Strike System
FAQ resolution is not "show the user the FAQ page." It is delivering the precise answer to the precise question in the precise moment. The three-strike system: first, attempt direct intent match and deliver the answer. Second, if the user signals the answer didn't help, rephrase and add one more detail. Third, if the user is still unsatisfied, escalate to ticket creation with the full conversation as context. Three strikes, then a human owns it.
The most common FAQ failure is answer mismatch — the bot correctly identifies the intent but delivers a generic answer that doesn't match the user's specific situation. Fix this by building conditional FAQ responses: "if the user mentioned X in a prior turn, use answer variant B, not A." Context-aware FAQ responses are 40% more satisfying to users than static responses, based on internal conversation rating data.
- List your top 20 FAQ topics and write a primary answer + one "didn't help" variant for each
- Add context-conditional variants for your top 5 FAQs
- Set strike 3 to auto-create a ticket — never let a user loop more than 3 times on the same topic
- Review FAQ match rate monthly — any FAQ under 80% match accuracy needs retraining
Ticket Creation, Status Lookup & Escalation Routing
When the bot cannot resolve an issue, it creates a ticket. The ticket must contain everything the human agent needs: conversation transcript, issue category, severity level, user contact info, prior resolution attempts, and a plain-language summary of what went wrong and what was tried. A ticket that arrives without context forces the human to re-ask questions the user already answered — which is the fastest way to turn an annoyed customer into a lost customer.
Status lookup is one of the highest-value automations in customer service. "Where is my order?" and "What's the status of my ticket?" are two of the top five most common service inquiries across every industry. Build a status lookup flow that authenticates the user (email or order number), queries your system in real-time, and delivers a human-readable status update. Real-time lookup eliminates the need for a human in the loop for 80% of status inquiries.
- Define the minimum data fields for every ticket type (7 fields minimum)
- Build real-time status lookup via API connection to your helpdesk
- Set SLA commitments by tier — hot ticket (2h), standard (24h), low (48h)
- Write Spanish versions of all ticket confirmation and status messages
Complaint Routing & Satisfaction Surveys
Complaints are data. Mishandled, they become churn and negative reviews. Handled well, they become retention and loyalty. The bot's role in a complaint is not to resolve it — it is to de-escalate, acknowledge, and route with speed. The fastest path to de-escalation is demonstrating that someone has taken ownership and set a real timeline. Vague promises accelerate anger.
Satisfaction surveys should be triggered at the end of every resolved interaction — not after every conversation, but after resolution events (ticket closed, FAQ answered, refund issued). Keep the survey to one question: "Did we resolve your issue today?" Yes/No. If no, trigger a re-open flow. If yes, optionally ask "On a scale of 1-5, how would you rate the experience?" Do not make the survey mandatory. Do not ask more than two questions. Survey fatigue kills response rates, and zero data is worse than imperfect data.
- Write a complaint de-escalation script for your top 3 complaint types
- Set maximum response time commitment you can actually keep (not a fantasy SLA)
- Build satisfaction survey as a triggered post-resolution event, not a scheduled batch
- Route all 1-2 star survey responses to a human review queue within 1 hour
Multi-Platform Deployment
Every channel your customers use has a different technical environment, character limit, media capability, and user behavior pattern. A bot designed for the web widget and copy-pasted onto WhatsApp will perform badly — not because the content is wrong, but because the format is wrong. Platform-specific design is not optional. This module covers the five channels NetWebMedia runs and what makes each one different.
WhatsApp Business API — The Highest-ROI Channel
WhatsApp is the most effective channel in the stack. Open rates exceed 95%. Response rates are 3-5x higher than email. And because WhatsApp conversations happen in a personal messaging context, the tone threshold for what feels "pushy" is lower than email cold outreach. Users who contact you via WhatsApp expect a real conversation — make sure they get one.
Technical constraints: messages are limited to 4,096 characters (rarely an issue). Media support is rich: images, PDFs, video, audio. Template messages (for outbound proactive messages) must be pre-approved by Meta — approval takes 24-72 hours, plan accordingly. Session windows: once a user messages you, you have a 24-hour window to respond freely. After 24 hours without user response, you can only send pre-approved templates. Build your follow-up sequences around this window.
- Verify your WhatsApp Business Account (green checkmark) — essential for trust
- Pre-approve at least 5 template messages for outbound follow-ups before launch
- Set automated welcome message triggered on first user contact
- Build 24-hour session tracking — alert system when session window closes
- Media strategy: use images for product/service examples, PDF for proposals
- Apply for and verify WhatsApp Business API access (via Meta or approved BSP)
- Submit 5 template messages for outbound follow-up approval
- Write a welcome message that fires within 5 seconds of first user contact
- Build a re-engagement template for leads who go silent after 24 hours
Instagram DM Automation — Context from the Feed
Instagram DMs carry unique context: you often know which post the user came from. A user who DMs after seeing a Reel about chatbot automation has different intent than one who DMs from your bio. Build your Instagram bot to read the entry point and customize the opening response accordingly. This context-aware opening increases engagement rates significantly compared to a generic "Hi, how can I help?"
Constraints: Instagram's messaging API is accessed via Meta's Graph API, same infrastructure as Messenger. Automated responses to comments (triggering a DM when someone comments a keyword) are subject to Meta's spam policies — keep keyword triggers specific and relevant. Character limits are generous (1,000 chars per message). Stories can include a swipe-up link that opens a DM thread with a pre-populated message — use this for lead generation campaigns.
- Map the top 3 posts/Reels that drive DMs — write custom opening scripts for each
- Set up keyword trigger automation for your top 3 comment keywords ("info", "price", "how")
- Build a generic fallback for DMs without a detectable entry point
- Test comment-to-DM trigger with a real post before scaling to campaigns
Facebook Messenger, Web Widget & SMS — Platform-Specific Notes
Facebook Messenger uses the same API infrastructure as Instagram DMs (Meta Graph API). The audience skews older than Instagram. Messenger supports rich cards (image + title + buttons), quick replies, and persistent menu — use persistent menu for returning users who need navigation shortcuts, not for first-time conversations. The 24-hour session window rule applies here identically to WhatsApp.
Web widget is your highest-context channel — you know the page the user is on, their session history, and often their CRM record if they're a returning customer. Build page-aware triggers: users on the pricing page see a different opener than users on the homepage. Users on the blog see a content-aware opener. The web widget is also your only channel where full HTML formatting renders (buttons, carousels) — use it for the richest interactive experiences.
SMS has the tightest constraints: 160 characters per segment (concatenated messages up to 1,600 chars), no rich media in standard SMS, no session window concept. SMS is best used for transactional notifications (appointment reminders, ticket updates, follow-ups) rather than open conversation flows. Keep SMS messages under 320 characters. Never initiate SMS contact without explicit opt-in — the regulatory exposure (TCPA in the US) is significant.
- Build page-aware web widget triggers for pricing, homepage, and blog pages minimum
- Messenger: design persistent menu for 3 main navigation paths (Sales / Support / Account)
- SMS: write all follow-up templates under 320 characters
- Confirm TCPA compliance for SMS — double opt-in required before first message
- Test each platform independently before cross-channel routing is activated
AI & NLP Integration
Rule-based bots are predictable but brittle. AI-powered bots handle variance, edge cases, and novel phrasing that would break any decision tree. This module is about integrating a real LLM (OpenAI or Anthropic) into your chatbot stack to give it genuine language understanding — while keeping it constrained enough to be reliable in production.
Connecting to OpenAI / Claude API — Architecture Decision
The architecture question is not "which LLM is best" — it is "where does the LLM sit in your system." Two patterns: LLM-as-classifier (the LLM only handles intent classification and entity extraction; all responses come from pre-written scripts) and LLM-as-responder (the LLM generates responses directly from a system prompt + knowledge base). Both have legitimate use cases. LLM-as-classifier is cheaper, more predictable, and easier to audit. LLM-as-responder handles edge cases better but requires more careful prompt engineering and output validation.
For most business deployments, start with LLM-as-classifier. Use the LLM to classify intent and extract entities, then route to human-written scripts for all responses. This gives you the language flexibility of AI without the hallucination risk of fully generative responses. Add LLM-as-responder for specific flows (open-ended support, complex discovery) once you've established baseline performance metrics.
- Pattern A (LLM-as-Classifier): User message → LLM classifies intent + extracts entities → route to pre-written script → deliver response. Cost: ~$0.002/conversation. Predictability: high. Risk: low.
- Pattern B (LLM-as-Responder): User message → LLM generates response from system prompt + KB → output validation → deliver response. Cost: ~$0.02-0.10/conversation. Predictability: medium. Risk: medium (hallucination).
- Recommended start: Pattern A with Pattern B for open-ended flows only
- Choose your LLM provider (OpenAI GPT-4o or Anthropic Claude Sonnet are the production defaults in 2026)
- Decide Pattern A vs B for each flow type — document the decision
- Set up API key management with environment variables — never hardcode keys
- Implement rate limiting and fallback to rule-based responses if API is unavailable
Prompt Engineering for Production Bots
The system prompt is the most important piece of code in an LLM-powered bot. It defines the persona, the constraints, the knowledge, the tone, and the behavior in every edge case. A weak system prompt produces an unpredictable bot. A strong system prompt produces a bot that is reliably on-brand, on-topic, and useful in 95%+ of conversations.
The anatomy of a production system prompt: Role declaration (who the bot is and what business it serves), Behavioral rules (what it must and must not do), Tone guidelines (how it speaks), Knowledge scope (what topics it knows and what are out-of-scope), Escalation rules (when to hand off to human), and Output format (response length, structure, language rules). Every element is mandatory. Missing one produces predictable failure modes.
You are Alex, the AI assistant for NetWebMedia, a digital marketing agency. Your role is to help potential clients understand our services, qualify leads, answer support questions, and schedule meetings.
Rules you must follow: Always respond in the user's language. Never claim to be human — if asked, say you are an AI assistant. Never discuss competitors by name. Never quote a price without first understanding the user's specific need. Escalate to a human if the user expresses legal concerns, payment disputes over $500, or three consecutive expressions of frustration.
Tone: Direct, warm, expert. Short sentences. Active voice. No corporate jargon.
Knowledge scope: NetWebMedia services (SEO, paid ads, chatbot automation, CRM), general digital marketing questions. Out of scope: legal advice, financial advice, competitor pricing, personal topics.
Response length: Maximum 3 short paragraphs or 5 bullet points. Prefer shorter.
- Write your full system prompt covering all 6 anatomy elements
- Test the system prompt against 20 edge case inputs before deploying
- Version-control your system prompt — date every change
- Review system prompt performance monthly — update when failure patterns emerge
Context Management & Memory Systems
LLMs are stateless. Every API call is independent — the model remembers nothing from the previous message unless you explicitly include it. Context management is your engineering responsibility. The simplest approach: include the full conversation history in every API call, up to the model's context window limit. This works for short conversations (under 20 exchanges). For longer conversations or returning users, you need a memory system.
Three levels of memory: Conversation memory (the current session — include in context window), Short-term memory (cross-session, last 7 days — store in fast key-value store like Redis, pull relevant facts on session start), and Long-term memory (persistent user profile — stored in CRM or vector DB, retrieved via semantic search). Most business chatbots need Conversation + Short-term. Long-term memory is for high-touch use cases where personalization at scale matters.
- Implement conversation history injection into every LLM API call
- Set maximum context window usage — truncate old turns if approaching limit
- Build short-term memory for returning users (store: name, company, stated need, last intent)
- Test returning user experience — does the bot remember who they are?
Handoff Protocols & Escalation Logic
The promise of full automation is not that humans never get involved — it is that humans only get involved when they add value. A bot that hands off too easily wastes your team's time. A bot that never hands off loses customers. The line between them is your escalation logic, and it needs to be designed, not discovered through failure.
Trigger Conditions — When the Bot Hands Off
Escalation triggers fall into two categories: hard triggers (always escalate, no exceptions) and soft triggers (escalate if other conditions are also met). Hard triggers: user explicitly requests a human, user expresses a legal complaint, user uses crisis language (self-harm signals, emergency situations), payment dispute above your threshold, or three consecutive unresolved clarification loops. These are non-negotiable — no business logic overrides them.
Soft triggers: user sentiment has been negative for three consecutive turns, confidence score has been below 0.50 for two consecutive intents, the session is 15+ minutes old without a resolution, or the user's lead score is 80+ and the bot hasn't converted them. Soft triggers escalate only if a human agent is available. If not, they move to a priority queue with an explicit wait-time commitment communicated to the user.
- Hard triggers (always escalate): Explicit human request, legal complaint, crisis language, payment dispute above threshold, 3 clarification loops
- Soft triggers (escalate if agent available): 3 negative-sentiment turns, 2 low-confidence intents, 15-min session without resolution, hot lead not converting
- After-hours rule: Hard trigger → priority queue with real callback time. Soft trigger → schedule a callback for next business day
- Document every hard trigger and confirm they are programmatically enforced (not prompt-based)
- Define your soft trigger thresholds and test them with synthetic conversations
- Build an agent availability check — bot must know if humans are online before promising immediate handoff
- Write after-hours escalation scripts that set accurate expectations
Warm Transfer Scripts & Context Packaging
A cold handoff is: "I'm transferring you now." A warm handoff is: "I'm connecting you with Sarah from our account team. I've sent her the full context of our conversation — she already knows about your 30-day timeline and $2,500 budget. She'll be with you in about 2 minutes." The difference in customer experience is significant. The difference in first-call resolution rate is measurable — warm handoffs resolve in one touch at roughly 2x the rate of cold handoffs.
Context packaging is the data object the bot assembles before the handoff: conversation summary (2-3 sentences), intent path (what flows were triggered), BANT data (if qualification occurred), lead score, contact info, channel of origin, and the reason for escalation. This package is displayed to the human agent in your CRM or helpdesk before they engage. Agents should never start a conversation with "So, what can I help you with?" when the bot has already gathered that information.
- Write warm transfer scripts for your top 3 escalation scenarios
- Build the context package object — define all fields and test with 5 mock handoffs
- Configure your CRM/helpdesk to display the context package before the agent's first message
- Train agents on the context package format — they should never re-ask questions already answered
After-Hours Handling & SLA Rules
After-hours conversations are where most automation systems fail. The bot either pretends it's going to escalate immediately (and doesn't), or it drops the conversation entirely. Both outcomes damage trust. The correct approach: acknowledge the time constraint honestly, commit to a specific callback time, and collect the contact information to fulfill that commitment.
SLA rules must be defined, documented, and enforced programmatically. Three tiers: urgent (legal, payment, crisis — 2 hours, any time including after-hours), standard (qualification, support — next business day by 10am), and low-priority (general inquiry, nurture — 48 business hours). Every user who triggers a handoff receives a specific commitment matching their tier, not a generic "someone will be in touch soon."
- Define your business hours and configure the bot to detect after-hours status
- Write after-hours scripts for each SLA tier
- Build a real on-call process for urgent escalations — the bot can only promise what the team can deliver
- Test after-hours flow — send a test message at 2am and verify the correct script fires
Testing, Analytics & Optimization
A chatbot deployed without a measurement system is a chatbot that will slowly degrade. Language evolves, user behavior shifts, and intents that were rare become common. The testing and optimization cadence is what keeps your system performing at launch levels — and improving past them.
QA Conversation Testing — Pre-Launch Protocol
Before any bot goes live, run 50 scripted test conversations covering your full intent taxonomy plus all failure states. Scripted tests are not enough alone — also run 10 unscripted tests where real people attempt to use the bot naturally, without knowing what intents exist. The unscripted tests will reveal the gaps your scripted tests don't. Users say things you never anticipated. That is not a complaint — it is a testing opportunity.
Track four outcomes per test conversation: intent classification accuracy (did the bot identify the right intent?), entity extraction accuracy (did it extract the right data?), response appropriateness (was the response helpful and on-brand?), and flow completion (did the conversation reach its intended outcome?). Score each conversation on a 0-3 scale for each dimension. Target: 80%+ scores across all four dimensions before launch. Below that threshold, keep training.
- Intent Accuracy: Correct intent identified / Total intents tested — target 85%+
- Entity Accuracy: Entities correctly extracted / Total entities in test set — target 90%+
- Response Appropriateness: Responses rated helpful + on-brand / Total responses — target 80%+
- Flow Completion: Conversations reaching intended outcome / Total conversations — target 70%+
- Launch gate: All four metrics must meet threshold simultaneously
- Write 50 scripted test cases covering full intent taxonomy + all failure states
- Run 10 unscripted tests with real people unfamiliar with the bot design
- Score every test conversation on the four-dimension scorecard
- Do not launch until all four metrics meet threshold — fix gaps first
Key Metrics — What to Measure and Why
Five metrics run your chatbot's performance dashboard. Intent recognition rate: the percentage of messages correctly classified. Below 80% and your training data needs expansion. Containment rate: the percentage of conversations fully resolved by the bot without human escalation. Target varies by use case — 60% is a reasonable baseline for a mixed sales/support bot. Conversion rate: the percentage of chatbot conversations that result in a qualified lead, booked meeting, or completed transaction. Drop-off rate by flow step: where in your defined flows do users abandon? High drop-off on step 3 of a 5-step flow means step 3 is broken. Average conversation length: measured in turns (exchanges). Too short suggests users aren't engaging. Too long suggests the bot isn't resolving efficiently.
- Intent Recognition Rate — target 85%+ | alert threshold: below 75%
- Containment Rate — target 60%+ | alert threshold: below 45%
- Conversion Rate — target 8-15% for cold traffic | alert threshold: below 5%
- Drop-off Rate by Step — any step above 40% drop-off is a fix priority
- Average Turns to Resolution — target 4-8 turns | alert: below 2 (too shallow) or above 12 (too complex)
- Configure your analytics dashboard to track all 5 KPIs from day one
- Set alert thresholds — automated notification when any metric drops below threshold
- Build a funnel view for each major flow showing drop-off at each step
- Review dashboard weekly for first 60 days, then monthly once stable
A/B Testing Flows — What to Test and How
A/B testing a chatbot is not the same as A/B testing a landing page. Conversation is sequential — a change in turn 1 affects the likelihood of every subsequent turn. Test one element at a time, wait for statistical significance (minimum 200 complete conversations per variant), and measure impact on the end goal (conversion), not just the local metric (click rate on a button). Testing a button color in a chatbot is a waste of time. Testing whether a BANT opener vs. a pain-first opener produces more qualified leads is worth doing.
High-value test candidates: opening message (the first thing the bot says), price anchoring script (sequence A vs. B), BANT question order, objection handling approach for your most common objection, follow-up timing (24h vs. 48h), and call-to-action phrasing in the close sequence. Run one test at a time. Document every test with hypothesis, variant definitions, success metric, and result.
- Identify your top 3 A/B test candidates from drop-off data
- Define success metric for each test before running it
- Minimum 200 conversations per variant before calling a winner
- Document every test result — wins and losses are both valuable data
Monthly Optimization Cadence — The 4-Step Review
Set a fixed monthly date for your optimization review. Skipping it for two months in a row means your bot is degrading undetected. The four-step review takes two hours maximum and produces a prioritized action list for the month ahead.
Step 1 — Data review: Pull the 5 core KPIs. Flag anything that moved more than 10% from last month. Step 2 — Failure analysis: Pull all low-confidence events and misclassified intents from the past 30 days. Read 20 at random. Every one is a training phrase to add or a flow to fix. Step 3 — Drop-off analysis: Identify the one flow with the highest drop-off rate at any step. That is your priority fix for the month — not the lowest-hanging fruit, the highest-impact problem. Step 4 — Prioritized action list: Output a list of 3-5 specific changes (not "improve the bot" — "add 10 training phrases to the price_objection intent" or "rewrite step 3 of the qualification flow").
- Step 1: Review 5 KPIs vs. prior month — flag changes >10%
- Step 2: Read 20 random low-confidence / misclassified events — extract training phrases
- Step 3: Identify the single highest-drop-off flow step
- Step 4: Output 3-5 specific, actionable changes for the coming month
- Time budget: 2 hours total. If it takes longer, you are analyzing instead of deciding.
- Book a recurring monthly calendar block for the optimization review — same day, same time
- Create a shared doc or Notion template for the 4-step review output
- Assign ownership of each action item with a deadline
- Review the previous month's action items at the start of each review — did they move the metric?
The business that automates first contact across every channel doesn't just save time — it creates a compounding advantage. Every conversation is data. Every data point improves the next conversation. The businesses still answering every inquiry manually are running a system that degrades with volume. Automation scales. People don't have to.
What You Have Now
Nine modules. Forty-one lessons. Every intent mapped. Every failure state scripted. Every platform addressed. The DIAL framework gives you a mental model for any conversation design decision. The BANT automation gives you a qualification engine that runs at any scale. The PACE script gives your bot a sales voice that earns the presentation. The escalation logic gives you a system where humans are always in the right place at the right time — not every place all the time.
This is exactly the system NetWebMedia runs. No one answers the phone. Everything qualified comes through the bot, scored, contextualized, and ready for a human to close. The system handles the floor so humans can own the ceiling. Deploy it with discipline and the same will be true for your business.
Want this system built for your business?
NetWebMedia deploys full chatbot automation across WhatsApp, Instagram, web, Facebook, and SMS — strategy through launch. Request a free AI audit and we'll map the exact flows your business needs in a written report within 48 hours.
Request Free AI Audit →