Join our FREE personalized newsletter for news, trends, and insights that matter to everyone in America

Newsletter
New

How I Built A Saas That Deploys Ai Chatbots To Telegram In 2 Minutes

Card image cap

I recently launched ClawBotCloud, a managed platform for deploying AI chatbots to Telegram. Users configure a bot through a web dashboard, connect their Telegram token, and deploy. Each bot runs in its own isolated container on Fly.io, powered by Claude.

In this post, I want to walk through the architecture decisions, trade-offs, and lessons learned building it as a solo developer.

The Problem

Setting up an AI chatbot on Telegram requires a surprising amount of infrastructure:

  • A server (VPS, cloud instance, or a machine at home)
  • Docker or a process manager to keep the bot running
  • API key management and secure storage
  • Telegram Bot API configuration
  • Monitoring and crash recovery
  • SSL certificates if using webhooks

For a developer, this is a weekend project. For a small business owner who just wants an AI assistant in their Telegram group, it's a brick wall.

I wanted to reduce that to: configure → connect → deploy.

Architecture Overview

Here's the high-level architecture:

┌─────────────┐     ┌──────────────┐     ┌─────────────────┐  
│  Next.js     │────▶│  PostgreSQL   │     │   Fly.io        │  
│  Dashboard   │     │  (Neon)       │     │   Machines      │  
│              │────▶│  Redis        │     │                 │  
│  (Vercel)    │     │  (Upstash)    │     │  ┌───────────┐  │  
│              │─────┼───────────────┼────▶│  │  Bot #1   │  │  
│              │     │  Stripe       │     │  │  (OpenClaw)│  │  
└─────────────┘     └──────────────┘     │  └───────────┘  │  
                                          │  ┌───────────┐  │  
                                          │  │  Bot #2   │  │  
                                          │  │  (OpenClaw)│  │  
                                          │  └───────────┘  │  
                                          │  ┌───────────┐  │  
                                          │  │  Bot #N   │  │  
                                          │  │  (OpenClaw)│  │  
                                          │  └───────────┘  │  
                                          └─────────────────┘  

The key insight: each bot is its own Fly.io machine. Not a shared process, not a worker in a queue — a fully isolated container.

Tech Stack Decisions

Next.js 15 (App Router) — The Dashboard

The dashboard handles user auth, bot configuration, billing, and bot management. I went with Next.js 15 and the App Router because:

  • Server components reduce client-side JS (the dashboard is mostly forms and tables)
  • Server actions simplify the API layer — no separate REST endpoints for CRUD operations
  • Vercel deployment is zero-config
// Example: Server action for creating a bot  
'use server'  
  
import { auth } from '@/lib/auth'  
import { db } from '@/lib/db'  
import { createFlyMachine } from '@/lib/fly'  
  
export async function createBot(formData: FormData) {  
  const session = await auth()  
  if (!session?.user) throw new Error('Unauthorized')  
  
  const bot = await db.bot.create({  
    data: {  
      name: formData.get('name') as string,  
      systemPrompt: formData.get('systemPrompt') as string,  
      userId: session.user.id,  
      status: 'provisioning',  
    },  
  })  
  
  // Provision the Fly.io machine  
  await createFlyMachine(bot.id, {  
    telegramToken: formData.get('telegramToken') as string,  
    anthropicKey: formData.get('anthropicKey') as string,  
    systemPrompt: formData.get('systemPrompt') as string,  
  })  
  
  return bot  
}  

PostgreSQL (Neon) — The Database

I chose Neon's serverless PostgreSQL for a few reasons:

  • Scale-to-zero means I'm not paying for a database that sits idle 90% of the time at launch
  • Branching is useful for testing schema changes
  • Prisma works great with it
model Bot {  
  id            String    @id @default(cuid())  
  name          String  
  systemPrompt  String    @db.Text  
  status        BotStatus @default(STOPPED)  
  telegramToken String    @db.Text  // encrypted at application level  
  userId        String  
  user          User      @relation(fields: [userId], references: [id])  
  flyMachineId  String?  
  createdAt     DateTime  @default(now())  
  updatedAt     DateTime  @updatedAt  
  
  @@index([userId])  
}  
  
enum BotStatus {  
  PROVISIONING  
  RUNNING  
  STOPPED  
  ERROR  
}  

Fly.io Machines API — Bot Infrastructure

This is the most interesting part. The Fly.io Machines API lets you create, start, stop, and destroy individual containers programmatically.

// Simplified bot provisioning  
async function createFlyMachine(  
  botId: string,  
  config: BotConfig  
): Promise<string> {  
  const response = await fetch(  
    `https://api.machines.dev/v1/apps/${FLY_APP}/machines`,  
    {  
      method: 'POST',  
      headers: {  
        Authorization: `Bearer ${FLY_API_TOKEN}`,  
        'Content-Type': 'application/json',  
      },  
      body: JSON.stringify({  
        name: `bot-${botId}`,  
        config: {  
          image: 'registry.fly.io/clawbotcloud-bot:latest',  
          env: {  
            BOT_ID: botId,  
            TELEGRAM_TOKEN: encrypt(config.telegramToken),  
            ANTHROPIC_API_KEY: encrypt(config.anthropicKey),  
            SYSTEM_PROMPT: config.systemPrompt,  
          },  
          guest: {  
            cpu_kind: 'shared',  
            cpus: 1,  
            memory_mb: 256,  
          },  
          services: [],  // No public ports — bot uses outbound only  
          checks: {  
            alive: {  
              type: 'http',  
              port: 3000,  
              path: '/health',  
              interval: '30s',  
              timeout: '5s',  
            },  
          },  
        },  
      }),  
    }  
  )  
  
  const machine = await response.json()  
  return machine.id  
}  

Why Fly.io over alternatives?

Option Pros Cons
Fly.io Machines Per-machine isolation, instant start/stop, pay-per-second API can be quirky, documentation gaps
AWS ECS/Fargate Battle-tested, scalable Complex, expensive at low scale, slow cold starts
Railway Great DX Less control over individual containers
Kubernetes Ultimate flexibility Massive overkill for this use case
Shared process Cheapest No isolation, one bot crash kills all

Fly machines can be stopped and started in under 2 seconds. This is critical for economics — when a user cancels, I stop their machine instantly and stop paying for it. No zombie containers.

Encryption and Security

API keys are sensitive. Here's the approach:

import { createCipheriv, createDecipheriv, randomBytes } from 'crypto'  
  
const ALGORITHM = 'aes-256-gcm'  
const KEY = Buffer.from(process.env.ENCRYPTION_KEY!, 'hex')  
  
export function encrypt(text: string): string {  
  const iv = randomBytes(16)  
  const cipher = createCipheriv(ALGORITHM, KEY, iv)  
  
  let encrypted = cipher.update(text, 'utf8', 'hex')  
  encrypted += cipher.final('hex')  
  
  const authTag = cipher.getAuthTag()  
  
  // Store IV + auth tag + ciphertext together  
  return `${iv.toString('hex')}:${authTag.toString('hex')}:${encrypted}`  
}  
  
export function decrypt(data: string): string {  
  const [ivHex, authTagHex, encrypted] = data.split(':')  
  
  const decipher = createDecipheriv(  
    ALGORITHM,  
    KEY,  
    Buffer.from(ivHex, 'hex')  
  )  
  decipher.setAuthTag(Buffer.from(authTagHex, 'hex'))  
  
  let decrypted = decipher.update(encrypted, 'hex', 'utf8')  
  decrypted += decipher.final('utf8')  
  
  return decrypted  
}  

Keys are encrypted in PostgreSQL and only decrypted when injecting into a Fly machine's environment. They're never logged, never sent to the frontend, and never stored in plain text.

Telegram Integration: Polling vs. Webhooks

There are two ways to receive Telegram messages:

Long polling: Your bot asks Telegram "any new messages?" in a loop.
Webhooks: Telegram sends messages to a URL you provide.

I went with long polling for a counterintuitive reason: it's simpler and more reliable for this use case.

With webhooks, each bot would need:

  • A public URL
  • An SSL certificate
  • A port exposed on the container
  • Fly.io proxy configuration

With long polling, the bot just makes outbound HTTP requests. No public ports, no SSL certs, no proxy config. The container doesn't even need to be publicly accessible, which is actually a security win.

// Inside the bot container — simplified polling loop  
async function startPolling(token: string) {  
  let offset = 0  
  
  while (true) {  
    try {  
      const updates = await fetch(  
        `https://api.telegram.org/bot${token}/getUpdates`,  
        {  
          method: 'POST',  
          headers: { 'Content-Type': 'application/json' },  
          body: JSON.stringify({  
            offset,  
            timeout: 30,  // long-poll for 30 seconds  
          }),  
        }  
      ).then(r => r.json())  
  
      for (const update of updates.result || []) {  
        offset = update.update_id + 1  
        await handleMessage(update)  
      }  
    } catch (error) {  
      console.error('Polling error:', error)  
      await sleep(5000)  // back off on errors  
    }  
  }  
}  

The trade-off: long polling means each bot maintains a persistent connection. At scale (thousands of bots), this could be a problem. But at launch scale (tens of bots), it's the right choice. I can migrate to webhooks later if needed.

Billing with Stripe

Stripe handles subscriptions. Each bot maps to a Stripe subscription item:

// When user deploys a new bot  
async function handleBotDeployment(userId: string, botId: string) {  
  const user = await db.user.findUnique({  
    where: { id: userId },  
    include: { subscription: true },  
  })  
  
  if (!user?.subscription?.stripeSubscriptionId) {  
    // Create new subscription  
    const subscription = await stripe.subscriptions.create({  
      customer: user.stripeCustomerId,  
      items: [{ price: BOT_PRICE_ID, quantity: 1 }],  
    })  
    // Store subscription...  
  } else {  
    // Update quantity on existing subscription  
    await stripe.subscriptions.update(  
      user.subscription.stripeSubscriptionId,  
      {  
        items: [{  
          id: user.subscription.stripeItemId,  
          quantity: user.bots.filter(b => b.status === 'RUNNING').length + 1,  
        }],  
        proration_behavior: 'create_prorations',  
      }  
    )  
  }  
}  

Users pay per bot. Add a bot → subscription quantity increases. Remove a bot → it decreases. Stripe handles prorations automatically.

Health Monitoring

Each bot container exposes a /health endpoint. Fly.io checks it every 30 seconds. If a bot goes unhealthy:

  1. Fly restarts the container automatically (first line of defense)
  2. If it fails 3 times, the dashboard shows the bot as "Error"
  3. I get notified (webhook to my monitoring)

On top of that, the dashboard polls bot status from Fly's API periodically so users see real-time status.

Lessons Learned

1. Container-per-bot is expensive but worth it

At ~€5-7/month infrastructure cost per bot with €20/month pricing, margins are okay but not amazing. A shared-process architecture would be 10x cheaper. But the isolation guarantee is worth it — one bot with a bad system prompt can't crash another customer's bot.

2. Fly.io Machines API has quirks

The API is powerful but the documentation doesn't cover every edge case. Machine state transitions can be surprising — a machine can be in "starting" state for longer than expected, and you need to poll for the actual running state.

3. Start with fewer features

I almost built a full analytics dashboard, conversation logs, A/B testing for prompts, and multi-model support before launch. Glad I didn't. The MVP is: create bot, deploy bot, bot works. Everything else can come later.

4. Encryption is not optional

Even for an MVP. Users are trusting you with their API keys. If you store them in plain text "just for now," you'll forget to encrypt them later. Do it from day one.

What's Next

  • WhatsApp support — high demand, but Meta's Business API is a different beast
  • Free trial — figuring out limits that prevent abuse
  • Multiple AI models — GPT-4o, Gemini, local models
  • Conversation analytics — token usage, message volume, popular topics
  • Custom knowledge bases — upload documents for RAG

Try It

If you want to deploy a Claude-powered Telegram bot without touching a server:

???? https://clawbotcloud.com

The bot runtime is OpenClaw (open source) — you can always self-host if you prefer full control.

I'm a solo developer and would genuinely appreciate feedback. What would you build with this? What's missing?

If you enjoyed this post, I write about building SaaS products and AI infrastructure. Follow for more.