How I Built A Saas That Deploys Ai Chatbots To Telegram In 2 Minutes
I recently launched ClawBotCloud, a managed platform for deploying AI chatbots to Telegram. Users configure a bot through a web dashboard, connect their Telegram token, and deploy. Each bot runs in its own isolated container on Fly.io, powered by Claude.
In this post, I want to walk through the architecture decisions, trade-offs, and lessons learned building it as a solo developer.
The Problem
Setting up an AI chatbot on Telegram requires a surprising amount of infrastructure:
- A server (VPS, cloud instance, or a machine at home)
- Docker or a process manager to keep the bot running
- API key management and secure storage
- Telegram Bot API configuration
- Monitoring and crash recovery
- SSL certificates if using webhooks
For a developer, this is a weekend project. For a small business owner who just wants an AI assistant in their Telegram group, it's a brick wall.
I wanted to reduce that to: configure → connect → deploy.
Architecture Overview
Here's the high-level architecture:
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Next.js │────▶│ PostgreSQL │ │ Fly.io │
│ Dashboard │ │ (Neon) │ │ Machines │
│ │────▶│ Redis │ │ │
│ (Vercel) │ │ (Upstash) │ │ ┌───────────┐ │
│ │─────┼───────────────┼────▶│ │ Bot #1 │ │
│ │ │ Stripe │ │ │ (OpenClaw)│ │
└─────────────┘ └──────────────┘ │ └───────────┘ │
│ ┌───────────┐ │
│ │ Bot #2 │ │
│ │ (OpenClaw)│ │
│ └───────────┘ │
│ ┌───────────┐ │
│ │ Bot #N │ │
│ │ (OpenClaw)│ │
│ └───────────┘ │
└─────────────────┘
The key insight: each bot is its own Fly.io machine. Not a shared process, not a worker in a queue — a fully isolated container.
Tech Stack Decisions
Next.js 15 (App Router) — The Dashboard
The dashboard handles user auth, bot configuration, billing, and bot management. I went with Next.js 15 and the App Router because:
- Server components reduce client-side JS (the dashboard is mostly forms and tables)
- Server actions simplify the API layer — no separate REST endpoints for CRUD operations
- Vercel deployment is zero-config
// Example: Server action for creating a bot
'use server'
import { auth } from '@/lib/auth'
import { db } from '@/lib/db'
import { createFlyMachine } from '@/lib/fly'
export async function createBot(formData: FormData) {
const session = await auth()
if (!session?.user) throw new Error('Unauthorized')
const bot = await db.bot.create({
data: {
name: formData.get('name') as string,
systemPrompt: formData.get('systemPrompt') as string,
userId: session.user.id,
status: 'provisioning',
},
})
// Provision the Fly.io machine
await createFlyMachine(bot.id, {
telegramToken: formData.get('telegramToken') as string,
anthropicKey: formData.get('anthropicKey') as string,
systemPrompt: formData.get('systemPrompt') as string,
})
return bot
}
PostgreSQL (Neon) — The Database
I chose Neon's serverless PostgreSQL for a few reasons:
- Scale-to-zero means I'm not paying for a database that sits idle 90% of the time at launch
- Branching is useful for testing schema changes
- Prisma works great with it
model Bot {
id String @id @default(cuid())
name String
systemPrompt String @db.Text
status BotStatus @default(STOPPED)
telegramToken String @db.Text // encrypted at application level
userId String
user User @relation(fields: [userId], references: [id])
flyMachineId String?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([userId])
}
enum BotStatus {
PROVISIONING
RUNNING
STOPPED
ERROR
}
Fly.io Machines API — Bot Infrastructure
This is the most interesting part. The Fly.io Machines API lets you create, start, stop, and destroy individual containers programmatically.
// Simplified bot provisioning
async function createFlyMachine(
botId: string,
config: BotConfig
): Promise<string> {
const response = await fetch(
`https://api.machines.dev/v1/apps/${FLY_APP}/machines`,
{
method: 'POST',
headers: {
Authorization: `Bearer ${FLY_API_TOKEN}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
name: `bot-${botId}`,
config: {
image: 'registry.fly.io/clawbotcloud-bot:latest',
env: {
BOT_ID: botId,
TELEGRAM_TOKEN: encrypt(config.telegramToken),
ANTHROPIC_API_KEY: encrypt(config.anthropicKey),
SYSTEM_PROMPT: config.systemPrompt,
},
guest: {
cpu_kind: 'shared',
cpus: 1,
memory_mb: 256,
},
services: [], // No public ports — bot uses outbound only
checks: {
alive: {
type: 'http',
port: 3000,
path: '/health',
interval: '30s',
timeout: '5s',
},
},
},
}),
}
)
const machine = await response.json()
return machine.id
}
Why Fly.io over alternatives?
| Option | Pros | Cons |
|---|---|---|
| Fly.io Machines | Per-machine isolation, instant start/stop, pay-per-second | API can be quirky, documentation gaps |
| AWS ECS/Fargate | Battle-tested, scalable | Complex, expensive at low scale, slow cold starts |
| Railway | Great DX | Less control over individual containers |
| Kubernetes | Ultimate flexibility | Massive overkill for this use case |
| Shared process | Cheapest | No isolation, one bot crash kills all |
Fly machines can be stopped and started in under 2 seconds. This is critical for economics — when a user cancels, I stop their machine instantly and stop paying for it. No zombie containers.
Encryption and Security
API keys are sensitive. Here's the approach:
import { createCipheriv, createDecipheriv, randomBytes } from 'crypto'
const ALGORITHM = 'aes-256-gcm'
const KEY = Buffer.from(process.env.ENCRYPTION_KEY!, 'hex')
export function encrypt(text: string): string {
const iv = randomBytes(16)
const cipher = createCipheriv(ALGORITHM, KEY, iv)
let encrypted = cipher.update(text, 'utf8', 'hex')
encrypted += cipher.final('hex')
const authTag = cipher.getAuthTag()
// Store IV + auth tag + ciphertext together
return `${iv.toString('hex')}:${authTag.toString('hex')}:${encrypted}`
}
export function decrypt(data: string): string {
const [ivHex, authTagHex, encrypted] = data.split(':')
const decipher = createDecipheriv(
ALGORITHM,
KEY,
Buffer.from(ivHex, 'hex')
)
decipher.setAuthTag(Buffer.from(authTagHex, 'hex'))
let decrypted = decipher.update(encrypted, 'hex', 'utf8')
decrypted += decipher.final('utf8')
return decrypted
}
Keys are encrypted in PostgreSQL and only decrypted when injecting into a Fly machine's environment. They're never logged, never sent to the frontend, and never stored in plain text.
Telegram Integration: Polling vs. Webhooks
There are two ways to receive Telegram messages:
Long polling: Your bot asks Telegram "any new messages?" in a loop.
Webhooks: Telegram sends messages to a URL you provide.
I went with long polling for a counterintuitive reason: it's simpler and more reliable for this use case.
With webhooks, each bot would need:
- A public URL
- An SSL certificate
- A port exposed on the container
- Fly.io proxy configuration
With long polling, the bot just makes outbound HTTP requests. No public ports, no SSL certs, no proxy config. The container doesn't even need to be publicly accessible, which is actually a security win.
// Inside the bot container — simplified polling loop
async function startPolling(token: string) {
let offset = 0
while (true) {
try {
const updates = await fetch(
`https://api.telegram.org/bot${token}/getUpdates`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
offset,
timeout: 30, // long-poll for 30 seconds
}),
}
).then(r => r.json())
for (const update of updates.result || []) {
offset = update.update_id + 1
await handleMessage(update)
}
} catch (error) {
console.error('Polling error:', error)
await sleep(5000) // back off on errors
}
}
}
The trade-off: long polling means each bot maintains a persistent connection. At scale (thousands of bots), this could be a problem. But at launch scale (tens of bots), it's the right choice. I can migrate to webhooks later if needed.
Billing with Stripe
Stripe handles subscriptions. Each bot maps to a Stripe subscription item:
// When user deploys a new bot
async function handleBotDeployment(userId: string, botId: string) {
const user = await db.user.findUnique({
where: { id: userId },
include: { subscription: true },
})
if (!user?.subscription?.stripeSubscriptionId) {
// Create new subscription
const subscription = await stripe.subscriptions.create({
customer: user.stripeCustomerId,
items: [{ price: BOT_PRICE_ID, quantity: 1 }],
})
// Store subscription...
} else {
// Update quantity on existing subscription
await stripe.subscriptions.update(
user.subscription.stripeSubscriptionId,
{
items: [{
id: user.subscription.stripeItemId,
quantity: user.bots.filter(b => b.status === 'RUNNING').length + 1,
}],
proration_behavior: 'create_prorations',
}
)
}
}
Users pay per bot. Add a bot → subscription quantity increases. Remove a bot → it decreases. Stripe handles prorations automatically.
Health Monitoring
Each bot container exposes a /health endpoint. Fly.io checks it every 30 seconds. If a bot goes unhealthy:
- Fly restarts the container automatically (first line of defense)
- If it fails 3 times, the dashboard shows the bot as "Error"
- I get notified (webhook to my monitoring)
On top of that, the dashboard polls bot status from Fly's API periodically so users see real-time status.
Lessons Learned
1. Container-per-bot is expensive but worth it
At ~€5-7/month infrastructure cost per bot with €20/month pricing, margins are okay but not amazing. A shared-process architecture would be 10x cheaper. But the isolation guarantee is worth it — one bot with a bad system prompt can't crash another customer's bot.
2. Fly.io Machines API has quirks
The API is powerful but the documentation doesn't cover every edge case. Machine state transitions can be surprising — a machine can be in "starting" state for longer than expected, and you need to poll for the actual running state.
3. Start with fewer features
I almost built a full analytics dashboard, conversation logs, A/B testing for prompts, and multi-model support before launch. Glad I didn't. The MVP is: create bot, deploy bot, bot works. Everything else can come later.
4. Encryption is not optional
Even for an MVP. Users are trusting you with their API keys. If you store them in plain text "just for now," you'll forget to encrypt them later. Do it from day one.
What's Next
- WhatsApp support — high demand, but Meta's Business API is a different beast
- Free trial — figuring out limits that prevent abuse
- Multiple AI models — GPT-4o, Gemini, local models
- Conversation analytics — token usage, message volume, popular topics
- Custom knowledge bases — upload documents for RAG
Try It
If you want to deploy a Claude-powered Telegram bot without touching a server:
The bot runtime is OpenClaw (open source) — you can always self-host if you prefer full control.
I'm a solo developer and would genuinely appreciate feedback. What would you build with this? What's missing?
If you enjoyed this post, I write about building SaaS products and AI infrastructure. Follow for more.
Popular Products
-
Fake Pregnancy Test$61.56$30.78 -
Anti-Slip Safety Handle for Elderly S...$57.56$28.78 -
Toe Corrector Orthotics$41.56$20.78 -
Waterproof Trauma Medical First Aid Kit$169.56$84.78 -
Rescue Zip Stitch Kit$109.56$54.78