Discord Bot Operations & Scaling
This guide covers ongoing operations for Discord bot hosting — monitoring, scaling, backups, and best practices for your customers.
Monitoring
Section titled “Monitoring”Portal Metrics
Section titled “Portal Metrics”Customers can monitor their bots through the portal:
- CPU Usage: Real-time and historical graphs
- RAM Usage: Current allocation and trends
- Network: Inbound/outbound traffic
- Disk: Storage utilization
- Uptime: Time since last restart
Warning Thresholds
Section titled “Warning Thresholds”Recommend customers watch for:
| Metric | Warning | Action Needed |
|---|---|---|
| RAM | > 75% sustained | Optimize or upgrade plan |
| CPU | > 80% sustained | Review code efficiency or upgrade |
| Restarts | > 5/day | Investigate crash causes |
| Disk | > 90% | Clean logs or upgrade storage |
Log Management
Section titled “Log Management”Customers can:
- View live console output
- Download log files
- Clear old logs to free space
Recommend log rotation for bots that generate significant output.
Scaling
Section titled “Scaling”Vertical Scaling (Upgrade Plans)
Section titled “Vertical Scaling (Upgrade Plans)”When to upgrade:
- RAM consistently above 75%
- CPU spikes causing slow responses
- Need for MySQL database
- Running multiple bot processes
Upgrades apply immediately with prorated billing.
Horizontal Scaling (Multiple Containers)
Section titled “Horizontal Scaling (Multiple Containers)”For large bots:
- Sharding: Split bot across shards using Discord’s sharding system
- Microservices: Separate components (commands, music, moderation) into containers
- Shared state: Use Redis or MySQL for cross-container data
Sharding Setup
Section titled “Sharding Setup”Discord.js:
const { ShardingManager } = require('discord.js');const manager = new ShardingManager('./bot.js', { token: process.env.DISCORD_TOKEN, totalShards: 'auto'});manager.spawn();discord.py:
from discord.ext import commandsbot = commands.AutoShardedBot(command_prefix='!')Sharding recommended when approaching 2,500 guilds.
Backups
Section titled “Backups”Automated Backups
Section titled “Automated Backups”Configure in the portal:
- Navigate to Schedules → Create.
- Select Backup task type.
- Set frequency (daily recommended).
- Configure retention (7-30 days typical).
What Gets Backed Up
Section titled “What Gets Backed Up”- Bot source code and configuration files
- SQLite databases
- Environment variables
- Uploaded assets
Manual Backups
Section titled “Manual Backups”Before major changes:
- Go to Backups in the portal.
- Click Create Backup.
- Download locally for offsite storage.
Restore Process
Section titled “Restore Process”- Navigate to Backups.
- Select the backup to restore.
- Choose restore options (files, database, or both).
- Confirm and restart the bot.
Scheduled Tasks
Section titled “Scheduled Tasks”Common Automations
Section titled “Common Automations”| Task | Frequency | Purpose |
|---|---|---|
| Restart | Daily | Clear memory, apply updates |
| Backup | Daily | Data protection |
| Dependency update | Weekly | Security patches |
| Log cleanup | Weekly | Free disk space |
Creating Schedules
Section titled “Creating Schedules”- Go to Schedules in the portal.
- Click Create Schedule.
- Configure:
- Name and description
- Cron expression or simple interval
- Command to run
- Enable/disable toggle
Example Commands
Section titled “Example Commands”Daily restart:
Command: (use restart action)Cron: 0 4 * * *Weekly npm update:
Command: npm update && npm audit fixCron: 0 3 * * 0Performance Optimization
Section titled “Performance Optimization”Memory Management
Section titled “Memory Management”Advise customers to:
- Limit Discord.js cache sizes:
const client = new Client({ intents: [...], makeCache: Options.cacheWithLimits({ MessageManager: 50, GuildMemberManager: 100 })});- Clear unused data periodically
- Use external caching (Redis) for large datasets
Response Times
Section titled “Response Times”Discord requires interaction responses within 3 seconds:
- Use
deferReply()for slow operations - Cache frequently accessed data
- Optimize database queries
- Consider worker processes for heavy tasks
Code Efficiency
Section titled “Code Efficiency”- Avoid synchronous blocking operations
- Use connection pooling for databases
- Implement rate limit handling
- Profile with debugging tools
Security Operations
Section titled “Security Operations”Token Rotation
Section titled “Token Rotation”Rotate Discord tokens when:
- Staff members leave
- Suspicious activity detected
- Regular quarterly rotation
Process:
- Generate new token in Discord Developer Portal
- Update environment variable in portal
- Restart the bot
- Verify connectivity
Access Control
Section titled “Access Control”- Use environment variables for all secrets
- Create separate panel accounts for team members
- Review access logs regularly
- Remove inactive user permissions
Dependency Security
Section titled “Dependency Security”Schedule regular audits:
Node.js:
npm auditnpm audit fixPython:
pip-auditpip install --upgrade -r requirements.txtCost Management
Section titled “Cost Management”Right-Sizing
Section titled “Right-Sizing”Review resource usage monthly:
- Downgrade if utilization stays below 30%
- Upgrade proactively before performance issues
Consolidation
Section titled “Consolidation”For customers with multiple small bots:
- Consider Premium plan with PM2 for multiple processes
- Share database connections
- Centralize common functionality
Usage Patterns
Section titled “Usage Patterns”Track peak usage times:
- Schedule restarts during low activity
- Plan upgrades before anticipated growth
- Monitor after feature launches
Incident Response
Section titled “Incident Response”When Things Break
Section titled “When Things Break”- Check console: Look for error messages and stack traces
- Review recent changes: What was deployed or configured recently?
- Check resources: Is the container out of memory or CPU?
- Rollback if needed: Restore from backup
- Contact support: If infrastructure issues suspected
Communication Template
Section titled “Communication Template”For customers to report issues:
Bot Name: [name]Server ID: [from portal]Issue: [description]When Started: [timestamp]Recent Changes: [deployments, config changes]Console Output: [relevant logs]Steps Taken: [what was tried]Checklist for Customers
Section titled “Checklist for Customers”- Auto-restart enabled
- Daily backups configured
- Environment variables secured (hidden)
- Monitoring thresholds understood
- Emergency contacts documented
- Rollback procedure tested
- Dependencies regularly updated