Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 24 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ PythonID/
│ └── database/
│ ├── models.py # SQLModel schemas (4 tables)
│ └── service.py # DatabaseService singleton (645 lines)
├── tests/ # pytest-asyncio (18 files, 99% coverage)
├── tests/ # pytest-asyncio (19 files, 99.9% coverage)
└── data/bot.db # SQLite (auto-created, WAL mode)
```

Expand Down Expand Up @@ -91,16 +91,32 @@ PythonID/
### Handler Priority Groups
```python
# main.py - Order matters!
group=-1 # topic_guard: Runs FIRST, deletes unauthorized warning topic msgs
group=0 # Commands, DM, anti_spam: Default priority
group=1 # message_handler: Runs LAST, profile compliance check
group=-1 # topic_guard: Runs FIRST, deletes unauthorized warning topic msgs, raises ApplicationHandlerStop
group=0 # Commands, DM, captcha: Default priority
group=1 # inline_keyboard_spam: Catches inline keyboard URL spam
group=2 # new_user_spam: Probation enforcement (links/forwards)
group=3 # duplicate_spam: Repeated message detection
group=4 # message_handler: Runs LAST, profile compliance check
```

### Topic Guard Design
- Handles both `message` and `edited_message` updates (combined filter)
- Raises `ApplicationHandlerStop` after handling ANY warning-topic message (allows or deletes)
- This prevents downstream spam/profile handlers from processing warning-topic traffic
- **Fail-closed**: On `get_chat_member` API error, deletes the message (scoped to confirmed warning-topic only)
- Early returns (no message, wrong group, wrong topic) happen OUTSIDE the try/except block

### Singletons
- `get_settings()` — Pydantic settings, `@lru_cache`
- `get_database()` — DatabaseService, lazy init
- `BotInfoCache` — Class-level cache for bot username/ID

### Admin Cache
- Fetched at startup in `post_init()` and stored in `bot_data["group_admin_ids"]` (per-group) and `bot_data["admin_ids"]` (union)
- Refreshed every 10 minutes via `refresh_admin_ids` JobQueue job
- On refresh failure for a group, falls back to existing cached data (not empty list)
- Spam handlers use cached admin IDs; topic_guard uses live `get_chat_member` API call

### Multi-Group Support
- `GroupConfig` — Pydantic model for per-group settings (warning thresholds, captcha, probation)
- `GroupRegistry` — O(1) lookup by group_id, manages all monitored groups
Expand Down Expand Up @@ -140,7 +156,7 @@ Time threshold → Auto-restrict via scheduler (parallel path)
- **Fixtures**: `mock_update`, `mock_context`, `mock_settings` — copy from existing tests
- **Database tests**: Use `temp_db` fixture with `tempfile.TemporaryDirectory`
- **Mocking**: `AsyncMock` for Telegram API; no real network calls
- **Coverage**: 100% maintained — check before committing
- **Coverage**: 99.9% maintained (519 tests) — check before committing

## Anti-Patterns (THIS PROJECT)

Expand Down Expand Up @@ -186,8 +202,10 @@ if user.id not in admin_ids:
## Notes

- Topic guard runs at `group=-1` to intercept unauthorized messages BEFORE other handlers
- Topic guard handles both messages and edited messages, raises `ApplicationHandlerStop` to block downstream handlers
- JobQueue auto-restriction job runs every 5 minutes (first run after 5 min delay)
- Bot uses `allowed_updates=["message", "callback_query", "chat_member"]`
- JobQueue admin refresh job runs every 10 minutes (first run after 10 min delay)
- Bot uses `allowed_updates=["message", "edited_message", "callback_query", "chat_member"]`
- Captcha uses both `ChatMemberHandler` (for "Hide Join" groups) and `MessageHandler` fallback
- Multi-group: handlers use `get_group_config_for_update()` instead of `settings.group_id`
- Captcha callback data encodes group_id: `captcha_verify_{group_id}_{user_id}` to avoid ambiguity
Expand Down
42 changes: 25 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ A comprehensive Telegram bot for managing group members with profile verificatio
- Checks if users have a public profile picture
- Checks if users have a username set
- Sends warnings to a dedicated topic (thread) for non-compliant users
- **Warning topic protection**: Only admins and the bot can post in the warning topic
- **Warning topic protection**: Only admins and the bot can post in the warning topic (messages + edited messages)

### Restriction & Unrestriction
- **Progressive restriction**: Optional mode to restrict users after multiple warnings (message-based)
Expand Down Expand Up @@ -194,15 +194,14 @@ uv run pytest -v
### Test Coverage

The project maintains comprehensive test coverage:
- **Coverage**: 99% (1,396 statements)
- **Tests**: 442 total
- **Pass Rate**: 100% (442/442 passed)
- **All modules**: 100% coverage including JobQueue scheduler integration, captcha verification, and anti-spam enforcement
- Services: `bot_info.py` (100%), `group_config.py` (97%), `scheduler.py` (100%), `user_checker.py` (100%), `telegram_utils.py` (100%), `captcha_recovery.py` (100%)
- Handlers: `anti_spam.py` (100%), `captcha.py` (100%), `check.py` (98%), `dm.py` (100%), `message.py` (100%), `topic_guard.py` (100%), `verify.py` (100%)
- Database: `service.py` (100%), `models.py` (100%)
- Config: `config.py` (100%)
- Constants: `constants.py` (100%)
- **Coverage**: 99.9% (1,570 statements, 1 unreachable line)
- **Tests**: 519 total
- **Pass Rate**: 100% (519/519 passed)
- **All modules at 100%** except one unreachable line in `anti_spam.py`
- Services: `bot_info.py`, `scheduler.py`, `user_checker.py`, `telegram_utils.py`, `captcha_recovery.py` — all 100%
- Handlers: `anti_spam.py` (99%), `captcha.py`, `check.py`, `dm.py`, `message.py`, `topic_guard.py`, `verify.py`, `duplicate_spam.py` — all 100%
- Database: `service.py`, `models.py` — all 100%
- Config: `config.py`, `group_config.py`, `constants.py` — all 100%

All modules are fully unit tested with:
- Mocked async dependencies (telegram bot API calls)
Expand All @@ -227,16 +226,18 @@ PythonID/
│ ├── test_bot_info.py
│ ├── test_captcha.py
│ ├── test_captcha_recovery.py
│ ├── test_check.py
│ ├── test_config.py
│ ├── test_constants.py
│ ├── test_database.py
│ ├── test_dm_handler.py
│ ├── test_duplicate_spam.py
│ ├── test_group_config.py
│ ├── test_message_handler.py
│ ├── test_photo_verification.py
│ ├── test_scheduler.py # JobQueue scheduler tests
│ ├── test_telegram_utils.py
│ ├── test_topic_guard.py
│ ├── test_group_config.py
│ ├── test_user_checker.py
│ ├── test_verify_handler.py
│ └── test_whitelist.py
Expand Down Expand Up @@ -441,14 +442,16 @@ flowchart TD

The bot is organized into clear modules for maintainability:

- **main.py**: Entry point with python-telegram-bot's JobQueue integration and graceful shutdown
- **handlers/**: Message processing logic
- `message.py`: Monitors group messages and sends warnings/restrictions
- **main.py**: Entry point with python-telegram-bot's JobQueue integration, admin cache refresh, and graceful shutdown
- **handlers/**: Message processing logic (priority groups -1 through 4)
- `topic_guard.py`: Protects warning topic (group=-1, messages + edited messages, fail-closed)
- `message.py`: Monitors group messages and sends warnings/restrictions (group=4)
- `dm.py`: Handles DM unrestriction flow
- `topic_guard.py`: Protects warning topic from unauthorized messages
- `captcha.py`: Captcha verification for new members
- `anti_spam.py`: Anti-spam enforcement for users on probation
- `anti_spam.py`: Inline keyboard spam (group=1) + new user probation enforcement (group=2)
- `duplicate_spam.py`: Repeated message detection (group=3)
- `verify.py`: /verify and /unverify command handlers
- `check.py`: /check command + forwarded message handling
- **services/**: Business logic and utilities
- `scheduler.py`: JobQueue background job that runs every 5 minutes for time-based auto-restrictions
- `user_checker.py`: Profile validation (photo + username check)
Expand Down Expand Up @@ -492,6 +495,9 @@ The bot runs a JobQueue background job every 5 minutes that:

This ensures users cannot evade restrictions by simply not sending messages.

### Admin Cache Refresh
Admin IDs are fetched at startup and refreshed every 10 minutes via a JobQueue job. If the refresh fails for a group, the bot falls back to the previously cached data (never an empty list). Spam handlers use the cached admin IDs for fast lookups, while the topic guard uses live `get_chat_member` API calls for maximum accuracy.

### Message Templates and Constants
All warning and restriction messages are centralized in `constants.py` for consistency:
- `WARNING_MESSAGE_NO_RESTRICTION`: Used in warning-only mode
Expand All @@ -504,7 +510,9 @@ All messages are formatted with proper Indonesian language patterns and include

### Warning Topic Protection
- Only group administrators and the bot itself can post in the warning topic
- Messages from regular users are automatically deleted
- Messages and edited messages from regular users are automatically deleted
- Uses `ApplicationHandlerStop` to prevent downstream handlers from processing warning-topic traffic
- **Fail-closed**: On API errors, messages in the warning topic are deleted (erring on the side of protection)

### DM Unrestriction Flow
When a restricted user DMs the bot (or sends `/start`):
Expand Down
69 changes: 42 additions & 27 deletions src/bot/handlers/topic_guard.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import logging

from telegram import Update
from telegram.ext import ContextTypes
from telegram.ext import ApplicationHandlerStop, ContextTypes

from bot.group_config import get_group_config_for_update

Expand All @@ -31,41 +31,44 @@ async def guard_warning_topic(update: Update, context: ContextTypes.DEFAULT_TYPE
update: Telegram update containing the message.
context: Bot context with helper methods.
"""
try:
# Skip if no message or sender
if not update.message or not update.message.from_user:
logger.info("No message or no sender, skipping")
return
message = update.message or update.edited_message

# Skip if no message or sender
if not message or not message.from_user:
logger.info("No message or no sender, skipping")
return

group_config = get_group_config_for_update(update)
user = message.from_user
chat_id = update.effective_chat.id if update.effective_chat else None
thread_id = message.message_thread_id

group_config = get_group_config_for_update(update)
user = update.message.from_user
chat_id = update.effective_chat.id if update.effective_chat else None
thread_id = update.message.message_thread_id
logger.info(
f"Topic guard called: user_id={user.id}, chat_id={chat_id}, thread_id={thread_id}"
)

# Only process messages from monitored groups
if group_config is None:
logger.info(
f"Topic guard called: user_id={user.id}, chat_id={chat_id}, thread_id={thread_id}"
f"Chat not monitored (chat_id={chat_id}), skipping"
)
return

# Only process messages from monitored groups
if group_config is None:
logger.info(
f"Chat not monitored (chat_id={chat_id}), skipping"
)
return

# Only guard the warning topic, not other topics
if thread_id != group_config.warning_topic_id:
logger.info(
f"Wrong topic (thread_id={thread_id}, expected {group_config.warning_topic_id}), skipping"
)
return
# Only guard the warning topic, not other topics
if thread_id != group_config.warning_topic_id:
logger.info(
f"Wrong topic (thread_id={thread_id}, expected {group_config.warning_topic_id}), skipping"
)
return

# From here on, we're in the warning topic - use try/except with fail-closed
try:
bot_id = context.bot.id

# Allow bot's own messages
if user.id == bot_id:
logger.info(f"Allowing bot's own message (bot_id={bot_id})")
return
raise ApplicationHandlerStop

# Check if user is an admin or creator
logger.info(f"Checking admin status for user {user.id} ({user.full_name})")
Expand All @@ -79,17 +82,29 @@ async def guard_warning_topic(update: Update, context: ContextTypes.DEFAULT_TYPE
logger.info(
f"Allowing message from {chat_member.status} {user.id} ({user.full_name})"
)
return
raise ApplicationHandlerStop

# Delete message from non-admin user
logger.info(
f"Deleting message from non-admin user {user.id} ({user.full_name}) "
f"in warning topic (group_id={group_config.group_id}, thread_id={thread_id})"
)
await update.message.delete()
await message.delete()
raise ApplicationHandlerStop

except ApplicationHandlerStop:
raise
except Exception as e:
logger.error(
f"Error in topic guard handler: {e}",
exc_info=True,
)
# Fail-closed: delete message in warning topic on error
try:
await message.delete()
except Exception as delete_error:
logger.error(
f"Failed to delete message during error recovery: {delete_error}",
exc_info=True,
)
raise ApplicationHandlerStop
41 changes: 38 additions & 3 deletions src/bot/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,33 @@ def configure_logging() -> None:
logger = logging.getLogger(__name__)


async def refresh_admin_ids(context: ContextTypes.DEFAULT_TYPE) -> None:
"""
Periodically refresh cached admin IDs for all monitored groups.

Called by JobQueue every 10 minutes to keep admin rosters up to date
when promotions/demotions happen after startup.
"""
registry = get_group_registry()
group_admin_ids: dict[int, list[int]] = {}
all_admin_ids: set[int] = set()

for gc in registry.all_groups():
try:
ids = await fetch_group_admin_ids(context.bot, gc.group_id)
group_admin_ids[gc.group_id] = ids
all_admin_ids.update(ids)
except Exception as e:
logger.error(f"Failed to refresh admin IDs for group {gc.group_id}: {e}")
existing = context.bot_data.get("group_admin_ids", {}).get(gc.group_id, [])
group_admin_ids[gc.group_id] = existing
all_admin_ids.update(existing)

context.bot_data["group_admin_ids"] = group_admin_ids
context.bot_data["admin_ids"] = list(all_admin_ids)
logger.info(f"Refreshed admin IDs: {len(all_admin_ids)} unique admin(s) across {len(group_admin_ids)} group(s)")


async def error_handler(update: object, context: ContextTypes.DEFAULT_TYPE) -> None:
"""
Handle errors in the bot.
Expand Down Expand Up @@ -212,12 +239,12 @@ def main() -> None:
# messages in the warning topic before other handlers process them
application.add_handler(
MessageHandler(
filters.ALL,
filters.UpdateType.MESSAGE | filters.UpdateType.EDITED_MESSAGE,
guard_warning_topic,
),
group=-1,
)
logger.info("Registered handler: topic_guard (group=-1)")
logger.info("Registered handler: topic_guard (group=-1, message + edited_message)")

# Handler 2: /verify command - allows admins to whitelist users in DM
application.add_handler(
Expand Down Expand Up @@ -331,10 +358,18 @@ def main() -> None:
)
logger.info("JobQueue registered: auto_restrict_job (every 5 minutes, first run in 5 minutes)")

application.job_queue.run_repeating(
refresh_admin_ids,
interval=600,
first=600,
name="refresh_admin_ids_job"
)
logger.info("JobQueue registered: refresh_admin_ids_job (every 10 minutes)")

logger.info(f"Starting bot polling for {group_count} group(s)")
logger.info("All handlers registered successfully")

application.run_polling(allowed_updates=["message", "callback_query", "chat_member"])
application.run_polling(allowed_updates=["message", "edited_message", "callback_query", "chat_member"])


if __name__ == "__main__":
Expand Down
9 changes: 9 additions & 0 deletions tests/test_anti_spam.py
Original file line number Diff line number Diff line change
Expand Up @@ -848,6 +848,15 @@ def test_mixed_urls_returns_true(self):

assert has_non_whitelisted_inline_keyboard_urls(msg) is True

def test_none_button_skipped(self):
"""Test that None buttons in a row don't crash."""
reply_markup = MagicMock()
reply_markup.inline_keyboard = [[None]]
msg = MagicMock(spec=Message)
msg.reply_markup = reply_markup

assert has_non_whitelisted_inline_keyboard_urls(msg) is False

def test_callback_data_buttons_returns_false(self):
"""Test that buttons without URLs (callback_data buttons) return False."""
button = MagicMock()
Expand Down
32 changes: 32 additions & 0 deletions tests/test_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -604,3 +604,35 @@ async def test_warn_callback_get_chat_timeout(
query.edit_message_text.assert_called_once()
call_args = query.edit_message_text.call_args
assert "timeout" in call_args.args[0].lower()

async def test_warn_callback_per_group_send_failure_all_groups(
self, mock_context, mock_settings, mock_registry
):
"""When send_message fails for all groups, shows 'all groups failed' message."""
update = MagicMock()
query = MagicMock()
query.from_user = MagicMock()
query.from_user.id = 12345
query.from_user.full_name = "Admin User"
query.data = "warn:555666:pu"
query.answer = AsyncMock()
query.edit_message_text = AsyncMock()
update.callback_query = query

mock_chat = MagicMock()
mock_chat.full_name = "Test User"
mock_chat.username = "testuser"
mock_context.bot.get_chat.return_value = mock_chat
mock_context.bot.send_message.side_effect = RuntimeError("Connection lost")

with (
patch(
"bot.handlers.check.get_group_registry",
return_value=mock_registry,
),
):
await handle_warn_callback(update, mock_context)

query.edit_message_text.assert_called_once()
call_args = query.edit_message_text.call_args
assert "Gagal mengirim peringatan ke semua grup" in call_args.args[0]
Loading