X/Twitter accounts ─┐
├─► twitterapi.io ─► Claude API (summary) ─┬─► Telegram (daily push)
RSS (feeds.txt) ────┘ └─► feed.json ─► site (archive)
⏰ triggered every morning by GitHub Actions
feed.json is the data contract between the two repos.1. The spark / the initial need
I wanted to follow AI news (labs, models, tools, research) without spending time on it or drowning in hype. With my own angle: covering global AI and especially AI that's useful to merchants and business owners — the concrete stuff for a real business.
A real case had already pointed me in this direction: an LLM drafts a daily hi-fi press review for me from about thirty sources every morning. The AI watch bot is the generalization of that mechanic.
Two goals from the start:
- a Telegram channel (t.me/VeilleIA_HL) — daily push, ephemeral;
- a public archive on the site (the "AI Watch" page) — permanent, SEO/GEO-optimized, citable by AIs.
2. The idea → the scope (decisions, rejected alternatives)
- Two separate repos (
veille-ia= engine,knowledge-hub= site) — to isolate secrets (API keys, bot token) from the public site. Rejected: putting everything in the site repo, secrets exposed. - Bot = data engine / Site = display, connected by a single
feed.json— separation of concerns. Rejected: the site collecting and summarizing itself. - Reading
feed.jsonat runtime (browser fetch), not at build time — afeed.jsonpush is reflected without redeploying. Rejected: regenerating the site on every update. - Archive page ≠ Telegram — the archive is permanent, Telegram is ephemeral. Rejected: relying solely on Telegram history.
- Renaming "Flux IA" → "Veille IA" + URL
/veille-ia/(redirect from/flux-ia/) — more accurate industry term, better SEO/GEO anchor. Rejected: "Flux IA", "Radar IA". - Tweet media: official X embed + automatic thumbnail+link fallback — X embeds are fragile, the user is never blocked. Rejected: embed only (often breaks).
- Transformative EN summaries, never copy-pasting the tweet — hard copyright rule. Rejected: copying the tweet (not allowed).
3. The "vibe coding": how it came together, step by step
The project happened in two phases: a chat-based scoping session (the decisions above, formalized in handoff MDs), then the actual code build with Claude Code in the veille-ia repo.
The pipeline built (veille/):
twitter.py— fetches recent tweets by account via twitterapi.io (+advanced_searchfor backfill).- Filtering — removes noise (retweets, replies, out-of-window).
feed.py— Claude (role "neutral editor-in-chief, anti-hype") selects noteworthy topics and produces structured entries via an enforced JSON Schema. Factual fields (URL, date, media, author) are not delegated to Claude: they're read back from the tweet source by its index[n].digest.py— writes the Telegram digest.telegram.py— publishes (with splitting for messages > 4096 characters).main.py— orchestrates both outputs from the same collection run.backfill.py— generates the initialfeed.json(May 2026), budget-capped.
The whole thing runs without a server: two GitHub Actions workflows (veille.yml daily at 05:00 UTC ≈ 7am Paris, and backfill.yml manual).
4. The stack + tools
- twitterapi.io — tweet collection (Twitter having closed free access). Pay-per-use, a few cents/day. Key
<TWITTERAPI_IO_KEY>. - RSS (
feeds.txt) — complementary sources beyond X. - Claude API (Anthropic) — editorial selection + neutral summaries. Model
claude-opus-4-8(switchable toclaude-sonnet-4-6to reduce cost). Key<ANTHROPIC_API_KEY>. - Telegram Bot API — digest push. Token
<TELEGRAM_BOT_TOKEN>, channelt.me/VeilleIA_HL. - GitHub Actions — the scheduler (1 run/day) that orchestrates collect → summarize → 2 outputs.
- GitHub (
veille-ia) — engine + secrets. GitHub Pages (knowledge-hub) — the site. feed.json— the data contract between the two repos (cumulative).seen.json— the deduplication memory.- Cross-repo bridge — at the end of the job,
veille-iacommits and pushesfeed.jsontoknowledge-hubvia a dedicated PAT<KH_REPO_TOKEN>.
5. The headaches (what broke and how we fixed it)
Faithfully reconstructed from the code + the repo's Git history.
- The cross-repo bridge fighting itself. First instinct:
rebasebefore pushingfeed.json. Result →add/addconflicts. → Ourfeed.jsonbeing the source of truth, we overwrite the site version:git reset --hard origin/mainthen copy + push, with 3 resync attempts. (commits7cf2157→ec80ca4) - Telegram blocking everything. Bot not an admin of the channel (
403 "bot is not a member") or a network hiccup → the whole delivery would fail, includingfeed.json. → Telegram sending made non-blocking (try/except) and optional: no Telegram configured → skip, but still produce the feed. (commits500a8ce,e9717d1) - Secret name mismatch. The code expected
TELEGRAM_CHAT_ID, the secret was namedTELEGRAM_CHANNEL_ID. → Alias accepted from both sides. (commite9717d1) - Invalid GitHub Actions workflow. A
secrets.*inside a stepif:— forbidden by GitHub. → The "no token → skip" guard moved into the script ([ -z "$KH_REPO_TOKEN" ]). (commitba581b3) - Unbalanced backfill. Without a quota, a few talkative accounts ate the entire budget. → Per-account quota (fair collection across ~45 accounts) + strict global cap on the total. (commit
00afa3f)
6. Time spent (real) and tokens / cost
Time. Scoping (2-repo architecture, feed.json contract, stack, safeguards) spread over several short conversations, formalized in two handoff MDs. Actual setup on May 31, 2026 (accounts + secrets), based on my screenshot timestamps: ~3:14–3:19 PM (twitterapi.io account + 1st secret), ~6:44–6:50 PM (Telegram bot via BotFather + 3 secrets), ~11:42–11:43 PM (4th secret + "no workflow yet" screen). Total span ~8h30 that evening — but these are clock gaps between screenshots, not actual working time. The real work end-to-end (creating the accounts, pasting the 4 secrets, hooking up the workflow) takes tens of minutes, not hours.
Cost (unlike the site, here you pay per use):
- twitterapi.io: €100 prepaid "to see" — even though Claude advised me to start at zero. It's a comfort ceiling, not an expense: the real consumption for ~45 accounts is in the order of a few cents/day, so those €100 last a very long time.
- Claude API: one daily digest = a handful of cents.
claude-opus-4-8by default;claude-sonnet-4-6to reduce it further. - GitHub Actions: free within public repo limits.
- Budget safeguards: 1 run/day, RSS/tweet caps, low prepaid balance, capped backfill (no loop on
advanced_search, strict ceiling ~1500 tweets). See also How much does AI cost.
7. What this illustrates
- A useful agent fits in ~300 lines and 0 servers. Collect → summarize via the Claude API → dual publish, orchestrated by a free GitHub Actions cron. No infrastructure to manage. (What's an agent? → the guide.)
- Good architecture prevents headaches. Separating engine from display, isolating secrets, locking a data contract: most of the remaining bugs were integration details, not design flaws.
- Keep humans (and the LLM) honest. Factual fields read back from the source; copyright respected (summary, never copy-paste); neutral, anti-hype editorial line, even when the news isn't flattering for Anthropic.
What's next
- Document the Telegram / AI watch project (step-by-step guide "My AI watch agent").
- Pinned welcome post on the Telegram channel.
- Tag-filterable Guides page on the site.
- LinkedIn post: announce the project and track it over time — two solid concrete use cases, with real time and cost numbers.