Last updated: 2026-05-10
Single canonical reference for ALL tracking, monitoring, alerting, and observability across our platforms. Read this before launching anything new.
1. Visitor tracking (every user-facing platform)
Mandatory maximalist visitor tracking on every platform: oranges.live, apples.live, askcraig.org, quiddler.org, spidergpt, collage, marketing, peoplesearch, and any new platform. Reference schema: oranges-platform.global_visitors. Skill: /root/.claude/skills/visitor-tracking/SKILL.md.
If the browser exposes it, capture it. Default posture is maximalist — these are our domains; we decide what to log.
Per-IP visitor row (global_visitors)
- Identity — ip (PK), first_seen, last_seen, visit_count, label, session_id, user_id
- Network — isp, org, asn, is_proxy, is_vpn, is_tor, is_hosting, webrtc_local_ip
- Geo (server-side from IP) — city, region, country, country_code, postal, lat, lng, timezone, accuracy_radius
- Device — ua, device_type, brand, model, os, browser, is_mobile, is_tablet, is_bot, cpu_cores, device_memory_gb, touch_points, hardware_concurrency
- Display — screen_w/h, viewport_w/h, pixel_ratio, color_scheme, color_gamut, contrast_pref, reduced_motion_pref, forced_colors, orientation
- Connection — type, effective_type, downlink, downlink_max, rtt, save_data
- Fingerprint — canvas_fp, webgl_vendor/renderer, webgl_extensions_hash, webgl_unmasked_*, audio_fp, audio_sample_rate, fonts_fp, client_hints_*, plugins_hash, mime_types_hash, speech_voices_hash, math_fp
- Privacy/locale — do_not_track, gpc, cookie_enabled, lang, languages, tz_js, tz_offset_minutes, intl_calendar, intl_numbering
- Battery — level, charging, charging_time, discharging_time
- Capabilities — storage_quota_mb, storage_usage_mb, indexeddb_avail, service_worker_supported, push_supported, webauthn_supported, payment_request_supported, web_share_supported, bluetooth, usb, gamepad_count, midi, vr, ar
- Permissions (snapshot, no prompt) — geolocation_state, notifications_state, camera_state, microphone_state, clipboard_state, persistent_storage_state
- Media devices (count only) — camera_count, mic_count, speaker_count
- PWA — display_mode, is_installed, referrer_app
- Keyboard — keyboard_layout (when getLayoutMap available)
Per-request log (global_visit_log)
ip + session_id + user_id + path + query_string + referrer + utm_* + gclid + fbclid + msclkid + ttclid + status_code + response_time_ms + bytes_sent + ts. Every non-asset request.
Per-event engagement (global_engagement)
ip + session_id + event_name + object_id + dwell_ms + scroll_depth_pct + viewport_x/y + element_text + element_path + ts.
Events captured: impression, expand, open, skip, click, rage_click (3+ same area in <1s), dead_click (non-interactive target), copy, paste, share, outbound_link, form_focus, form_abandon, form_submit.
Performance (global_performance)
ip + session_id + path + ttfb + fcp + lcp + cls + inp + tti + dom_content_loaded + load_event + js_errors + slow_resources_json + ts. Captured via PerformanceObserver + Web Vitals.
Session timing (global_session)
session_id + ip + start_ts + end_ts + page_count + total_active_ms + total_idle_ms + max_scroll_pct + tab_switches + window_blur_count + visibility_changes + bounced (1 page <10s no events).
Conversion (global_conversion)
ip + session_id + user_id + event (signup / first_action / paid_trial_start / paid_active / churned) + value_cents + funnel_stage + ab_bucket + ts.
Email tracking
1×1 pixel + per-link /r/<token> redirector for opens + clicks attributed to visitor row.
Implementation rules
- IP-keyed for anon. Session-cookie for cross-IP linkage. user_id once known. All three coexist on rows that have them.
- Server-side geo on first sighting (IPinfo / ipapi / MaxMind). Cache.
- Client fingerprint via JS payload POSTed to
/api/visitor/enrichafter page load (deferred viarequestIdleCallback). Single batched payload, not 30 separate requests. - Performance sent via
navigator.sendBeacon()onpagehideso it survives nav-away. - Logging path NEVER 500s page render. Wrap in try/except, swallow, log to journal.
- All tables IP-indexed for cheap lookups. WAL + busy_timeout=15000.
- Console UI surfaces sortable + filtered (country, device, label, conversion stage, ab_bucket) + click-through to per-visitor timeline.
When building a new platform, copy global_visitors schema + _log_global_visit_inner() + /api/visitor/enrich from oranges-platform. Don't reinvent.
2. Per-platform tracking surfaces
apples.live (vps2)
- agentaudit.db —
pageviews,track_events,track_dwell,ai_activity_log(existing) - sessions.db (apples-agent) —
turnstable: every user/agent/tool message in customer chat, indexed by slug + ts - consultant_chat_messages — every message in consultant chat (token-keyed), full transcript
- client_folders — folder creates, mutations, parent moves
- outreach_packs / pack_activity — consultant pack views, prospect status changes
oranges.live (vps1)
- global_visitors + global_visit_log + global_engagement + global_performance + global_session + global_conversion — full reference implementation
askcraig.org (vps2)
- alan.db — users (45), jobs (46), messages (279), bids
- askcraig-reddit data.db (vps1) — candidates (1170 queued), drafted, posted, skipped
quiddler.org (vps1)
- quiddler/data.db — players, games, sessions, scoring events, magic_tokens
spidergpt (vps1)
- spidergpt/data.db — audit_log, tg_chats, telegram interactions
collage.live (vps1)
- collage/data.db — cards, items
Aggregate: 4030 admin console at /apples/conversations
- Cross-platform conversation viewer
- Reads sessions.db (vps2) via
/api/admin/dashboard/conversations+/conversation?slug=<X> - Bearer auth via WORKSPACE_DASHBOARD_TOKEN
3. Service health monitoring
claude-ping (every 5 min, both VPSes)
/opt/claude-ping/ping.sh— verifiesclaude -p "ping"returns "pong" on each VPS- Detects: rate limits, account suspension, auth/cred failures, network issues
- Notifies via
/opt/notify-nik.pyon first auth-class failure or 3 consecutive generic failures - Throttle: 1 notify per host per hour
- Log:
/opt/claude-ping/ping.log
apples-smoke (every 5 min, vps2)
19 checks exercising every user flow that has broken before. Read the full list at /opt/apples-smoke/run.sh. Currently:
- apex marketing, login page, OAuth start
- privacy + ToS content present
- workspace L1 with snap_key bypass
- admin clients API + conversations API
- snap-image route + tab-snap render direct
- apples-agent blue/green health
- agentaudit blue/green health
- WS handshake (catches missing python websockets lib)
- claude binary works as apples-agent user
- note path normalization (catches dup-slug bug)
- tab-snap listening + warm-apples cron recent
- consultant-chat-rules (CLAUDE.md identity rules present)
Output: /opt/apples-smoke/results.tsv (autoresearch-style append-only TSV).
Notifications: first failure + once per hour after; resets on pass.
cred-sync (every 10 min, vps2)
/opt/cred-sync/run.sh— compares root vs apples-agent~/.claude/.credentials.jsonexpiresAt- Copies fresh when SRC > DST
- Hard alert if DST is expired (would cause 401 in composer)
Cron warmer for snap cache (every 5 min, vps2)
/opt/tab-snap/warm-apples.sh— pre-warms tab-snap cache for every customer workspace- Hits all 5 default tabs + every server-side folder + L1 grid for each slug
force=1bypasses 15-min disk cache- Keeps thumbnails always-fresh
Service-health helper
/root/.claude/skills/service-check/— used to debug systemd services (status, logs, port, memory)
4. Snap freshness (multi-layer)
Six layers keep PNG thumbnails fresh. Skill: snap-warm.
| Layer | When | What |
|---|---|---|
| 1 | Tile click | warmSnap() fire-and-forget warm of the destination tab |
| 2 | Page mount | Folder/tab mounts → fires force=1 warm of own URL |
| 3 | Mutation | Folder PATCH, note save → re-warms the thumbnail |
| 4 | Cron (every 5 min) | warm-apples.sh warms every customer workspace's full tab set |
| 5 | Visibility-change | When user comes back to L1 grid, force=1 fetch + cache-bust |
| 6 | Discovery | Polling sees new tab/folder → fires immediate warm |
Everything routes through /api/snap-image?force=1 → tab-snap :4137. Snap cache TTL: 15 min on disk.
5. Bug ledger + prevention loop
Every bug that takes more than 5 minutes to fix gets:
- Entry in
/opt/apples-bugs/ledger.md— date / symptom / root cause / fix / prevention - New smoke check appended to
/opt/apples-smoke/run.sh(orWONTFIX/ACCEPTjustification)
Read the ledger before designing new features — don't repeat solved problems.
Recent entries (as of 2026-05-10):
- Note duplication on creation (slug-prefix path bug → server normalize + CLAUDE.md rule)
- Composer 401 (creds, BindPaths, websockets lib)
- Folder PNG snap not loading (skip flag + auth bypass)
- Custom tabs invisible on L1 (no fetch endpoint)
- Agent identifying as Claude (CLAUDE.md template)
- Privacy + ToS placeholder (force-dynamic + real content)
- WebSocket 404 (missing python deps)
- OAuth 404 (missing keys333 on vps2)
- nginx → callflow connection refused (UFW rule order)
- Wrong header pattern on Inbox+Calendar (image-bg vs white-card confusion)
- Consultant chat identifying as Claude (missing identity rules in APPLES_CONTEXT.md)
6. Notification channels
/opt/notify-nik.py "msg" — sends to:
- Slack — Nik's primary workspace
- Telegram — bot DM via TELEGRAM_BOT_TOKEN
Used by claude-ping, smoke runner, cred-sync, anything mission-critical. Throttled per-source so we don't spam.
7. Console / Admin observability surfaces
4030 workspace
/— L1 grid of every platform tile/apples/clients— client list with last activity/apples/conversations— every chat across customer workspaces (turns count + last_ts)/apples/conversations/<slug>— full transcript per workspace/apples/pageviews— visit aggregates + recent rows/apples/behavior— events count + dwell rows (last 7 days)/askcraig/customers,jobs,messages,stats— alan.db aggregates/oranges/_system— oranges system overview/quiddler/*,/spidergpt/*,/collage/*,/marketing/*— per-platform admin
Per-platform consoles (legacy, internal-facing)
oranges.live/console— visitor tracking deep-dive (sortable, filterable)apples.live/console— DEPRECATED, moving to 4030 only- Each new platform gets its admin in 4030, not on the public domain
8. What to build for a new platform
Every new user-facing platform gets, day one:
- global_visitors schema copied from oranges-platform
/api/visitor/enrichendpoint- Blue/green deploy pair on adjacent ports + nginx upstream
- Health endpoint (
/healthzreturning JSON{ok:true,...}) - Smoke test added to
/opt/apples-smoke/run.sh(or platform-specific equivalent) /api/admin/dashboard/<section>JSON endpoint with WORKSPACE_DASHBOARD_TOKEN bearer auth- 4030 dashboard tile wired to the JSON endpoint
- claude-ping cron if any LLM calls are involved
- Snap cron warmer if it's part of the 4030 grid (new platform = new entries in
warm-apples.sh) - Header card matching the white-card pattern (apples-style, FolderPanel reference)
Don't ship without these. Retrofit cost is always > build-it-right cost.
References
/opt/apples-bugs/ledger.md— bug history + preventions/opt/apples-smoke/run.sh— current smoke test suite/opt/cred-sync/run.sh— cred sync cron/opt/claude-ping/ping.sh— claude API health probe/opt/tab-snap/warm-apples.sh— snap cache warmer/opt/notify-nik.py— Slack + Telegram alerts/root/.claude/skills/visitor-tracking/SKILL.md— visitor tracking schema (skill)/root/.claude/skills/blue-green-deploys/SKILL.md— zero-downtime deploy pattern/root/.claude/skills/scroll-snap-fix/SKILL.md— auto-refresh dashboards scroll preservation/root/.claude/skills/console-header/SKILL.md— header card patterns (A image-bg-pill, B white-card)/root/.claude/skills/ios-zoom-fix/SKILL.md— 16px input rule for mobile/opt/apples-templates/CLAUDE.md— apples customer agent identity + privacy rules/opt/apples-consultants/APPLES_CONTEXT.md— consultant chat agent identity + privacy rules