Tracking + Monitoring Protocol

Last updated: 2026-05-10

Single canonical reference for ALL tracking, monitoring, alerting, and observability across our platforms. Read this before launching anything new.


1. Visitor tracking (every user-facing platform)

Mandatory maximalist visitor tracking on every platform: oranges.live, apples.live, askcraig.org, quiddler.org, spidergpt, collage, marketing, peoplesearch, and any new platform. Reference schema: oranges-platform.global_visitors. Skill: /root/.claude/skills/visitor-tracking/SKILL.md.

If the browser exposes it, capture it. Default posture is maximalist — these are our domains; we decide what to log.

Per-IP visitor row (global_visitors)

  • Identity — ip (PK), first_seen, last_seen, visit_count, label, session_id, user_id
  • Network — isp, org, asn, is_proxy, is_vpn, is_tor, is_hosting, webrtc_local_ip
  • Geo (server-side from IP) — city, region, country, country_code, postal, lat, lng, timezone, accuracy_radius
  • Device — ua, device_type, brand, model, os, browser, is_mobile, is_tablet, is_bot, cpu_cores, device_memory_gb, touch_points, hardware_concurrency
  • Display — screen_w/h, viewport_w/h, pixel_ratio, color_scheme, color_gamut, contrast_pref, reduced_motion_pref, forced_colors, orientation
  • Connection — type, effective_type, downlink, downlink_max, rtt, save_data
  • Fingerprint — canvas_fp, webgl_vendor/renderer, webgl_extensions_hash, webgl_unmasked_*, audio_fp, audio_sample_rate, fonts_fp, client_hints_*, plugins_hash, mime_types_hash, speech_voices_hash, math_fp
  • Privacy/locale — do_not_track, gpc, cookie_enabled, lang, languages, tz_js, tz_offset_minutes, intl_calendar, intl_numbering
  • Battery — level, charging, charging_time, discharging_time
  • Capabilities — storage_quota_mb, storage_usage_mb, indexeddb_avail, service_worker_supported, push_supported, webauthn_supported, payment_request_supported, web_share_supported, bluetooth, usb, gamepad_count, midi, vr, ar
  • Permissions (snapshot, no prompt) — geolocation_state, notifications_state, camera_state, microphone_state, clipboard_state, persistent_storage_state
  • Media devices (count only) — camera_count, mic_count, speaker_count
  • PWA — display_mode, is_installed, referrer_app
  • Keyboard — keyboard_layout (when getLayoutMap available)

Per-request log (global_visit_log)

ip + session_id + user_id + path + query_string + referrer + utm_* + gclid + fbclid + msclkid + ttclid + status_code + response_time_ms + bytes_sent + ts. Every non-asset request.

Per-event engagement (global_engagement)

ip + session_id + event_name + object_id + dwell_ms + scroll_depth_pct + viewport_x/y + element_text + element_path + ts.

Events captured: impression, expand, open, skip, click, rage_click (3+ same area in <1s), dead_click (non-interactive target), copy, paste, share, outbound_link, form_focus, form_abandon, form_submit.

Performance (global_performance)

ip + session_id + path + ttfb + fcp + lcp + cls + inp + tti + dom_content_loaded + load_event + js_errors + slow_resources_json + ts. Captured via PerformanceObserver + Web Vitals.

Session timing (global_session)

session_id + ip + start_ts + end_ts + page_count + total_active_ms + total_idle_ms + max_scroll_pct + tab_switches + window_blur_count + visibility_changes + bounced (1 page <10s no events).

Conversion (global_conversion)

ip + session_id + user_id + event (signup / first_action / paid_trial_start / paid_active / churned) + value_cents + funnel_stage + ab_bucket + ts.

Email tracking

1×1 pixel + per-link /r/<token> redirector for opens + clicks attributed to visitor row.

Implementation rules

  • IP-keyed for anon. Session-cookie for cross-IP linkage. user_id once known. All three coexist on rows that have them.
  • Server-side geo on first sighting (IPinfo / ipapi / MaxMind). Cache.
  • Client fingerprint via JS payload POSTed to /api/visitor/enrich after page load (deferred via requestIdleCallback). Single batched payload, not 30 separate requests.
  • Performance sent via navigator.sendBeacon() on pagehide so it survives nav-away.
  • Logging path NEVER 500s page render. Wrap in try/except, swallow, log to journal.
  • All tables IP-indexed for cheap lookups. WAL + busy_timeout=15000.
  • Console UI surfaces sortable + filtered (country, device, label, conversion stage, ab_bucket) + click-through to per-visitor timeline.

When building a new platform, copy global_visitors schema + _log_global_visit_inner() + /api/visitor/enrich from oranges-platform. Don't reinvent.


2. Per-platform tracking surfaces

apples.live (vps2)

  • agentaudit.dbpageviews, track_events, track_dwell, ai_activity_log (existing)
  • sessions.db (apples-agent) — turns table: every user/agent/tool message in customer chat, indexed by slug + ts
  • consultant_chat_messages — every message in consultant chat (token-keyed), full transcript
  • client_folders — folder creates, mutations, parent moves
  • outreach_packs / pack_activity — consultant pack views, prospect status changes

oranges.live (vps1)

  • global_visitors + global_visit_log + global_engagement + global_performance + global_session + global_conversion — full reference implementation

askcraig.org (vps2)

  • alan.db — users (45), jobs (46), messages (279), bids
  • askcraig-reddit data.db (vps1) — candidates (1170 queued), drafted, posted, skipped

quiddler.org (vps1)

  • quiddler/data.db — players, games, sessions, scoring events, magic_tokens

spidergpt (vps1)

  • spidergpt/data.db — audit_log, tg_chats, telegram interactions

collage.live (vps1)

  • collage/data.db — cards, items

Aggregate: 4030 admin console at /apples/conversations

  • Cross-platform conversation viewer
  • Reads sessions.db (vps2) via /api/admin/dashboard/conversations + /conversation?slug=<X>
  • Bearer auth via WORKSPACE_DASHBOARD_TOKEN

3. Service health monitoring

claude-ping (every 5 min, both VPSes)

  • /opt/claude-ping/ping.sh — verifies claude -p "ping" returns "pong" on each VPS
  • Detects: rate limits, account suspension, auth/cred failures, network issues
  • Notifies via /opt/notify-nik.py on first auth-class failure or 3 consecutive generic failures
  • Throttle: 1 notify per host per hour
  • Log: /opt/claude-ping/ping.log

apples-smoke (every 5 min, vps2)

19 checks exercising every user flow that has broken before. Read the full list at /opt/apples-smoke/run.sh. Currently:

  • apex marketing, login page, OAuth start
  • privacy + ToS content present
  • workspace L1 with snap_key bypass
  • admin clients API + conversations API
  • snap-image route + tab-snap render direct
  • apples-agent blue/green health
  • agentaudit blue/green health
  • WS handshake (catches missing python websockets lib)
  • claude binary works as apples-agent user
  • note path normalization (catches dup-slug bug)
  • tab-snap listening + warm-apples cron recent
  • consultant-chat-rules (CLAUDE.md identity rules present)

Output: /opt/apples-smoke/results.tsv (autoresearch-style append-only TSV). Notifications: first failure + once per hour after; resets on pass.

cred-sync (every 10 min, vps2)

  • /opt/cred-sync/run.sh — compares root vs apples-agent ~/.claude/.credentials.json expiresAt
  • Copies fresh when SRC > DST
  • Hard alert if DST is expired (would cause 401 in composer)

Cron warmer for snap cache (every 5 min, vps2)

  • /opt/tab-snap/warm-apples.sh — pre-warms tab-snap cache for every customer workspace
  • Hits all 5 default tabs + every server-side folder + L1 grid for each slug
  • force=1 bypasses 15-min disk cache
  • Keeps thumbnails always-fresh

Service-health helper

  • /root/.claude/skills/service-check/ — used to debug systemd services (status, logs, port, memory)

4. Snap freshness (multi-layer)

Six layers keep PNG thumbnails fresh. Skill: snap-warm.

Layer When What
1 Tile click warmSnap() fire-and-forget warm of the destination tab
2 Page mount Folder/tab mounts → fires force=1 warm of own URL
3 Mutation Folder PATCH, note save → re-warms the thumbnail
4 Cron (every 5 min) warm-apples.sh warms every customer workspace's full tab set
5 Visibility-change When user comes back to L1 grid, force=1 fetch + cache-bust
6 Discovery Polling sees new tab/folder → fires immediate warm

Everything routes through /api/snap-image?force=1 → tab-snap :4137. Snap cache TTL: 15 min on disk.


5. Bug ledger + prevention loop

Every bug that takes more than 5 minutes to fix gets:

  1. Entry in /opt/apples-bugs/ledger.md — date / symptom / root cause / fix / prevention
  2. New smoke check appended to /opt/apples-smoke/run.sh (or WONTFIX/ACCEPT justification)

Read the ledger before designing new features — don't repeat solved problems.

Recent entries (as of 2026-05-10):

  • Note duplication on creation (slug-prefix path bug → server normalize + CLAUDE.md rule)
  • Composer 401 (creds, BindPaths, websockets lib)
  • Folder PNG snap not loading (skip flag + auth bypass)
  • Custom tabs invisible on L1 (no fetch endpoint)
  • Agent identifying as Claude (CLAUDE.md template)
  • Privacy + ToS placeholder (force-dynamic + real content)
  • WebSocket 404 (missing python deps)
  • OAuth 404 (missing keys333 on vps2)
  • nginx → callflow connection refused (UFW rule order)
  • Wrong header pattern on Inbox+Calendar (image-bg vs white-card confusion)
  • Consultant chat identifying as Claude (missing identity rules in APPLES_CONTEXT.md)

6. Notification channels

/opt/notify-nik.py "msg" — sends to:

  • Slack — Nik's primary workspace
  • Telegram — bot DM via TELEGRAM_BOT_TOKEN

Used by claude-ping, smoke runner, cred-sync, anything mission-critical. Throttled per-source so we don't spam.


7. Console / Admin observability surfaces

4030 workspace

  • / — L1 grid of every platform tile
  • /apples/clients — client list with last activity
  • /apples/conversations — every chat across customer workspaces (turns count + last_ts)
  • /apples/conversations/<slug> — full transcript per workspace
  • /apples/pageviews — visit aggregates + recent rows
  • /apples/behavior — events count + dwell rows (last 7 days)
  • /askcraig/customers,jobs,messages,stats — alan.db aggregates
  • /oranges/_system — oranges system overview
  • /quiddler/*, /spidergpt/*, /collage/*, /marketing/* — per-platform admin

Per-platform consoles (legacy, internal-facing)

  • oranges.live/console — visitor tracking deep-dive (sortable, filterable)
  • apples.live/console — DEPRECATED, moving to 4030 only
  • Each new platform gets its admin in 4030, not on the public domain

8. What to build for a new platform

Every new user-facing platform gets, day one:

  1. global_visitors schema copied from oranges-platform
  2. /api/visitor/enrich endpoint
  3. Blue/green deploy pair on adjacent ports + nginx upstream
  4. Health endpoint (/healthz returning JSON {ok:true,...})
  5. Smoke test added to /opt/apples-smoke/run.sh (or platform-specific equivalent)
  6. /api/admin/dashboard/<section> JSON endpoint with WORKSPACE_DASHBOARD_TOKEN bearer auth
  7. 4030 dashboard tile wired to the JSON endpoint
  8. claude-ping cron if any LLM calls are involved
  9. Snap cron warmer if it's part of the 4030 grid (new platform = new entries in warm-apples.sh)
  10. Header card matching the white-card pattern (apples-style, FolderPanel reference)

Don't ship without these. Retrofit cost is always > build-it-right cost.


References

  • /opt/apples-bugs/ledger.md — bug history + preventions
  • /opt/apples-smoke/run.sh — current smoke test suite
  • /opt/cred-sync/run.sh — cred sync cron
  • /opt/claude-ping/ping.sh — claude API health probe
  • /opt/tab-snap/warm-apples.sh — snap cache warmer
  • /opt/notify-nik.py — Slack + Telegram alerts
  • /root/.claude/skills/visitor-tracking/SKILL.md — visitor tracking schema (skill)
  • /root/.claude/skills/blue-green-deploys/SKILL.md — zero-downtime deploy pattern
  • /root/.claude/skills/scroll-snap-fix/SKILL.md — auto-refresh dashboards scroll preservation
  • /root/.claude/skills/console-header/SKILL.md — header card patterns (A image-bg-pill, B white-card)
  • /root/.claude/skills/ios-zoom-fix/SKILL.md — 16px input rule for mobile
  • /opt/apples-templates/CLAUDE.md — apples customer agent identity + privacy rules
  • /opt/apples-consultants/APPLES_CONTEXT.md — consultant chat agent identity + privacy rules
.md-prose h1 { font-size: 24px; font-weight: 700; margin: 24px 0 12px; color: #1a1a1a; border-bottom: 1px solid #e5e7eb; padding-bottom: 8px; } .md-prose h2 { font-size: 19px; font-weight: 700; margin: 28px 0 10px; color: #1a1a1a; } .md-prose h3 { font-size: 16px; font-weight: 700; margin: 20px 0 8px; color: #1a1a1a; } .md-prose h4 { font-size: 14px; font-weight: 700; margin: 16px 0 6px; color: #374151; } .md-prose p { margin: 0 0 12px; color: #374151; } .md-prose ul, .md-prose ol { margin: 0 0 12px; padding-left: 22px; } .md-prose li { margin-bottom: 6px; color: #374151; } .md-prose strong { color: #1a1a1a; } .md-prose code { font-family: ui-monospace, SFMono-Regular, Menlo, monospace; font-size: 13px; background: #f3f4f6; padding: 1px 5px; border-radius: 4px; color: #1a1a1a; } .md-prose pre { background: #0b1020; color: #e5e7eb; padding: 12px 14px; border-radius: 8px; overflow-x: auto; margin: 8px 0 14px; font-size: 12.5px; line-height: 1.5; } .md-prose pre code { background: transparent; color: inherit; padding: 0; font-size: inherit; } .md-prose table { border-collapse: collapse; width: 100%; margin: 8px 0 14px; font-size: 13.5px; } .md-prose th { background: #f3f4f6; border-bottom: 2px solid #e5e7eb; padding: 8px 10px; text-align: left; font-weight: 600; color: #1a1a1a; } .md-prose td { padding: 8px 10px; border-bottom: 1px solid #e5e7eb; color: #374151; vertical-align: top; } .md-prose hr { border: none; border-top: 1px solid #e5e7eb; margin: 24px 0; } .md-prose blockquote { border-left: 3px solid #d97706; padding: 6px 14px; margin: 0 0 12px; background: #fef3c7; color: #92400e; border-radius: 4px; font-size: 14px; } .md-prose a { color: #2563eb; text-decoration: underline; }