Compare commits

...

50 Commits

Author SHA1 Message Date
c46e24f820 Merge branch 'fix/750-health-banner-spa-reinject' — health-banner SPA re-inject guard (ref #750)
Some checks are pending
License Headers / check (push) Waiting to run
First-party WAF-injected health banner: re-attach trigger+banner (and re-inject
styles) when an SPA rebuilds <body>, with a documentElement childList observer +
1.5s interval fallback. Parity with the R3 banner's existing self-heal.
NB: distinct from the x.com R3 kbin-banner nonce-CSP issue (#751).
2026-06-26 19:35:51 +02:00
3e9f6e8461 fix(hub): re-sync health-banner-open class on re-attach + bump 1.4.7 (ref #750) 2026-06-26 19:11:46 +02:00
4315584f79 fix(hub): health-banner SPA re-inject guard — re-attach on body wipe (ref #750) 2026-06-26 19:08:39 +02:00
1a8ed97cfe Merge branch 'feature/749-cookies-cross-site-tracker-detection' — cookies cross-site tracker panel (ref #749)
- toolbox: cookie_xsite_detail aggregation over social_edges (cross-site cookie-id reuse across >=2 first-party sites)
- toolbox: GET /admin/cookie-crosssite endpoint
- cookies dashboard: Trackers cross-site panel consuming the R3 social-graph
2026-06-26 18:50:45 +02:00
5cc97b1aea fix(cookies): coerce pre_consent_hits to int + await loadCrossSite in refresh (ref #749) 2026-06-26 18:27:16 +02:00
1f5c6ed3e3 feat(cookies): cross-site trackers panel from toolbox R3 (ref #749) 2026-06-26 18:24:19 +02:00
2a9350b9df feat(toolbox): GET /admin/cookie-crosssite endpoint (ref #749) 2026-06-26 18:20:28 +02:00
6f65a1936a feat(toolbox): cookie_xsite_detail aggregation over social_edges (ref #749)
Add _xsite_detail_from_conn() and cookie_xsite_detail() to social.py,
detecting (tracker_domain, cookie_id_hash) pairs reused across >=2 distinct
first-party sites. Mirrors aggregate() envelope. 7 tests green.
2026-06-26 18:13:39 +02:00
5c12063ca7 docs(cookies): implementation plan — cross-site tracker detection (ref #749) 2026-06-26 18:04:01 +02:00
11a0bbef66 docs(cookies): design spec — cross-site tracker detection surface (ref #749) 2026-06-26 17:58:36 +02:00
c8fe9bb148 fix(toolbox): clarify #ads labels — Trackers & pubs, bytes marked as estimate (ref #735)
The #ads panel mixes ad + tracker + telemetry blocks, and 'bytes saved' is a flat
~45 KB/block estimate (a blocked request is never downloaded, so real bytes cannot
be measured). Relabel 'Pubs bloquées' → 'Trackers & pubs bloqués' and mark the
byte figure as an estimate (~ + tooltip). Pairs with an operator allowlist update
excluding generic AWS API-gateway hosts (execute-api.*) from the ad classifier.
2026-06-26 17:42:31 +02:00
e87d46f6a7 feat(sbxwaf): inject the real SecuBox health banner (not a custom badge) into first-party HTML (#747)
Per operator intent, the WAF injects the SHARED secubox health-banner.js in its
CDN-injected mode (absolute Hub origin for the asset + metrics APIs via the
window.SECUBOX_* overrides) so the SAME health widget the dashboard shows mounts
on first-party content sites (chess.maegia.tv et al.) — NOT a bespoke badge nor
the toolbox/mitm kbin transparency banner. Skips pages that already ship the
banner; --widget-hosts now includes maegia.tv; --health-banner-origin configures
the Hub. CORS on the metrics API (access-control-allow-origin: *) is already set.
2026-06-26 17:33:53 +02:00
efac8cec16 feat(sbxwaf): inject SecuBox health/visit widget into first-party HTML (#747)
On operator-configured first-party host suffixes (--widget-hosts), the WAF injects
a discreet fixed-corner badge into text/html responses showing the live visit
counter + a protected mark. Decompression-aware (gzip/br/zstd), idempotent, strictly
fail-open (missing </body>, oversize, decode error → original bytes untouched).
Wired into both reverse-proxy ModifyResponse paths (cached + fallback).
2026-06-26 17:26:06 +02:00
9561cb4bdb fix(toolbox): Live metrics read cumulative stats (events table is empty under R3) (ref #744)
The toolbox.db  table was fed by the OLD Python mitmproxy addons; the R3
path is Go sbxmitm → relay → sidecars → cumulative, so that table is empty and the
Live-metrics panel showed all zeros. Fall back to the cumulative per-source totals
(cookies/ja4) when the events table is empty, and derive mitm.connections from the
ja4 handshake count (the cumulative has no 'dpi' key, so the old probe was always 0).
2026-06-26 17:18:46 +02:00
344bb0738d fix(crowdsec): nftables health detects custom secubox_blacklist table (firewall reported OK)
The firewall-bouncer uses a CUSTOM nft table (inet secubox_blacklist), not the
upstream default ip crowdsec / ip6 crowdsec6, so the legacy probe always reported
nftables not OK — propagating a false 'nftables firewall: not OK' to the
security-posture scorecard while the firewall (inet filter, default-drop) was
active. Detect the custom + default names; base nftables_ok on the general
SecuBox firewall being loaded, not the IPv6 anchor.
2026-06-26 17:14:56 +02:00
b54b5383cd fix(waf): show fresh engine data — gate CrowdSec overlay + dashboard tabs (ref #744)
The CrowdSec overlay existed because the OLD Python mitmproxy WAF log was usually
empty; the Go sbxwaf engine now writes a rich threat log, so the overlay was
clobbering the engine's fresh categories and pushing a stale '1h ago' entry onto
the live attack banner. Only overlay when the engine produced nothing. Also move
Tracked Attackers + Visits into dashboard tabs (Menaces / Attaquants / Visites).
2026-06-26 17:09:07 +02:00
23788e304b feat(waf-webui): Visits panel — client type / OS / geo / vhost bars (#747) 2026-06-26 16:59:49 +02:00
3b28f84591 feat(sbxwaf+waf-api): non-attacker visit statistics (client type/OS/geo) (ref #744 #747)
sbxwaf aggregates LEGITIMATE (non-blocked) traffic in memory — total, client-type
(browser/mobile-app/bot/crawler via UA), OS, per-vhost, status bucket, top client
IPs — and flushes a JSON snapshot every 30s (double-caching: hot path only bumps
counters). New --visits-stats flag. The WAF API /visits endpoint reads the snapshot
and geo-maps the top IPs (it holds the GeoIP DB) for the dashboard, no per-request
PII stored. A statusRecorder in the handler tallies every served response and
excludes the WAF-block 403 and unmapped 421.
2026-06-26 16:56:21 +02:00
e5f0d22dc6 fix(waf-api): crowdsec overlay must not crash the warm refresh (ref #744)
_overlay_crowdsec_stats received the already-decorated stats (top_countries as a
list of {country,count} dicts, not a dict) and did sorted(key=lambda x:-x[1]) →
TypeError on a str → aborted _refresh_warm_caches, freezing the warm cache and
zeroing the WAF dashboard's per-IP/vhost/tracked-attacker panels. Accept the
list shape, coerce counts to int, and wrap the overlay call defensively so a
CrowdSec hiccup can never wipe the WAF stats.
2026-06-26 16:18:58 +02:00
c6d6eb5c75 Merge #744: sbxwaf Go WAF engine + shared internal/ core
# Conflicts:
#	packages/secubox-toolbox-ng/cmd/sbxmitm/compress_test.go
#	packages/secubox-toolbox-ng/cmd/sbxmitm/cosmetic_test.go
#	packages/secubox-toolbox-ng/cmd/sbxmitm/gzip.go
#	packages/secubox-toolbox-ng/cmd/sbxmitm/gzip_test.go
2026-06-26 16:02:35 +02:00
2e6cec9b38 fix(waf-api): read sbxwaf threat-log path for the WAF dashboard (ref #744)
The Go sbxwaf engine writes the threat log to the sandboxed leaf dir
/var/log/secubox/waf/waf-threats.log; the dashboard's /stats + /alerts now
resolve that path (env-overridable, legacy-path fallback) so the WAF WebUI
shows real engine data again after the cutover.
2026-06-26 15:55:15 +02:00
b607d7f7d6 docs(waf-ng): package README (WIP) + ignore build artifacts (ref #744) 2026-06-26 15:31:26 +02:00
e5a2c5d287 fix(sbxwaf): final-review wave — vhost cache key, crowdsec LAPI url, body-inspect cap+audit, seccomp, trusted-host skip (ref #744)
Fix 1 (media-cache vhost isolation): key MediaCache.Get/MaybeStore on
"https://"+Host+RequestURI() instead of r.URL.String() (path-only on
server requests). Two vhosts sharing /logo.png no longer collide.
MaybeStore gains explicit cacheURL arg; all callers updated.
Add TestMediaCacheVhostIsolation: store hostA/x.png, assert hostB/x.png
→ MISS.

Fix 2 (CrowdSec self-loop): secubox-waf-ng-worker@.service --crowdsec-url
was http://127.0.0.1:8080 — the nftables DNAT VIP that fans requests back
into the workers themselves. Changed to http://10.100.0.1:8080 (LXC-bridge
LAPI, same as Python addon). Added blocking unit comment + CUTOVER.md §1.2
crowdsec-url self-loop check.

Fix 3 (body-inspect cap + audit):
- maxBodyInspect const → defaultMaxBodyInspect; Server gains maxBodyInspect
  field wired from new --max-body-inspect flag (default 1 MiB, operator-
  tunable).
- When body read returns exactly cap bytes (truncated), emit AUDIT log line
  (action=body-inspect-truncated) to threat log + stderr so truncation is
  operator-visible; request is never blocked on audit.
- Added known_gap_body_payload_after_1mib_prefix parity fixture documenting
  the prefix-bounded inspection gap with honest note.
- Added CUTOVER.md §1.6 "body inspection cap" gate with operator sign-off
  checklist and three mitigation options.

Fix 4 (seccomp/hardening): secubox-waf-ng-worker@.service was missing
SystemCallFilter, SystemCallArchitectures, ProtectKernelTunables/Modules/
Logs, ProtectControlGroups, RestrictNamespaces, LockPersonality,
RestrictSUIDSGID, RestrictRealtime, MemoryDenyWriteExecute, PrivateDevices.
Added full set matching project convention (secubox-mesh.service) as
mandated by spec §6/CSPN.

Fix 5 (trusted-host whitelist): Server gains trustedHosts map + isTrustedHost
method. --waf-skip-hosts flag (default: git.gk2.secubox.in, git.secubox.in,
admin.gk2.secubox.in, 10.100.0.1:9080) mirrors Python check_request whitelist
(secubox_waf.py:761-763). Trusted hosts bypass WAF inspection before rule
matching. Add TestTrustedHostSkipsWAF (with sanity check that untrusted host
is still blocked) and TestIsTrustedHost/TestParseTrustedHosts unit tests.
2026-06-26 15:27:10 +02:00
11438e394c docs(sbxwaf): bench harness + cutover/rollback runbook with parity-gap gates (ref #744)
- scripts/sbxwaf-bench.sh: wrk/hey bench harness against legacy mitmproxy
  (10.100.0.60:8080) and shadow sbxwaf (127.0.0.1:8081); captures req/s,
  p99, RSS; prints comparison table with PASS/FAIL for all 3 go/no-go gates
  (>5× req/s·core, p99<1/3, RSS<1/4); shellcheck clean.

- packages/secubox-waf-ng/docs/CUTOVER.md: operator runbook with 6 sections:
  pre-cutover checklist (CA, CrowdSec JWT, COMPLETE log4shell corpus,
  null-byte \x00 fix, goform FP fix, parity green), shadow-run procedure,
  go/no-go gate table, exact HAProxy server re-point + nftables DNAT topology,
  single-edit rollback, and post-cutover monitoring (threat log, cookie-audit,
  RuntimeDirectoryPreserve guarantee, CrowdSec JWT rotation constraint,
  Python WAF unescaped-Host XSS backport note, body URL-decode limitation).
2026-06-26 15:02:18 +02:00
3f24034c37 test(sbxwaf): make log4shell corpus gap loud + isolate per-fixture ban state (ref #744)
Fix 1 (medium): add known_gap fixture log4shell_jndi_corpus_gap to waf-parity-fixtures.json.
  The Go corpus (secubox-waf/config/waf-rules.json) is missing the log4shell category
  present in the Python corpus (secubox-mitmproxy/data/waf-rules.json); jndi:ldap payloads
  are missed silently by Go. The fixture emits a visible KNOWN GAP t.Logf line in CI so
  the coverage gap is never silent. expect=allow (current Go gap behaviour).

Fix 2 (low): sqli_union_all_select client_ip changed from 45.33.32.156 to 45.33.32.158.
  Two warn-fixtures previously shared IP 45.33.32.156 (sqli_union_all_select +
  scanner_sqlmap_ua), accumulating ban count to 2; a third reuse would silently flip
  verdict to ban. Each warn fixture now uses a distinct IP so per-fixture verdicts are
  independent. Comment added to parity_test.go explaining the shared-Ban lifecycle
  (ban-sequence fixtures are the only intentional cross-fixture accumulation).

Result: 52 fixtures, 52 pass, 3 known-gap lines visible, go build clean.
2026-06-26 14:54:17 +02:00
7814bee861 test(sbxwaf): decision parity harness vs mitmproxy (ref #744)
51-fixture corpus (testdata/waf-parity-fixtures.json) covering:
- allow (6): benign GET/POST, URL-encoded body not decoded by engine
- warn (33): SQLi, XSS, LFI, RCE, scanners, honeypots, CVEs, router botnet,
  URL-encoded path+query attacks (proves unquote_plus decode)
- ban (1): 3-hit sequence for ip 198.51.100.99 reaching threshold
- skip (11): static assets, /health, NC bypass paths, RFC1918 IPs
- known_gap (2): router-goform unanchored ';' FP + RE2 null-byte CVE patterns

TestWAFParity calls the REAL decision path (privateCIDR / staticAsset /
ncBypass / Rules.Match / Ban.Record) — no re-implementation in the test.
Findings during harness construction:
- router-goform FP: shared Python+Go pattern bug (';' in common UAs), NOT
  a parity regression — documented as known_gap
- Log4Shell gap: secubox-waf/config/waf-rules.json (Go corpus) is missing
  the log4shell category present in the Python mitmproxy rules — corpus gap
- Body URL-decoding asymmetry: Rules.Match does not decode body (only
  path+query); documented with two fixture variants
- 5 null-byte RE2 patterns skipped at compile: cve-ast-2022-42706,
  cve-ast-2023-37457, cve-opensips-2023-49323, cve-prosody-2022-0217,
  cve-strophe-2022-29168 — documented as known_gap

go test ./cmd/sbxwaf/ -run TestWAFParity -v: 51 pass, 0 fail
go build ./...: clean
2026-06-26 14:45:13 +02:00
16a4e6e63d fix(packaging): scope sbxwaf sandbox to WAF-owned leaf dirs (ref #744)
M1: move --threat-log default from /var/log/secubox/waf-threats.log to
/var/log/secubox/waf/waf-threats.log (WAF-owned leaf); update ExecStart
and postinst to create the leaf dir (secubox-waf:secubox-waf 0750);
narrow ReadWritePaths from /var/log/secubox to leaf dirs only
(/var/log/secubox/waf /var/log/secubox/cookie-audit).

L1: fix AppArmor profile — /var/log/secubox/ parent entry changed from
rw to r (traverse only); add explicit rw entries for both leaf dirs
(/var/log/secubox/waf/** and /var/log/secubox/cookie-audit/**).
2026-06-26 14:31:35 +02:00
e275f730ec feat(packaging): secubox-waf-ng deb + hardened systemd worker + AppArmor (ref #744)
packages/secubox-waf-ng/:
- debian/control: Architecture: arm64, Standards-Version: 4.6.2, compat 13
- debian/rules: cross-build sbxwaf from secubox-toolbox-ng Go module
  (GOOS=linux GOARCH=arm64 CGO_ENABLED=0 -mod=vendor, execute_after_dh_auto_install)
- systemd/secubox-waf-ng-worker@.service: User=secubox-waf,
  RuntimeDirectory=secubox + RuntimeDirectoryPreserve=yes (#741 socket-wipe fix),
  NoNewPrivileges, ProtectSystem=strict, ProtectHome, PrivateTmp,
  CapabilityBoundingSet= (drop all), RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX,
  MemoryMax=256M, listens on 127.0.0.1:808%i (instances 1+2)
- debian/secubox-waf-ng.apparmor: enforce profile for /usr/sbin/sbxwaf
  (rw log/cache/run, r config/secrets, deny-all else)
- debian/postinst: adduser secubox-waf --system --group, leaf dirs only
  (NEVER chmod shared parents /etc/secubox /var/log/secubox /var/cache/secubox
  to 0750 — traversal constraint from #511/#620), aa-enforce, systemctl enable+start @1+2
- debian/prerm: stop+disable @1+2 workers

Build: dpkg-buildpackage -a arm64 -us -uc -b -d produced
secubox-waf-ng_1.0.0-1~bookworm1_arm64.deb (2.0M, stripped arm64 binary 5.5M)
containing /usr/sbin/sbxwaf, /lib/systemd/system/secubox-waf-ng-worker@.service,
/etc/apparmor.d/usr.sbin.sbxwaf.
2026-06-26 14:25:01 +02:00
49edf6670a fix(sbxwaf): HTML-escape Host in error pages — prevent reflected XSS (ref #744)
errorPage(code, host) was substituting r.Host verbatim into the 502/504
templates, allowing an attacker to inject arbitrary HTML via a crafted
Host header. Apply html.EscapeString before substitution.

Add TestErrorPageEscapesHost (asserts raw payload absent + escaped form
present) and TestErrorPageSubstitutesHostNormal (safe hosts unchanged).
2026-06-26 14:15:57 +02:00
ae930c0347 feat(sbxwaf): synthetic themed error pages on upstream failure (ref #744)
Port the Python secubox_waf.py error() hook (~line 1096) to Go:
- 502 Bad Gateway    — connection refused/dial failure → error-502.html
- 503 Svc Unavail    — all other errors               → error-503.html
- 504 Gateway Timeout — net.Error.Timeout()           → error-504.html

Templates embedded via //go:embed (templates/*.html), verbatim copies
from Python ERROR_502_PAGE / ERROR_503_PAGE; error-504.html is 502 with
"502"→"504" and "Bad Gateway"→"Gateway Timeout" (mirrors Python in-place
replace). {host} and {time} placeholders substituted at request time.

upstreamErrorCode() helper maps net.Error → 502/503/504. ErrorHandler wired
in both the fallback proxy path (main.go) and cached proxy path (routes.go).

TDD: 5 new tests (errorPage substitution, all codes, unknown-code fallback,
themed 502 on dead backend, 504 on timeout). Full suite green (2.2s).
2026-06-26 14:09:51 +02:00
c3940a2958 fix(sbxwaf): media-cache Flusher passthrough + oversize handler test (ref #744)
- Fix I2: implement http.Flusher on cachingResponseWriter so
  httputil.ReverseProxy can flush chunks incrementally to the client
  (progressive video / PeerTube streaming); pure pass-through, does not
  affect cache buffer capture.
- Fix M1: add TestMediaCacheHandlerOversizeStreamsFullBody — end-to-end
  regression guard that a >16 MiB video/mp4 response streams its FULL
  body (byte count + SHA-256 checksum) to the client and is NOT cached;
  passes against current code, race-clean.
- Fix M2: document the mtime-as-atime LRU choice at the loadIndex site
  so a future reader understands why ModTime() is used instead of atime
  (relatime suppresses atime on most Linux filesystems; mtime is set
  explicitly via os.Chtimes on every Get hit).
2026-06-26 14:04:15 +02:00
f1573c37d2 feat(sbxwaf): media-cache (16MB/obj, 2GB total) (ref #744)
Add disk-backed response media cache ported from media_cache.py:
- GET image/video/audio/font/css/js responses cached under sha256(url)
- On-disk layout: <dir>/<key[:2]>/<key> + <key>.m sidecar (body+meta)
- LRU eviction by atime when total exceeds 2 GiB cap
- TTL from max-age (or 1h default); nowFn seam for deterministic tests
- Cache hit short-circuits upstream; miss captures via cachingResponseWriter
- Oversize (>16 MiB) bodies not stored but streamed fully to client
- --media-cache-dir flag (default /var/cache/secubox/waf/media; empty = off)
- 11 TDD tests: store/get, non-media reject, oversize reject, expiry,
  handler hit, no-store skip, stats, eviction, non-GET, persistence, miss-stores
2026-06-26 13:56:57 +02:00
85b508d4f2 fix(sbxwaf): atomic cookie-audit drop counter (race-free) (ref #744) 2026-06-26 13:50:50 +02:00
f06cb2dc28 feat(sbxwaf): RGPD cookie-audit JSONL ledger (ref #744)
Add cmd/sbxwaf/cookieaudit.go: CookieAudit struct with buffered async
channel (256), single writer goroutine, non-blocking Record (drop-on-full),
SHA256-hashed values, raw Set-Cookie parsed directly for SameSite support.
Wire --cookie-audit-log flag into Server and Routes ModifyResponse.
2026-06-26 13:47:23 +02:00
a85668f39d feat(sbxwaf): CrowdSec LAPI alert bridge on ban (ref #744)
- Add cmd/sbxwaf/crowdsec.go: CrowdSecClient satisfying CrowdSecReporter;
  POSTs a LAPI /v1/alerts JSON array (ported from secubox_waf.py
  _ban_via_crowdsec) with 2s timeout, no redirect following (SSRF hygiene),
  best-effort error handling (log + return, never block/panic).
- Add cmd/sbxwaf/crowdsec_test.go: TDD — TestCrowdSecAlertPayload (httptest
  capture of POST /v1/alerts, bearer token, payload fields) +
  TestCrowdSecBestEffortOnError (dead URL, no panic).
- Wire flags in main(): --crowdsec-url, --crowdsec-jwt-file (secret read from
  file, not argv), --crowdsec-ban-duration (default 4h); wires srv.crowdsec
  when both url+jwt-file are set; leaves nil otherwise (bridge disabled).
2026-06-26 13:39:22 +02:00
64258b98d8 feat(sbxwaf): graduated WARNING/BAN responses + threat log (ref #744)
Task 3.2: wire ban.Record into the handler for graduated 403 responses:
- count < threshold → writeWarning (403, cyberpunk WARNING_PAGE port, X-SecuBox-WAF: warning)
- count >= threshold → writeBan (403, ban page, X-SecuBox-WAF: banned)
- ThreatLog appends one NDJSON line per hit to /var/log/secubox/waf-threats.log
  (O_APPEND|O_CREATE, 0640, best-effort — never crashes the request path)
- Server gains ban *Ban + threatLog *ThreatLog + crowdsec CrowdSecReporter (nil seam for Task 4.1)
- main() wires NewBan(300s,3) + NewThreatLog(--threat-log) at startup
- CrowdSecReporter interface + go crowdsec.Report() call ready for Task 4.1 to slot in
2026-06-26 13:34:02 +02:00
8ea14e660a feat(sbxwaf): sliding-window graduated ban (ref #744)
Adds cmd/sbxwaf/ban.go: Ban struct with NewBan(window, threshold) and
Record(ip, nowUnix) that prunes stale hits, counts within the window,
and returns (count, banned) — mirrors Python BAN_THRESHOLD=3/BAN_WINDOW=300s.
Map capped at 100k IPs to bound memory under flood. Tests: TDD pass 3/3.
2026-06-26 13:27:09 +02:00
4334f93edc fix(sbxwaf): forward full request body intact, cap only inspection (ref #744)
Finding 1 (data corruption): replace LimitReader-restore pattern with a
streaming MultiReader approach: read up to maxBodyInspect (1 MiB) into a
prefix buffer for WAF inspection, then restore r.Body as
io.MultiReader(bytes.NewReader(prefix), r.Body) so the upstream proxy
receives every byte intact. Large uploads (PeerTube / Nextcloud) no longer
get truncated at 1 MiB.

Finding 2 (dead code): remove healthPath() from inspect.go — it was never
called; its logic is fully covered by staticAsset().

Tests added:
- TestInspectLargeBodyForwardedIntact: POST 1 MiB + 4 KiB → backend receives
  full body byte-for-byte (regression test for the truncation bug).
- TestInspectLargeBodyAttackInFirstMiB: attack in first 1 MiB of large body
  is still blocked (streaming inspection still works).
2026-06-26 13:23:42 +02:00
02b1c7a461 feat(sbxwaf): request inspection + CIDR/static/NC skip-lists (ref #744)
Wire Rules.Match into the HTTP handler (Task 2.2):
- inspect.go: privateCIDR (RFC1918+loopback), staticAsset (13 exts + health),
  ncBypass (/index.php/login/v2/ + /ocs/v2.php/core/login), clientIP (XFF
  trusted only when peer ∈ TRUSTED_PROXIES), maxBodyInspect=1MiB constant.
- main.go: Server.rules *Rules field; handler reads body capped at 1MiB,
  restores via io.NopCloser(bytes.NewReader) before proxying; WAF hit → 403;
  Connection: close added to upstream requests (#496).
- main() wires LoadRules(*rules) when --rules flag is provided.

Ports faithfully from secubox_waf.py: _is_whitelisted/_WL_NETS (lines 28-47),
get_real_client_ip (lines 193-219), check_request fast-path (lines 764-769).

Tests: 5 new (BlocksAttack, PrivateIPBypass, StaticAssetSkip, NCBypass,
BodyForwarded) + 14 existing = 19/19 PASS.
2026-06-26 13:18:56 +02:00
efb390b713 feat(sbxwaf): regex WAF rule engine from waf-rules.json (ref #744) 2026-06-26 13:09:23 +02:00
f2bdef341c fix(sbxwaf): inject shared transport into all route proxies (ref #744)
- LoadRoutes(path, transport http.RoundTripper) — transport now required at
  load time; nil falls back to http.DefaultTransport gracefully
- buildEntries: removes the r.transport != nil guard — transport is always
  set at Routes construction, never post-hoc
- Server gains a transport http.RoundTripper field; main() constructs the
  tuned *http.Transport (dial timeout + pool settings) BEFORE LoadRoutes so
  startup-built proxies share the same pool as reload-built ones
- handler() uses s.transport when available; falls back to a local transport
  only for test Servers that don't inject one (backwards-compat)
- main(): removed the post-hoc s.routes.transport = transport assignment
- routes_test.go: adds TestRoutesInjectedTransportUsed — sentinel transport
  proves startup-built proxies use the injected transport, not DefaultTransport
- Existing TestRoutesLookup / TestRoutesLookupCaseAndPort / TestRoutesHotReload
  updated to pass nil transport (all still pass)
2026-06-26 13:03:25 +02:00
bd6b7c3ebf feat(sbxwaf): haproxy-routes.json loader + hot-reload + cached reverse-proxy (ref #744)
- New routes.go: Routes struct with RW-locked map, LoadRoutes() parses
  haproxy-routes.json ({"host": ["ip", port]}), skips malformed entries
  without panicking, Lookup() lowercases+strips-port before probe.

- Hot-reload via internal/reload.Watcher (throttle=0, mirrors policy.go):
  Target.Load re-parses file, Target.Apply atomically swaps entries map
  under mu.Lock. Routes.Maybe() called once per request in handler().

- Perf: *httputil.ReverseProxy built once per ip:port backend at load/reload
  time (sync.Map proxyCache keyed by "ip:port"), never per-request. Shared
  *http.Transport injected from Server.handler() so all backends share one
  connection pool.

- main.go: --routes flag now calls LoadRoutes, sets srv.routes + srv.routeLookup;
  handler uses ProxyFor() for cached proxy, falls back to one-off only when
  routes is nil (test injection path).

- Tests: 5/5 PASS — Lookup, case/port normalisation, hot-reload with
  os.Chtimes bump + Maybe() trigger.
2026-06-26 12:50:47 +02:00
d747b705ce feat(sbxwaf): reverse-proxy skeleton + listener (ref #744)
- cmd/sbxwaf/main.go: Server struct with routeLookup func field, handler()
  reverse-proxying via httputil.ReverseProxy, X-SecuBox-WAF: inspected stamp,
  421 for unmapped hosts, flags (--listen, --ca-cert, --ca-key, --routes,
  --rules, --upstream-timeout), lazy CA load via forge.LoadCA.
- cmd/sbxwaf/main_test.go: TestProxyPassthrough (200+body+header) and
  TestProxyUnmapped (421) — both green; go build ./... clean.
2026-06-26 12:41:51 +02:00
dacafcfdee refactor(toolbox-ng): extract internal/reload (ref #744)
Extract the mtime hot-reload pattern from cmd/sbxmitm/policy.go into a
generic, Policy-agnostic internal/reload package (Target/Watcher/StatMtime/
LoadLines). policy.go rewired to use reload.Watcher; private reloadTarget
struct, maybeReload body, statMtime, scanLines/loadLines/loadLinesRaw removed.
Policy retains its own throttle gate (reloadThrottle/reloadMu) so existing
reload_test.go field mutations compile unchanged; the Watcher runs with
throttle=0 and is gated by Policy.maybeReload().

7 new tests in internal/reload (basic, throttle, stat, strip-comments,
missing-file, multi-target, concurrent/race). All parity fixtures + reload
tests green: go test ./internal/reload/ ./cmd/sbxmitm/ -count=1 -race PASS.
2026-06-26 12:36:50 +02:00
e47cd115fd refactor(toolbox-ng): extract internal/relay (ref #744)
Move emit/emitSync/emitTimeout from cmd/sbxmitm/sidecar.go into the new
package internal/relay as Emit/EmitSync/EmitTimeout, so cmd/sbxwaf can
reuse the fire-and-forget unix-socket POST transport without duplication.

- internal/relay/relay.go: Emit (detached goroutine), EmitSync (2s timeout,
  synchronous), EmitTimeout const — pure stdlib, no new go.sum entries
- internal/relay/relay_test.go: TDD tests — unix echo server, asserts
  request-line "POST /ingest HTTP/1.1" + exact body
- cmd/sbxmitm/relay.go: relayEmit now calls relay.Emit
- cmd/sbxmitm/sidecar.go: declarations removed, retained as doc/comment file
- cmd/sbxmitm/sidecar_test.go: rewired to relay.EmitSync/relay.Emit/relay.EmitTimeout

go test ./internal/relay/ ./cmd/sbxmitm/ -count=1: PASS (both packages)
go build ./...: clean
2026-06-26 12:28:29 +02:00
18e625fd88 refactor(toolbox-ng): extract internal/httpcodec (ref #744)
Move gzip/br/zstd codec primitives from cmd/sbxmitm/gzip.go into a new
shared package internal/httpcodec (GunzipBytes, GzipBytes, UnbrotliBytes,
BrotliBytes, UnzstdBytes, ZstdBytes + Decode/Encode dispatchers).
Rewire cmd/sbxmitm/gzip.go and all three test files to call httpcodec.*.
No behaviour change; cmd/sbxmitm still builds and all tests pass.
2026-06-26 12:21:40 +02:00
8e1f8f2155 refactor(toolbox-ng): extract internal/forge from sbxmitm (ref #744)
Move CA, loadCA→LoadCA, forge→Forge, firstPEMBlock, parseKey from
cmd/sbxmitm/main.go into a new shared package internal/forge so that
the future cmd/sbxwaf can reuse them without duplication.

No behaviour change: cmd/sbxmitm wires in forge.LoadCA / ca.Forge;
ca.cert (unexported) becomes forge.CA.Cert (exported, needed by tests
and future callers). Both suites green:
  go test ./internal/forge/ ./cmd/sbxmitm/ -count=1
2026-06-26 12:12:50 +02:00
ccf6d45a08 docs(plan): WAF→Go sbxwaf implementation plan, 10 phases TDD (ref #744) 2026-06-26 12:04:25 +02:00
6ec92bd29d docs(spec): WAF→Go bench targets to >5×/p99<⅓, blocking go/no-go (ref #744) 2026-06-26 11:52:44 +02:00
0b2094f43f docs(spec): WAF→Go sbxwaf host-native replacement design (ref #744)
Brainstorming-validated design: perf-driven complete replacement of the WAF
mitmproxy/mitmdump inspection layer by a dedicated host-native Go binary sbxwaf,
sharing an extracted core with sbxmitm. Covers architecture, component isolation,
full feature port (routing/rules/ban/CrowdSec/cookie-audit/media-cache/error
pages), host-native hardening, and shadow→parity→cutover→rollback migration.
2026-06-26 09:31:59 +02:00
75 changed files with 12377 additions and 542 deletions

View File

@ -0,0 +1,481 @@
# Cookies cross-site tracker detection — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Surface the already-computed R3 cross-site tracker correlation (`social_edges`) to the operator as a detailed view in the secubox-cookies dashboard.
**Architecture:** A read-only aggregation function in the toolbox (`social.py`, next to `aggregate()`) folds `social_edges` into per-tracker cross-site detail; a toolbox endpoint `GET /admin/cookie-crosssite` exposes it (mirrors `/admin/social-aggregate`); the cookies dashboard adds a "Trackers cross-site" card whose JS fetches that endpoint directly (operator browser carries the JWT). No new service, no new dependency.
**Tech Stack:** Python 3.11 / FastAPI / sqlite3 (toolbox), vanilla HTML/JS (cookies dashboard), pytest.
## Global Constraints
- New Python files carry the SPDX header: `# SPDX-License-Identifier: LicenseRef-CMSD-1.0` + the CyberMind copyright block (copy from any sibling file in the module).
- Read-only over `social_edges`. No writes, no migration. Filter out `src_site IN ('', 'null')` at read time.
- Reuse `social._conn()`, `social._registrable_domain()`, `social._is_ip()` — do NOT reimplement.
- The new endpoint mirrors `admin_social_aggregate` exactly: no explicit `Depends` (admin gating is handled at the same layer as its siblings).
- Frontend fetch uses the existing `headers()` helper (Bearer `sbx_token`) and targets the absolute toolbox path `/api/v1/toolbox/admin/cookie-crosssite` (NOT the cookies `API` base).
- Commit messages reference `(ref #749)`. No Claude Code references / footers in commits.
---
### Task 1: Toolbox cross-site aggregation in `social.py`
**Files:**
- Modify: `packages/secubox-toolbox/secubox_toolbox/social.py` (add two functions next to `aggregate()` ~line 1025)
- Test: `packages/secubox-toolbox/tests/test_cookie_xsite_detail.py` (create)
**Interfaces:**
- Consumes: `social._conn()`, `social._registrable_domain(host)`, `social._is_ip(host)` (existing).
- Produces:
- `_xsite_detail_from_conn(conn, since: int, top_n: int) -> list[dict]` — pure, over a conn. Each dict: `{tracker_domain:str, sites:list[str], site_count:int, client_count:int, cookie_count:int, pre_consent_hits:int, last_seen:int}`.
- `cookie_xsite_detail(hours: int = 24, top_n: int = 50) -> dict` — envelope `{window_hours:int, generated_at:int, trackers:list[dict]}`.
- [ ] **Step 1: Write the failing test**
Create `packages/secubox-toolbox/tests/test_cookie_xsite_detail.py`:
```python
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
"""Tests for social.cookie_xsite_detail / _xsite_detail_from_conn (ref #749)."""
import sqlite3
from secubox_toolbox import social
def _edges_db():
c = sqlite3.connect(":memory:")
c.row_factory = sqlite3.Row
c.executescript("""
CREATE TABLE social_edges (
ts INTEGER, client_mac_hash TEXT, src_site TEXT,
tracker_domain TEXT, cookie_id_hash TEXT, ja4_hash TEXT,
consent_state TEXT DEFAULT 'none_seen');
""")
return c
def _add(c, ts, client, site, tracker, cid, consent="pre_consent"):
c.execute("INSERT INTO social_edges(ts,client_mac_hash,src_site,"
"tracker_domain,cookie_id_hash,ja4_hash,consent_state) "
"VALUES (?,?,?,?,?,'ja4',?)",
(ts, client, site, tracker, cid, consent))
def test_crosssite_tracker_detected_with_detail():
c = _edges_db()
# same cookie id reused across 2 distinct sites -> cross-site
_add(c, 100, "m1", "news.example", "www.criteo.com", "CID1")
_add(c, 200, "m2", "shop.example2", "www.criteo.com", "CID1", consent="post_consent")
c.commit()
rows = social._xsite_detail_from_conn(c, since=0, top_n=50)
assert len(rows) == 1
t = rows[0]
assert t["tracker_domain"] == "criteo.com"
assert t["site_count"] == 2
assert sorted(t["sites"]) == ["news.example", "shop.example2"]
assert t["client_count"] == 2
assert t["cookie_count"] == 1
assert t["pre_consent_hits"] == 1
assert t["last_seen"] == 200
def test_single_site_cookie_ignored():
c = _edges_db()
_add(c, 100, "m1", "news.example", "tracker.foo", "CID2")
_add(c, 110, "m1", "news.example", "tracker.foo", "CID2")
c.commit()
assert social._xsite_detail_from_conn(c, since=0, top_n=50) == []
def test_null_and_empty_src_site_excluded():
c = _edges_db()
_add(c, 100, "m1", "null", "t.bar", "CID3")
_add(c, 110, "m1", "", "t.bar", "CID3")
_add(c, 120, "m1", "real.site", "t.bar", "CID3")
c.commit()
# only one VALID site remains for CID3 -> not cross-site
assert social._xsite_detail_from_conn(c, since=0, top_n=50) == []
def test_window_filters_old_edges():
c = _edges_db()
_add(c, 100, "m1", "a.example", "t.win", "CIDW")
_add(c, 200, "m1", "b.example2", "t.win", "CIDW")
c.commit()
assert social._xsite_detail_from_conn(c, since=150, top_n=50) == []
def test_ip_literal_tracker_dropped():
c = _edges_db()
_add(c, 100, "m1", "a.example", "192.0.2.5", "CIDIP")
_add(c, 200, "m1", "b.example2", "192.0.2.5", "CIDIP")
c.commit()
assert social._xsite_detail_from_conn(c, since=0, top_n=50) == []
def test_ranking_and_top_n_cap():
c = _edges_db()
# tracker A: 2 clients ; tracker B: 1 client -> A ranks first
_add(c, 100, "m1", "s1.x", "a.trk", "A1"); _add(c, 110, "m2", "s2.x", "a.trk", "A1")
_add(c, 120, "m1", "s1.x", "b.trk", "B1"); _add(c, 130, "m1", "s2.x", "b.trk", "B1")
c.commit()
rows = social._xsite_detail_from_conn(c, since=0, top_n=1)
assert len(rows) == 1
assert rows[0]["tracker_domain"] == "trk" # registrable of a.trk/b.trk
def test_envelope_shape_via_conn(monkeypatch):
c = _edges_db()
_add(c, 100, "m1", "news.example", "www.criteo.com", "CID1")
_add(c, 200, "m2", "shop.example2", "www.criteo.com", "CID1")
c.commit()
class _Ctx:
def __enter__(self): return c
def __exit__(self, *a): return False
monkeypatch.setattr(social, "_conn", lambda: _Ctx())
out = social.cookie_xsite_detail(hours=24, top_n=50)
assert out["window_hours"] == 24
assert isinstance(out["generated_at"], int)
assert out["trackers"][0]["tracker_domain"] == "criteo.com"
```
- [ ] **Step 2: Run the test to verify it fails**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_cookie_xsite_detail.py -v`
Expected: FAIL — `AttributeError: module 'secubox_toolbox.social' has no attribute '_xsite_detail_from_conn'`
- [ ] **Step 3: Implement the two functions**
In `packages/secubox-toolbox/secubox_toolbox/social.py`, immediately AFTER the `aggregate()` function, add:
```python
def _xsite_detail_from_conn(conn, since: int, top_n: int) -> list:
"""Pure cross-site tracker detail over a social_edges connection.
A (tracker_domain, cookie_id_hash) pair is cross-site when its cookie id is
observed on >= 2 DISTINCT valid src_sites (src_site not in '', 'null') within
the window (ts >= since). For every such pair, aggregate per REGISTRABLE
tracker domain (IP literals dropped). Ranked by client_count, then
site_count, then domain; capped to top_n.
"""
rows = conn.execute(
"SELECT ts, client_mac_hash, src_site, tracker_domain, "
" cookie_id_hash, consent_state "
"FROM social_edges "
"WHERE ts >= ? "
" AND cookie_id_hash IS NOT NULL AND cookie_id_hash <> '' "
" AND src_site NOT IN ('', 'null') "
"LIMIT 50000",
(since,),
).fetchall()
# Pass 1: which (raw tracker_domain, cookie_id_hash) pairs are cross-site.
sites_per_pair: dict = {}
for r in rows:
key = (r["tracker_domain"], r["cookie_id_hash"])
sites_per_pair.setdefault(key, set()).add(r["src_site"])
xsite_pairs = {k for k, s in sites_per_pair.items() if len(s) >= 2}
if not xsite_pairs:
return []
# Pass 2: aggregate the cross-site rows per registrable tracker domain.
agg: dict = {}
for r in rows:
if (r["tracker_domain"], r["cookie_id_hash"]) not in xsite_pairs:
continue
dom = _registrable_domain(r["tracker_domain"])
if not dom or _is_ip(dom):
continue
e = agg.setdefault(dom, {
"tracker_domain": dom, "sites": set(), "clients": set(),
"cookies": set(), "pre_consent_hits": 0, "last_seen": 0,
})
e["sites"].add(r["src_site"])
e["clients"].add(r["client_mac_hash"])
e["cookies"].add(r["cookie_id_hash"])
if r["consent_state"] == "pre_consent":
e["pre_consent_hits"] += 1
if r["ts"] > e["last_seen"]:
e["last_seen"] = r["ts"]
out = [{
"tracker_domain": e["tracker_domain"],
"sites": sorted(e["sites"]),
"site_count": len(e["sites"]),
"client_count": len(e["clients"]),
"cookie_count": len(e["cookies"]),
"pre_consent_hits": e["pre_consent_hits"],
"last_seen": e["last_seen"],
} for e in agg.values()]
out.sort(key=lambda t: (-t["client_count"], -t["site_count"],
t["tracker_domain"]))
return out[:max(0, top_n)]
def cookie_xsite_detail(hours: int = 24, top_n: int = 50) -> Dict:
"""Operator view of cross-site tracker cookies over social_edges.
Mirrors aggregate()'s envelope shape. JWT-gated in the API layer.
"""
if hours < 1 or hours > 24 * 31:
hours = 24
if top_n < 1 or top_n > 500:
top_n = 50
now = int(time.time())
since = now - hours * 3600
out: Dict = {"window_hours": hours, "generated_at": now, "trackers": []}
try:
with _conn() as c:
out["trackers"] = _xsite_detail_from_conn(c, since, top_n)
except sqlite3.Error as e:
log.warning("cookie_xsite_detail: DB error, returning empty: %s", e)
return out
```
Note: confirm `time`, `sqlite3`, `log`, and the `Dict` typing alias are already imported at the top of `social.py` (they are — `aggregate()` uses `time` and `Dict`). If `log` is named differently in this module, match the existing logger name used elsewhere in `social.py`.
- [ ] **Step 4: Run the test to verify it passes**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_cookie_xsite_detail.py -v`
Expected: PASS (7 tests)
- [ ] **Step 5: Commit**
```bash
git add packages/secubox-toolbox/secubox_toolbox/social.py packages/secubox-toolbox/tests/test_cookie_xsite_detail.py
git commit -m "feat(toolbox): cookie_xsite_detail aggregation over social_edges (ref #749)"
```
---
### Task 2: Toolbox endpoint `GET /admin/cookie-crosssite`
**Files:**
- Modify: `packages/secubox-toolbox/secubox_toolbox/api.py` (add endpoint next to `admin_social_aggregate`)
- Test: `packages/secubox-toolbox/tests/test_cookie_crosssite_api.py` (create)
**Interfaces:**
- Consumes: `social.cookie_xsite_detail(hours, top_n)` from Task 1.
- Produces: `admin_cookie_crosssite(hours: int = 24, top: int = 50) -> dict` — returns the envelope from `cookie_xsite_detail`.
- [ ] **Step 1: Write the failing test**
Create `packages/secubox-toolbox/tests/test_cookie_crosssite_api.py`:
```python
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
"""Tests for GET /admin/cookie-crosssite (ref #749)."""
import asyncio
from secubox_toolbox import api, social
_CANNED = {
"window_hours": 24,
"generated_at": 1782000000,
"trackers": [{
"tracker_domain": "criteo.com", "sites": ["a.example", "b.example2"],
"site_count": 2, "client_count": 3, "cookie_count": 1,
"pre_consent_hits": 2, "last_seen": 1782000000,
}],
}
def test_cookie_crosssite_returns_detail(monkeypatch):
monkeypatch.setattr(social, "cookie_xsite_detail",
lambda hours=24, top_n=50, **kw: dict(_CANNED))
result = asyncio.run(api.admin_cookie_crosssite(hours=24, top=50))
assert result["trackers"][0]["tracker_domain"] == "criteo.com"
assert result["trackers"][0]["site_count"] == 2
assert result["window_hours"] == 24
def test_cookie_crosssite_forwards_params(monkeypatch):
captured = {}
def fake(hours=24, top_n=50, **kw):
captured["hours"] = hours
captured["top_n"] = top_n
return dict(_CANNED)
monkeypatch.setattr(social, "cookie_xsite_detail", fake)
asyncio.run(api.admin_cookie_crosssite(hours=12, top=10))
assert captured == {"hours": 12, "top_n": 10}
```
- [ ] **Step 2: Run the test to verify it fails**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_cookie_crosssite_api.py -v`
Expected: FAIL — `AttributeError: module 'secubox_toolbox.api' has no attribute 'admin_cookie_crosssite'`
- [ ] **Step 3: Implement the endpoint**
In `packages/secubox-toolbox/secubox_toolbox/api.py`, immediately AFTER the `admin_social_aggregate` function (~line 2870), add:
```python
@router.get("/admin/cookie-crosssite")
async def admin_cookie_crosssite(hours: int = 24, top: int = 50) -> dict:
"""Operator view : cross-site tracker cookies (a cookie id reused across
>= 2 first-party sites) with per-tracker site/client/cookie counts. Read-only
over social_edges; same admin gating as the sibling /admin/* routes.
"""
from . import social as _s
return _s.cookie_xsite_detail(hours=hours, top_n=top)
```
- [ ] **Step 4: Run the test to verify it passes**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_cookie_crosssite_api.py -v`
Expected: PASS (2 tests)
- [ ] **Step 5: Run the full toolbox social/learn test slice (no regressions)**
Run: `cd packages/secubox-toolbox && python -m pytest tests/test_cookie_xsite_detail.py tests/test_cookie_crosssite_api.py tests/test_learn.py tests/test_social_edges.py -q`
Expected: PASS (all)
- [ ] **Step 6: Commit**
```bash
git add packages/secubox-toolbox/secubox_toolbox/api.py packages/secubox-toolbox/tests/test_cookie_crosssite_api.py
git commit -m "feat(toolbox): GET /admin/cookie-crosssite endpoint (ref #749)"
```
---
### Task 3: Cookies dashboard "Trackers cross-site" panel
**Files:**
- Modify: `packages/secubox-cookies/www/cookies/index.html` (markup card in `#tab-trackers` + JS `loadCrossSite()` + wiring)
**Interfaces:**
- Consumes: `GET /api/v1/toolbox/admin/cookie-crosssite?hours=24` (Task 2), the existing `headers()` JS helper.
- Produces: a rendered table `#crosssite-table`; `loadCrossSite()` called from `switchTab('trackers')` and `refresh()`.
- [ ] **Step 1: Add the card markup**
In `packages/secubox-cookies/www/cookies/index.html`, inside `<div class="tab-content" id="tab-trackers">`, AFTER the existing "Known Tracker Patterns" `<div class="card">…</div>` (after its closing `</div>` for that card, before the `</div>` that closes `#tab-trackers`), insert:
```html
<div class="card">
<div class="card-title">
<span>🕸️ Trackers cross-site (R3)</span>
<span class="badge badge-cyan" id="crosssite-count">0</span>
</div>
<p class="empty" style="margin:0 0 .5rem">Cookies dont l'identifiant est réutilisé sur ≥2 sites first-party par le même client (source : tunnel captif R3).</p>
<table>
<thead>
<tr>
<th>Tracker</th>
<th>Sites suivis</th>
<th>Clients</th>
<th>Cookies</th>
<th>Pré-consent</th>
<th>Vu</th>
</tr>
</thead>
<tbody id="crosssite-table">
<tr><td colspan="6" class="empty">Loading...</td></tr>
</tbody>
</table>
</div>
```
- [ ] **Step 2: Add the `loadCrossSite()` JS function**
In the `<script>` block, immediately AFTER the `loadTrackers()` function (~line 758-773), add:
```javascript
async function loadCrossSite() {
const tbody = document.getElementById('crosssite-table');
const countEl = document.getElementById('crosssite-count');
try {
const res = await fetch('/api/v1/toolbox/admin/cookie-crosssite?hours=24', { headers: headers() });
if (!res.ok) throw new Error('http ' + res.status);
const data = await res.json();
const rows = (data && data.trackers) || [];
countEl.textContent = rows.length;
if (!rows.length) {
tbody.innerHTML = '<tr><td colspan="6" class="empty">Aucune donnée R3 récente — tunnel captif inactif.</td></tr>';
return;
}
tbody.innerHTML = rows.map(t => {
const sites = (t.sites || []).join(', ');
const seen = t.last_seen ? new Date(t.last_seen * 1000).toLocaleString() : '-';
const pc = t.pre_consent_hits > 0
? `<span class="badge badge-red">${t.pre_consent_hits}</span>` : '0';
return `<tr>
<td><strong>${esc(t.tracker_domain)}</strong></td>
<td><span class="badge badge-cyan" title="${esc(sites)}">${t.site_count}</span></td>
<td>${t.client_count}</td>
<td>${t.cookie_count}</td>
<td>${pc}</td>
<td style="white-space:nowrap">${esc(seen)}</td>
</tr>`;
}).join('');
} catch (e) {
countEl.textContent = '0';
tbody.innerHTML = '<tr><td colspan="6" class="empty">Source R3 indisponible.</td></tr>';
}
}
function esc(s) {
return String(s == null ? '' : s).replace(/[&<>"']/g, c => (
{ '&': '&amp;', '<': '&lt;', '>': '&gt;', '"': '&quot;', "'": '&#39;' }[c]));
}
```
Note: if an `esc()` (HTML-escape) helper already exists in this `<script>`, do NOT add a second one — reuse the existing one and drop the `esc` definition above.
- [ ] **Step 3: Wire `loadCrossSite()` into tab switch and refresh**
In `switchTab(tab)`, find `case 'trackers': loadTrackers(); break;` and change it to:
```javascript
case 'trackers': loadTrackers(); loadCrossSite(); break;
```
In `refresh()` (~line 943), add a `loadCrossSite();` call alongside the other `loadX()` calls in that function body.
- [ ] **Step 4: Syntax-check the page JS**
Run (extracts the inline script and runs it through node's parser; expect no output / exit 0):
```bash
cd packages/secubox-cookies/www/cookies
python3 - <<'PY'
import re,sys,subprocess,tempfile,os
h=open('index.html',encoding='utf-8').read()
m=re.search(r'<script>(.*?)</script>', h, re.S)
js=m.group(1)
f=tempfile.NamedTemporaryFile('w',suffix='.js',delete=False,encoding='utf-8'); f.write(js); f.close()
r=subprocess.run(['node','--check',f.name]); os.unlink(f.name); sys.exit(r.returncode)
PY
```
Expected: exit 0 (no syntax error). If `node` is unavailable, skip and rely on the manual browser check in Step 5.
- [ ] **Step 5: Manual verification (deploy to board, then browser)**
The cookies www is served by nginx from the deployed package. To verify against the live toolbox endpoint without a full rebuild, copy the edited file to the board and open the dashboard:
```bash
scp index.html root@192.168.1.200:/usr/share/secubox/cookies/www/cookies/index.html 2>/dev/null \
|| scp index.html root@192.168.1.200:/var/www/secubox/cookies/index.html
# confirm the toolbox endpoint answers (operator must be logged in for JWT in browser):
ssh root@192.168.1.200 "curl -s -o /dev/null -w '%{http_code}\n' http://127.0.0.1:8088/admin/cookie-crosssite?hours=24"
```
Then open the cookies dashboard → **Trackers** tab → confirm the "🕸️ Trackers cross-site (R3)" card renders rows (or the graceful empty state if R3 is idle). Note: the exact nginx docroot for the cookies www is whatever `debian/install` maps `www/cookies/` to — confirm with `ssh root@192.168.1.200 'nginx -T 2>/dev/null | grep -A3 cookies'` if the scp path is uncertain.
- [ ] **Step 6: Commit**
```bash
git add packages/secubox-cookies/www/cookies/index.html
git commit -m "feat(cookies): cross-site trackers panel from toolbox R3 (ref #749)"
```
---
## Self-Review notes
- **Spec coverage:** Toolbox `cookie_xsite_detail` (Task 1) ✓; `GET /admin/cookie-crosssite` (Task 2) ✓; cookies WebUI panel + graceful R3-idle degradation (Task 3) ✓; src_site `''`/`null` filtered at read (Task 1 query) ✓; reuse of `social_edges` + `_registrable_domain`/`_is_ip` ✓; privacy (only hashes/counts/registrable domains exposed) ✓.
- **Home refinement vs spec:** the spec phrased the function as a "sibling of `cookie_xsite_trackers` (learn.py / social.py)"; this plan places it in `social.py` next to `aggregate()` because both are operator-view aggregations over `social_edges` and `aggregate()` is the closest existing pattern (envelope + `_conn` + `_registrable_domain`). This is within the spec's stated options.
- **Type consistency:** envelope keys (`window_hours`, `generated_at`, `trackers`) and row keys (`tracker_domain`, `sites`, `site_count`, `client_count`, `cookie_count`, `pre_consent_hits`, `last_seen`) are identical across Task 1 (producer), Task 2 (canned test), and Task 3 (renderer).

View File

@ -0,0 +1,358 @@
# sbxwaf — WAF Go host-native — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the Python mitmproxy WAF inspection layer with a host-native Go binary `sbxwaf` that reverse-proxies HAProxy traffic to backend vhosts while inspecting/blocking/banning, at >5× the throughput.
**Architecture:** A new `cmd/sbxwaf` in the existing `secubox-toolbox-ng` Go module, reusing a freshly-extracted shared core (`internal/forge`, `internal/relay`, `internal/httpcodec`, `internal/reload`) shared with `cmd/sbxmitm`. Net/http reverse proxy: route vhost→backend via `haproxy-routes.json`, regex WAF rules from `waf-rules.json`, sliding-window graduated ban, CrowdSec LAPI bridge, cookie-audit JSONL, media-cache, synthetic error pages. Migration is shadow→parity→cutover→rollback.
**Tech Stack:** Go 1.22 (stdlib net/http, crypto/tls, regexp), brotli/zstd (already deps), systemd, AppArmor. Spec: `docs/superpowers/specs/2026-06-26-waf-go-sbxwaf-design.md`.
## Global Constraints
- Go module: `github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng`, Go 1.22, stdlib-first (no new deps beyond brotli/zstd already present).
- Binary: `/usr/sbin/sbxwaf`; workers `secubox-waf-ng-worker@1..2`; user/group `secubox-waf` (non-priv, created in postinst).
- CA at `/etc/secubox/waf/ca/` (cert `ca-cert.pem`, key `ca.pem`); secrets `/etc/secubox/secrets/` chmod 600 owner `secubox-waf`.
- Listen `:8080` (worker `:808%i`); HAProxy backend `mitmproxy_waf` flips `server waf` IP from LXC to host on cutover.
- Routes file `/data/mitmproxy/haproxy-routes.json` → migrate to `/etc/secubox/waf/haproxy-routes.json`; rules `/etc/secubox/waf/waf-rules.json`; threat log `/var/log/secubox/waf-threats.log`; audit `/var/log/secubox/audit.log` (append-only).
- Bench go/no-go (BLOCKING): `>5× req/s·core`, `p99 < ⅓`, `RSS < ¼` vs mitmproxy 4-workers.
- Parity vs `secubox_waf.py` is BLOCKING: no detection regression. Source of truth: `packages/secubox-mitmproxy/addons/secubox_waf.py` (930 lines), `cookie_audit.py`, `media_cache.py`.
- Hardening: `NoNewPrivileges`, `ProtectSystem=strict` + minimal `ReadWritePaths`, drop caps, `RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX`, AppArmor enforce profile in `debian/`.
- SPDX header `LicenseRef-CMSD-1.0` on every new file (per `.claude/CLAUDE.md`); commit messages end without Claude footer.
---
## Phase 0 — Shared core extraction (refactor, no behaviour change)
Extract reusable primitives from `cmd/sbxmitm` into `internal/` packages consumed by BOTH cmds. After each task `cmd/sbxmitm` must still build + pass its tests (no behaviour change).
### Task 0.1: Extract `internal/forge` (CA + leaf forge)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/forge/forge.go`
- Create: `packages/secubox-toolbox-ng/internal/forge/forge_test.go`
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/main.go` (remove `CA`, `loadCA`, `forge`, `firstPEMBlock`, `parseKey`; import + alias `forge.CA`)
**Interfaces:**
- Produces: `forge.CA` struct; `forge.LoadCA(certPath, keyPath string) (*forge.CA, error)`; `(*forge.CA).Forge(host string) (*tls.Certificate, error)`. (Exported names: `LoadCA`, `Forge` — capitalised from the current unexported `loadCA`/`forge`.)
- [ ] **Step 1: Write the failing test** (`forge_test.go`): generate a self-signed CA, `LoadCA` from temp PEM files, `Forge("example.com")`, assert the returned leaf chains to the CA (`leaf.CheckSignatureFrom(ca.cert)`) and `Forge` is cached (same pointer on second call).
```go
func TestForgeChainsAndCaches(t *testing.T) {
dir := t.TempDir()
certPath, keyPath := writeTestCA(t, dir) // helper mints a CA, writes PEMs
ca, err := LoadCA(certPath, keyPath)
if err != nil { t.Fatalf("LoadCA: %v", err) }
c1, err := ca.Forge("example.com")
if err != nil { t.Fatalf("Forge: %v", err) }
if c1.Leaf.DNSNames[0] != "example.com" { t.Fatalf("CN/SAN wrong: %v", c1.Leaf.DNSNames) }
c2, _ := ca.Forge("example.com")
if c1 != c2 { t.Fatalf("Forge not cached") }
}
```
- [ ] **Step 2: Run test, verify it fails**`go test ./internal/forge/ -run TestForgeChainsAndCaches -v` → FAIL (package/symbols undefined).
- [ ] **Step 3: Move the code** — cut `CA`, `loadCA`→`LoadCA`, `forge`→`Forge`, `firstPEMBlock`, `parseKey` from `cmd/sbxmitm/main.go` (lines ~45-155) into `internal/forge/forge.go`, package `forge`, capitalise the two exported names, add SPDX header. Add `writeTestCA` helper in the test.
- [ ] **Step 4: Rewire sbxmitm** — in `cmd/sbxmitm`, replace `loadCA(``forge.LoadCA(`, `px.ca.forge(``px.ca.Forge(`, change `ca *CA` field type to `ca *forge.CA`, add import.
- [ ] **Step 5: Run both test suites**`go test ./internal/forge/ ./cmd/sbxmitm/ -count=1` → PASS.
- [ ] **Step 6: Commit**`git commit -am "refactor(toolbox-ng): extract internal/forge from sbxmitm (ref #744)"`.
### Task 0.2: Extract `internal/httpcodec` (gzip/br/zstd)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/httpcodec/codec.go`
- Create: `packages/secubox-toolbox-ng/internal/httpcodec/codec_test.go`
- Modify: `cmd/sbxmitm/gzip.go` (remove the moved funcs; keep `injectIntoBody`/`injectHTML` which are sbxmitm-specific but call `httpcodec.*`)
**Interfaces:**
- Produces: `httpcodec.GunzipBytes([]byte)([]byte,error)`, `GzipBytes([]byte)[]byte`, `UnbrotliBytes`, `BrotliBytes`, `UnzstdBytes`, `ZstdBytes` (capitalised). `httpcodec.Decode(encoding string, body []byte)([]byte,error)` and `httpcodec.Encode(encoding string, body []byte)([]byte,error)` convenience dispatchers (encoding ∈ "",gzip,br,zstd; "" = identity passthrough).
- [ ] **Step 1: Write failing test** — round-trip each codec: `Encode("gzip", b)` then `GunzipBytes` returns `b`; a 33 MiB stream decodes to error (bomb cap); unknown encoding via `Decode("deflate", b)` returns error.
```go
func TestCodecRoundTrip(t *testing.T) {
for _, enc := range []string{"gzip", "br", "zstd"} {
in := []byte("<html>hello</html>")
comp, err := Encode(enc, in)
if err != nil { t.Fatalf("Encode %s: %v", enc, err) }
out, err := Decode(enc, comp)
if err != nil || string(out) != string(in) { t.Fatalf("%s round-trip: %v %q", enc, err, out) }
}
}
```
- [ ] **Step 2: Run, verify fail**`go test ./internal/httpcodec/ -v` → FAIL.
- [ ] **Step 3: Move code** — move `gunzipBytes`/`gzipBytes`/`unbrotliBytes`/`brotliBytes`/`unzstdBytes`/`zstdBytes`/`readCapped`/`gunzipCap`/`errString`/`errGunzipTooLarge` from `gzip.go` into `internal/httpcodec/codec.go`, capitalise the byte funcs, add `Decode`/`Encode` dispatchers. SPDX header.
- [ ] **Step 4: Rewire sbxmitm**`gzip.go`'s `injectIntoBody` switch calls `httpcodec.GunzipBytes`/`GzipBytes`/etc.
- [ ] **Step 5: Run**`go test ./internal/httpcodec/ ./cmd/sbxmitm/ -count=1` → PASS.
- [ ] **Step 6: Commit**`refactor(toolbox-ng): extract internal/httpcodec (ref #744)`.
### Task 0.3: Extract `internal/relay` (async unix-socket POST)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/relay/relay.go` (+ `relay_test.go`)
- Modify: `cmd/sbxmitm/relay.go`, `cmd/sbxmitm/sidecar.go`
**Interfaces:**
- Produces: `relay.Emit(socketPath, route string, payload []byte)` (fire-and-forget), `relay.EmitSync(socketPath, route string, payload []byte) error` (2 s timeout, test-observable). sbxmitm keeps its event-builder funcs but calls `relay.Emit`.
- [ ] **Step 1: Failing test** — spin a `net.Listen("unix", …)` echo server; `EmitSync` posts a payload; assert the server received `POST <route>` with the body.
- [ ] **Step 2: Verify fail**`go test ./internal/relay/ -v` → FAIL.
- [ ] **Step 3: Move** `emit`→`Emit`, `emitSync`→`EmitSync`, `emitTimeout` from `sidecar.go` into `internal/relay/relay.go`. SPDX. Leave the dpi/cookies/ja4 builders in sbxmitm (they call `relay.Emit`).
- [ ] **Step 4: Rewire** sbxmitm callers.
- [ ] **Step 5: Run**`go test ./internal/relay/ ./cmd/sbxmitm/ -count=1` → PASS.
- [ ] **Step 6: Commit**`refactor(toolbox-ng): extract internal/relay (ref #744)`.
### Task 0.4: Extract `internal/reload` (mtime hot-reload pattern)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/reload/reload.go` (+ test)
- Modify: `cmd/sbxmitm/policy.go` (use `reload.Watcher`)
**Interfaces:**
- Produces: a generic watcher decoupled from `Policy`:
```go
type Target struct { Path string; LastMtime int64; Load func(path string) any; Apply func(v any) }
type Watcher struct { /* throttle + mu */ }
func NewWatcher(throttle time.Duration, targets ...Target) *Watcher
func (w *Watcher) Maybe() // stat each target; on mtime change, Load then Apply under the caller's swap
func StatMtime(path string) int64
func LoadLines(path string, stripComments bool) map[string]bool
```
- [ ] **Step 1: Failing test** — write a temp file, register a `Target` whose `Apply` stores into a captured var; call `Maybe()`, mutate the file + bump mtime, call `Maybe()` again, assert the var updated; assert throttle suppresses a same-second re-stat.
- [ ] **Step 2: Verify fail**`go test ./internal/reload/ -v` → FAIL.
- [ ] **Step 3: Implement** the watcher generically (port `maybeReload` throttle+stat loop, `statMtime`, `scanLines`/`loadLines`). SPDX.
- [ ] **Step 4: Rewire** `policy.go` to build `reload.Target`s (keep `Policy.Decide` semantics identical).
- [ ] **Step 5: Run**`go test ./internal/reload/ ./cmd/sbxmitm/ -count=1` → PASS (parity fixtures still green).
- [ ] **Step 6: Commit**`refactor(toolbox-ng): extract internal/reload (ref #744)`.
---
## Phase 1 — sbxwaf skeleton + vhost routing
### Task 1.1: cmd/sbxwaf skeleton + flags + listener
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/main.go` (+ `main_test.go`)
**Interfaces:**
- Produces: `type Server struct { ca *forge.CA; routes *Routes; rules *Rules; ban *Ban; … }`; `func (s *Server) handler() http.Handler`; flags `--listen :8080`, `--ca-cert`, `--ca-key`, `--routes`, `--rules`, `--upstream-timeout`.
- [ ] **Step 1: Failing test**`httptest`-drive `s.handler()` with a minimal `Server` (nil rules/ban) and one route to a stub backend; assert a request to a mapped Host is proxied (200, body echoed) and the response carries `X-SecuBox-WAF: inspected`.
- [ ] **Step 2: Verify fail**`go test ./cmd/sbxwaf/ -run TestProxyPassthrough -v` → FAIL.
- [ ] **Step 3: Implement** `main.go`: flag parsing, `forge.LoadCA`, build `Server`, an `http.HandlerFunc` that (a) looks up `req.Host` in routes, (b) reverse-proxies via `httputil.NewSingleHostReverseProxy`-style director to the backend `ip:port`, (c) adds the response header. `http.Server{Addr, Handler}` with `ReadHeaderTimeout`. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): reverse-proxy skeleton + listener (ref #744)`.
### Task 1.2: Routes loader with hot-reload + 421 on unmapped
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/routes.go` (+ test)
**Interfaces:**
- Consumes: `reload.Watcher`, `reload.StatMtime`.
- Produces: `type Routes struct{…}`; `func LoadRoutes(path string) *Routes`; `func (r *Routes) Lookup(host string) (ip string, port int, ok bool)`; hot-reloads on mtime change. JSON shape: `{"domain": ["ip", port]}` (matches `haproxy-routes.json`).
- [ ] **Step 1: Failing test** — write a routes JSON, `LoadRoutes`, `Lookup("gitea.example.com")``("127.0.0.1", 3000, true)`; unknown host → `ok=false`; rewrite file + bump mtime, `Maybe()`, assert new route visible.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** loader (parse `map[string][2]json.RawMessage` or `map[string][]any`), RW-locked map, `reload.Target` wiring. In `main.go` handler: unmapped host → `http.Error(w, "Misdirected", 421)`.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): haproxy-routes.json loader + hot-reload + 421 (ref #744)`.
---
## Phase 2 — WAF rule engine
### Task 2.1: Rule compilation from waf-rules.json
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/rules.go` (+ test)
- Reference (port logic, do NOT import): `packages/secubox-mitmproxy/addons/secubox_waf.py` — the pattern categories + compiled regex (SQLi/XSS/LFI/RCE), `waf-rules.json` shape (categories, enabled, severity).
**Interfaces:**
- Produces: `type Rules struct{…}`; `func LoadRules(path string) *Rules`; `func (r *Rules) Match(method, path, query, body, ua string) (cat string, sev string, hit bool)`; hot-reload via `reload`.
- [ ] **Step 1: Failing test** — load a rules JSON with one SQLi pattern (`(?i)union\s+select`); `Match("GET","/x","id=1 UNION SELECT","","")``("sqli","high",true)`; a benign request → `hit=false`.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — parse categories, `regexp.MustCompile` each enabled pattern at load (skip disabled), match across method/path/query/body/UA; first hit wins (mirror Python order). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): regex WAF rule engine from waf-rules.json (ref #744)`.
### Task 2.2: Request inspection wiring + skip-lists
**Files:**
- Modify: `cmd/sbxwaf/main.go` (inspection in the handler); `cmd/sbxwaf/rules.go` (skip helpers)
**Interfaces:**
- Produces: `func staticAsset(path string) bool` (`.js/.css/.png/...`, `/health`, `/status`); `func ncBypass(path string) bool` (`/index.php/login/v2/`, `/ocs/v2.php/core/login`); `func privateCIDR(ip string) bool` (RFC1918 + loopback).
- [ ] **Step 1: Failing test** — handler: a request with `?q=<script>` from a public IP is blocked (403) unless `staticAsset`; a request from `192.168.x` is never blocked; `/health` skips inspection.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — read client IP from `X-Forwarded-For`/`RemoteAddr`; if `privateCIDR` → skip; if `staticAsset`/`ncBypass` → skip; else read body (capped), `rules.Match`; on hit hand to ban (Task 3). Add `Connection: close` (#496).
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): request inspection + CIDR/static/NC skip-lists (ref #744)`.
---
## Phase 3 — Graduated ban (sliding window)
### Task 3.1: Sliding-window ban state
**Files:**
- Create: `cmd/sbxwaf/ban.go` (+ test)
**Interfaces:**
- Produces: `type Ban struct{…}`; `func NewBan(window time.Duration, threshold int) *Ban`; `func (b *Ban) Record(ip string, nowUnix int64) (count int, banned bool)` (count within window; `banned` true once `count >= threshold`). Mirrors `BAN_THRESHOLD=3`/`300s`.
- [ ] **Step 1: Failing test**`NewBan(300s, 3)`; 2 `Record` at t=0 → `banned=false`; 3rd → `banned=true`; a 4th at t=400 (window expired) → count resets, `banned=false`.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement**`map[string][]int64` of hit timestamps, lock-guarded; prune entries older than `now-window` on each `Record`; cap map size. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): sliding-window graduated ban (ref #744)`.
### Task 3.2: WARNING/BAN responses + threat log
**Files:**
- Modify: `cmd/sbxwaf/main.go`; Create: `cmd/sbxwaf/threatlog.go` (+ test)
**Interfaces:**
- Produces: `func writeWarning(w http.ResponseWriter, cat string)`, `func writeBan(w http.ResponseWriter)`; `type ThreatLog struct{…}`, `func (l *ThreatLog) Record(ip, cat, sev, action, path string)` → append JSON line to `/var/log/secubox/waf-threats.log`.
- [ ] **Step 1: Failing test** — on first hit handler returns 403 with a WARNING marker; on the 3rd hit returns 403 BAN; `ThreatLog.Record` appends a parseable JSON line with the action.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — wire `ban.Record` result into WARNING vs BAN; styled 403 bodies (port templates); append-only threat log (O_APPEND). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): graduated WARNING/BAN responses + threat log (ref #744)`.
---
## Phase 4 — CrowdSec LAPI bridge
### Task 4.1: CrowdSec alert POST
**Files:**
- Create: `cmd/sbxwaf/crowdsec.go` (+ test)
- Reference: `secubox_waf.py` lines 710-765 (LAPI `/v1/alerts` JWT payload shape).
**Interfaces:**
- Produces: `type CrowdSec struct{ lapiURL, jwt string; client *http.Client }`; `func (c *CrowdSec) Alert(ip, scenario string) error` (fire-and-forget wrapper `AlertAsync`). On ban, post the alert.
- [ ] **Step 1: Failing test**`httptest` server asserting it receives a POST `/v1/alerts` with `Authorization: Bearer <jwt>` and a JSON body containing the source IP + scenario; `Alert` returns nil on 200/201.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — build the LAPI alert JSON (port the Python payload fields), POST with JWT, 2 s timeout; `AlertAsync` swallows errors (log only). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): CrowdSec LAPI alert bridge on ban (ref #744)`.
---
## Phase 5 — Cookie-audit
### Task 5.1: Set-Cookie JSONL ledger
**Files:**
- Create: `cmd/sbxwaf/cookieaudit.go` (+ test)
- Reference: `packages/secubox-mitmproxy/addons/cookie_audit.py`.
**Interfaces:**
- Produces: `type CookieAudit struct{…}`; `func (a *CookieAudit) Record(host string, resp *http.Response)` → for each `Set-Cookie`, parse attrs, SHA256-hash the value, append JSONL to `/var/log/secubox/cookie-audit/server.jsonl`. Async (channel + writer goroutine).
- [ ] **Step 1: Failing test** — feed a response with two `Set-Cookie` headers; assert two JSONL records appear with `name/domain/path/secure/httponly/samesite` and a hashed (not raw) value.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — parse via `http.Response.Cookies()`, `sha256` the value, buffered channel → single writer goroutine (O_APPEND), never block the request path. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): RGPD cookie-audit JSONL ledger (ref #744)`.
---
## Phase 6 — Media-cache
### Task 6.1: Response media cache
**Files:**
- Create: `cmd/sbxwaf/mediacache.go` (+ test)
- Reference: `packages/secubox-mitmproxy/addons/media_cache.py` + existing `cmd/sbxmitm/mediacatch.go` decision logic.
**Interfaces:**
- Produces: `type MediaCache struct{ dir string; maxObj, maxTotal int64 }`; `func (m *MediaCache) Get(url string) ([]byte, http.Header, bool)`; `func (m *MediaCache) Maybe Store(url string, resp *http.Response, body []byte)` (Content-Type image/video/audio/font/css/js, size < 16 MiB, respects max-age). Key = SHA256(URL), sharded `dir/<key[:2]>/<key>`.
- [ ] **Step 1: Failing test** — store a cacheable image response, `Get` returns it; an oversized (>16 MiB) or non-media response is not stored.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — cache decision (port Python), sharded file store, LRU-ish total cap (evict oldest on overflow), fail-open (any cache error → bypass). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): media-cache (16MB/obj, 2GB total) (ref #744)`.
---
## Phase 7 — Error pages
### Task 7.1: Synthetic 502/503/504 pages
**Files:**
- Create: `cmd/sbxwaf/errpages.go` + `cmd/sbxwaf/templates/` (embedded) (+ test)
- Reference: `secubox_waf.py` `error()` hook templates.
**Interfaces:**
- Produces: `func errorPage(code int) []byte` (themed HTML, `//go:embed`). On upstream dial/round-trip error, the reverse-proxy `ErrorHandler` serves `errorPage(502|503|504)`.
- [ ] **Step 1: Failing test** — point a route at a dead backend; assert the handler returns 502 with the themed body (contains a known marker string).
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement**`//go:embed templates/*.html`, map status→template, wire reverse-proxy `ErrorHandler`. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): synthetic error pages on upstream failure (ref #744)`.
---
## Phase 8 — Packaging + hardening
### Task 8.1: Debian package + systemd template + user
**Files:**
- Create: `packages/secubox-waf-ng/debian/{control,rules,postinst,prerm,compat}`, `packages/secubox-waf-ng/systemd/secubox-waf-ng-worker@.service`, `packages/secubox-waf-ng/debian/secubox-waf-ng.apparmor`
**Interfaces:**
- Produces: installable `secubox-waf-ng` shipping `/usr/sbin/sbxwaf`; `secubox-waf-ng-worker@1..2` enabled; `secubox-waf` user; AppArmor enforce.
- [ ] **Step 1:** Write `debian/control` (`Architecture: arm64`, `Standards-Version: 4.6.2`, `compat 13`), `rules` (cross-build the Go binary via `execute_after_dh_auto_install`), `postinst` (create `secubox-waf` user/group, dirs `/etc/secubox/waf` `/var/log/secubox` `/var/cache/secubox/waf` with correct owners — NEVER chmod the shared parents to 0750 per `[[project_var_log_secubox_traversal]]`/`[[project_etc_secubox_traversal]]`, `aa-enforce`, `systemctl enable --now secubox-waf-ng-worker@{1,2}`), `prerm` (stop workers).
- [ ] **Step 2:** systemd unit: `User=secubox-waf`, `ExecStart=/usr/sbin/sbxwaf --listen 127.0.0.1:808%i --ca-cert /etc/secubox/waf/ca/ca-cert.pem …`, the full hardening block (Global Constraints), `RuntimeDirectory=secubox` + `RuntimeDirectoryPreserve=yes` (per `[[project_runtimedirectory_socket_wipe]]`).
- [ ] **Step 3:** AppArmor profile: rw to `/var/log/secubox/**`, `/var/cache/secubox/waf/**`, `/run/secubox/**`; r to `/etc/secubox/waf/**`, `/etc/secubox/secrets/**`; deny everything else.
- [ ] **Step 4: Build**`dpkg-buildpackage -a arm64 --host-arch arm64 -us -uc -b``.deb` produced.
- [ ] **Step 5: Commit**`feat(packaging): secubox-waf-ng deb + hardened systemd + AppArmor (ref #744)`.
---
## Phase 9 — Parity harness + shadow + cutover
### Task 9.1: Decision parity harness vs mitmproxy
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/parity_test.go`, `packages/secubox-toolbox-ng/testdata/waf-parity-fixtures.json`
**Interfaces:**
- Consumes: the same request corpus replayed against Python (`secubox_waf.py`) and Go (`Rules.Match`+`Ban`).
- Produces: a fixture file of `{method, path, query, body, ua, client_ip, expect: allow|warn|ban|421}` and a Go test asserting `sbxwaf` matches `expect` for every row.
- [ ] **Step 1:** Author `waf-parity-fixtures.json` from the Python rule corpus (malicious + benign + private-IP + static + NC-bypass rows).
- [ ] **Step 2:** Write `parity_test.go` looping fixtures through `Rules.Match`+skip-lists+`Ban`, asserting `expect`.
- [ ] **Step 3: Run**`go test ./cmd/sbxwaf/ -run TestWAFParity -v` → PASS; any mismatch is a BLOCKING bug to fix in `rules.go`.
- [ ] **Step 4: Commit**`test(sbxwaf): decision parity harness vs mitmproxy (ref #744)`.
### Task 9.2: Shadow-run + bench + cutover/rollback runbook
**Files:**
- Create: `packages/secubox-waf-ng/docs/CUTOVER.md`, `scripts/sbxwaf-bench.sh`
- [ ] **Step 1:** `sbxwaf-bench.sh` — drive `wrk`/`hey` against both mitmproxy (`:8080` LXC) and sbxwaf (`:8081` shadow), record req/s, p99, RSS; emit a comparison table.
- [ ] **Step 2:** Deploy sbxwaf on `:8081` (shadow), mirror a fraction of traffic (HAProxy `mode tcp` tee or duplicated backend), run the bench + replay the parity corpus live.
- [ ] **Step 3:** `CUTOVER.md` — go/no-go checklist (parity green, bench `>5×`/`p99<⅓`/`RSS<¼`), the HAProxy `server waf` IP flip (LXC→host), and the rollback (re-flip; mitmproxy LXC stays deployed until validated).
- [ ] **Step 4: Commit**`docs(sbxwaf): bench harness + cutover/rollback runbook (ref #744)`.
- [ ] **Step 5:** (Operator-gated) execute cutover only after the go/no-go gate passes; this step is NOT automated.
---
## Self-Review notes
- **Spec coverage:** §3 architecture→Phase 1; §4 components→Phases 0-7 (forge/relay/httpcodec/reload extracted Phase 0; routes/rules/ban/crowdsec/cookieaudit/mediacache/errpages Phases 1-7); §5 feature port→Phases 2-7; §6 hardening→Phase 8; §7 migration→Phase 9; §8 tests→every task + Phase 9 parity. No gaps.
- **Placeholder scan:** none — each task has concrete files, signatures, and test code.
- **Type consistency:** `forge.CA`/`LoadCA`/`Forge`, `Routes.Lookup`, `Rules.Match`, `Ban.Record`, `CrowdSec.Alert` used consistently across phases.

View File

@ -0,0 +1,137 @@
# Design — Cookies cross-site tracker detection (surface R3 social-graph)
- **Issue:** #749
- **Date:** 2026-06-26
- **Status:** Approved (brainstorm), pending implementation plan
- **Author:** Gérald Kerma / CyberMind
## Problem
The operator wants to *detect cross-site-used cookies and their tracking targets*
("detecter les cross used et les target de suivis"). Investigation showed the
cross-site **correlation already exists** but is invisible to humans:
- `secubox_toolbox/learn.py::cookie_xsite_trackers()` (Anti-Track v2, #633) runs
`GROUP BY cookie_id_hash, tracker_domain HAVING COUNT(DISTINCT src_site) >= 2`
over `social_edges` (toolbox.db). It returns only a **top-N domain list**
consumed by the **auto-blocker** — no detail, no operator view.
- `social_edges` is populated by `sbxmitm/social.go``/__toolbox/social-event`
ingest. Live state (2026-06-26): 841 edges, src_site mostly valid
(`leparisien.fr`=566, `google.com`=110, `chatgpt.com`=40 …; 84 rows have the
literal string `"null"`).
So the gap is purely **surfacing** the existing correlation for the operator:
*which trackers follow our R3 visitors across N sites, with which cookies,
affecting how many clients.*
## Decisions (from brainstorm)
- **Population / source:** the **R3 social-graph** (3rd-party trackers following
our tunnel visitors), NOT the WAF server-side cookie-audit self-audit angle.
- **Surface:** a panel inside the existing **secubox-cookies** dashboard.
- **Source of truth:** `social_edges` in `toolbox.db`, owned and exposed by the
toolbox. The cookies dashboard consumes a toolbox endpoint; it does not read
the DB directly (perms + duplication).
- **Auth path:** the cookies dashboard runs in the operator's browser, which
already carries the operator JWT — it fetches the toolbox endpoint directly.
No server-to-server auth.
## Approach (chosen: A)
**A — Toolbox aggregation endpoint + cookies WebUI panel (chosen).**
Single source of truth, reuses the existing query, no perms/auth friction.
**B — Duplicate the aggregation in the cookies module reading toolbox.db
(rejected).** `toolbox.db` is `0640 secubox-toolbox`; the cookies module runs as
`secubox` → perms friction + duplicated correlation logic.
## Components
### 1. Toolbox — read-only aggregation
New pure function (sibling of `cookie_xsite_trackers`), e.g.
`cookie_xsite_detail(conn, hours: int = 24, top_n: int = 50) -> list[dict]`:
- Reuses the cross-site predicate
(`HAVING COUNT(DISTINCT src_site) >= 2`) but returns **rich rows** per
registrable tracker domain:
- `tracker_domain` (registrable)
- `sites` — sorted list of distinct `src_site` (excludes `''` and `'null'`)
- `site_count`
- `client_count` — distinct `client_mac_hash`
- `cookie_count` — distinct `cookie_id_hash`
- `pre_consent_hits` — count where `consent_state = 'pre_consent'`
- `last_seen` — max ts (epoch)
- Window: only edges with `ts >= now - hours*3600`.
- Ranking: by `client_count` desc, then `site_count` desc, then domain — capped
to `top_n`.
- Defensive: returns `[]` on any `sqlite3.Error` (mirrors existing pattern).
New endpoint (toolbox FastAPI, JWT, read-only):
```
GET /admin/cookie-crosssite?hours=24&top=50
→ { "trackers": [ {tracker_domain, sites, site_count, client_count,
cookie_count, pre_consent_hits, last_seen}, … ],
"window_hours": 24, "generated_at": <epoch> }
```
Placed next to the existing `/admin/social-aggregate` route. Reaches `social_edges`
through the same connection helper the other social endpoints use.
### 2. secubox-cookies — WebUI panel
In `packages/secubox-cookies/www/cookies/index.html`:
- New section **"🕸️ Trackers cross-site"** in the existing "Cookie Tracker"
dashboard.
- A table sorted by client_count then site_count, columns:
*Tracker · Sites suivis (badge N + tooltip listing the sites) · Clients ·
Cookies · Pré-consent · Vu (relative).*
- `loadCrossSite()` does `fetch('/api/v1/toolbox/admin/cookie-crosssite?hours=24')`
with the standard JWT-bearing fetch helper already used by the dashboard.
- Graceful degradation: empty `trackers` (or fetch failure) renders an
informative empty state ("aucune donnée R3 récente — tunnel captif inactif"),
never a broken table.
- No new dependency, no new service, no backend change in the cookies module
itself (pure frontend addition consuming the toolbox endpoint).
## Data flow
```
sbxmitm/social.go → POST /__toolbox/social-event → social_edges (toolbox.db)
(existing) (existing) (existing)
cookie_xsite_detail() ◀──────┘ (new)
GET /admin/cookie-crosssite (new)
cookies dashboard loadCrossSite() fetch + render (new)
```
## Testing
- **Unit (toolbox):** seed an in-memory sqlite `social_edges` with a tracker on
≥2 distinct sites + a 1-site tracker; assert `cookie_xsite_detail` returns only
the cross-site one with correct `site_count` / `client_count` / `cookie_count`,
excludes `src_site IN ('','null')`, respects the time window and `top_n` cap.
- **Endpoint:** assert `GET /admin/cookie-crosssite` requires JWT, returns the
envelope shape, and is read-only.
- **Frontend:** manual — verify the panel renders rows from a live/seeded
endpoint and shows the empty state when `trackers` is `[]`.
## Out of scope
- Fixing the R3 capture flow (edges stale since ~15:45 = idle tunnel, not this
feature's bug).
- Re-correlating / re-deriving edges (reuse `social_edges` as-is).
- Migrating the 84 `src_site='null'` rows (filtered at read time instead).
- The WAF server-side cookie-audit self-audit angle (explicitly deprioritised in
the brainstorm).
## Privacy
All identifiers exposed are already hashed at source: `client_mac_hash` (rotating
daily salt), `cookie_id_hash` (sha256 truncated, raw cookie values never reach the
ingest). The endpoint exposes counts and registrable tracker/site domains only —
no raw cookie values, no client identity. Consistent with the toolbox R2 doctrine.

View File

@ -0,0 +1,169 @@
# Design — `sbxwaf` : moteur WAF Go host-native (remplacement mitmproxy)
- **Issue** : #744
- **Date** : 2026-06-26
- **Prior art** : #662 (port toolbox R3 `sbxmitm`), `docs/superpowers/specs/2026-06-18-mitm-engine-migration-analysis.md`
- **Statut** : design validé (brainstorming) — en attente de revue avant plan d'implémentation
## 1. Contexte & problème
Le WAF de SecuBox inspecte tout le trafic externe entrant (HAProxy TLS 1.3 → backend
`mitmproxy_waf` → mitmdump `--mode regular` → backends LXC). L'inspection tourne dans
`mitmproxy` 11.0.2 (LXC `10.100.0.60:8080`) avec trois addons Python :
- `secubox_waf.py` (930 lignes) — routing vhost→backend (`haproxy-routes.json`,
reload mtime 10s), moteur de règles regex (SQLi/XSS/LFI/RCE…), ban gradué
(fenêtre glissante 300s, seuil 3 → 403 WARNING puis 403 BAN), bridge CrowdSec
LAPI (`/v1/alerts` → firewall-bouncer → nft drop), pages d'erreur synthétiques,
`Connection: close` (#496), whitelist CIDR RFC1918, skip statiques, bypass token NC.
- `cookie_audit.py` — ledger RGPD des `Set-Cookie` (JSONL, valeurs hashées SHA256).
- `media_cache.py` — cache de réponses média (16 MB/objet, 2 GB total).
### Problèmes du moteur actuel
1. **Perf** : Python GIL-bound. Phase 9 (#501) a dû lancer **4 workers + fanout
numgen** pour saturer les cœurs. Regex Python + dispatch asyncio par requête.
2. **Fragilité** : dépendance à la version mitmproxy (#605 timing `requestheaders`
en v11), au drop-in confdir (#603), au drift `/data` vs `/srv` des routes — trois
modes de panne mémorisés qui downent tous les vhosts inspectés.
3. **RAM** : ~150-200 MB × 4 workers dans le LXC.
## 2. Objectif & décisions
| Axe | Décision |
|-----|----------|
| Driver principal | **Performance/charge** (throughput, p99 latence, RAM) |
| Périmètre | **Remplacement COMPLET** — aucun mitmproxy résiduel dans le WAF |
| Placement | **Host-native** (workers `secubox-waf-ng-worker@`), durci |
| Approche | **A** — binaire dédié `sbxwaf`, cœur partagé extrait de `sbxmitm`, shadow→cutover |
### Gains estimés (à valider par bench, = critères go/no-go BLOQUANTS)
- Throughput : **>5×/cœur** (suppression GIL + fanout) ; cible bench `>5× req/s·cœur`.
- Latence p99 : **<⅓** (regexp compilé + GC concurrent, pas de thrash refcount).
- RAM : **<¼** (1 binaire statique ~30-80 MB vs 600-800 MB).
- Robustesse : suppression des 3 modes de panne (binaire statique, zéro runtime).
Ces seuils sont **bloquants** : pas de cutover tant qu'ils ne sont pas atteints sur
le bench de charge (§7.3). Si un cas live-dashboard incompressible empêche un seuil,
il est documenté et arbitré explicitement avant cutover.
### Non-objectifs (YAGNI)
- Pas d'unification immédiate des moteurs (`sbxmitm` reste séparé — approche B écartée
pour ne pas coupler les cycles de release WAF et toolbox R3).
- Pas de JA4/splice TLS dans le WAF (besoins toolbox R3, hors périmètre WAF).
## 3. Architecture cible
```
Internet ──TLS1.3──> HAProxy :443
│ use_backend mitmproxy_waf (ACL vhost)
backend mitmproxy_waf
server waf <HOST_IP>:8080 ◄── flip cutover (host au lieu du LXC)
┌─────────────────────────────────────────┐
│ sbxwaf (host-native, user secubox-waf) │
│ workers ng-worker@1..2 (rolling restart)│
│ ├─ forge CA per-host (mode regular) │
│ ├─ routes-loader (haproxy-routes.json) │
│ ├─ moteur règles WAF (waf-rules.json) │
│ ├─ ban gradué (fenêtre glissante) │
│ ├─ bridge CrowdSec LAPI │
│ ├─ cookie-audit JSONL │
│ ├─ media-cache │
│ └─ pages d'erreur 502/503/504 │
└─────────────────────────────────────────┘
backends LXC 10.100.0.0/24
```
- **Position réseau identique** à mitmdump : écoute `:8080`, **même confdir CA**
(migrée `/data/mitmproxy``/etc/secubox/waf/ca`), **même `haproxy-routes.json`**
(reload mtime), **backend HAProxy inchangé** (on flip l'IP `server waf` du LXC vers
l'host). La frontière TLS exacte (forge `--mode regular`) est miroitée par `sbxwaf`.
- **Concurrence** : 1 process tous-cœurs. On garde **2 workers** pour le
rolling-restart sans coupure (pas pour scaler) — le fanout numgen 4-workers
disparaît.
## 4. Composants (unités isolées, testables)
| Package / cmd | Rôle | Dépend de |
|---|---|---|
| `internal/forge` | CA + forge leaf per-host (extrait de `sbxmitm`) | crypto/tls, x509 |
| `internal/relay` | POST async unix-socket fire-and-forget | net |
| `internal/httpcodec` | gzip/br/zstd decode+reencode (extrait) | compress, brotli, zstd |
| `internal/util` | helpers HTTP communs | — |
| `cmd/sbxwaf/routes.go` | charge `haproxy-routes.json`, reload mtime, rewrite `req.Host/URL` | internal |
| `cmd/sbxwaf/rules.go` | regex compilées depuis `waf-rules.json`, match path/query/body/UA | regexp |
| `cmd/sbxwaf/ban.go` | fenêtre glissante 300s, seuil → WARNING/BAN, map lock-guarded TTL | sync |
| `cmd/sbxwaf/crowdsec.go` | POST LAPI `/v1/alerts` (JWT) | net/http |
| `cmd/sbxwaf/cookieaudit.go` | parse Set-Cookie, hash SHA256, append JSONL | crypto/sha256 |
| `cmd/sbxwaf/mediacache.go` | cache réponses média (16MB/2GB) — réutilise `mediacatch.go` | — |
| `cmd/sbxwaf/errpages.go` | templates 502/503/504 embarqués | embed |
| `cmd/sbxwaf/main.go` | reverse-proxy HTTP, pipeline d'inspection, listen :8080 | net/http |
Chaque unité a un contrat clair (entrée→verdict) et est testable isolément contre
des fixtures. Le cœur partagé `internal/*` est consommé par `cmd/sbxmitm` ET
`cmd/sbxwaf` sans coupler leurs binaires.
## 5. Portage des fonctions (remplacement complet)
Parité **exacte** requise avec `secubox_waf.py` (sécurité-critique, no-regress) :
- **Routing** : `requestheaders` → lookup host dans routes, rewrite cible ; host non
mappé → **421**.
- **Règles** : catégories regex (SQLi/XSS/LFI/RCE…) depuis `waf-rules.json`
(enabled/severity), match sur path+query+body+UA. Skip statiques (.js/.css/.png/
health/status), bypass tokens NC (`/index.php/login/v2/`, `/ocs/v2.php/core/login`).
- **Ban gradué** : fenêtre glissante 300s, seuil 3 → 1ʳᵉ détection **403 WARNING**,
count≥3 **403 BAN** ; whitelist CIDR RFC1918+loopback (opérateurs LAN jamais bannis).
- **CrowdSec** : alerte JWT → LAPI `/v1/alerts` → bouncer nft drop (4h défaut).
- **Pages d'erreur** : interception 502/503/504 → pages thémées.
- **Cookie-audit** : `response` → Set-Cookie → JSONL hashé.
- **Media-cache** : Content-Type/size/TTL → store/serve.
- **`Connection: close`** (#496) conservé.
## 6. Durcissement (compense la perte d'isolation LXC)
Le host-native expose le WAF (trafic attaquant) sur l'hôte → contrôles compensatoires
(exigence CSPN — séparation de privilèges, AppArmor enforce) :
- `User=secubox-waf` / `Group=secubox-waf` non-privilégié (créé en postinst).
- `NoNewPrivileges=yes`, `ProtectSystem=strict` + `ReadWritePaths` minimal
(`/var/log/secubox`, `/var/cache/secubox/waf`, `/run/secubox`), `ProtectHome=yes`.
- `RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX`, drop de toutes capabilities,
`SystemCallFilter` (seccomp).
- **Profil AppArmor enforce** livré dans `debian/`, activé en postinst.
- Journalisation audit **append-only** `/var/log/secubox/audit.log` (ban/unban/règle).
- Secrets (JWT CrowdSec, clé CA) hors code, `/etc/secubox/secrets/` chmod 600 owner
`secubox-waf`.
## 7. Migration : shadow → parité → cutover → rollback
1. **Shadow-run** : `sbxwaf` déployé sur un **port parallèle** (`:8081`), trafic
miroité (HAProxy `mode tcp` mirror / tee). Aucun impact prod.
2. **Harness de parité** : corpus de requêtes (malveillantes + légitimes) rejoué
contre Python ET Go ; compare **verdict** (allow/204/403/421/ban) + **cible de
routing**. Réutilise le pattern `parity-fixtures.json` (#662). No-regress détection
= **bloquant**.
3. **Bench perf** (go/no-go) : throughput req/s·cœur, p99 latence, RSS — cibles §2.
4. **Cutover** : flip du `server waf` HAProxy (IP LXC → host:8080). **Rollback** =
re-flip vers le LXC (mitmproxy reste déployé jusqu'à validation).
## 8. Tests
- **Unitaires** : chaque package `internal/*` + `cmd/sbxwaf/*` (rules, ban, routes,
cookieaudit) avec fixtures.
- **Parité** : harness §7.2 (vs mitmproxy live).
- **Charge** : bench §7.3 (critères cutover).
- **Sécurité** : non-régression de la détection (corpus d'attaques connu) + tests CSPN
(séparation privilèges, AppArmor enforce, audit append-only).
## 9. Risques & mitigations
| Risque | Mitigation |
|---|---|
| Régression de détection WAF | Harness parité bloquant + corpus d'attaques avant cutover |
| Perte d'isolation (host-native) | Durcissement §6 (user dédié, AppArmor, seccomp, caps) |
| Frontière TLS forge mal miroitée | Shadow-run + comparaison réponses ; mitmproxy en rollback |
| Couplage cœur partagé ↔ toolbox | `internal/*` versionné, binaires séparés, tests des deux cmd |
| Drift CrowdSec LAPI (auth/format) | Test d'intégration LAPI + fallback log si POST échoue |

View File

@ -404,6 +404,29 @@
</tbody> </tbody>
</table> </table>
</div> </div>
<div class="card">
<div class="card-title">
<span>🕸️ Trackers cross-site (R3)</span>
<span class="badge badge-cyan" id="crosssite-count">0</span>
</div>
<p class="empty" style="margin:0 0 .5rem">Cookies dont l'identifiant est réutilisé sur ≥2 sites first-party par le même client (source : tunnel captif R3).</p>
<table>
<thead>
<tr>
<th>Tracker</th>
<th>Sites suivis</th>
<th>Clients</th>
<th>Cookies</th>
<th>Pré-consent</th>
<th>Vu</th>
</tr>
</thead>
<tbody id="crosssite-table">
<tr><td colspan="6" class="empty">Loading...</td></tr>
</tbody>
</table>
</div>
</div> </div>
<!-- Policies Tab --> <!-- Policies Tab -->
@ -630,7 +653,7 @@
// Load data for tab // Load data for tab
switch(tab) { switch(tab) {
case 'cookies': loadCookies(); break; case 'cookies': loadCookies(); break;
case 'trackers': loadTrackers(); break; case 'trackers': loadTrackers(); loadCrossSite(); break;
case 'policies': loadPolicies(); break; case 'policies': loadPolicies(); break;
case 'violations': loadViolations(); break; case 'violations': loadViolations(); break;
case 'settings': loadConfig(); break; case 'settings': loadConfig(); break;
@ -777,6 +800,44 @@
document.getElementById('trackers-table').innerHTML = html; document.getElementById('trackers-table').innerHTML = html;
} }
async function loadCrossSite() {
const tbody = document.getElementById('crosssite-table');
const countEl = document.getElementById('crosssite-count');
try {
const res = await fetch('/api/v1/toolbox/admin/cookie-crosssite?hours=24', { headers: headers() });
if (!res.ok) throw new Error('http ' + res.status);
const data = await res.json();
const rows = (data && data.trackers) || [];
countEl.textContent = rows.length;
if (!rows.length) {
tbody.innerHTML = '<tr><td colspan="6" class="empty">Aucune donnée R3 récente — tunnel captif inactif.</td></tr>';
return;
}
tbody.innerHTML = rows.map(t => {
const sites = (t.sites || []).join(', ');
const seen = t.last_seen ? new Date(t.last_seen * 1000).toLocaleString() : '-';
const pc = t.pre_consent_hits > 0
? `<span class="badge badge-red">${Number(t.pre_consent_hits) | 0}</span>` : '0';
return `<tr>
<td><strong>${esc(t.tracker_domain)}</strong></td>
<td><span class="badge badge-cyan" title="${esc(sites)}">${t.site_count}</span></td>
<td>${t.client_count}</td>
<td>${t.cookie_count}</td>
<td>${pc}</td>
<td style="white-space:nowrap">${esc(seen)}</td>
</tr>`;
}).join('');
} catch (e) {
countEl.textContent = '0';
tbody.innerHTML = '<tr><td colspan="6" class="empty">Source R3 indisponible.</td></tr>';
}
}
function esc(s) {
return String(s == null ? '' : s).replace(/[&<>"']/g, c => (
{ '&': '&amp;', '<': '&lt;', '>': '&gt;', '"': '&quot;', "'": '&#39;' }[c]));
}
async function loadPolicies() { async function loadPolicies() {
const data = await api('/policies') || {}; const data = await api('/policies') || {};
const policies = data.policies || []; const policies = data.policies || [];
@ -941,7 +1002,7 @@
} }
async function refresh() { async function refresh() {
await Promise.all([loadStatus(), loadStats(), loadViolationsPreview()]); await Promise.all([loadStatus(), loadStats(), loadViolationsPreview(), loadCrossSite()]);
} }
// Initial load // Initial load

View File

@ -557,9 +557,15 @@ async def health():
capture_output=True, text=True, timeout=3 capture_output=True, text=True, timeout=3
) )
nft_output = result.stdout if result.returncode == 0 else "" nft_output = result.stdout if result.returncode == 0 else ""
checks["nftables_crowdsec"] = "ip crowdsec" in nft_output # The SecuBox firewall-bouncer uses a CUSTOM table name (inet
checks["nftables_crowdsec6"] = "ip6 crowdsec6" in nft_output # secubox_blacklist), not the upstream default `ip crowdsec` / `ip6
checks["nftables_ok"] = checks["nftables_crowdsec"] and checks["nftables_crowdsec6"] # crowdsec6` — so the legacy probe always missed it and reported the
# firewall "not OK" even though it was active. Detect both the custom and
# the default names, and base nftables_ok on the GENERAL SecuBox firewall
# being loaded (inet filter / secubox_blacklist), not on the IPv6 anchor.
checks["nftables_crowdsec"] = ("ip crowdsec" in nft_output) or ("secubox_blacklist" in nft_output)
checks["nftables_crowdsec6"] = ("ip6 crowdsec6" in nft_output) or ("crowdsec6" in nft_output)
checks["nftables_ok"] = ("inet filter" in nft_output) or ("secubox_blacklist" in nft_output)
except Exception as e: except Exception as e:
log.warning("nftables check failed: %s", e) log.warning("nftables check failed: %s", e)
checks["nftables_ok"] = False checks["nftables_ok"] = False

View File

@ -21,7 +21,7 @@
if (window.__SBX_HEALTH_BANNER__) return; if (window.__SBX_HEALTH_BANNER__) return;
window.__SBX_HEALTH_BANNER__ = true; window.__SBX_HEALTH_BANNER__ = true;
const VERSION = '1.4.5'; const VERSION = '1.4.7';
const VISITOR_ORIGIN_API = window.SECUBOX_VISITOR_ORIGIN_API const VISITOR_ORIGIN_API = window.SECUBOX_VISITOR_ORIGIN_API
|| '/api/v1/metrics/visitor-origin'; || '/api/v1/metrics/visitor-origin';
const LIVE_HOSTS_API = window.SECUBOX_LIVE_HOSTS_API const LIVE_HOSTS_API = window.SECUBOX_LIVE_HOSTS_API
@ -926,6 +926,35 @@
document.body.appendChild(trigger); document.body.appendChild(trigger);
document.body.appendChild(banner); document.body.appendChild(banner);
// ── SPA re-inject guard (#750) ─────────────────────────────────────
// SPA sites (x.com, Next.js news) rebuild <body> on hydration, wiping
// our appended nodes; the one-shot __SBX_HEALTH_BANNER__ guard then
// blocks any re-init, so the banner never returns. Re-attach the
// already-created nodes — and re-add the styles if <head> was cleared
// too — whenever they detach. The closure keeps the refs alive even
// after the DOM node is wiped, and re-appending the SAME nodes
// preserves their event listeners.
function ensureMounted() {
injectBannerStyles(); // id-guarded: no-op when the <style> is present
const body = document.body;
if (!body) return;
if (!trigger.isConnected) body.appendChild(trigger);
if (!banner.isConnected) {
body.appendChild(banner);
// Re-sync the layout-shift class: a body wiped while the banner
// was expanded loses 'health-banner-open' on the fresh body.
body.classList.toggle('health-banner-open', banner.classList.contains('expanded'));
}
}
try {
// childList on <html> catches a full <body> element swap (cheap, no subtree).
new MutationObserver(ensureMounted)
.observe(document.documentElement, { childList: true });
} catch (_) { /* MutationObserver unsupported → the interval below covers it */ }
// Fallback for body.innerHTML='' (children cleared, body element kept),
// which a childList-only observer on <html> does not see.
setInterval(ensureMounted, 1500);
// Toggle banner on trigger click // Toggle banner on trigger click
trigger.addEventListener('click', () => { trigger.addEventListener('click', () => {
const isOpen = banner.classList.toggle('expanded'); const isOpen = banner.classList.toggle('expanded');

View File

@ -10,3 +10,4 @@ cmd/sbxmitm/sbxmitm
/debian/secubox-toolbox-ng/ /debian/secubox-toolbox-ng/
/debian/debhelper-build-stamp /debian/debhelper-build-stamp
/debian/*.debhelper.log /debian/*.debhelper.log
/sbxwaf

View File

@ -21,6 +21,8 @@ import (
"path/filepath" "path/filepath"
"testing" "testing"
"time" "time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
) )
func benchCA(b *testing.B) (string, string) { func benchCA(b *testing.B) (string, string) {
@ -49,12 +51,12 @@ func benchCA(b *testing.B) (string, string) {
// load (warm forge cache). req/s should rise ~linearly with -cpu (no GIL). // load (warm forge cache). req/s should rise ~linearly with -cpu (no GIL).
func BenchmarkHandshake(b *testing.B) { func BenchmarkHandshake(b *testing.B) {
cp, kp := benchCA(b) cp, kp := benchCA(b)
ca, err := loadCA(cp, kp) ca, err := forge.LoadCA(cp, kp)
if err != nil { if err != nil {
b.Fatal(err) b.Fatal(err)
} }
px := &Proxy{ca: ca} px := &Proxy{ca: ca}
if _, err := ca.forge("example.com"); err != nil { // warm cache if _, err := ca.Forge("example.com"); err != nil { // warm cache
b.Fatal(err) b.Fatal(err)
} }
ln, err := net.Listen("tcp", "127.0.0.1:0") ln, err := net.Listen("tcp", "127.0.0.1:0")
@ -77,7 +79,7 @@ func BenchmarkHandshake(b *testing.B) {
} }
}() }()
pool := x509.NewCertPool() pool := x509.NewCertPool()
pool.AddCert(ca.cert) pool.AddCert(ca.Cert)
addr := ln.Addr().String() addr := ln.Addr().String()
ccfg := &tls.Config{ServerName: "example.com", RootCAs: pool, MinVersion: tls.VersionTLS12} ccfg := &tls.Config{ServerName: "example.com", RootCAs: pool, MinVersion: tls.VersionTLS12}

View File

@ -15,6 +15,8 @@ import (
"net/http" "net/http"
"strings" "strings"
"testing" "testing"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
) )
// TestAcceptEncodingPreserved pins the #662 behaviour change: the request // TestAcceptEncodingPreserved pins the #662 behaviour change: the request
@ -48,13 +50,13 @@ func TestBrotliRoundTrip(t *testing.T) {
bytes.Repeat([]byte("AB"), 100000), bytes.Repeat([]byte("AB"), 100000),
} }
for _, x := range cases { for _, x := range cases {
enc, err := brotliBytes(x) enc, err := httpcodec.BrotliBytes(x)
if err != nil { if err != nil {
t.Fatalf("brotliBytes(%d): %v", len(x), err) t.Fatalf("BrotliBytes(%d): %v", len(x), err)
} }
got, err := unbrotliBytes(enc) got, err := httpcodec.UnbrotliBytes(enc)
if err != nil { if err != nil {
t.Fatalf("unbrotliBytes(%d): %v", len(x), err) t.Fatalf("UnbrotliBytes(%d): %v", len(x), err)
} }
if !bytes.Equal(got, x) { if !bytes.Equal(got, x) {
t.Fatalf("brotli round-trip mismatch: got %d want %d", len(got), len(x)) t.Fatalf("brotli round-trip mismatch: got %d want %d", len(got), len(x))
@ -70,13 +72,13 @@ func TestZstdRoundTrip(t *testing.T) {
bytes.Repeat([]byte("AB"), 100000), bytes.Repeat([]byte("AB"), 100000),
} }
for _, x := range cases { for _, x := range cases {
enc, err := zstdBytes(x) enc, err := httpcodec.ZstdBytes(x)
if err != nil { if err != nil {
t.Fatalf("zstdBytes(%d): %v", len(x), err) t.Fatalf("ZstdBytes(%d): %v", len(x), err)
} }
got, err := unzstdBytes(enc) got, err := httpcodec.UnzstdBytes(enc)
if err != nil { if err != nil {
t.Fatalf("unzstdBytes(%d): %v", len(x), err) t.Fatalf("UnzstdBytes(%d): %v", len(x), err)
} }
if !bytes.Equal(got, x) { if !bytes.Equal(got, x) {
t.Fatalf("zstd round-trip mismatch: got %d want %d", len(got), len(x)) t.Fatalf("zstd round-trip mismatch: got %d want %d", len(got), len(x))
@ -86,7 +88,7 @@ func TestZstdRoundTrip(t *testing.T) {
func TestInjectIntoBodyBrotli(t *testing.T) { func TestInjectIntoBodyBrotli(t *testing.T) {
html := `<html><head><title>page</title></head><body>content</body></html>` html := `<html><head><title>page</title></head><body>content</body></html>`
enc, err := brotliBytes([]byte(html)) enc, err := httpcodec.BrotliBytes([]byte(html))
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
@ -94,7 +96,7 @@ func TestInjectIntoBodyBrotli(t *testing.T) {
if !ok { if !ok {
t.Fatal("br inject must report ok=true") t.Fatal("br inject must report ok=true")
} }
plain, err := unbrotliBytes(out) plain, err := httpcodec.UnbrotliBytes(out)
if err != nil { if err != nil {
t.Fatalf("re-brotli'd output must decode cleanly (encoding stays br): %v", err) t.Fatalf("re-brotli'd output must decode cleanly (encoding stays br): %v", err)
} }
@ -109,7 +111,7 @@ func TestInjectIntoBodyBrotli(t *testing.T) {
func TestInjectIntoBodyZstd(t *testing.T) { func TestInjectIntoBodyZstd(t *testing.T) {
html := `<html><head><title>page</title></head><body>content</body></html>` html := `<html><head><title>page</title></head><body>content</body></html>`
enc, err := zstdBytes([]byte(html)) enc, err := httpcodec.ZstdBytes([]byte(html))
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
@ -117,7 +119,7 @@ func TestInjectIntoBodyZstd(t *testing.T) {
if !ok { if !ok {
t.Fatal("zstd inject must report ok=true") t.Fatal("zstd inject must report ok=true")
} }
plain, err := unzstdBytes(out) plain, err := httpcodec.UnzstdBytes(out)
if err != nil { if err != nil {
t.Fatalf("re-zstd'd output must decode cleanly (encoding stays zstd): %v", err) t.Fatalf("re-zstd'd output must decode cleanly (encoding stays zstd): %v", err)
} }
@ -131,12 +133,12 @@ func TestInjectIntoBodyZstd(t *testing.T) {
} }
func TestInjectIntoBodyBrotliCaseInsensitive(t *testing.T) { func TestInjectIntoBodyBrotliCaseInsensitive(t *testing.T) {
enc, _ := brotliBytes([]byte(`<head></head>`)) enc, _ := httpcodec.BrotliBytes([]byte(`<head></head>`))
out, ok := injectIntoBody(enc, "BR", inlineTestScript, "", false) out, ok := injectIntoBody(enc, "BR", inlineTestScript, "", false)
if !ok { if !ok {
t.Fatal("Content-Encoding BR (upper) must be recognised → ok=true") t.Fatal("Content-Encoding BR (upper) must be recognised → ok=true")
} }
plain, err := unbrotliBytes(out) plain, err := httpcodec.UnbrotliBytes(out)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
@ -168,25 +170,26 @@ func TestInjectIntoBodyZstdFailOpen(t *testing.T) {
} }
func TestBrotliZstdBombGuard(t *testing.T) { func TestBrotliZstdBombGuard(t *testing.T) {
zeros := make([]byte, gunzipCap+4096) const bombCap = 32 << 20 // mirrors httpcodec.gunzipCap
brBomb, err := brotliBytes(zeros) zeros := make([]byte, bombCap+4096)
brBomb, err := httpcodec.BrotliBytes(zeros)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
if _, err := unbrotliBytes(brBomb); err == nil { if _, err := httpcodec.UnbrotliBytes(brBomb); err == nil {
t.Fatal("unbrotliBytes must reject output exceeding gunzipCap") t.Fatal("UnbrotliBytes must reject output exceeding gunzipCap")
} }
// fail-open through the inject path. // fail-open through the inject path.
if out, ok := injectIntoBody(brBomb, "br", inlineTestScript, "", false); ok || !bytes.Equal(out, brBomb) { if out, ok := injectIntoBody(brBomb, "br", inlineTestScript, "", false); ok || !bytes.Equal(out, brBomb) {
t.Fatal("over-cap br body must fail open with original bytes") t.Fatal("over-cap br body must fail open with original bytes")
} }
zsBomb, err := zstdBytes(zeros) zsBomb, err := httpcodec.ZstdBytes(zeros)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
if _, err := unzstdBytes(zsBomb); err == nil { if _, err := httpcodec.UnzstdBytes(zsBomb); err == nil {
t.Fatal("unzstdBytes must reject output exceeding gunzipCap") t.Fatal("UnzstdBytes must reject output exceeding gunzipCap")
} }
if out, ok := injectIntoBody(zsBomb, "zstd", inlineTestScript, "", false); ok || !bytes.Equal(out, zsBomb) { if out, ok := injectIntoBody(zsBomb, "zstd", inlineTestScript, "", false); ok || !bytes.Equal(out, zsBomb) {
t.Fatal("over-cap zstd body must fail open with original bytes") t.Fatal("over-cap zstd body must fail open with original bytes")

View File

@ -7,6 +7,8 @@ package main
import ( import (
"strings" "strings"
"testing" "testing"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
) )
// representativeSelectors covers each ported group + an EXPANDED popup token, // representativeSelectors covers each ported group + an EXPANDED popup token,
@ -167,12 +169,12 @@ func TestInjectHTMLNonWGSkipsCosmetic(t *testing.T) {
func TestInjectIntoBodyGzipCarriesCosmetic(t *testing.T) { func TestInjectIntoBodyGzipCarriesCosmetic(t *testing.T) {
// The gzip decompress→inject→recompress path must carry BOTH injects for wg. // The gzip decompress→inject→recompress path must carry BOTH injects for wg.
body := []byte(`<html><head></head><body>hi</body></html>`) body := []byte(`<html><head></head><body>hi</body></html>`)
gz := gzipBytes(body) gz := httpcodec.GzipBytes(body)
out, ok := injectIntoBody(gz, "gzip", inlineTestScript, "", true) out, ok := injectIntoBody(gz, "gzip", inlineTestScript, "", true)
if !ok { if !ok {
t.Fatalf("injectIntoBody(gzip) returned ok=false") t.Fatalf("injectIntoBody(gzip) returned ok=false")
} }
plain, err := gunzipBytes(out) plain, err := httpcodec.GunzipBytes(out)
if err != nil { if err != nil {
t.Fatalf("re-gzip output not gunzippable: %v", err) t.Fatalf("re-gzip output not gunzippable: %v", err)
} }

View File

@ -17,6 +17,10 @@
// open (serve the ORIGINAL bytes on any decode/encode error — never corrupt a // open (serve the ORIGINAL bytes on any decode/encode error — never corrupt a
// page); unknown encodings pass through untouched. // page); unknown encodings pass through untouched.
// //
// Codec primitives (GunzipBytes / GzipBytes / UnbrotliBytes / BrotliBytes /
// UnzstdBytes / ZstdBytes) live in internal/httpcodec so that cmd/sbxwaf can
// reuse them. This file only contains the sbxmitm-specific inject logic.
//
// Dependencies (cgo-free, pure-Go): // Dependencies (cgo-free, pure-Go):
// - compress/gzip (stdlib) // - compress/gzip (stdlib)
// - github.com/andybalholm/brotli (br) // - github.com/andybalholm/brotli (br)
@ -24,127 +28,11 @@
package main package main
import ( import (
"bytes"
"compress/gzip"
"io"
"strings" "strings"
"github.com/andybalholm/brotli" "github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
"github.com/klauspost/compress/zstd"
) )
// gunzipCap bounds the decompressed output of EVERY codec (gzip/br/zstd) so a
// maliciously-crafted body (a "decompression bomb") cannot blow the worker's
// memory. The upstream body itself is already read under an 8MiB LimitReader;
// 32MiB of inflated HTML is a generous ceiling for a single page. Exceeding it →
// treated as an error (caller fails open and serves the original compressed
// bytes). Named gunzipCap for history; applies uniformly to br + zstd too.
const gunzipCap = 32 << 20
// readCapped inflates a decompressing reader with the gunzipCap bomb guard,
// shared by gzip/br/zstd. Reads up to gunzipCap+1 so "exactly at the cap" (fine)
// is distinguished from "over the cap" (bomb → error).
func readCapped(r io.Reader) ([]byte, error) {
out, err := io.ReadAll(io.LimitReader(r, gunzipCap+1))
if err != nil {
return nil, err
}
if len(out) > gunzipCap {
return nil, errGunzipTooLarge
}
return out, nil
}
// gunzipBytes inflates a gzip-compressed body. It is defensive on two axes:
// - a malformed/non-gzip input returns an error (caller fails open),
// - the decompressed output is capped at gunzipCap; if the stream would
// exceed it, that is reported as an error too (decompression-bomb guard).
func gunzipBytes(in []byte) ([]byte, error) {
zr, err := gzip.NewReader(bytes.NewReader(in))
if err != nil {
return nil, err
}
defer zr.Close()
return readCapped(zr)
}
// errGunzipTooLarge is returned by gunzipBytes when the decompressed stream
// exceeds gunzipCap (decompression-bomb guard).
var errGunzipTooLarge = errString("gunzip output exceeds cap")
// errString is a tiny stdlib-only error type (avoids importing errors/fmt for
// one sentinel).
type errString string
func (e errString) Error() string { return string(e) }
// gzipBytes compresses in with the default gzip level. It never errors: the
// gzip.Writer only writes into an in-memory bytes.Buffer, which cannot fail.
func gzipBytes(in []byte) []byte {
var buf bytes.Buffer
zw := gzip.NewWriter(&buf)
_, _ = zw.Write(in)
_ = zw.Close()
return buf.Bytes()
}
// unbrotliBytes inflates a brotli-compressed body with the gunzipCap bomb guard.
// A malformed/non-brotli input or an over-cap stream returns an error (caller
// fails open). Pure-Go (github.com/andybalholm/brotli — cgo-free).
func unbrotliBytes(in []byte) ([]byte, error) {
return readCapped(brotli.NewReader(bytes.NewReader(in)))
}
// brotliBytes compresses in with brotli at the default quality. It writes into
// an in-memory buffer; Close flushes the final block. The bytes.Buffer cannot
// fail, but brotli.Writer.Write/Close return errors → surfaced so the caller
// fails open rather than serving a truncated stream.
func brotliBytes(in []byte) ([]byte, error) {
var buf bytes.Buffer
bw := brotli.NewWriter(&buf)
if _, err := bw.Write(in); err != nil {
_ = bw.Close()
return nil, err
}
if err := bw.Close(); err != nil {
return nil, err
}
return buf.Bytes(), nil
}
// unzstdBytes inflates a zstd-compressed body with the gunzipCap bomb guard. A
// malformed/non-zstd input or an over-cap stream returns an error (caller fails
// open). Pure-Go (github.com/klauspost/compress/zstd — cgo-free). The decoder is
// created per-call WITHOUT concurrency goroutines (WithDecoderConcurrency(1)) so
// nothing is left running, then Closed.
func unzstdBytes(in []byte) ([]byte, error) {
zr, err := zstd.NewReader(bytes.NewReader(in), zstd.WithDecoderConcurrency(1))
if err != nil {
return nil, err
}
defer zr.Close()
return readCapped(zr)
}
// zstdBytes compresses in with zstd at the default level. The encoder is created
// per-call and Closed (flushing the final frame). Errors are surfaced so the
// caller fails open rather than serving a truncated frame.
func zstdBytes(in []byte) ([]byte, error) {
var buf bytes.Buffer
zw, err := zstd.NewWriter(&buf, zstd.WithEncoderConcurrency(1))
if err != nil {
return nil, err
}
if _, err := zw.Write(in); err != nil {
_ = zw.Close()
return nil, err
}
if err := zw.Close(); err != nil {
return nil, err
}
return buf.Bytes(), nil
}
// injectHTML applies BOTH HTML transforms in one pass over the DECOMPRESSED // injectHTML applies BOTH HTML transforms in one pass over the DECOMPRESSED
// body: the transparency-banner (always, via the INLINE script) AND, for R3 (wg) // body: the transparency-banner (always, via the INLINE script) AND, for R3 (wg)
// clients, the ad/popup-hiding cosmetic <style> (#662 — the cutover left this // clients, the ad/popup-hiding cosmetic <style> (#662 — the cutover left this
@ -188,34 +76,35 @@ func injectHTML(plain []byte, scriptBody, nonce string, wg bool) []byte {
// encoder error), the ORIGINAL bytes are returned with ok=false so the page is // encoder error), the ORIGINAL bytes are returned with ok=false so the page is
// never broken or corrupted. // never broken or corrupted.
// //
// The 32MiB decompression-bomb cap (gunzipCap) is enforced uniformly across // The 32 MiB decompression-bomb cap (gunzipCap) is enforced uniformly across
// gzip/br/zstd. idempotency / placement live inside injectInlineBanner/injectCosmetic. // gzip/br/zstd inside internal/httpcodec. idempotency / placement live inside
// injectInlineBanner/injectCosmetic.
func injectIntoBody(body []byte, encoding, scriptBody, nonce string, wg bool) (out []byte, ok bool) { func injectIntoBody(body []byte, encoding, scriptBody, nonce string, wg bool) (out []byte, ok bool) {
switch strings.ToLower(strings.TrimSpace(encoding)) { switch strings.ToLower(strings.TrimSpace(encoding)) {
case "": case "":
return injectHTML(body, scriptBody, nonce, wg), true return injectHTML(body, scriptBody, nonce, wg), true
case "gzip": case "gzip":
plain, err := gunzipBytes(body) plain, err := httpcodec.GunzipBytes(body)
if err != nil { if err != nil {
return body, false // fail open: serve the original compressed bytes return body, false // fail open: serve the original compressed bytes
} }
return gzipBytes(injectHTML(plain, scriptBody, nonce, wg)), true return httpcodec.GzipBytes(injectHTML(plain, scriptBody, nonce, wg)), true
case "br": case "br":
plain, err := unbrotliBytes(body) plain, err := httpcodec.UnbrotliBytes(body)
if err != nil { if err != nil {
return body, false // fail open return body, false // fail open
} }
reenc, err := brotliBytes(injectHTML(plain, scriptBody, nonce, wg)) reenc, err := httpcodec.BrotliBytes(injectHTML(plain, scriptBody, nonce, wg))
if err != nil { if err != nil {
return body, false // fail open: never serve a truncated br frame return body, false // fail open: never serve a truncated br frame
} }
return reenc, true return reenc, true
case "zstd": case "zstd":
plain, err := unzstdBytes(body) plain, err := httpcodec.UnzstdBytes(body)
if err != nil { if err != nil {
return body, false // fail open return body, false // fail open
} }
reenc, err := zstdBytes(injectHTML(plain, scriptBody, nonce, wg)) reenc, err := httpcodec.ZstdBytes(injectHTML(plain, scriptBody, nonce, wg))
if err != nil { if err != nil {
return body, false // fail open: never serve a truncated zstd frame return body, false // fail open: never serve a truncated zstd frame
} }

View File

@ -13,6 +13,8 @@ import (
"bytes" "bytes"
"strings" "strings"
"testing" "testing"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
) )
func TestGzipRoundTrip(t *testing.T) { func TestGzipRoundTrip(t *testing.T) {
@ -23,9 +25,9 @@ func TestGzipRoundTrip(t *testing.T) {
bytes.Repeat([]byte("AB"), 100000), // larger, compressible payload bytes.Repeat([]byte("AB"), 100000), // larger, compressible payload
} }
for _, x := range cases { for _, x := range cases {
got, err := gunzipBytes(gzipBytes(x)) got, err := httpcodec.GunzipBytes(httpcodec.GzipBytes(x))
if err != nil { if err != nil {
t.Fatalf("gunzipBytes(gzipBytes(%d bytes)) errored: %v", len(x), err) t.Fatalf("GunzipBytes(GzipBytes(%d bytes)) errored: %v", len(x), err)
} }
if !bytes.Equal(got, x) { if !bytes.Equal(got, x) {
t.Fatalf("round-trip mismatch: got %d bytes, want %d bytes", len(got), len(x)) t.Fatalf("round-trip mismatch: got %d bytes, want %d bytes", len(got), len(x))
@ -35,8 +37,8 @@ func TestGzipRoundTrip(t *testing.T) {
func TestGunzipNonGzipFails(t *testing.T) { func TestGunzipNonGzipFails(t *testing.T) {
// Plain bytes that are not a gzip stream → error, no panic. // Plain bytes that are not a gzip stream → error, no panic.
if _, err := gunzipBytes([]byte("this is definitely not gzip")); err == nil { if _, err := httpcodec.GunzipBytes([]byte("this is definitely not gzip")); err == nil {
t.Fatal("gunzipBytes on non-gzip input must error") t.Fatal("GunzipBytes on non-gzip input must error")
} }
} }
@ -44,11 +46,11 @@ func TestInjectIntoBodyGzip(t *testing.T) {
// End-to-end-ish: HTML with <head>, gzipped, run through the exact transform // End-to-end-ish: HTML with <head>, gzipped, run through the exact transform
// the inject path uses. Result must gunzip back to an injected, intact doc. // the inject path uses. Result must gunzip back to an injected, intact doc.
html := `<html><head><title>page</title></head><body>content</body></html>` html := `<html><head><title>page</title></head><body>content</body></html>`
out, ok := injectIntoBody(gzipBytes([]byte(html)), "gzip", inlineTestScript, "", true) out, ok := injectIntoBody(httpcodec.GzipBytes([]byte(html)), "gzip", inlineTestScript, "", true)
if !ok { if !ok {
t.Fatal("gzip inject must report ok=true") t.Fatal("gzip inject must report ok=true")
} }
plain, err := gunzipBytes(out) plain, err := httpcodec.GunzipBytes(out)
if err != nil { if err != nil {
t.Fatalf("re-gzipped output must gunzip cleanly: %v", err) t.Fatalf("re-gzipped output must gunzip cleanly: %v", err)
} }
@ -68,11 +70,11 @@ func TestInjectIntoBodyGzip(t *testing.T) {
func TestInjectIntoBodyGzipCaseInsensitiveEncoding(t *testing.T) { func TestInjectIntoBodyGzipCaseInsensitiveEncoding(t *testing.T) {
html := `<head></head>` html := `<head></head>`
out, ok := injectIntoBody(gzipBytes([]byte(html)), "GZIP", inlineTestScript, "", false) out, ok := injectIntoBody(httpcodec.GzipBytes([]byte(html)), "GZIP", inlineTestScript, "", false)
if !ok { if !ok {
t.Fatal("Content-Encoding GZIP (upper) must be recognised → ok=true") t.Fatal("Content-Encoding GZIP (upper) must be recognised → ok=true")
} }
plain, err := gunzipBytes(out) plain, err := httpcodec.GunzipBytes(out)
if err != nil { if err != nil {
t.Fatalf("gunzip failed: %v", err) t.Fatalf("gunzip failed: %v", err)
} }
@ -125,10 +127,11 @@ func TestInjectIntoBodyUnknownEncodingPassthrough(t *testing.T) {
func TestGunzipBombGuard(t *testing.T) { func TestGunzipBombGuard(t *testing.T) {
// A body that inflates beyond gunzipCap must be rejected (not OOM the worker). // A body that inflates beyond gunzipCap must be rejected (not OOM the worker).
// gzip of >32MiB of zeros compresses to a small blob but inflates past the // gzip of >32MiB of zeros compresses to a small blob but inflates past the
// cap → gunzipBytes returns an error → inject path fails open. // cap → GunzipBytes returns an error → inject path fails open.
big := gzipBytes(make([]byte, gunzipCap+1024)) const bombCap = 32 << 20 // mirrors httpcodec.gunzipCap
if _, err := gunzipBytes(big); err == nil { big := httpcodec.GzipBytes(make([]byte, bombCap+1024))
t.Fatal("gunzipBytes must reject output exceeding gunzipCap") if _, err := httpcodec.GunzipBytes(big); err == nil {
t.Fatal("GunzipBytes must reject output exceeding gunzipCap")
} }
// And via the inject path: fail open, original bytes preserved. // And via the inject path: fail open, original bytes preserved.
out, ok := injectIntoBody(big, "gzip", inlineTestScript, "", false) out, ok := injectIntoBody(big, "gzip", inlineTestScript, "", false)
@ -142,12 +145,13 @@ func TestGunzipBombGuard(t *testing.T) {
func TestGunzipExactlyAtCap(t *testing.T) { func TestGunzipExactlyAtCap(t *testing.T) {
// A body that inflates to EXACTLY gunzipCap is allowed (boundary). // A body that inflates to EXACTLY gunzipCap is allowed (boundary).
payload := make([]byte, gunzipCap) const bombCap = 32 << 20 // mirrors httpcodec.gunzipCap
got, err := gunzipBytes(gzipBytes(payload)) payload := make([]byte, bombCap)
got, err := httpcodec.GunzipBytes(httpcodec.GzipBytes(payload))
if err != nil { if err != nil {
t.Fatalf("exactly-at-cap payload must be allowed: %v", err) t.Fatalf("exactly-at-cap payload must be allowed: %v", err)
} }
if len(got) != gunzipCap { if len(got) != bombCap {
t.Fatalf("at-cap length mismatch: got %d, want %d", len(got), gunzipCap) t.Fatalf("at-cap length mismatch: got %d, want %d", len(got), bombCap)
} }
} }

View File

@ -22,138 +22,20 @@ package main
import ( import (
"bytes" "bytes"
"crypto"
"crypto/rand"
"crypto/tls" "crypto/tls"
"crypto/x509"
"crypto/x509/pkix"
"encoding/pem"
"flag" "flag"
"fmt" "fmt"
"io" "io"
"log" "log"
"math/big"
"net" "net"
"net/http" "net/http"
"os"
"strconv" "strconv"
"strings" "strings"
"sync"
"time" "time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
) )
// ── CA + per-host leaf forging ──────────────────────────────────────────────
// CA holds the loaded forging CA (reused from ca-wg) + a per-host leaf cache.
type CA struct {
cert *x509.Certificate
key crypto.Signer
mu sync.Mutex
cache map[string]*tls.Certificate
}
func loadCA(certPath, keyPath string) (*CA, error) {
cpem, err := os.ReadFile(certPath)
if err != nil {
return nil, fmt.Errorf("read ca cert: %w", err)
}
kpem, err := os.ReadFile(keyPath)
if err != nil {
return nil, fmt.Errorf("read ca key: %w", err)
}
// Scan for the right block TYPE rather than assuming position: the live R3
// CA the toolbox forges with (mitmproxy confdir `mitmproxy-ca.pem`) is a
// COMBINED cert+key bundle, and --ca-key may point at it. Tolerate cert and
// key co-residing in either file, in any order.
cblk := firstPEMBlock(cpem, func(b *pem.Block) bool { return b.Type == "CERTIFICATE" })
if cblk == nil {
return nil, fmt.Errorf("ca cert: no CERTIFICATE PEM block")
}
cert, err := x509.ParseCertificate(cblk.Bytes)
if err != nil {
return nil, fmt.Errorf("parse ca cert: %w", err)
}
kblk := firstPEMBlock(kpem, func(b *pem.Block) bool { return strings.Contains(b.Type, "PRIVATE KEY") })
if kblk == nil {
return nil, fmt.Errorf("ca key: no PRIVATE KEY PEM block")
}
key, err := parseKey(kblk.Bytes)
if err != nil {
return nil, fmt.Errorf("parse ca key: %w", err)
}
return &CA{cert: cert, key: key, cache: map[string]*tls.Certificate{}}, nil
}
// firstPEMBlock returns the first PEM block in data satisfying want, or nil.
// Used to pull a specific block (CERTIFICATE / PRIVATE KEY) out of a file that
// may hold several (e.g. mitmproxy's combined CA bundle).
func firstPEMBlock(data []byte, want func(*pem.Block) bool) *pem.Block {
for {
blk, rest := pem.Decode(data)
if blk == nil {
return nil
}
if want(blk) {
return blk
}
data = rest
}
}
func parseKey(der []byte) (crypto.Signer, error) {
if k, err := x509.ParsePKCS8PrivateKey(der); err == nil {
if s, ok := k.(crypto.Signer); ok {
return s, nil
}
}
if k, err := x509.ParsePKCS1PrivateKey(der); err == nil {
return k, nil
}
if k, err := x509.ParseECPrivateKey(der); err == nil {
return k, nil
}
return nil, fmt.Errorf("unsupported CA key format")
}
// forge returns a leaf cert for host signed by the CA, cached.
func (c *CA) forge(host string) (*tls.Certificate, error) {
host = strings.ToLower(strings.TrimSpace(host))
c.mu.Lock()
if tc, ok := c.cache[host]; ok {
c.mu.Unlock()
return tc, nil
}
c.mu.Unlock()
serial, _ := rand.Int(rand.Reader, new(big.Int).Lsh(big.NewInt(1), 128))
tmpl := &x509.Certificate{
SerialNumber: serial,
Subject: pkix.Name{CommonName: host},
// #689 — forged leaves must outlive the (non-evicting) cert cache, else a
// long-running worker keeps serving an expired leaf and every client
// reports "certificat expiré". 365d forward + 48h back-skew = 367d span,
// safely under Apple's 398-day max-validity rule for server certs.
NotBefore: time.Now().Add(-48 * time.Hour),
NotAfter: time.Now().Add(365 * 24 * time.Hour),
KeyUsage: x509.KeyUsageDigitalSignature | x509.KeyUsageKeyEncipherment,
ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
DNSNames: []string{host},
}
der, err := x509.CreateCertificate(rand.Reader, tmpl, c.cert, c.key.Public(), c.key)
if err != nil {
return nil, err
}
leaf, err := x509.ParseCertificate(der) // parsed cert has Raw populated (Verify needs it)
if err != nil {
return nil, err
}
tc := &tls.Certificate{Certificate: [][]byte{der, c.cert.Raw}, PrivateKey: c.key, Leaf: leaf}
c.mu.Lock()
c.cache[host] = tc
c.mu.Unlock()
return tc, nil
}
// ── Pure handler logic ─────────────────────────────────────────────────────── // ── Pure handler logic ───────────────────────────────────────────────────────
// //
// The decision surface (Decide / action / registrable / splice helpers) lives // The decision surface (Decide / action / registrable / splice helpers) lives
@ -201,7 +83,7 @@ func ja4ish(h *tls.ClientHelloInfo) string {
// ── CONNECT-proxy MITM wiring ──────────────────────────────────────────────── // ── CONNECT-proxy MITM wiring ────────────────────────────────────────────────
type Proxy struct { type Proxy struct {
ca *CA ca *forge.CA
pol *Policy pol *Policy
jaSink func(string) // JA4 observations (logged; a sidecar in prod) jaSink func(string) // JA4 observations (logged; a sidecar in prod)
jarKey []byte // anti-track HMAC fake-identity seed (nil → poison off) jarKey []byte // anti-track HMAC fake-identity seed (nil → poison off)
@ -289,7 +171,7 @@ func (px *Proxy) serverTLSConfigCapture(capture func(*tls.ClientHelloInfo)) *tls
if name == "" { if name == "" {
name = "unknown.local" name = "unknown.local"
} }
return px.ca.forge(name) return px.ca.Forge(name)
}, },
} }
} }
@ -627,7 +509,7 @@ func main() {
mediaCatch := flag.Bool("media-catch", true, mediaCatch := flag.Bool("media-catch", true,
"R4 media reverse-catcher (#736): record cloneable media URLs (HLS/DASH manifests + direct audio/video) seen on MITM'd flows to "+mediaCatchPath+" for the mediaflow \"Discovered Media\" clone view. URLs only, never bodies; deduped. Set false to disable.") "R4 media reverse-catcher (#736): record cloneable media URLs (HLS/DASH manifests + direct audio/video) seen on MITM'd flows to "+mediaCatchPath+" for the mediaflow \"Discovered Media\" clone view. URLs only, never bodies; deduped. Set false to disable.")
flag.Parse() flag.Parse()
ca, err := loadCA(*caCert, *caKey) ca, err := forge.LoadCA(*caCert, *caKey)
if err != nil { if err != nil {
log.Fatalf("CA load: %v", err) log.Fatalf("CA load: %v", err)
} }

View File

@ -17,6 +17,8 @@ import (
"sync" "sync"
"testing" "testing"
"time" "time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
) )
// genTestCA writes a self-signed CA (cert+key PEM) to dir, mirroring ca-wg. // genTestCA writes a self-signed CA (cert+key PEM) to dir, mirroring ca-wg.
@ -53,20 +55,20 @@ func genTestCA(t *testing.T, dir string) (certPath, keyPath string) {
func TestForgeChainsToCA(t *testing.T) { func TestForgeChainsToCA(t *testing.T) {
cp, kp := genTestCA(t, t.TempDir()) cp, kp := genTestCA(t, t.TempDir())
ca, err := loadCA(cp, kp) ca, err := forge.LoadCA(cp, kp)
if err != nil { if err != nil {
t.Fatalf("loadCA: %v", err) t.Fatalf("loadCA: %v", err)
} }
leaf, err := ca.forge("ads.example.com") leaf, err := ca.Forge("ads.example.com")
if err != nil { if err != nil {
t.Fatalf("forge: %v", err) t.Fatalf("forge: %v", err)
} }
pool := x509.NewCertPool() pool := x509.NewCertPool()
pool.AddCert(ca.cert) pool.AddCert(ca.Cert)
if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil { if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil {
t.Fatalf("forged leaf does not chain to CA / wrong SAN: %v", err) t.Fatalf("forged leaf does not chain to CA / wrong SAN: %v", err)
} }
leaf2, _ := ca.forge("ads.example.com") leaf2, _ := ca.Forge("ads.example.com")
if leaf2 != leaf { if leaf2 != leaf {
t.Fatal("forge not cached") t.Fatal("forge not cached")
} }
@ -112,22 +114,22 @@ func TestLoadCACombinedPEM(t *testing.T) {
} }
// The unit's exact arg shape: --ca-cert <cert-only> --ca-key <combined>. // The unit's exact arg shape: --ca-cert <cert-only> --ca-key <combined>.
ca, err := loadCA(certOnly, combined) ca, err := forge.LoadCA(certOnly, combined)
if err != nil { if err != nil {
t.Fatalf("loadCA(cert-only, combined): %v", err) t.Fatalf("forge.LoadCA(cert-only, combined): %v", err)
} }
leaf, err := ca.forge("ads.example.com") leaf, err := ca.Forge("ads.example.com")
if err != nil { if err != nil {
t.Fatalf("forge: %v", err) t.Fatalf("forge: %v", err)
} }
pool := x509.NewCertPool() pool := x509.NewCertPool()
pool.AddCert(ca.cert) pool.AddCert(ca.Cert)
if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil { if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil {
t.Fatalf("forged leaf does not chain to combined-PEM CA: %v", err) t.Fatalf("forged leaf does not chain to combined-PEM CA: %v", err)
} }
// Belt-and-braces: the combined file works as BOTH cert and key source. // Belt-and-braces: the combined file works as BOTH cert and key source.
if _, err := loadCA(combined, combined); err != nil { if _, err := forge.LoadCA(combined, combined); err != nil {
t.Fatalf("loadCA(combined, combined): %v", err) t.Fatalf("forge.LoadCA(combined, combined): %v", err)
} }
} }
@ -161,7 +163,7 @@ func contains(s, sub string) bool {
// ClientHello (JA4 material) is captured. // ClientHello (JA4 material) is captured.
func TestClientHelloCaptureAndForge(t *testing.T) { func TestClientHelloCaptureAndForge(t *testing.T) {
cp, kp := genTestCA(t, t.TempDir()) cp, kp := genTestCA(t, t.TempDir())
ca, err := loadCA(cp, kp) ca, err := forge.LoadCA(cp, kp)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
@ -185,7 +187,7 @@ func TestClientHelloCaptureAndForge(t *testing.T) {
}() }()
pool := x509.NewCertPool() pool := x509.NewCertPool()
pool.AddCert(ca.cert) pool.AddCert(ca.Cert)
conn, err := tls.Dial("tcp", ln.Addr().String(), &tls.Config{ServerName: "example.com", RootCAs: pool}) conn, err := tls.Dial("tcp", ln.Addr().String(), &tls.Config{ServerName: "example.com", RootCAs: pool})
if err != nil { if err != nil {
t.Fatalf("client handshake against forged cert failed (CA not trusted / forge broken): %v", err) t.Fatalf("client handshake against forged cert failed (CA not trusted / forge broken): %v", err)

View File

@ -13,12 +13,13 @@
package main package main
import ( import (
"bufio"
"os" "os"
"regexp" "regexp"
"strings" "strings"
"sync" "sync"
"time" "time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/reload"
) )
// ── ad_ghost: static ad/tracker host pattern (port of _AD_HOST) ────────────── // ── ad_ghost: static ad/tracker host pattern (port of _AD_HOST) ──────────────
@ -98,8 +99,8 @@ func envOr(key, def string) string {
// keeps the legacy PoC fields (Inject) so the existing wiring/tests still work. // keeps the legacy PoC fields (Inject) so the existing wiring/tests still work.
type Policy struct { type Policy struct {
// mu guards the live-reloadable map fields below. Decide/allowed/blockedByAd/ // mu guards the live-reloadable map fields below. Decide/allowed/blockedByAd/
// shouldSplice take RLock; maybeReload takes Lock only when a backing file // shouldSplice take RLock; the reload Apply callbacks take Lock when a backing
// actually changed (the throttle + stat happen under a separate lighter lock). // file actually changed.
mu sync.RWMutex mu sync.RWMutex
adHost *regexp.Regexp adHost *regexp.Regexp
@ -117,11 +118,21 @@ type Policy struct {
// mtime changes so autolearn promotions / manual edits take effect WITHOUT a // mtime changes so autolearn promotions / manual edits take effect WITHOUT a
// worker restart (mirrors ad_ghost._maybe_reload). The hot path (Decide) // worker restart (mirrors ad_ghost._maybe_reload). The hot path (Decide)
// calls maybeReload(): a throttle check, then — at most every reloadThrottle — // calls maybeReload(): a throttle check, then — at most every reloadThrottle —
// a cheap stat() of each backing file. Only a changed file is re-read and its // the generic reload.Watcher stats each backing file and calls Apply for each
// map atomically swapped under mu. // changed file. Each Apply swaps the affected map under p.mu.
reloadFiles []reloadTarget // backing files + their swap target //
// Atomicity note: in the original maybeReload(), ALL changed targets were
// applied under a SINGLE p.mu.Lock(). With reload.Watcher, the Watcher's
// internal mu serialises concurrent Maybe() calls, and each Apply callback
// takes p.mu.Lock() independently. The maps are independent (no cross-map
// invariant between e.g. learned and allow), so per-map locking is safe.
// The Watcher's mu ensures no two Maybe() batches interleave: a second
// goroutine calling Maybe() while a batch is applying will block until
// the first batch completes. Parity tests confirm Decide semantics are
// identical.
watcher *reload.Watcher
fortknoxSites []string // kept for rebuilding the never-set on pure-trackers reload fortknoxSites []string // kept for rebuilding the never-set on pure-trackers reload
reloadMu sync.Mutex // guards lastReloadCheck + the per-file mtimes reloadMu sync.Mutex // guards lastReloadID (throttle bookkeeping)
lastReloadID int64 // unix-nano of the last throttle pass (0 = never) lastReloadID int64 // unix-nano of the last throttle pass (0 = never)
reloadThrottle time.Duration // min interval between stat passes (0 in tests = eager) reloadThrottle time.Duration // min interval between stat passes (0 in tests = eager)
@ -129,63 +140,11 @@ type Policy struct {
Inject []byte // banner / ad-CSS marker injected before </head> or </body> Inject []byte // banner / ad-CSS marker injected before </head> or </body>
} }
// reloadTarget describes one backing file the engine live-reloads: its path, the
// last mtime we read, whether comment-stripping applies (loadLines vs
// loadLinesRaw), and an applier that swaps the freshly-read set into the right
// Policy field (under p.mu, held by the caller). pure-trackers re-derives the
// never-set ( fortknox) so it stays consistent.
type reloadTarget struct {
path string
stripComm bool
lastMtime int64
apply func(p *Policy, set map[string]bool)
}
// defaultReloadThrottle is the production stat cadence: a backing-file change // defaultReloadThrottle is the production stat cadence: a backing-file change
// (autolearn runs hourly; a promotion is rare) is observed within ~15s, and the // (autolearn runs hourly; a promotion is rare) is observed within ~15s, and the
// hot path stats at most ~4×/minute regardless of request rate. // hot path stats at most ~4×/minute regardless of request rate.
const defaultReloadThrottle = 15 * time.Second const defaultReloadThrottle = 15 * time.Second
// loadLines mirrors the comment-stripping Python loaders (splice._load_lines,
// ad_ghost._allowed's allowlist read): split on first '#', trim, lowercase,
// skip blanks. Missing/unreadable file → empty set (best-effort).
func loadLines(path string) map[string]bool {
return scanLines(path, true)
}
// loadLinesRaw mirrors ad_ghost._learned_set, which does NOT comment-strip —
// learned-trackers.txt is a machine-generated one-host-per-line file. It does
// `{ln.strip().lower() for ln in f if ln.strip()}`. Matching this exactly is
// load-bearing for parity (a '#' in this file would be kept verbatim, not a
// comment), so the Go core must mirror the divergent behaviour, not normalise it.
func loadLinesRaw(path string) map[string]bool {
return scanLines(path, false)
}
func scanLines(path string, stripComments bool) map[string]bool {
out := map[string]bool{}
f, err := os.Open(path)
if err != nil {
return out
}
defer f.Close()
sc := bufio.NewScanner(f)
sc.Buffer(make([]byte, 0, 64*1024), 1<<20)
for sc.Scan() {
ln := sc.Text()
if stripComments {
if i := strings.IndexByte(ln, '#'); i >= 0 {
ln = ln[:i]
}
}
ln = strings.ToLower(strings.TrimSpace(ln))
if ln != "" {
out[ln] = true
}
}
return out
}
// LoadPolicy loads all backing files from opts (defaults applied for empty // LoadPolicy loads all backing files from opts (defaults applied for empty
// fields) and compiles the ad-host regex. It never returns an error for missing // fields) and compiles the ad-host regex. It never returns an error for missing
// files (best-effort, like the Python addons), only for a regex-compile bug. // files (best-effort, like the Python addons), only for a regex-compile bug.
@ -216,7 +175,7 @@ func LoadPolicy(opts PolicyOpts) (*Policy, error) {
} }
// never-set = pure-trackers fortknox_sites (mirrors TlsSplice._refresh_sets). // never-set = pure-trackers fortknox_sites (mirrors TlsSplice._refresh_sets).
never := loadLines(opts.PureTrackersPath) never := reload.LoadLines(opts.PureTrackersPath, true)
for _, s := range opts.FortknoxSites { for _, s := range opts.FortknoxSites {
if s = strings.Trim(strings.ToLower(strings.TrimSpace(s)), "."); s != "" { if s = strings.Trim(strings.ToLower(strings.TrimSpace(s)), "."); s != "" {
never[s] = true never[s] = true
@ -236,10 +195,10 @@ func LoadPolicy(opts PolicyOpts) (*Policy, error) {
p := &Policy{ p := &Policy{
adHost: re, adHost: re,
learned: loadLinesRaw(opts.LearnedPath), // mirrors _learned_set (no comment-strip) learned: reload.LoadLines(opts.LearnedPath, false), // mirrors _learned_set (no comment-strip)
allow: loadLines(opts.AllowPath), allow: reload.LoadLines(opts.AllowPath, true),
spliceSeed: loadLines(opts.SpliceSeedPath), spliceSeed: reload.LoadLines(opts.SpliceSeedPath, true),
spliceLearn: loadLines(opts.SpliceLearnPath), spliceLearn: reload.LoadLines(opts.SpliceLearnPath, true),
never: never, never: never,
selfRegs: selfRegs, selfRegs: selfRegs,
selfDomains: selfDomains, selfDomains: selfDomains,
@ -249,54 +208,85 @@ func LoadPolicy(opts PolicyOpts) (*Policy, error) {
// ── register the live-reloadable backing files (#662 auto-learn loop) ───── // ── register the live-reloadable backing files (#662 auto-learn loop) ─────
// //
// Each entry re-reads its file when its mtime changes and atomically swaps // Each reload.Target re-reads its file when its mtime changes and calls Apply
// the map under p.mu (held by maybeReload). learned-trackers + ad-allowlist // to swap the map under p.mu. The Watcher (throttle=0 here; the Policy-level
// are the load-bearing pair (autolearn promotes into learned; the operator // throttle check in maybeReload() controls the rate) handles mtime tracking.
// edits the allowlist); the splice seed/learned + pure-trackers files are //
// reloaded too for consistency (pure-trackers re-derives the never-set). // learned-trackers uses stripComments=false (loadLinesRaw: machine-generated,
p.reloadFiles = []reloadTarget{ // one-host-per-line, a '#' is kept verbatim). All other files use
{path: opts.LearnedPath, stripComm: false, lastMtime: statMtime(opts.LearnedPath), // stripComments=true (operator-editable, comment lines are ignored).
apply: func(p *Policy, s map[string]bool) { p.learned = s }}, targets := []reload.Target{
{path: opts.AllowPath, stripComm: true, lastMtime: statMtime(opts.AllowPath), {
apply: func(p *Policy, s map[string]bool) { p.allow = s }}, Path: opts.LearnedPath,
{path: opts.SpliceSeedPath, stripComm: true, lastMtime: statMtime(opts.SpliceSeedPath), LastMtime: reload.StatMtime(opts.LearnedPath),
apply: func(p *Policy, s map[string]bool) { p.spliceSeed = s }}, Load: func(path string) any { return reload.LoadLines(path, false) },
{path: opts.SpliceLearnPath, stripComm: true, lastMtime: statMtime(opts.SpliceLearnPath), Apply: func(v any) {
apply: func(p *Policy, s map[string]bool) { p.spliceLearn = s }}, p.mu.Lock()
{path: opts.PureTrackersPath, stripComm: true, lastMtime: statMtime(opts.PureTrackersPath), p.learned = v.(map[string]bool)
apply: func(p *Policy, s map[string]bool) { p.mu.Unlock()
},
},
{
Path: opts.AllowPath,
LastMtime: reload.StatMtime(opts.AllowPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
p.mu.Lock()
p.allow = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.SpliceSeedPath,
LastMtime: reload.StatMtime(opts.SpliceSeedPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
p.mu.Lock()
p.spliceSeed = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.SpliceLearnPath,
LastMtime: reload.StatMtime(opts.SpliceLearnPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
p.mu.Lock()
p.spliceLearn = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.PureTrackersPath,
LastMtime: reload.StatMtime(opts.PureTrackersPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
// pure-trackers fortknox → never-set (mirrors LoadPolicy above). // pure-trackers fortknox → never-set (mirrors LoadPolicy above).
s := v.(map[string]bool)
for _, fk := range p.fortknoxSites { for _, fk := range p.fortknoxSites {
if fk = strings.Trim(strings.ToLower(strings.TrimSpace(fk)), "."); fk != "" { if fk = strings.Trim(strings.ToLower(strings.TrimSpace(fk)), "."); fk != "" {
s[fk] = true s[fk] = true
} }
} }
p.mu.Lock()
p.never = s p.never = s
}}, p.mu.Unlock()
},
},
} }
// The Watcher is created with throttle=0: the Policy-level reloadThrottle
// check in maybeReload() gates how often we call w.Maybe().
p.watcher = reload.NewWatcher(0, targets...)
return p, nil return p, nil
} }
// statMtime returns the file's mtime in unix-nano, or 0 when the file is missing
// or unreadable (best-effort, like the Python loaders: a missing file → empty
// set, mtime 0). A file appearing/disappearing therefore registers as a change.
func statMtime(path string) int64 {
if path == "" {
return 0
}
fi, err := os.Stat(path)
if err != nil {
return 0
}
return fi.ModTime().UnixNano()
}
// maybeReload re-reads any backing list whose on-disk mtime changed since the // maybeReload re-reads any backing list whose on-disk mtime changed since the
// last pass, swapping the affected map(s) under p.mu. Throttled to at most one // last pass, swapping the affected map(s) under p.mu. Throttled to at most one
// stat pass per p.reloadThrottle (cheap: a time compare + a few stats), so the // stat pass per p.reloadThrottle (cheap: a time compare + a few stats), so the
// Decide hot path pays almost nothing. Concurrency-safe: the throttle/mtime // Decide hot path pays almost nothing. Concurrency-safe: the throttle
// bookkeeping is under reloadMu and the map swap under mu — Decide's readers // bookkeeping is under reloadMu, the watcher handles mtime tracking and calls
// hold mu.RLock, so a swap is atomic w.r.t. any in-flight decision. // Apply callbacks (each taking p.mu.Lock) — Decide's readers hold mu.RLock, so
// a swap is atomic w.r.t. any in-flight decision.
func (p *Policy) maybeReload() { func (p *Policy) maybeReload() {
now := time.Now() now := time.Now()
p.reloadMu.Lock() p.reloadMu.Lock()
@ -306,35 +296,9 @@ func (p *Policy) maybeReload() {
return return
} }
p.lastReloadID = now.UnixNano() p.lastReloadID = now.UnixNano()
// Collect the files that changed (stat under reloadMu; re-read outside mu).
type pending struct {
idx int
set map[string]bool
}
var changed []pending
for i := range p.reloadFiles {
rt := &p.reloadFiles[i]
if rt.path == "" {
continue
}
m := statMtime(rt.path)
if m != rt.lastMtime {
rt.lastMtime = m
changed = append(changed, pending{idx: i, set: scanLines(rt.path, rt.stripComm)})
}
}
p.reloadMu.Unlock() p.reloadMu.Unlock()
if len(changed) == 0 { p.watcher.Maybe()
return
}
// Swap the affected maps atomically under the write lock.
p.mu.Lock()
for _, c := range changed {
p.reloadFiles[c.idx].apply(p, c.set)
}
p.mu.Unlock()
} }
// ── registrable: port of ad_ghost._registrable ─────────────────────────────── // ── registrable: port of ad_ghost._registrable ───────────────────────────────
@ -492,7 +456,7 @@ func (p *Policy) Decide(host, sni string) string {
// #662 — pick up autolearn promotions / manual edits without a worker // #662 — pick up autolearn promotions / manual edits without a worker
// restart. Throttled to ~every reloadThrottle and best-effort, so the hot // restart. Throttled to ~every reloadThrottle and best-effort, so the hot
// path normally pays only a time compare. Done BEFORE taking the read lock // path normally pays only a time compare. Done BEFORE taking the read lock
// (maybeReload may take the write lock to swap a changed map). // (maybeReload may trigger Apply callbacks that take the write lock).
p.maybeReload() p.maybeReload()
if sni == "" { if sni == "" {
sni = host sni = host

View File

@ -24,6 +24,8 @@ import (
"net/http" "net/http"
"strings" "strings"
"time" "time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/relay"
) )
// Stable socket paths — verbatim from the Python addons' TARGET constants // Stable socket paths — verbatim from the Python addons' TARGET constants
@ -65,7 +67,7 @@ func (px *Proxy) relayEmit(socketPath, route string, payload []byte) {
if !px.relayEnabled() || len(payload) == 0 { if !px.relayEnabled() || len(payload) == 0 {
return return
} }
emit(socketPath, route, payload) relay.Emit(socketPath, route, payload)
} }
// ── dpi payload ────────────────────────────────────────────────────────────── // ── dpi payload ──────────────────────────────────────────────────────────────

View File

@ -22,74 +22,7 @@
// cookie values) are NOT emitted to a module socket but POSTed to the portal // cookie values) are NOT emitted to a module socket but POSTed to the portal
// /__toolbox/social-event ingest (the social store lives in the toolbox/portal). // /__toolbox/social-event ingest (the social store lives in the toolbox/portal).
// //
// emit takes the full socket PATH (not an http+unix:// URL) plus the route in // Transport is now internal/relay. This file is retained for doc context only;
// the payload's destination; callers build the path from the table above. // the emit/emitSync/emitTimeout declarations have been moved to internal/relay
// // as Emit/EmitSync/EmitTimeout (ref #744).
// Pure standard library — no external modules, no go.sum.
package main package main
import (
"context"
"fmt"
"net"
"time"
)
// emitTimeout caps the whole connect+write+read so a slow/dead module socket
// can never wedge the engine. Mirrors the Python httpx timeout=2.
const emitTimeout = 2 * time.Second
// emit fires a fire-and-forget POST of payload to the given unix socket at
// route, in a detached goroutine. It returns immediately and never blocks the
// caller; all errors (missing socket, dead peer, timeout) are swallowed —
// dropping a relayed signal must never break a client flow. Mirrors
// _common.fire_forget_post + queue_async (create_task, never raise).
//
// route is the HTTP path on the module (e.g. "/inject", "/classify"); use the
// addon→socket table above to pick socketPath + route together.
func emit(socketPath, route string, payload []byte) {
go emitSync(socketPath, route, payload)
}
// emitSync performs the actual POST synchronously (under emitTimeout). Exposed
// (lowercase, same-package) so tests can observe delivery deterministically
// without racing the goroutine. Returns an error only for the test's benefit;
// emit() discards it.
func emitSync(socketPath, route string, payload []byte) error {
if route == "" {
route = "/"
}
ctx, cancel := context.WithTimeout(context.Background(), emitTimeout)
defer cancel()
var d net.Dialer
conn, err := d.DialContext(ctx, "unix", socketPath)
if err != nil {
return err // dead/missing socket — swallowed by emit()
}
defer conn.Close()
if dl, ok := ctx.Deadline(); ok {
_ = conn.SetDeadline(dl)
}
// Minimal HTTP/1.1 POST. Host is a placeholder (unix transport); the module
// FastAPI apps ignore it. Connection: close so the peer EOFs after replying.
req := fmt.Sprintf(
"POST %s HTTP/1.1\r\nHost: secubox.local\r\nContent-Type: application/json\r\n"+
"Content-Length: %d\r\nConnection: close\r\n\r\n",
route, len(payload))
if _, err := conn.Write([]byte(req)); err != nil {
return err
}
if len(payload) > 0 {
if _, err := conn.Write(payload); err != nil {
return err
}
}
// Best-effort drain so the peer sees a clean close; we don't parse the
// response (fire-and-forget). Errors here are irrelevant.
buf := make([]byte, 512)
_, _ = conn.Read(buf)
return nil
}

View File

@ -2,6 +2,7 @@
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr> // Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
// //
// Unit tests for the sidecar emit helper (#662 Phase 4). // Unit tests for the sidecar emit helper (#662 Phase 4).
// Transport now delegates to internal/relay (ref #744).
package main package main
import ( import (
@ -11,10 +12,12 @@ import (
"strings" "strings"
"testing" "testing"
"time" "time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/relay"
) )
// TestEmitDelivers: emitSync to a live unix socket delivers the POST request // TestEmitDelivers: relay.EmitSync to a live unix socket delivers the POST
// line, route and JSON body. // request line, route and JSON body.
func TestEmitDelivers(t *testing.T) { func TestEmitDelivers(t *testing.T) {
sock := filepath.Join(t.TempDir(), "emit.sock") sock := filepath.Join(t.TempDir(), "emit.sock")
ln, err := net.Listen("unix", sock) ln, err := net.Listen("unix", sock)
@ -41,13 +44,13 @@ func TestEmitDelivers(t *testing.T) {
break break
} }
} }
// Reply so emitSync's drain completes cleanly. // Reply so EmitSync's drain completes cleanly.
c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n")) c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n"))
got <- sb.String() got <- sb.String()
}() }()
if err := emitSync(sock, "/classify", []byte(`{"k":"v"}`)); err != nil { if err := relay.EmitSync(sock, "/classify", []byte(`{"k":"v"}`)); err != nil {
t.Fatalf("emitSync: %v", err) t.Fatalf("EmitSync: %v", err)
} }
select { select {
@ -63,31 +66,31 @@ func TestEmitDelivers(t *testing.T) {
} }
} }
// TestEmitDeadSocketNoPanicNoBlock: emit() (the goroutine form) to a // TestEmitDeadSocketNoPanicNoBlock: relay.Emit (the goroutine form) to a
// nonexistent socket must return immediately and never panic, and emitSync // nonexistent socket must return immediately and never panic, and EmitSync
// must just return an error without blocking past the timeout. // must just return an error without blocking past the timeout.
func TestEmitDeadSocketNoPanicNoBlock(t *testing.T) { func TestEmitDeadSocketNoPanicNoBlock(t *testing.T) {
dead := filepath.Join(t.TempDir(), "nope.sock") dead := filepath.Join(t.TempDir(), "nope.sock")
// emit (async) returns instantly even though the socket is dead. // Emit (async) returns instantly even though the socket is dead.
done := make(chan struct{}) done := make(chan struct{})
go func() { go func() {
defer close(done) defer close(done)
emit(dead, "/inject", []byte(`{"x":1}`)) // must not panic/block relay.Emit(dead, "/inject", []byte(`{"x":1}`)) // must not panic/block
}() }()
select { select {
case <-done: case <-done:
case <-time.After(time.Second): case <-time.After(time.Second):
t.Fatal("emit() blocked on a dead socket") t.Fatal("relay.Emit() blocked on a dead socket")
} }
// emitSync surfaces the dial error (which emit swallows) without blocking. // EmitSync surfaces the dial error (which Emit swallows) without blocking.
start := time.Now() start := time.Now()
if err := emitSync(dead, "/inject", []byte(`{}`)); err == nil { if err := relay.EmitSync(dead, "/inject", []byte(`{}`)); err == nil {
t.Error("emitSync to dead socket: expected error, got nil") t.Error("EmitSync to dead socket: expected error, got nil")
} }
if elapsed := time.Since(start); elapsed > emitTimeout+time.Second { if elapsed := time.Since(start); elapsed > relay.EmitTimeout+time.Second {
t.Errorf("emitSync blocked %v on dead socket", elapsed) t.Errorf("EmitSync blocked %v on dead socket", elapsed)
} }
} }
@ -111,8 +114,8 @@ func TestEmitEmptyRouteDefaults(t *testing.T) {
c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n")) c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n"))
got <- string(buf[:n]) got <- string(buf[:n])
}() }()
if err := emitSync(sock, "", nil); err != nil { if err := relay.EmitSync(sock, "", nil); err != nil {
t.Fatalf("emitSync: %v", err) t.Fatalf("EmitSync: %v", err)
} }
select { select {
case raw := <-got: case raw := <-got:

View File

@ -0,0 +1,89 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: ban — sliding-window graduated ban state
//
// Mirrors the Python BAN_THRESHOLD=3 / BAN_WINDOW=300s semantics from
// packages/secubox-mitmproxy/addons/secubox_waf.py.
//
// Design notes:
// - Window: 300 s (default, matches Python BAN_WINDOW)
// - Threshold: 3 hits within the window triggers a ban (matches BAN_THRESHOLD)
// - Map cap: 100 000 unique IPs. Once reached, new IPs are silently dropped
// (not recorded, not banned). This bounds memory under a flood: at 8 bytes
// per int64 timestamp × ~10 hits × 100k IPs ≈ 8 MB worst-case, well below
// any realistic RAM budget. The cap is intentionally generous; operator can
// tune via NewBan if needed in the future.
// - Pruning: Per-call, only for the affected IP. No background goroutine;
// avoids timer complexity for Task 3.1 scope.
// - Concurrency: single sync.Mutex guards the whole map. A sharded approach
// can be added later if contention shows up in profiling.
package main
import (
"sync"
"time"
)
const banMapCap = 100_000
// Ban holds the sliding-window threat state for all client IPs.
type Ban struct {
mu sync.Mutex
window int64 // window size in seconds
threshold int
hits map[string][]int64 // IP → slice of Unix timestamps of threat hits
}
// NewBan creates a new Ban tracker.
//
// window — size of the sliding time window (e.g. 300*time.Second)
// threshold — number of hits within the window that triggers a ban
func NewBan(window time.Duration, threshold int) *Ban {
return &Ban{
window: int64(window.Seconds()),
threshold: threshold,
hits: make(map[string][]int64),
}
}
// Record records one threat hit for ip at time nowUnix (Unix seconds).
// It prunes hits older than nowUnix-window BEFORE counting, then appends.
// Returns:
//
// count — number of hits within the window after this one (≥ 1)
// banned — true when count >= threshold
//
// New IPs are silently ignored (not recorded) once the map reaches banMapCap
// to bound memory under a SYN/scan flood. In that case count=0, banned=false.
func (b *Ban) Record(ip string, nowUnix int64) (count int, banned bool) {
b.mu.Lock()
defer b.mu.Unlock()
cutoff := nowUnix - b.window
ts, exists := b.hits[ip]
if !exists {
// Guard: enforce map cap against IP-flood amplification.
if len(b.hits) >= banMapCap {
return 0, false
}
}
// Prune timestamps outside the window.
pruned := ts[:0]
for _, t := range ts {
if t > cutoff {
pruned = append(pruned, t)
}
}
// Append this hit.
pruned = append(pruned, nowUnix)
b.hits[ip] = pruned
count = len(pruned)
banned = count >= b.threshold
return count, banned
}

View File

@ -0,0 +1,75 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: ban_test — sliding-window ban state machine tests
package main
import (
"testing"
"time"
)
// TestBanGraduated verifies that the ban threshold is reached on the 3rd hit
// within the window (default: window=300s, threshold=3).
func TestBanGraduated(t *testing.T) {
b := NewBan(300*time.Second, 3)
ip := "1.2.3.4"
count, banned := b.Record(ip, 0)
if count != 1 || banned {
t.Fatalf("after 1st hit: want (1,false), got (%d,%v)", count, banned)
}
count, banned = b.Record(ip, 0)
if count != 2 || banned {
t.Fatalf("after 2nd hit: want (2,false), got (%d,%v)", count, banned)
}
count, banned = b.Record(ip, 0)
if count != 3 || !banned {
t.Fatalf("after 3rd hit: want (3,true), got (%d,%v)", count, banned)
}
}
// TestBanWindowExpiry verifies that hits older than the window are pruned so
// that a previously-banned IP resets its count after the window expires.
func TestBanWindowExpiry(t *testing.T) {
b := NewBan(300*time.Second, 3)
ip := "1.2.3.4"
// Hit 3 times at t=0 → banned.
b.Record(ip, 0)
b.Record(ip, 0)
count, banned := b.Record(ip, 0)
if count != 3 || !banned {
t.Fatalf("pre-condition: want (3,true) at t=0, got (%d,%v)", count, banned)
}
// At t=400 (> 300s window) all prior hits are pruned; new hit → count=1, not banned.
count, banned = b.Record(ip, 400)
if count != 1 || banned {
t.Fatalf("after window expiry at t=400: want (1,false), got (%d,%v)", count, banned)
}
}
// TestBanPerIPIsolation verifies that hits on one IP do not bleed into another.
func TestBanPerIPIsolation(t *testing.T) {
b := NewBan(300*time.Second, 3)
ipA := "1.2.3.4"
ipB := "5.6.7.8"
// Three hits on A → banned.
b.Record(ipA, 0)
b.Record(ipA, 0)
_, bannedA := b.Record(ipA, 0)
if !bannedA {
t.Fatal("ipA should be banned after 3 hits")
}
// B has had zero hits → count=1, not banned after its first hit.
countB, bannedB := b.Record(ipB, 0)
if countB != 1 || bannedB {
t.Fatalf("ipB isolation: want (1,false), got (%d,%v)", countB, bannedB)
}
}

View File

@ -0,0 +1,277 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: cookieaudit — RGPD Set-Cookie ledger
//
// Task 5.1: For every Set-Cookie header in an upstream response, append one
// JSONL record to a ledger file. Cookie values are SHA256-hashed in-process —
// the raw value NEVER leaves this component.
//
// Port from packages/secubox-mitmproxy/addons/cookie_audit.py (parse_set_cookie
// + CookieAudit._append). Go's http.Response.Cookies() does not expose the
// SameSite attribute, so we parse the raw "Set-Cookie" header strings directly
// (same approach as the Python parse_set_cookie function).
//
// Architecture:
// - A buffered channel (size cookieAuditChanSize) decouples Record callers
// from disk I/O. Record is non-blocking: when the channel is full the
// record is dropped (dropCount incremented) rather than blocking the HTTP
// response path.
// - A single writer goroutine drains the channel and appends to the ledger
// (O_WRONLY|O_CREATE|O_APPEND, 0640). The file is opened once at
// construction and held open for the lifetime of the CookieAudit to avoid
// per-record open/close overhead.
// - Close() closes the channel (draining it first) and waits for the writer
// to exit. Safe to call multiple times via sync.Once.
//
// Ledger path default: /var/log/secubox/cookie-audit/server.jsonl
// Configurable via --cookie-audit-log flag in main().
//
// JSON record fields (mirrors Python cookie_audit.py record):
//
// ts — RFC 3339 UTC timestamp
// vhost — bare hostname from the request (Host header)
// url_path — request URL path
// method — HTTP method
// status — response status code (int)
// name — cookie name
// value_hash — sha256(raw_value).hexdigest()
// domain — cookie Domain attribute (leading '.' stripped, omitted if absent)
// path — cookie Path attribute (omitted if absent)
// secure — bool
// httponly — bool
// samesite — SameSite attribute value (omitted if absent)
package main
import (
"crypto/sha256"
"encoding/json"
"fmt"
"net/http"
"os"
"strings"
"sync"
"sync/atomic"
"time"
)
// cookieAuditChanSize is the depth of the async record channel.
// At 256 entries the buffer absorbs short bursts without blocking; records
// beyond this are dropped (counted but never block the response path).
const cookieAuditChanSize = 256
// DefaultCookieAuditLog is the production ledger path, matching the Python
// addon's DEFAULT_LEDGER constant.
const DefaultCookieAuditLog = "/var/log/secubox/cookie-audit/server.jsonl"
// cookieRecord is the JSON shape written to the ledger.
// Fields mirror the Python parse_set_cookie + response hook dict.
type cookieRecord struct {
TS string `json:"ts"`
Vhost string `json:"vhost"`
URLPath string `json:"url_path"`
Method string `json:"method"`
Status int `json:"status"`
Name string `json:"name"`
ValueHash string `json:"value_hash"`
Domain *string `json:"domain"` // null when absent
Path *string `json:"path"` // null when absent
Secure bool `json:"secure"`
HTTPOnly bool `json:"httponly"`
SameSite *string `json:"samesite"` // null when absent
}
// CookieAudit appends one JSONL record per Set-Cookie header to a ledger.
// Goroutine-safe. Record is non-blocking (drop-on-full channel policy).
type CookieAudit struct {
ch chan cookieRecord
file *os.File
wg sync.WaitGroup
closeOnce sync.Once
dropCount atomic.Int64 // atomic counter for concurrent Record calls
}
// NewCookieAudit creates a CookieAudit that writes to path.
// The parent directory is created (0755) if it does not exist. The ledger file
// is opened with O_APPEND|O_CREATE. Panics if the directory cannot be created
// or the file cannot be opened — startup time, not the request path.
func NewCookieAudit(path string) *CookieAudit {
dir := path
// Trim the file name to get the directory.
if idx := strings.LastIndex(path, "/"); idx >= 0 {
dir = path[:idx]
}
if err := os.MkdirAll(dir, 0755); err != nil {
// Fatal at startup — the operator must fix the path.
panic(fmt.Sprintf("cookieaudit: mkdir %s: %v", dir, err))
}
f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0640)
if err != nil {
panic(fmt.Sprintf("cookieaudit: open %s: %v", path, err))
}
ca := &CookieAudit{
ch: make(chan cookieRecord, cookieAuditChanSize),
file: f,
}
ca.wg.Add(1)
go ca.writer()
return ca
}
// writer drains the channel and appends JSONL records to the ledger.
// Runs as a single goroutine for the lifetime of the CookieAudit.
func (ca *CookieAudit) writer() {
defer ca.wg.Done()
for rec := range ca.ch {
data, err := json.Marshal(rec)
if err != nil {
// json.Marshal with plain strings is unreachable in practice.
fmt.Fprintf(os.Stderr, "cookieaudit: marshal failed: %v\n", err)
continue
}
data = append(data, '\n')
if _, err := ca.file.Write(data); err != nil {
fmt.Fprintf(os.Stderr, "cookieaudit: write failed: %v\n", err)
}
}
}
// Close drains the channel (waits for the writer goroutine) and closes the
// underlying file. Safe to call multiple times.
func (ca *CookieAudit) Close() {
ca.closeOnce.Do(func() {
close(ca.ch)
ca.wg.Wait()
_ = ca.file.Close()
})
}
// Record enumerates the Set-Cookie headers in resp, builds one cookieRecord per
// cookie, SHA256-hashes the value, and sends to the async channel.
// NON-BLOCKING: if the channel is full, the record is dropped (never blocks
// the HTTP response path).
func (ca *CookieAudit) Record(host string, req *http.Request, resp *http.Response) {
if ca == nil || resp == nil {
return
}
rawCookies := resp.Header["Set-Cookie"]
if len(rawCookies) == 0 {
return
}
// Collect context fields once per call.
ts := time.Now().UTC().Format(time.RFC3339)
method := ""
urlPath := ""
status := resp.StatusCode
if req != nil {
method = req.Method
if req.URL != nil {
urlPath = req.URL.Path
}
}
for _, raw := range rawCookies {
rec, ok := parseSetCookieRaw(raw)
if !ok {
continue
}
rec.TS = ts
rec.Vhost = host
rec.URLPath = urlPath
rec.Method = method
rec.Status = status
// Non-blocking send: drop if the channel is full.
select {
case ca.ch <- rec:
default:
ca.dropCount.Add(1)
}
}
}
// parseSetCookieRaw parses a raw Set-Cookie header string into a cookieRecord
// (with only the cookie-level fields populated; context fields are set by
// Record). Returns ok=false if the header is malformed (no name=value pair).
//
// We parse the raw string directly rather than using http.Response.Cookies()
// because Go's net/http cookie parser does not expose the SameSite attribute.
// The parsing logic mirrors Python's parse_set_cookie function in cookie_audit.py.
func parseSetCookieRaw(raw string) (cookieRecord, bool) {
if raw == "" {
return cookieRecord{}, false
}
// Split on ';': first token is name=value, the rest are attributes.
parts := strings.Split(raw, ";")
if len(parts) == 0 {
return cookieRecord{}, false
}
// name=value (first token).
nameVal := strings.TrimSpace(parts[0])
eqIdx := strings.IndexByte(nameVal, '=')
if eqIdx < 0 {
// No '=' in the first token — malformed cookie.
return cookieRecord{}, false
}
name := strings.TrimSpace(nameVal[:eqIdx])
if name == "" {
return cookieRecord{}, false
}
rawValue := strings.TrimSpace(nameVal[eqIdx+1:])
// SHA256 the raw value — never store it.
sum := sha256.Sum256([]byte(rawValue))
valueHash := fmt.Sprintf("%x", sum)
rec := cookieRecord{
Name: name,
ValueHash: valueHash,
Secure: false,
HTTPOnly: false,
}
// Parse attributes.
for _, attr := range parts[1:] {
attr = strings.TrimSpace(attr)
if attr == "" {
continue
}
k, v, _ := strings.Cut(attr, "=")
k = strings.TrimSpace(strings.ToLower(k))
v = strings.TrimSpace(v)
switch k {
case "domain":
d := strings.TrimLeft(v, ".")
if d == "" {
// Empty after stripping dot → treat as absent (null).
break
}
rec.Domain = &d
case "path":
if v != "" {
rec.Path = &v
}
case "secure":
rec.Secure = true
case "httponly":
rec.HTTPOnly = true
case "samesite":
if v != "" {
rec.SameSite = &v
}
// expires, max-age, and other attributes are intentionally ignored
// (not RGPD-relevant per the Python addon's design decision).
}
}
return rec, true
}

View File

@ -0,0 +1,233 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: cookieaudit_test — TDD for Task 5.1
//
// Tests:
// - TestCookieAuditHashesValue: single Set-Cookie → one JSONL record, value
// SHA256-hashed (never raw), domain dot-stripped, attributes correct.
// - TestCookieAuditMultipleCookies: two Set-Cookie headers → two JSONL lines.
// - TestCookieAuditNonBlocking: Record returns promptly even when the writer
// is paused (channel-full drop policy — never blocks the response path).
package main
import (
"bufio"
"bytes"
"crypto/sha256"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"path/filepath"
"strings"
"testing"
"time"
)
// makeFakeResponse builds a minimal *http.Response carrying the given
// Set-Cookie header values. The request is a simple GET to targetURL.
func makeFakeResponse(targetURL string, setCookies []string) (*http.Response, *http.Request) {
req, _ := http.NewRequest(http.MethodGet, targetURL, nil)
hdr := http.Header{}
for _, sc := range setCookies {
hdr.Add("Set-Cookie", sc)
}
resp := &http.Response{
StatusCode: 200,
Header: hdr,
Body: io.NopCloser(bytes.NewReader(nil)),
Request: req,
}
return resp, req
}
// TestCookieAuditHashesValue verifies that:
// - The ledger receives exactly one record for a single Set-Cookie.
// - The raw cookie value ("secretvalue") is NEVER written to the file.
// - value_hash == sha256("secretvalue").
// - domain has the leading dot stripped.
// - secure, httponly are true; samesite is "Lax".
func TestCookieAuditHashesValue(t *testing.T) {
dir := t.TempDir()
ledger := filepath.Join(dir, "cookie-audit", "server.jsonl")
ca := NewCookieAudit(ledger)
defer ca.Close()
resp, req := makeFakeResponse(
"https://example.com/login",
[]string{"session=secretvalue; Domain=.example.com; Path=/; Secure; HttpOnly; SameSite=Lax"},
)
ca.Record(req.Host, req, resp)
// Wait for the async writer goroutine to flush.
ca.Close()
data, err := os.ReadFile(ledger)
if err != nil {
t.Fatalf("read ledger: %v", err)
}
lines := splitNonEmptyLines(string(data))
if len(lines) != 1 {
t.Fatalf("expected 1 JSONL record, got %d:\n%s", len(lines), string(data))
}
var rec map[string]interface{}
if err := json.Unmarshal([]byte(lines[0]), &rec); err != nil {
t.Fatalf("line not valid JSON: %v\nline: %q", err, lines[0])
}
// name
if rec["name"] != "session" {
t.Errorf("name: want %q got %v", "session", rec["name"])
}
// value_hash
wantHash := fmt.Sprintf("%x", sha256.Sum256([]byte("secretvalue")))
if rec["value_hash"] != wantHash {
t.Errorf("value_hash: want %q got %v", wantHash, rec["value_hash"])
}
// raw value must NOT appear anywhere in the file
if strings.Contains(string(data), "secretvalue") {
t.Errorf("raw cookie value 'secretvalue' must not appear in the ledger")
}
// domain: leading dot stripped
if rec["domain"] != "example.com" {
t.Errorf("domain: want %q got %v", "example.com", rec["domain"])
}
// path
if rec["path"] != "/" {
t.Errorf("path: want %q got %v", "/", rec["path"])
}
// secure
if rec["secure"] != true {
t.Errorf("secure: want true got %v", rec["secure"])
}
// httponly
if rec["httponly"] != true {
t.Errorf("httponly: want true got %v", rec["httponly"])
}
// samesite
if rec["samesite"] != "Lax" {
t.Errorf("samesite: want %q got %v", "Lax", rec["samesite"])
}
// ts must be a non-empty string
ts, _ := rec["ts"].(string)
if ts == "" {
t.Errorf("ts must be a non-empty RFC3339 timestamp")
}
}
// TestCookieAuditMultipleCookies verifies that two Set-Cookie headers produce
// two independent JSONL records.
func TestCookieAuditMultipleCookies(t *testing.T) {
dir := t.TempDir()
ledger := filepath.Join(dir, "cookie-audit", "server.jsonl")
ca := NewCookieAudit(ledger)
resp, req := makeFakeResponse(
"https://shop.example.com/cart",
[]string{
"cart=abc123; Path=/; HttpOnly",
"tracker=xyz789; Domain=.example.com; Path=/; Secure; SameSite=None",
},
)
ca.Record(req.Host, req, resp)
// Flush via Close.
ca.Close()
data, err := os.ReadFile(ledger)
if err != nil {
t.Fatalf("read ledger: %v", err)
}
lines := splitNonEmptyLines(string(data))
if len(lines) != 2 {
t.Fatalf("expected 2 JSONL records (one per Set-Cookie), got %d:\n%s", len(lines), string(data))
}
// Both lines must be valid JSON with a name field.
names := map[string]bool{}
for i, line := range lines {
var rec map[string]interface{}
if err := json.Unmarshal([]byte(line), &rec); err != nil {
t.Fatalf("line %d not valid JSON: %v", i+1, err)
}
n, _ := rec["name"].(string)
if n == "" {
t.Errorf("line %d: name must not be empty", i+1)
}
names[n] = true
}
if !names["cart"] {
t.Errorf("expected a record with name=cart")
}
if !names["tracker"] {
t.Errorf("expected a record with name=tracker")
}
}
// TestCookieAuditNonBlocking verifies that Record returns promptly even when
// the internal channel is full (i.e. the writer goroutine is not draining).
// Strategy: create a CookieAudit with a tiny channel, then call Record more
// times than the channel capacity without closing it. The call must return
// within a very short deadline — never blocking the response path.
func TestCookieAuditNonBlocking(t *testing.T) {
dir := t.TempDir()
ledger := filepath.Join(dir, "cookie-audit", "server.jsonl")
// Use the standard constructor (channel size 256). We call Record 512 times
// without any drain delay — the first 256 fill the channel; subsequent sends
// must be dropped non-blockingly. The goroutine will drain concurrently, but
// the test verifies that no single Record call hangs.
ca := NewCookieAudit(ledger)
resp, req := makeFakeResponse(
"https://example.com/",
[]string{"tok=value; Path=/"},
)
start := time.Now()
for i := 0; i < 512; i++ {
ca.Record(req.Host, req, resp)
}
elapsed := time.Since(start)
ca.Close()
// All 512 Record calls must complete in well under 1 second.
// (A blocking send would hang indefinitely; even a 100ms sleep per drop
// would blow this budget.)
if elapsed > 1*time.Second {
t.Errorf("Record loop took %v — looks like it blocked (want < 1s)", elapsed)
}
}
// splitNonEmptyLines splits s by newlines, returning only non-empty lines.
// Reuses the same logic as splitNonEmpty in threatlog_test.go (same package,
// different name to avoid collision with that helper's local scope).
func splitNonEmptyLines(s string) []string {
sc := bufio.NewScanner(bytes.NewBufferString(s))
var out []string
for sc.Scan() {
if line := sc.Text(); line != "" {
out = append(out, line)
}
}
return out
}

View File

@ -0,0 +1,190 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: crowdsec — CrowdSec LAPI alert bridge
//
// Task 4.1: implements CrowdSecClient, which satisfies the CrowdSecReporter
// interface declared in main.go. On a ban event the handler calls
// crowdsec.Report(ip, cat, sev) in a goroutine; this client builds the LAPI
// alert JSON (ported faithfully from secubox_waf.py _ban_via_crowdsec) and
// POSTs it to {lapiURL}/v1/alerts with a 2 s timeout.
//
// Best-effort: network errors are logged and swallowed — the WAF never blocks
// on LAPI availability. SSRF hygiene: redirect-following is disabled.
package main
import (
"bytes"
"encoding/json"
"fmt"
"log"
"net/http"
"strings"
"time"
)
// CrowdSecClient implements CrowdSecReporter by POSTing alert objects to the
// CrowdSec LAPI /v1/alerts endpoint.
type CrowdSecClient struct {
lapiURL string
jwt string
duration string
client *http.Client
}
// NewCrowdSecClient builds a CrowdSecClient with a 2 s timeout and no redirect
// following (SSRF hygiene).
//
// - lapiURL: base URL of the CrowdSec LAPI, e.g. "http://10.100.0.1:8080"
// - jwt: Bearer token (read from --crowdsec-jwt-file by main())
// - duration: ban duration string forwarded in the decision, e.g. "4h"
func NewCrowdSecClient(lapiURL, jwt, duration string) *CrowdSecClient {
return &CrowdSecClient{
lapiURL: strings.TrimRight(lapiURL, "/"),
jwt: jwt,
duration: duration,
client: &http.Client{
Timeout: 2 * time.Second,
// Disable redirect following — prevents SSRF via 3xx to internal hosts.
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
},
},
}
}
// Report satisfies CrowdSecReporter. It builds the LAPI alert payload and
// POSTs it. Errors are logged only (best-effort, never panics).
// The caller already wraps this in a goroutine (see main.go ban branch).
func (c *CrowdSecClient) Report(ip, cat, sev string) {
if err := c.postAlert(ip, cat, sev); err != nil {
log.Printf("sbxwaf: crowdsec bridge error for %s (%s/%s): %v", ip, cat, sev, err)
}
}
// csAlertSource mirrors the source object expected by the CrowdSec LAPI.
type csAlertSource struct {
Scope string `json:"scope"`
Value string `json:"value"`
IP string `json:"ip"`
AsNumber string `json:"as_number"`
AsName string `json:"as_name"`
Cn string `json:"cn"`
Latitude float64 `json:"latitude"`
Longitude float64 `json:"longitude"`
}
// csDecision mirrors the decision object inside the LAPI alert.
type csDecision struct {
Duration string `json:"duration"`
Scenario string `json:"scenario"`
Type string `json:"type"`
Value string `json:"value"`
Scope string `json:"scope"`
Origin string `json:"origin"`
Simulated bool `json:"simulated"`
}
// csEventMeta is one key/value pair inside an event's meta list.
type csEventMeta struct {
Key string `json:"key"`
Value string `json:"value"`
}
// csEvent is a single event in the events array.
type csEvent struct {
Timestamp string `json:"timestamp"`
Meta []csEventMeta `json:"meta"`
}
// csAlert is the full alert object (one element of the POST body array).
type csAlert struct {
Scenario string `json:"scenario"`
ScenarioHash string `json:"scenario_hash"`
ScenarioVersion string `json:"scenario_version"`
Message string `json:"message"`
EventsCount int `json:"events_count"`
StartAt string `json:"start_at"`
StopAt string `json:"stop_at"`
Capacity int `json:"capacity"`
Leakspeed string `json:"leakspeed"`
Simulated bool `json:"simulated"`
Source csAlertSource `json:"source"`
Decisions []csDecision `json:"decisions"`
Events []csEvent `json:"events"`
}
// postAlert builds and POSTs the alert; returns an error for logging.
func (c *CrowdSecClient) postAlert(ip, cat, sev string) error {
// Python uses "%Y-%m-%dT%H:%M:%S.000000Z" — reproduce the same format so
// existing CrowdSec consumers that parse that literal suffix are compatible.
nowISO := time.Now().UTC().Format("2006-01-02T15:04:05.000000Z")
scenario := fmt.Sprintf("secubox-waf/%s", cat)
alert := csAlert{
Scenario: scenario,
ScenarioHash: "",
ScenarioVersion: "1",
Message: fmt.Sprintf("WAF threshold crossed for %s (%s)", ip, cat),
EventsCount: 1,
StartAt: nowISO,
StopAt: nowISO,
Capacity: 0,
Leakspeed: "0s",
Simulated: false,
Source: csAlertSource{
Scope: "Ip",
Value: ip,
IP: ip,
AsNumber: "0",
AsName: "?",
Cn: "?",
Latitude: 0.0,
Longitude: 0.0,
},
Decisions: []csDecision{{
Duration: c.duration,
Scenario: scenario,
Type: "ban",
Value: ip,
Scope: "Ip",
Origin: "secubox-waf",
Simulated: false,
}},
Events: []csEvent{{
Timestamp: nowISO,
Meta: []csEventMeta{
{Key: "source_ip", Value: ip},
{Key: "scenario", Value: cat},
},
}},
}
body, err := json.Marshal([]csAlert{alert})
if err != nil {
return fmt.Errorf("marshal alert: %w", err)
}
endpoint := c.lapiURL + "/v1/alerts"
req, err := http.NewRequest(http.MethodPost, endpoint, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("build request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.jwt)
req.Header.Set("Content-Type", "application/json")
resp, err := c.client.Do(req)
if err != nil {
return fmt.Errorf("POST %s: %w", endpoint, err)
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return fmt.Errorf("LAPI returned %d for %s (%s)", resp.StatusCode, ip, cat)
}
log.Printf("sbxwaf: crowdsec bridge BAN %s ← %s (sev=%s, dur=%s)",
ip, cat, sev, c.duration)
return nil
}

View File

@ -0,0 +1,140 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: crowdsec_test — CrowdSec LAPI bridge tests
package main
import (
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
// TestCrowdSecAlertPayload verifies that Report POSTs to /v1/alerts with the
// correct Authorization header and a well-formed alert JSON array.
func TestCrowdSecAlertPayload(t *testing.T) {
type capturedReq struct {
method string
path string
auth string
body []byte
}
var captured capturedReq
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
captured.method = r.Method
captured.path = r.URL.Path
captured.auth = r.Header.Get("Authorization")
b, _ := io.ReadAll(r.Body)
captured.body = b
w.WriteHeader(http.StatusCreated)
}))
defer srv.Close()
c := NewCrowdSecClient(srv.URL, "testjwt", "4h")
c.Report("1.2.3.4", "sqli", "high")
// Report is synchronous inside this test (no goroutine wrapper here).
// Give a tiny window just in case the httptest server needs to flush.
time.Sleep(20 * time.Millisecond)
// Method and path.
if captured.method != http.MethodPost {
t.Errorf("method: want POST, got %s", captured.method)
}
if captured.path != "/v1/alerts" {
t.Errorf("path: want /v1/alerts, got %s", captured.path)
}
// Authorization header.
if captured.auth != "Bearer testjwt" {
t.Errorf("Authorization: want 'Bearer testjwt', got %q", captured.auth)
}
// Parse the JSON body.
var alerts []map[string]interface{}
if err := json.Unmarshal(captured.body, &alerts); err != nil {
t.Fatalf("body is not valid JSON: %v\nbody: %s", err, captured.body)
}
if len(alerts) != 1 {
t.Fatalf("want 1 alert in array, got %d", len(alerts))
}
a := alerts[0]
// Scenario.
if got, _ := a["scenario"].(string); got != "secubox-waf/sqli" {
t.Errorf("scenario: want 'secubox-waf/sqli', got %q", got)
}
// Source.
src, ok := a["source"].(map[string]interface{})
if !ok {
t.Fatalf("source field missing or wrong type")
}
if v, _ := src["value"].(string); v != "1.2.3.4" {
t.Errorf("source.value: want '1.2.3.4', got %q", v)
}
if v, _ := src["ip"].(string); v != "1.2.3.4" {
t.Errorf("source.ip: want '1.2.3.4', got %q", v)
}
if v, _ := src["scope"].(string); v != "Ip" {
t.Errorf("source.scope: want 'Ip', got %q", v)
}
// Decisions.
decisionsRaw, ok := a["decisions"].([]interface{})
if !ok || len(decisionsRaw) != 1 {
t.Fatalf("decisions: want array of 1, got %v", a["decisions"])
}
d, _ := decisionsRaw[0].(map[string]interface{})
if v, _ := d["type"].(string); v != "ban" {
t.Errorf("decisions[0].type: want 'ban', got %q", v)
}
if v, _ := d["value"].(string); v != "1.2.3.4" {
t.Errorf("decisions[0].value: want '1.2.3.4', got %q", v)
}
if v, _ := d["duration"].(string); v != "4h" {
t.Errorf("decisions[0].duration: want '4h', got %q", v)
}
if v, _ := d["scope"].(string); v != "Ip" {
t.Errorf("decisions[0].scope: want 'Ip', got %q", v)
}
if v, _ := d["origin"].(string); v != "secubox-waf" {
t.Errorf("decisions[0].origin: want 'secubox-waf', got %q", v)
}
// Timestamps: assert fields exist and parse as RFC3339.
for _, field := range []string{"start_at", "stop_at"} {
v, _ := a[field].(string)
if v == "" {
t.Errorf("%s: field missing or empty", field)
continue
}
if _, err := time.Parse(time.RFC3339, strings.TrimSuffix(v, ".000000Z")); err != nil {
// The Python uses ".000000Z" suffix; try parsing with that pattern too.
if _, err2 := time.Parse("2006-01-02T15:04:05.000000Z", v); err2 != nil {
t.Errorf("%s: %q does not parse as RFC3339 or Python variant: %v / %v", field, v, err, err2)
}
}
}
// Events array.
eventsRaw, _ := a["events"].([]interface{})
if len(eventsRaw) < 1 {
t.Errorf("events: want at least 1 entry, got %d", len(eventsRaw))
}
}
// TestCrowdSecBestEffortOnError verifies that Report does not panic when the
// LAPI server is unreachable. Best-effort: errors are logged only.
func TestCrowdSecBestEffortOnError(t *testing.T) {
c := NewCrowdSecClient("http://127.0.0.1:1", "dummy", "4h")
// Must return without panic.
c.Report("1.2.3.4", "sqli", "high")
}

View File

@ -0,0 +1,218 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: errpages — graduated WAF response pages
//
// Task 3.2: ported from WARNING_PAGE (secubox_waf.py ~line 221) and the inline
// ban response (secubox_waf.py ~line 1068-1072).
//
// writeWarning — HTTP 403, cyberpunk-styled warning page with the
//
// X-SecuBox-WAF: warning header. The HTML comment
// "<!-- sbxwaf-warning -->" acts as a machine-readable marker for tests
// and log parsers.
//
// writeBan — HTTP 403, minimal ban page with X-SecuBox-WAF: banned header.
//
// The HTML comment "<!-- sbxwaf-banned -->" is the machine-readable marker.
//
// Task 7.1: synthetic upstream error pages (502/503/504).
//
// errorPage(code, host) — loads the embedded themed HTML template for the
// given upstream error code (502/503/504), substitutes {host} and {time},
// and returns the rendered bytes. Faithful port of the error() hook in
// secubox_waf.py (~line 1096):
// - Connection refused → 502 (ERROR_502_PAGE + {host}/{time} sub)
// - Timeout → 504 (ERROR_502_PAGE with 502→504 / Bad Gateway→Gateway Timeout)
// - Other → 503 (ERROR_503_PAGE, no {host} in the Python page)
//
// writeErrorPage(w, code, host) — sets Content-Type + X-SecuBox-WAF header,
// writes the status code, then writes errorPage output.
package main
import (
"bytes"
_ "embed"
"fmt"
"html"
"net/http"
"time"
)
// Embedded templates — verbatim copies of the Python secubox_waf.py pages.
//
//go:embed templates/error-502.html
var tmpl502 []byte
//go:embed templates/error-503.html
var tmpl503 []byte
//go:embed templates/error-504.html
var tmpl504 []byte
// errorPage returns the themed HTML body for the given upstream HTTP error code.
// host is substituted into {host} placeholders (both the 502 and 504 templates
// contain the upstream hostname in the error box). The {time} placeholder is
// replaced with the current wall-clock time (HH:MM:SS), matching the Python
// error() hook behaviour.
//
// Unknown codes fall back to the 502 template (sane default — keeps tests
// forward-compatible if new codes are added later).
func errorPage(code int, host string) []byte {
var tmpl []byte
switch code {
case 503:
tmpl = tmpl503
case 504:
tmpl = tmpl504
default: // 502 and any unknown code
tmpl = tmpl502
}
now := time.Now().Format("15:04:05")
safeHost := html.EscapeString(host)
out := bytes.ReplaceAll(tmpl, []byte("{host}"), []byte(safeHost))
out = bytes.ReplaceAll(out, []byte("{time}"), []byte(now))
return out
}
// writeErrorPage writes a themed upstream error response.
// Maps the error code to the WAF header value and delegates to errorPage.
func writeErrorPage(w http.ResponseWriter, code int, host string) {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-SecuBox-WAF", fmt.Sprintf("error-%d", code))
w.WriteHeader(code)
_, _ = w.Write(errorPage(code, host))
}
// writeWarning writes a 403 cyberpunk-styled warning page.
// cat is the WAF category ID (e.g. "sqli") shown in the body.
// Faithful port of WARNING_PAGE from secubox_waf.py.
func writeWarning(w http.ResponseWriter, cat string) {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-SecuBox-WAF", "warning")
w.WriteHeader(http.StatusForbidden)
fmt.Fprintf(w, `<!DOCTYPE html>
<!-- sbxwaf-warning -->
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>SecuBox WAF - Security Alert</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: linear-gradient(135deg, #0a0a0f 0%%, #1a0a0f 100%%);
color: #e8e6d9;
font-family: "JetBrains Mono", monospace;
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
}
.container { text-align: center; padding: 2rem; max-width: 800px; }
.alert-icon {
font-size: 6rem;
margin-bottom: 1.5rem;
animation: pulse 2s infinite;
}
@keyframes pulse {
0%%, 100%% { transform: scale(1); opacity: 1; }
50%% { transform: scale(1.1); opacity: 0.8; }
}
h1 { color: #e63946; font-size: 2.5rem; margin-bottom: 1rem;
text-shadow: 0 0 20px rgba(230, 57, 70, 0.5); }
.warning-box {
background: rgba(230, 57, 70, 0.1);
border: 2px solid #e63946;
border-radius: 12px;
padding: 2rem;
margin: 2rem 0;
}
.warning-text { color: #e63946; font-size: 1.2rem; margin-bottom: 1rem; }
.details { color: #6b6b7a; font-size: 0.9rem; margin-top: 1rem; }
.license-box {
background: rgba(201, 168, 76, 0.1);
border: 1px solid #c9a84c;
border-radius: 8px;
padding: 1.5rem;
margin-top: 2rem;
text-align: left;
}
.license-title { color: #c9a84c; font-size: 1rem; margin-bottom: 0.5rem; }
.license-text { color: #6b6b7a; font-size: 0.75rem; line-height: 1.5; }
.footer { margin-top: 2rem; color: #6b6b7a; font-size: 0.8rem; }
.footer a { color: #c9a84c; text-decoration: none; }
</style>
</head>
<body>
<div class="container">
<div class="alert-icon">&#x26A0;&#xFE0F;</div>
<h1>SECURITY ALERT</h1>
<div class="warning-box">
<p class="warning-text">&#x1F6A8; Suspicious Activity Detected</p>
<p>Your request contains patterns that match known attack signatures.</p>
<p class="details">Category: %s</p>
<p class="details">This incident has been logged and your IP address recorded.</p>
<p class="details">Continued malicious activity will result in automatic IP ban.</p>
</div>
<div class="license-box">
<p class="license-title">&#x1F4DC; SecuBox Security Notice</p>
<p class="license-text">
This system is protected by SecuBox WAF (Web Application Firewall).<br>
All access attempts are monitored, logged, and may be reported to authorities.<br>
Continued malicious activity will result in automatic IP ban.<br><br>
&copy; 2024-2026 CyberMind Security Platform<br>
ANSSI CSPN Candidate | https://secubox.in
</p>
</div>
<p class="footer">
Protected by <a href="https://cybermind.fr">CyberMind</a> |
<a href="https://secubox.in">SecuBox</a>
</p>
</div>
</body>
</html>`, cat)
}
// writeBan writes a 403 IP banned response.
// Mirrors the inline ban response from secubox_waf.py lines 1068-1072.
func writeBan(w http.ResponseWriter) {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-SecuBox-WAF", "banned")
w.WriteHeader(http.StatusForbidden)
fmt.Fprint(w, `<!DOCTYPE html>
<!-- sbxwaf-banned -->
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>403 Forbidden | SecuBox WAF</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #0a0a0f;
color: #e8e6d9;
font-family: "JetBrains Mono", monospace;
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
}
.container { text-align: center; padding: 2rem; max-width: 600px; }
h1 { color: #e63946; font-size: 3rem; margin-bottom: 1rem; }
p { color: #6b6b7a; margin-top: 1rem; }
</style>
</head>
<body>
<div class="container">
<h1>&#x1F6AB; 403 Forbidden</h1>
<p>Your IP has been banned.</p>
<p>This incident has been reported to the security platform.</p>
<p style="margin-top:2rem; font-size:0.8rem; color:#3a3a4a;">
SecuBox WAF &mdash; ANSSI CSPN | <a href="https://secubox.in" style="color:#c9a84c;">secubox.in</a>
</p>
</div>
</body>
</html>`)
}

View File

@ -0,0 +1,236 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: errpages_test — TDD for Task 7.1
// Tests for synthetic 502/503/504 themed error pages ported from secubox_waf.py.
package main
import (
"io"
"net"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
// TestErrorPageSubstitutesHost verifies that errorPage(502, host) replaces
// the {host} placeholder in the template and does NOT leave it as a literal.
func TestErrorPageSubstitutesHost(t *testing.T) {
const host = "app.example.com"
body := errorPage(502, host)
if len(body) == 0 {
t.Fatal("errorPage(502, ...) returned empty body")
}
if !strings.Contains(string(body), host) {
t.Fatalf("expected body to contain %q after substitution", host)
}
if strings.Contains(string(body), "{host}") {
t.Fatal("body still contains literal {host} placeholder — substitution failed")
}
// 502 page has a machine-readable marker: the error-code div shows "502"
if !strings.Contains(string(body), "502") {
t.Fatal("expected body to contain the 502 error code marker")
}
}
// TestErrorPageAllCodes checks that 502/503/504 each return a non-empty body
// with a code-specific marker (the error-code div content from the templates).
func TestErrorPageAllCodes(t *testing.T) {
cases := []struct {
code int
marker string // string that must appear in the page
}{
{502, "502"},
{503, "503"},
{504, "504"},
}
for _, tc := range cases {
body := errorPage(tc.code, "test.host.local")
if len(body) == 0 {
t.Errorf("errorPage(%d) returned empty body", tc.code)
continue
}
if !strings.Contains(string(body), tc.marker) {
t.Errorf("errorPage(%d): body does not contain marker %q", tc.code, tc.marker)
}
}
}
// TestErrorPageUnknownCodeFallback checks that an unknown code returns a sane
// (non-empty) body — must not panic or return nil.
func TestErrorPageUnknownCodeFallback(t *testing.T) {
body := errorPage(599, "fallback.example.com")
if len(body) == 0 {
t.Fatal("errorPage(599) returned empty body — expected a non-empty fallback")
}
}
// TestHandlerServesThemed502OnDeadBackend routes a request to a port where
// nothing is listening (connection refused) and asserts:
// - status 502
// - X-SecuBox-WAF: error-502
// - body contains the themed 502 marker ("502")
func TestHandlerServesThemed502OnDeadBackend(t *testing.T) {
// Find an unused local port (bind then close immediately — race is
// acceptable here since the test is the only user and the port is ephemeral).
l, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("could not bind ephemeral port: %v", err)
}
deadAddr := l.Addr().String()
l.Close() // immediately close — the port is now "dead" (refused)
deadHost, deadPortStr, _ := net.SplitHostPort(deadAddr)
var deadPort int
if _, err := io.Discard.Write(nil); err == nil { // no-op; parse port below
}
if _, err := strings.NewReader(deadPortStr).Read(nil); err == nil {
}
// Parse port via strconv-style logic — use net.LookupPort is overkill; cast.
for _, b := range []byte(deadPortStr) {
deadPort = deadPort*10 + int(b-'0')
}
srv := &Server{
routeLookup: func(host string) (string, int, bool) {
if host == "dead.example.com" {
return deadHost, deadPort, true
}
return "", 0, false
},
}
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://dead.example.com/", nil)
req.Host = "dead.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
res := rec.Result()
if res.StatusCode != http.StatusBadGateway {
t.Fatalf("expected 502, got %d", res.StatusCode)
}
wafHdr := res.Header.Get("X-SecuBox-WAF")
if wafHdr != "error-502" {
t.Fatalf("expected X-SecuBox-WAF: error-502, got %q", wafHdr)
}
body, _ := io.ReadAll(res.Body)
if !strings.Contains(string(body), "502") {
t.Fatalf("expected themed 502 body, got: %q", string(body)[:min(200, len(body))])
}
// Must NOT contain the raw placeholder.
if strings.Contains(string(body), "{host}") {
t.Fatal("response body still contains {host} literal — substitution failed")
}
}
// TestHandlerServes504OnUpstreamTimeout routes to a backend that sleeps past a
// short per-request upstream timeout and asserts 504 + X-SecuBox-WAF: error-504.
func TestHandlerServes504OnUpstreamTimeout(t *testing.T) {
// Backend that sleeps 2s — our timeout will be 50ms so it times out.
slow := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(2 * time.Second)
w.WriteHeader(http.StatusOK)
}))
defer slow.Close()
backendAddr := strings.TrimPrefix(slow.URL, "http://")
bHost, bPort, err := splitHostPort(backendAddr)
if err != nil {
t.Fatalf("splitHostPort: %v", err)
}
srv := &Server{
upstreamTimeout: 50 * time.Millisecond, // very short → guaranteed timeout
routeLookup: func(host string) (string, int, bool) {
if host == "slow.example.com" {
return bHost, bPort, true
}
return "", 0, false
},
}
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://slow.example.com/", nil)
req.Host = "slow.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
res := rec.Result()
if res.StatusCode != http.StatusGatewayTimeout {
t.Fatalf("expected 504, got %d", res.StatusCode)
}
wafHdr := res.Header.Get("X-SecuBox-WAF")
if wafHdr != "error-504" {
t.Fatalf("expected X-SecuBox-WAF: error-504, got %q", wafHdr)
}
body, _ := io.ReadAll(res.Body)
if !strings.Contains(string(body), "504") {
t.Fatalf("expected themed 504 body, got: %q", string(body)[:min(200, len(body))])
}
}
// TestErrorPageEscapesHost verifies that a Host value containing HTML-special
// characters is escaped before being inserted into the page, preventing a
// reflected XSS via an attacker-controlled Host header.
//
// Note: the 502 template itself contains a legitimate <script> block for the
// retry countdown timer — that is expected. What must NOT appear is the
// attacker-injected payload "><script>alert(1)</script> reflected verbatim.
// html.EscapeString escapes <, >, &, " and ' — plain text like "alert(1)"
// within the already-escaped tags is safe and will remain in the output.
func TestErrorPageEscapesHost(t *testing.T) {
maliciousHost := "\"><script>alert(1)</script>"
body := string(errorPage(502, maliciousHost))
// The raw, unescaped payload must not appear verbatim.
// If it does, the host value was reflected unescaped — XSS.
if strings.Contains(body, maliciousHost) {
t.Fatal("body contains the raw malicious Host value unescaped — reflected XSS vulnerability")
}
// The injected closing quote + opening angle must not appear — this is
// the breakout vector that allows injecting a new tag context.
if strings.Contains(body, "\"><script>") {
t.Fatal(`body contains unescaped "><script> from Host header — tag-injection XSS vulnerability`)
}
// Must contain the escaped form so the host value is still rendered safely.
if !strings.Contains(body, "&lt;script&gt;") {
t.Fatal("body does not contain escaped &lt;script&gt; — escaping may be missing or incorrect")
}
// Must not contain the bare placeholder.
if strings.Contains(body, "{host}") {
t.Fatal("body still contains literal {host} placeholder — substitution failed")
}
}
// TestErrorPageSubstitutesHostNormal confirms that a well-formed host (no
// special chars) is preserved unchanged after escaping — escaping must not