Compare commits

...

31 Commits

Author SHA1 Message Date
c6d6eb5c75 Merge #744: sbxwaf Go WAF engine + shared internal/ core
Some checks are pending
License Headers / check (push) Waiting to run
# Conflicts:
#	packages/secubox-toolbox-ng/cmd/sbxmitm/compress_test.go
#	packages/secubox-toolbox-ng/cmd/sbxmitm/cosmetic_test.go
#	packages/secubox-toolbox-ng/cmd/sbxmitm/gzip.go
#	packages/secubox-toolbox-ng/cmd/sbxmitm/gzip_test.go
2026-06-26 16:02:35 +02:00
2e6cec9b38 fix(waf-api): read sbxwaf threat-log path for the WAF dashboard (ref #744)
The Go sbxwaf engine writes the threat log to the sandboxed leaf dir
/var/log/secubox/waf/waf-threats.log; the dashboard's /stats + /alerts now
resolve that path (env-overridable, legacy-path fallback) so the WAF WebUI
shows real engine data again after the cutover.
2026-06-26 15:55:15 +02:00
b607d7f7d6 docs(waf-ng): package README (WIP) + ignore build artifacts (ref #744) 2026-06-26 15:31:26 +02:00
e5a2c5d287 fix(sbxwaf): final-review wave — vhost cache key, crowdsec LAPI url, body-inspect cap+audit, seccomp, trusted-host skip (ref #744)
Fix 1 (media-cache vhost isolation): key MediaCache.Get/MaybeStore on
"https://"+Host+RequestURI() instead of r.URL.String() (path-only on
server requests). Two vhosts sharing /logo.png no longer collide.
MaybeStore gains explicit cacheURL arg; all callers updated.
Add TestMediaCacheVhostIsolation: store hostA/x.png, assert hostB/x.png
→ MISS.

Fix 2 (CrowdSec self-loop): secubox-waf-ng-worker@.service --crowdsec-url
was http://127.0.0.1:8080 — the nftables DNAT VIP that fans requests back
into the workers themselves. Changed to http://10.100.0.1:8080 (LXC-bridge
LAPI, same as Python addon). Added blocking unit comment + CUTOVER.md §1.2
crowdsec-url self-loop check.

Fix 3 (body-inspect cap + audit):
- maxBodyInspect const → defaultMaxBodyInspect; Server gains maxBodyInspect
  field wired from new --max-body-inspect flag (default 1 MiB, operator-
  tunable).
- When body read returns exactly cap bytes (truncated), emit AUDIT log line
  (action=body-inspect-truncated) to threat log + stderr so truncation is
  operator-visible; request is never blocked on audit.
- Added known_gap_body_payload_after_1mib_prefix parity fixture documenting
  the prefix-bounded inspection gap with honest note.
- Added CUTOVER.md §1.6 "body inspection cap" gate with operator sign-off
  checklist and three mitigation options.

Fix 4 (seccomp/hardening): secubox-waf-ng-worker@.service was missing
SystemCallFilter, SystemCallArchitectures, ProtectKernelTunables/Modules/
Logs, ProtectControlGroups, RestrictNamespaces, LockPersonality,
RestrictSUIDSGID, RestrictRealtime, MemoryDenyWriteExecute, PrivateDevices.
Added full set matching project convention (secubox-mesh.service) as
mandated by spec §6/CSPN.

Fix 5 (trusted-host whitelist): Server gains trustedHosts map + isTrustedHost
method. --waf-skip-hosts flag (default: git.gk2.secubox.in, git.secubox.in,
admin.gk2.secubox.in, 10.100.0.1:9080) mirrors Python check_request whitelist
(secubox_waf.py:761-763). Trusted hosts bypass WAF inspection before rule
matching. Add TestTrustedHostSkipsWAF (with sanity check that untrusted host
is still blocked) and TestIsTrustedHost/TestParseTrustedHosts unit tests.
2026-06-26 15:27:10 +02:00
11438e394c docs(sbxwaf): bench harness + cutover/rollback runbook with parity-gap gates (ref #744)
- scripts/sbxwaf-bench.sh: wrk/hey bench harness against legacy mitmproxy
  (10.100.0.60:8080) and shadow sbxwaf (127.0.0.1:8081); captures req/s,
  p99, RSS; prints comparison table with PASS/FAIL for all 3 go/no-go gates
  (>5× req/s·core, p99<1/3, RSS<1/4); shellcheck clean.

- packages/secubox-waf-ng/docs/CUTOVER.md: operator runbook with 6 sections:
  pre-cutover checklist (CA, CrowdSec JWT, COMPLETE log4shell corpus,
  null-byte \x00 fix, goform FP fix, parity green), shadow-run procedure,
  go/no-go gate table, exact HAProxy server re-point + nftables DNAT topology,
  single-edit rollback, and post-cutover monitoring (threat log, cookie-audit,
  RuntimeDirectoryPreserve guarantee, CrowdSec JWT rotation constraint,
  Python WAF unescaped-Host XSS backport note, body URL-decode limitation).
2026-06-26 15:02:18 +02:00
3f24034c37 test(sbxwaf): make log4shell corpus gap loud + isolate per-fixture ban state (ref #744)
Fix 1 (medium): add known_gap fixture log4shell_jndi_corpus_gap to waf-parity-fixtures.json.
  The Go corpus (secubox-waf/config/waf-rules.json) is missing the log4shell category
  present in the Python corpus (secubox-mitmproxy/data/waf-rules.json); jndi:ldap payloads
  are missed silently by Go. The fixture emits a visible KNOWN GAP t.Logf line in CI so
  the coverage gap is never silent. expect=allow (current Go gap behaviour).

Fix 2 (low): sqli_union_all_select client_ip changed from 45.33.32.156 to 45.33.32.158.
  Two warn-fixtures previously shared IP 45.33.32.156 (sqli_union_all_select +
  scanner_sqlmap_ua), accumulating ban count to 2; a third reuse would silently flip
  verdict to ban. Each warn fixture now uses a distinct IP so per-fixture verdicts are
  independent. Comment added to parity_test.go explaining the shared-Ban lifecycle
  (ban-sequence fixtures are the only intentional cross-fixture accumulation).

Result: 52 fixtures, 52 pass, 3 known-gap lines visible, go build clean.
2026-06-26 14:54:17 +02:00
7814bee861 test(sbxwaf): decision parity harness vs mitmproxy (ref #744)
51-fixture corpus (testdata/waf-parity-fixtures.json) covering:
- allow (6): benign GET/POST, URL-encoded body not decoded by engine
- warn (33): SQLi, XSS, LFI, RCE, scanners, honeypots, CVEs, router botnet,
  URL-encoded path+query attacks (proves unquote_plus decode)
- ban (1): 3-hit sequence for ip 198.51.100.99 reaching threshold
- skip (11): static assets, /health, NC bypass paths, RFC1918 IPs
- known_gap (2): router-goform unanchored ';' FP + RE2 null-byte CVE patterns

TestWAFParity calls the REAL decision path (privateCIDR / staticAsset /
ncBypass / Rules.Match / Ban.Record) — no re-implementation in the test.
Findings during harness construction:
- router-goform FP: shared Python+Go pattern bug (';' in common UAs), NOT
  a parity regression — documented as known_gap
- Log4Shell gap: secubox-waf/config/waf-rules.json (Go corpus) is missing
  the log4shell category present in the Python mitmproxy rules — corpus gap
- Body URL-decoding asymmetry: Rules.Match does not decode body (only
  path+query); documented with two fixture variants
- 5 null-byte RE2 patterns skipped at compile: cve-ast-2022-42706,
  cve-ast-2023-37457, cve-opensips-2023-49323, cve-prosody-2022-0217,
  cve-strophe-2022-29168 — documented as known_gap

go test ./cmd/sbxwaf/ -run TestWAFParity -v: 51 pass, 0 fail
go build ./...: clean
2026-06-26 14:45:13 +02:00
16a4e6e63d fix(packaging): scope sbxwaf sandbox to WAF-owned leaf dirs (ref #744)
M1: move --threat-log default from /var/log/secubox/waf-threats.log to
/var/log/secubox/waf/waf-threats.log (WAF-owned leaf); update ExecStart
and postinst to create the leaf dir (secubox-waf:secubox-waf 0750);
narrow ReadWritePaths from /var/log/secubox to leaf dirs only
(/var/log/secubox/waf /var/log/secubox/cookie-audit).

L1: fix AppArmor profile — /var/log/secubox/ parent entry changed from
rw to r (traverse only); add explicit rw entries for both leaf dirs
(/var/log/secubox/waf/** and /var/log/secubox/cookie-audit/**).
2026-06-26 14:31:35 +02:00
e275f730ec feat(packaging): secubox-waf-ng deb + hardened systemd worker + AppArmor (ref #744)
packages/secubox-waf-ng/:
- debian/control: Architecture: arm64, Standards-Version: 4.6.2, compat 13
- debian/rules: cross-build sbxwaf from secubox-toolbox-ng Go module
  (GOOS=linux GOARCH=arm64 CGO_ENABLED=0 -mod=vendor, execute_after_dh_auto_install)
- systemd/secubox-waf-ng-worker@.service: User=secubox-waf,
  RuntimeDirectory=secubox + RuntimeDirectoryPreserve=yes (#741 socket-wipe fix),
  NoNewPrivileges, ProtectSystem=strict, ProtectHome, PrivateTmp,
  CapabilityBoundingSet= (drop all), RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX,
  MemoryMax=256M, listens on 127.0.0.1:808%i (instances 1+2)
- debian/secubox-waf-ng.apparmor: enforce profile for /usr/sbin/sbxwaf
  (rw log/cache/run, r config/secrets, deny-all else)
- debian/postinst: adduser secubox-waf --system --group, leaf dirs only
  (NEVER chmod shared parents /etc/secubox /var/log/secubox /var/cache/secubox
  to 0750 — traversal constraint from #511/#620), aa-enforce, systemctl enable+start @1+2
- debian/prerm: stop+disable @1+2 workers

Build: dpkg-buildpackage -a arm64 -us -uc -b -d produced
secubox-waf-ng_1.0.0-1~bookworm1_arm64.deb (2.0M, stripped arm64 binary 5.5M)
containing /usr/sbin/sbxwaf, /lib/systemd/system/secubox-waf-ng-worker@.service,
/etc/apparmor.d/usr.sbin.sbxwaf.
2026-06-26 14:25:01 +02:00
49edf6670a fix(sbxwaf): HTML-escape Host in error pages — prevent reflected XSS (ref #744)
errorPage(code, host) was substituting r.Host verbatim into the 502/504
templates, allowing an attacker to inject arbitrary HTML via a crafted
Host header. Apply html.EscapeString before substitution.

Add TestErrorPageEscapesHost (asserts raw payload absent + escaped form
present) and TestErrorPageSubstitutesHostNormal (safe hosts unchanged).
2026-06-26 14:15:57 +02:00
ae930c0347 feat(sbxwaf): synthetic themed error pages on upstream failure (ref #744)
Port the Python secubox_waf.py error() hook (~line 1096) to Go:
- 502 Bad Gateway    — connection refused/dial failure → error-502.html
- 503 Svc Unavail    — all other errors               → error-503.html
- 504 Gateway Timeout — net.Error.Timeout()           → error-504.html

Templates embedded via //go:embed (templates/*.html), verbatim copies
from Python ERROR_502_PAGE / ERROR_503_PAGE; error-504.html is 502 with
"502"→"504" and "Bad Gateway"→"Gateway Timeout" (mirrors Python in-place
replace). {host} and {time} placeholders substituted at request time.

upstreamErrorCode() helper maps net.Error → 502/503/504. ErrorHandler wired
in both the fallback proxy path (main.go) and cached proxy path (routes.go).

TDD: 5 new tests (errorPage substitution, all codes, unknown-code fallback,
themed 502 on dead backend, 504 on timeout). Full suite green (2.2s).
2026-06-26 14:09:51 +02:00
c3940a2958 fix(sbxwaf): media-cache Flusher passthrough + oversize handler test (ref #744)
- Fix I2: implement http.Flusher on cachingResponseWriter so
  httputil.ReverseProxy can flush chunks incrementally to the client
  (progressive video / PeerTube streaming); pure pass-through, does not
  affect cache buffer capture.
- Fix M1: add TestMediaCacheHandlerOversizeStreamsFullBody — end-to-end
  regression guard that a >16 MiB video/mp4 response streams its FULL
  body (byte count + SHA-256 checksum) to the client and is NOT cached;
  passes against current code, race-clean.
- Fix M2: document the mtime-as-atime LRU choice at the loadIndex site
  so a future reader understands why ModTime() is used instead of atime
  (relatime suppresses atime on most Linux filesystems; mtime is set
  explicitly via os.Chtimes on every Get hit).
2026-06-26 14:04:15 +02:00
f1573c37d2 feat(sbxwaf): media-cache (16MB/obj, 2GB total) (ref #744)
Add disk-backed response media cache ported from media_cache.py:
- GET image/video/audio/font/css/js responses cached under sha256(url)
- On-disk layout: <dir>/<key[:2]>/<key> + <key>.m sidecar (body+meta)
- LRU eviction by atime when total exceeds 2 GiB cap
- TTL from max-age (or 1h default); nowFn seam for deterministic tests
- Cache hit short-circuits upstream; miss captures via cachingResponseWriter
- Oversize (>16 MiB) bodies not stored but streamed fully to client
- --media-cache-dir flag (default /var/cache/secubox/waf/media; empty = off)
- 11 TDD tests: store/get, non-media reject, oversize reject, expiry,
  handler hit, no-store skip, stats, eviction, non-GET, persistence, miss-stores
2026-06-26 13:56:57 +02:00
85b508d4f2 fix(sbxwaf): atomic cookie-audit drop counter (race-free) (ref #744) 2026-06-26 13:50:50 +02:00
f06cb2dc28 feat(sbxwaf): RGPD cookie-audit JSONL ledger (ref #744)
Add cmd/sbxwaf/cookieaudit.go: CookieAudit struct with buffered async
channel (256), single writer goroutine, non-blocking Record (drop-on-full),
SHA256-hashed values, raw Set-Cookie parsed directly for SameSite support.
Wire --cookie-audit-log flag into Server and Routes ModifyResponse.
2026-06-26 13:47:23 +02:00
a85668f39d feat(sbxwaf): CrowdSec LAPI alert bridge on ban (ref #744)
- Add cmd/sbxwaf/crowdsec.go: CrowdSecClient satisfying CrowdSecReporter;
  POSTs a LAPI /v1/alerts JSON array (ported from secubox_waf.py
  _ban_via_crowdsec) with 2s timeout, no redirect following (SSRF hygiene),
  best-effort error handling (log + return, never block/panic).
- Add cmd/sbxwaf/crowdsec_test.go: TDD — TestCrowdSecAlertPayload (httptest
  capture of POST /v1/alerts, bearer token, payload fields) +
  TestCrowdSecBestEffortOnError (dead URL, no panic).
- Wire flags in main(): --crowdsec-url, --crowdsec-jwt-file (secret read from
  file, not argv), --crowdsec-ban-duration (default 4h); wires srv.crowdsec
  when both url+jwt-file are set; leaves nil otherwise (bridge disabled).
2026-06-26 13:39:22 +02:00
64258b98d8 feat(sbxwaf): graduated WARNING/BAN responses + threat log (ref #744)
Task 3.2: wire ban.Record into the handler for graduated 403 responses:
- count < threshold → writeWarning (403, cyberpunk WARNING_PAGE port, X-SecuBox-WAF: warning)
- count >= threshold → writeBan (403, ban page, X-SecuBox-WAF: banned)
- ThreatLog appends one NDJSON line per hit to /var/log/secubox/waf-threats.log
  (O_APPEND|O_CREATE, 0640, best-effort — never crashes the request path)
- Server gains ban *Ban + threatLog *ThreatLog + crowdsec CrowdSecReporter (nil seam for Task 4.1)
- main() wires NewBan(300s,3) + NewThreatLog(--threat-log) at startup
- CrowdSecReporter interface + go crowdsec.Report() call ready for Task 4.1 to slot in
2026-06-26 13:34:02 +02:00
8ea14e660a feat(sbxwaf): sliding-window graduated ban (ref #744)
Adds cmd/sbxwaf/ban.go: Ban struct with NewBan(window, threshold) and
Record(ip, nowUnix) that prunes stale hits, counts within the window,
and returns (count, banned) — mirrors Python BAN_THRESHOLD=3/BAN_WINDOW=300s.
Map capped at 100k IPs to bound memory under flood. Tests: TDD pass 3/3.
2026-06-26 13:27:09 +02:00
4334f93edc fix(sbxwaf): forward full request body intact, cap only inspection (ref #744)
Finding 1 (data corruption): replace LimitReader-restore pattern with a
streaming MultiReader approach: read up to maxBodyInspect (1 MiB) into a
prefix buffer for WAF inspection, then restore r.Body as
io.MultiReader(bytes.NewReader(prefix), r.Body) so the upstream proxy
receives every byte intact. Large uploads (PeerTube / Nextcloud) no longer
get truncated at 1 MiB.

Finding 2 (dead code): remove healthPath() from inspect.go — it was never
called; its logic is fully covered by staticAsset().

Tests added:
- TestInspectLargeBodyForwardedIntact: POST 1 MiB + 4 KiB → backend receives
  full body byte-for-byte (regression test for the truncation bug).
- TestInspectLargeBodyAttackInFirstMiB: attack in first 1 MiB of large body
  is still blocked (streaming inspection still works).
2026-06-26 13:23:42 +02:00
02b1c7a461 feat(sbxwaf): request inspection + CIDR/static/NC skip-lists (ref #744)
Wire Rules.Match into the HTTP handler (Task 2.2):
- inspect.go: privateCIDR (RFC1918+loopback), staticAsset (13 exts + health),
  ncBypass (/index.php/login/v2/ + /ocs/v2.php/core/login), clientIP (XFF
  trusted only when peer ∈ TRUSTED_PROXIES), maxBodyInspect=1MiB constant.
- main.go: Server.rules *Rules field; handler reads body capped at 1MiB,
  restores via io.NopCloser(bytes.NewReader) before proxying; WAF hit → 403;
  Connection: close added to upstream requests (#496).
- main() wires LoadRules(*rules) when --rules flag is provided.

Ports faithfully from secubox_waf.py: _is_whitelisted/_WL_NETS (lines 28-47),
get_real_client_ip (lines 193-219), check_request fast-path (lines 764-769).

Tests: 5 new (BlocksAttack, PrivateIPBypass, StaticAssetSkip, NCBypass,
BodyForwarded) + 14 existing = 19/19 PASS.
2026-06-26 13:18:56 +02:00
efb390b713 feat(sbxwaf): regex WAF rule engine from waf-rules.json (ref #744) 2026-06-26 13:09:23 +02:00
f2bdef341c fix(sbxwaf): inject shared transport into all route proxies (ref #744)
- LoadRoutes(path, transport http.RoundTripper) — transport now required at
  load time; nil falls back to http.DefaultTransport gracefully
- buildEntries: removes the r.transport != nil guard — transport is always
  set at Routes construction, never post-hoc
- Server gains a transport http.RoundTripper field; main() constructs the
  tuned *http.Transport (dial timeout + pool settings) BEFORE LoadRoutes so
  startup-built proxies share the same pool as reload-built ones
- handler() uses s.transport when available; falls back to a local transport
  only for test Servers that don't inject one (backwards-compat)
- main(): removed the post-hoc s.routes.transport = transport assignment
- routes_test.go: adds TestRoutesInjectedTransportUsed — sentinel transport
  proves startup-built proxies use the injected transport, not DefaultTransport
- Existing TestRoutesLookup / TestRoutesLookupCaseAndPort / TestRoutesHotReload
  updated to pass nil transport (all still pass)
2026-06-26 13:03:25 +02:00
bd6b7c3ebf feat(sbxwaf): haproxy-routes.json loader + hot-reload + cached reverse-proxy (ref #744)
- New routes.go: Routes struct with RW-locked map, LoadRoutes() parses
  haproxy-routes.json ({"host": ["ip", port]}), skips malformed entries
  without panicking, Lookup() lowercases+strips-port before probe.

- Hot-reload via internal/reload.Watcher (throttle=0, mirrors policy.go):
  Target.Load re-parses file, Target.Apply atomically swaps entries map
  under mu.Lock. Routes.Maybe() called once per request in handler().

- Perf: *httputil.ReverseProxy built once per ip:port backend at load/reload
  time (sync.Map proxyCache keyed by "ip:port"), never per-request. Shared
  *http.Transport injected from Server.handler() so all backends share one
  connection pool.

- main.go: --routes flag now calls LoadRoutes, sets srv.routes + srv.routeLookup;
  handler uses ProxyFor() for cached proxy, falls back to one-off only when
  routes is nil (test injection path).

- Tests: 5/5 PASS — Lookup, case/port normalisation, hot-reload with
  os.Chtimes bump + Maybe() trigger.
2026-06-26 12:50:47 +02:00
d747b705ce feat(sbxwaf): reverse-proxy skeleton + listener (ref #744)
- cmd/sbxwaf/main.go: Server struct with routeLookup func field, handler()
  reverse-proxying via httputil.ReverseProxy, X-SecuBox-WAF: inspected stamp,
  421 for unmapped hosts, flags (--listen, --ca-cert, --ca-key, --routes,
  --rules, --upstream-timeout), lazy CA load via forge.LoadCA.
- cmd/sbxwaf/main_test.go: TestProxyPassthrough (200+body+header) and
  TestProxyUnmapped (421) — both green; go build ./... clean.
2026-06-26 12:41:51 +02:00
dacafcfdee refactor(toolbox-ng): extract internal/reload (ref #744)
Extract the mtime hot-reload pattern from cmd/sbxmitm/policy.go into a
generic, Policy-agnostic internal/reload package (Target/Watcher/StatMtime/
LoadLines). policy.go rewired to use reload.Watcher; private reloadTarget
struct, maybeReload body, statMtime, scanLines/loadLines/loadLinesRaw removed.
Policy retains its own throttle gate (reloadThrottle/reloadMu) so existing
reload_test.go field mutations compile unchanged; the Watcher runs with
throttle=0 and is gated by Policy.maybeReload().

7 new tests in internal/reload (basic, throttle, stat, strip-comments,
missing-file, multi-target, concurrent/race). All parity fixtures + reload
tests green: go test ./internal/reload/ ./cmd/sbxmitm/ -count=1 -race PASS.
2026-06-26 12:36:50 +02:00
e47cd115fd refactor(toolbox-ng): extract internal/relay (ref #744)
Move emit/emitSync/emitTimeout from cmd/sbxmitm/sidecar.go into the new
package internal/relay as Emit/EmitSync/EmitTimeout, so cmd/sbxwaf can
reuse the fire-and-forget unix-socket POST transport without duplication.

- internal/relay/relay.go: Emit (detached goroutine), EmitSync (2s timeout,
  synchronous), EmitTimeout const — pure stdlib, no new go.sum entries
- internal/relay/relay_test.go: TDD tests — unix echo server, asserts
  request-line "POST /ingest HTTP/1.1" + exact body
- cmd/sbxmitm/relay.go: relayEmit now calls relay.Emit
- cmd/sbxmitm/sidecar.go: declarations removed, retained as doc/comment file
- cmd/sbxmitm/sidecar_test.go: rewired to relay.EmitSync/relay.Emit/relay.EmitTimeout

go test ./internal/relay/ ./cmd/sbxmitm/ -count=1: PASS (both packages)
go build ./...: clean
2026-06-26 12:28:29 +02:00
18e625fd88 refactor(toolbox-ng): extract internal/httpcodec (ref #744)
Move gzip/br/zstd codec primitives from cmd/sbxmitm/gzip.go into a new
shared package internal/httpcodec (GunzipBytes, GzipBytes, UnbrotliBytes,
BrotliBytes, UnzstdBytes, ZstdBytes + Decode/Encode dispatchers).
Rewire cmd/sbxmitm/gzip.go and all three test files to call httpcodec.*.
No behaviour change; cmd/sbxmitm still builds and all tests pass.
2026-06-26 12:21:40 +02:00
8e1f8f2155 refactor(toolbox-ng): extract internal/forge from sbxmitm (ref #744)
Move CA, loadCA→LoadCA, forge→Forge, firstPEMBlock, parseKey from
cmd/sbxmitm/main.go into a new shared package internal/forge so that
the future cmd/sbxwaf can reuse them without duplication.

No behaviour change: cmd/sbxmitm wires in forge.LoadCA / ca.Forge;
ca.cert (unexported) becomes forge.CA.Cert (exported, needed by tests
and future callers). Both suites green:
  go test ./internal/forge/ ./cmd/sbxmitm/ -count=1
2026-06-26 12:12:50 +02:00
ccf6d45a08 docs(plan): WAF→Go sbxwaf implementation plan, 10 phases TDD (ref #744) 2026-06-26 12:04:25 +02:00
6ec92bd29d docs(spec): WAF→Go bench targets to >5×/p99<⅓, blocking go/no-go (ref #744) 2026-06-26 11:52:44 +02:00
0b2094f43f docs(spec): WAF→Go sbxwaf host-native replacement design (ref #744)
Brainstorming-validated design: perf-driven complete replacement of the WAF
mitmproxy/mitmdump inspection layer by a dedicated host-native Go binary sbxwaf,
sharing an extracted core with sbxmitm. Covers architecture, component isolation,
full feature port (routing/rules/ban/CrowdSec/cookie-audit/media-cache/error
pages), host-native hardening, and shadow→parity→cutover→rollback migration.
2026-06-26 09:31:59 +02:00
61 changed files with 10604 additions and 517 deletions

View File

@ -0,0 +1,358 @@
# sbxwaf — WAF Go host-native — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the Python mitmproxy WAF inspection layer with a host-native Go binary `sbxwaf` that reverse-proxies HAProxy traffic to backend vhosts while inspecting/blocking/banning, at >5× the throughput.
**Architecture:** A new `cmd/sbxwaf` in the existing `secubox-toolbox-ng` Go module, reusing a freshly-extracted shared core (`internal/forge`, `internal/relay`, `internal/httpcodec`, `internal/reload`) shared with `cmd/sbxmitm`. Net/http reverse proxy: route vhost→backend via `haproxy-routes.json`, regex WAF rules from `waf-rules.json`, sliding-window graduated ban, CrowdSec LAPI bridge, cookie-audit JSONL, media-cache, synthetic error pages. Migration is shadow→parity→cutover→rollback.
**Tech Stack:** Go 1.22 (stdlib net/http, crypto/tls, regexp), brotli/zstd (already deps), systemd, AppArmor. Spec: `docs/superpowers/specs/2026-06-26-waf-go-sbxwaf-design.md`.
## Global Constraints
- Go module: `github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng`, Go 1.22, stdlib-first (no new deps beyond brotli/zstd already present).
- Binary: `/usr/sbin/sbxwaf`; workers `secubox-waf-ng-worker@1..2`; user/group `secubox-waf` (non-priv, created in postinst).
- CA at `/etc/secubox/waf/ca/` (cert `ca-cert.pem`, key `ca.pem`); secrets `/etc/secubox/secrets/` chmod 600 owner `secubox-waf`.
- Listen `:8080` (worker `:808%i`); HAProxy backend `mitmproxy_waf` flips `server waf` IP from LXC to host on cutover.
- Routes file `/data/mitmproxy/haproxy-routes.json` → migrate to `/etc/secubox/waf/haproxy-routes.json`; rules `/etc/secubox/waf/waf-rules.json`; threat log `/var/log/secubox/waf-threats.log`; audit `/var/log/secubox/audit.log` (append-only).
- Bench go/no-go (BLOCKING): `>5× req/s·core`, `p99 < ⅓`, `RSS < ¼` vs mitmproxy 4-workers.
- Parity vs `secubox_waf.py` is BLOCKING: no detection regression. Source of truth: `packages/secubox-mitmproxy/addons/secubox_waf.py` (930 lines), `cookie_audit.py`, `media_cache.py`.
- Hardening: `NoNewPrivileges`, `ProtectSystem=strict` + minimal `ReadWritePaths`, drop caps, `RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX`, AppArmor enforce profile in `debian/`.
- SPDX header `LicenseRef-CMSD-1.0` on every new file (per `.claude/CLAUDE.md`); commit messages end without Claude footer.
---
## Phase 0 — Shared core extraction (refactor, no behaviour change)
Extract reusable primitives from `cmd/sbxmitm` into `internal/` packages consumed by BOTH cmds. After each task `cmd/sbxmitm` must still build + pass its tests (no behaviour change).
### Task 0.1: Extract `internal/forge` (CA + leaf forge)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/forge/forge.go`
- Create: `packages/secubox-toolbox-ng/internal/forge/forge_test.go`
- Modify: `packages/secubox-toolbox-ng/cmd/sbxmitm/main.go` (remove `CA`, `loadCA`, `forge`, `firstPEMBlock`, `parseKey`; import + alias `forge.CA`)
**Interfaces:**
- Produces: `forge.CA` struct; `forge.LoadCA(certPath, keyPath string) (*forge.CA, error)`; `(*forge.CA).Forge(host string) (*tls.Certificate, error)`. (Exported names: `LoadCA`, `Forge` — capitalised from the current unexported `loadCA`/`forge`.)
- [ ] **Step 1: Write the failing test** (`forge_test.go`): generate a self-signed CA, `LoadCA` from temp PEM files, `Forge("example.com")`, assert the returned leaf chains to the CA (`leaf.CheckSignatureFrom(ca.cert)`) and `Forge` is cached (same pointer on second call).
```go
func TestForgeChainsAndCaches(t *testing.T) {
dir := t.TempDir()
certPath, keyPath := writeTestCA(t, dir) // helper mints a CA, writes PEMs
ca, err := LoadCA(certPath, keyPath)
if err != nil { t.Fatalf("LoadCA: %v", err) }
c1, err := ca.Forge("example.com")
if err != nil { t.Fatalf("Forge: %v", err) }
if c1.Leaf.DNSNames[0] != "example.com" { t.Fatalf("CN/SAN wrong: %v", c1.Leaf.DNSNames) }
c2, _ := ca.Forge("example.com")
if c1 != c2 { t.Fatalf("Forge not cached") }
}
```
- [ ] **Step 2: Run test, verify it fails**`go test ./internal/forge/ -run TestForgeChainsAndCaches -v` → FAIL (package/symbols undefined).
- [ ] **Step 3: Move the code** — cut `CA`, `loadCA`→`LoadCA`, `forge`→`Forge`, `firstPEMBlock`, `parseKey` from `cmd/sbxmitm/main.go` (lines ~45-155) into `internal/forge/forge.go`, package `forge`, capitalise the two exported names, add SPDX header. Add `writeTestCA` helper in the test.
- [ ] **Step 4: Rewire sbxmitm** — in `cmd/sbxmitm`, replace `loadCA(``forge.LoadCA(`, `px.ca.forge(``px.ca.Forge(`, change `ca *CA` field type to `ca *forge.CA`, add import.
- [ ] **Step 5: Run both test suites**`go test ./internal/forge/ ./cmd/sbxmitm/ -count=1` → PASS.
- [ ] **Step 6: Commit**`git commit -am "refactor(toolbox-ng): extract internal/forge from sbxmitm (ref #744)"`.
### Task 0.2: Extract `internal/httpcodec` (gzip/br/zstd)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/httpcodec/codec.go`
- Create: `packages/secubox-toolbox-ng/internal/httpcodec/codec_test.go`
- Modify: `cmd/sbxmitm/gzip.go` (remove the moved funcs; keep `injectIntoBody`/`injectHTML` which are sbxmitm-specific but call `httpcodec.*`)
**Interfaces:**
- Produces: `httpcodec.GunzipBytes([]byte)([]byte,error)`, `GzipBytes([]byte)[]byte`, `UnbrotliBytes`, `BrotliBytes`, `UnzstdBytes`, `ZstdBytes` (capitalised). `httpcodec.Decode(encoding string, body []byte)([]byte,error)` and `httpcodec.Encode(encoding string, body []byte)([]byte,error)` convenience dispatchers (encoding ∈ "",gzip,br,zstd; "" = identity passthrough).
- [ ] **Step 1: Write failing test** — round-trip each codec: `Encode("gzip", b)` then `GunzipBytes` returns `b`; a 33 MiB stream decodes to error (bomb cap); unknown encoding via `Decode("deflate", b)` returns error.
```go
func TestCodecRoundTrip(t *testing.T) {
for _, enc := range []string{"gzip", "br", "zstd"} {
in := []byte("<html>hello</html>")
comp, err := Encode(enc, in)
if err != nil { t.Fatalf("Encode %s: %v", enc, err) }
out, err := Decode(enc, comp)
if err != nil || string(out) != string(in) { t.Fatalf("%s round-trip: %v %q", enc, err, out) }
}
}
```
- [ ] **Step 2: Run, verify fail**`go test ./internal/httpcodec/ -v` → FAIL.
- [ ] **Step 3: Move code** — move `gunzipBytes`/`gzipBytes`/`unbrotliBytes`/`brotliBytes`/`unzstdBytes`/`zstdBytes`/`readCapped`/`gunzipCap`/`errString`/`errGunzipTooLarge` from `gzip.go` into `internal/httpcodec/codec.go`, capitalise the byte funcs, add `Decode`/`Encode` dispatchers. SPDX header.
- [ ] **Step 4: Rewire sbxmitm**`gzip.go`'s `injectIntoBody` switch calls `httpcodec.GunzipBytes`/`GzipBytes`/etc.
- [ ] **Step 5: Run**`go test ./internal/httpcodec/ ./cmd/sbxmitm/ -count=1` → PASS.
- [ ] **Step 6: Commit**`refactor(toolbox-ng): extract internal/httpcodec (ref #744)`.
### Task 0.3: Extract `internal/relay` (async unix-socket POST)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/relay/relay.go` (+ `relay_test.go`)
- Modify: `cmd/sbxmitm/relay.go`, `cmd/sbxmitm/sidecar.go`
**Interfaces:**
- Produces: `relay.Emit(socketPath, route string, payload []byte)` (fire-and-forget), `relay.EmitSync(socketPath, route string, payload []byte) error` (2 s timeout, test-observable). sbxmitm keeps its event-builder funcs but calls `relay.Emit`.
- [ ] **Step 1: Failing test** — spin a `net.Listen("unix", …)` echo server; `EmitSync` posts a payload; assert the server received `POST <route>` with the body.
- [ ] **Step 2: Verify fail**`go test ./internal/relay/ -v` → FAIL.
- [ ] **Step 3: Move** `emit`→`Emit`, `emitSync`→`EmitSync`, `emitTimeout` from `sidecar.go` into `internal/relay/relay.go`. SPDX. Leave the dpi/cookies/ja4 builders in sbxmitm (they call `relay.Emit`).
- [ ] **Step 4: Rewire** sbxmitm callers.
- [ ] **Step 5: Run**`go test ./internal/relay/ ./cmd/sbxmitm/ -count=1` → PASS.
- [ ] **Step 6: Commit**`refactor(toolbox-ng): extract internal/relay (ref #744)`.
### Task 0.4: Extract `internal/reload` (mtime hot-reload pattern)
**Files:**
- Create: `packages/secubox-toolbox-ng/internal/reload/reload.go` (+ test)
- Modify: `cmd/sbxmitm/policy.go` (use `reload.Watcher`)
**Interfaces:**
- Produces: a generic watcher decoupled from `Policy`:
```go
type Target struct { Path string; LastMtime int64; Load func(path string) any; Apply func(v any) }
type Watcher struct { /* throttle + mu */ }
func NewWatcher(throttle time.Duration, targets ...Target) *Watcher
func (w *Watcher) Maybe() // stat each target; on mtime change, Load then Apply under the caller's swap
func StatMtime(path string) int64
func LoadLines(path string, stripComments bool) map[string]bool
```
- [ ] **Step 1: Failing test** — write a temp file, register a `Target` whose `Apply` stores into a captured var; call `Maybe()`, mutate the file + bump mtime, call `Maybe()` again, assert the var updated; assert throttle suppresses a same-second re-stat.
- [ ] **Step 2: Verify fail**`go test ./internal/reload/ -v` → FAIL.
- [ ] **Step 3: Implement** the watcher generically (port `maybeReload` throttle+stat loop, `statMtime`, `scanLines`/`loadLines`). SPDX.
- [ ] **Step 4: Rewire** `policy.go` to build `reload.Target`s (keep `Policy.Decide` semantics identical).
- [ ] **Step 5: Run**`go test ./internal/reload/ ./cmd/sbxmitm/ -count=1` → PASS (parity fixtures still green).
- [ ] **Step 6: Commit**`refactor(toolbox-ng): extract internal/reload (ref #744)`.
---
## Phase 1 — sbxwaf skeleton + vhost routing
### Task 1.1: cmd/sbxwaf skeleton + flags + listener
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/main.go` (+ `main_test.go`)
**Interfaces:**
- Produces: `type Server struct { ca *forge.CA; routes *Routes; rules *Rules; ban *Ban; … }`; `func (s *Server) handler() http.Handler`; flags `--listen :8080`, `--ca-cert`, `--ca-key`, `--routes`, `--rules`, `--upstream-timeout`.
- [ ] **Step 1: Failing test**`httptest`-drive `s.handler()` with a minimal `Server` (nil rules/ban) and one route to a stub backend; assert a request to a mapped Host is proxied (200, body echoed) and the response carries `X-SecuBox-WAF: inspected`.
- [ ] **Step 2: Verify fail**`go test ./cmd/sbxwaf/ -run TestProxyPassthrough -v` → FAIL.
- [ ] **Step 3: Implement** `main.go`: flag parsing, `forge.LoadCA`, build `Server`, an `http.HandlerFunc` that (a) looks up `req.Host` in routes, (b) reverse-proxies via `httputil.NewSingleHostReverseProxy`-style director to the backend `ip:port`, (c) adds the response header. `http.Server{Addr, Handler}` with `ReadHeaderTimeout`. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): reverse-proxy skeleton + listener (ref #744)`.
### Task 1.2: Routes loader with hot-reload + 421 on unmapped
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/routes.go` (+ test)
**Interfaces:**
- Consumes: `reload.Watcher`, `reload.StatMtime`.
- Produces: `type Routes struct{…}`; `func LoadRoutes(path string) *Routes`; `func (r *Routes) Lookup(host string) (ip string, port int, ok bool)`; hot-reloads on mtime change. JSON shape: `{"domain": ["ip", port]}` (matches `haproxy-routes.json`).
- [ ] **Step 1: Failing test** — write a routes JSON, `LoadRoutes`, `Lookup("gitea.example.com")``("127.0.0.1", 3000, true)`; unknown host → `ok=false`; rewrite file + bump mtime, `Maybe()`, assert new route visible.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** loader (parse `map[string][2]json.RawMessage` or `map[string][]any`), RW-locked map, `reload.Target` wiring. In `main.go` handler: unmapped host → `http.Error(w, "Misdirected", 421)`.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): haproxy-routes.json loader + hot-reload + 421 (ref #744)`.
---
## Phase 2 — WAF rule engine
### Task 2.1: Rule compilation from waf-rules.json
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/rules.go` (+ test)
- Reference (port logic, do NOT import): `packages/secubox-mitmproxy/addons/secubox_waf.py` — the pattern categories + compiled regex (SQLi/XSS/LFI/RCE), `waf-rules.json` shape (categories, enabled, severity).
**Interfaces:**
- Produces: `type Rules struct{…}`; `func LoadRules(path string) *Rules`; `func (r *Rules) Match(method, path, query, body, ua string) (cat string, sev string, hit bool)`; hot-reload via `reload`.
- [ ] **Step 1: Failing test** — load a rules JSON with one SQLi pattern (`(?i)union\s+select`); `Match("GET","/x","id=1 UNION SELECT","","")``("sqli","high",true)`; a benign request → `hit=false`.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — parse categories, `regexp.MustCompile` each enabled pattern at load (skip disabled), match across method/path/query/body/UA; first hit wins (mirror Python order). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): regex WAF rule engine from waf-rules.json (ref #744)`.
### Task 2.2: Request inspection wiring + skip-lists
**Files:**
- Modify: `cmd/sbxwaf/main.go` (inspection in the handler); `cmd/sbxwaf/rules.go` (skip helpers)
**Interfaces:**
- Produces: `func staticAsset(path string) bool` (`.js/.css/.png/...`, `/health`, `/status`); `func ncBypass(path string) bool` (`/index.php/login/v2/`, `/ocs/v2.php/core/login`); `func privateCIDR(ip string) bool` (RFC1918 + loopback).
- [ ] **Step 1: Failing test** — handler: a request with `?q=<script>` from a public IP is blocked (403) unless `staticAsset`; a request from `192.168.x` is never blocked; `/health` skips inspection.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — read client IP from `X-Forwarded-For`/`RemoteAddr`; if `privateCIDR` → skip; if `staticAsset`/`ncBypass` → skip; else read body (capped), `rules.Match`; on hit hand to ban (Task 3). Add `Connection: close` (#496).
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): request inspection + CIDR/static/NC skip-lists (ref #744)`.
---
## Phase 3 — Graduated ban (sliding window)
### Task 3.1: Sliding-window ban state
**Files:**
- Create: `cmd/sbxwaf/ban.go` (+ test)
**Interfaces:**
- Produces: `type Ban struct{…}`; `func NewBan(window time.Duration, threshold int) *Ban`; `func (b *Ban) Record(ip string, nowUnix int64) (count int, banned bool)` (count within window; `banned` true once `count >= threshold`). Mirrors `BAN_THRESHOLD=3`/`300s`.
- [ ] **Step 1: Failing test**`NewBan(300s, 3)`; 2 `Record` at t=0 → `banned=false`; 3rd → `banned=true`; a 4th at t=400 (window expired) → count resets, `banned=false`.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement**`map[string][]int64` of hit timestamps, lock-guarded; prune entries older than `now-window` on each `Record`; cap map size. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): sliding-window graduated ban (ref #744)`.
### Task 3.2: WARNING/BAN responses + threat log
**Files:**
- Modify: `cmd/sbxwaf/main.go`; Create: `cmd/sbxwaf/threatlog.go` (+ test)
**Interfaces:**
- Produces: `func writeWarning(w http.ResponseWriter, cat string)`, `func writeBan(w http.ResponseWriter)`; `type ThreatLog struct{…}`, `func (l *ThreatLog) Record(ip, cat, sev, action, path string)` → append JSON line to `/var/log/secubox/waf-threats.log`.
- [ ] **Step 1: Failing test** — on first hit handler returns 403 with a WARNING marker; on the 3rd hit returns 403 BAN; `ThreatLog.Record` appends a parseable JSON line with the action.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — wire `ban.Record` result into WARNING vs BAN; styled 403 bodies (port templates); append-only threat log (O_APPEND). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): graduated WARNING/BAN responses + threat log (ref #744)`.
---
## Phase 4 — CrowdSec LAPI bridge
### Task 4.1: CrowdSec alert POST
**Files:**
- Create: `cmd/sbxwaf/crowdsec.go` (+ test)
- Reference: `secubox_waf.py` lines 710-765 (LAPI `/v1/alerts` JWT payload shape).
**Interfaces:**
- Produces: `type CrowdSec struct{ lapiURL, jwt string; client *http.Client }`; `func (c *CrowdSec) Alert(ip, scenario string) error` (fire-and-forget wrapper `AlertAsync`). On ban, post the alert.
- [ ] **Step 1: Failing test**`httptest` server asserting it receives a POST `/v1/alerts` with `Authorization: Bearer <jwt>` and a JSON body containing the source IP + scenario; `Alert` returns nil on 200/201.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — build the LAPI alert JSON (port the Python payload fields), POST with JWT, 2 s timeout; `AlertAsync` swallows errors (log only). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): CrowdSec LAPI alert bridge on ban (ref #744)`.
---
## Phase 5 — Cookie-audit
### Task 5.1: Set-Cookie JSONL ledger
**Files:**
- Create: `cmd/sbxwaf/cookieaudit.go` (+ test)
- Reference: `packages/secubox-mitmproxy/addons/cookie_audit.py`.
**Interfaces:**
- Produces: `type CookieAudit struct{…}`; `func (a *CookieAudit) Record(host string, resp *http.Response)` → for each `Set-Cookie`, parse attrs, SHA256-hash the value, append JSONL to `/var/log/secubox/cookie-audit/server.jsonl`. Async (channel + writer goroutine).
- [ ] **Step 1: Failing test** — feed a response with two `Set-Cookie` headers; assert two JSONL records appear with `name/domain/path/secure/httponly/samesite` and a hashed (not raw) value.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — parse via `http.Response.Cookies()`, `sha256` the value, buffered channel → single writer goroutine (O_APPEND), never block the request path. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): RGPD cookie-audit JSONL ledger (ref #744)`.
---
## Phase 6 — Media-cache
### Task 6.1: Response media cache
**Files:**
- Create: `cmd/sbxwaf/mediacache.go` (+ test)
- Reference: `packages/secubox-mitmproxy/addons/media_cache.py` + existing `cmd/sbxmitm/mediacatch.go` decision logic.
**Interfaces:**
- Produces: `type MediaCache struct{ dir string; maxObj, maxTotal int64 }`; `func (m *MediaCache) Get(url string) ([]byte, http.Header, bool)`; `func (m *MediaCache) Maybe Store(url string, resp *http.Response, body []byte)` (Content-Type image/video/audio/font/css/js, size < 16 MiB, respects max-age). Key = SHA256(URL), sharded `dir/<key[:2]>/<key>`.
- [ ] **Step 1: Failing test** — store a cacheable image response, `Get` returns it; an oversized (>16 MiB) or non-media response is not stored.
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement** — cache decision (port Python), sharded file store, LRU-ish total cap (evict oldest on overflow), fail-open (any cache error → bypass). SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): media-cache (16MB/obj, 2GB total) (ref #744)`.
---
## Phase 7 — Error pages
### Task 7.1: Synthetic 502/503/504 pages
**Files:**
- Create: `cmd/sbxwaf/errpages.go` + `cmd/sbxwaf/templates/` (embedded) (+ test)
- Reference: `secubox_waf.py` `error()` hook templates.
**Interfaces:**
- Produces: `func errorPage(code int) []byte` (themed HTML, `//go:embed`). On upstream dial/round-trip error, the reverse-proxy `ErrorHandler` serves `errorPage(502|503|504)`.
- [ ] **Step 1: Failing test** — point a route at a dead backend; assert the handler returns 502 with the themed body (contains a known marker string).
- [ ] **Step 2: Verify fail.**
- [ ] **Step 3: Implement**`//go:embed templates/*.html`, map status→template, wire reverse-proxy `ErrorHandler`. SPDX.
- [ ] **Step 4: Run** — PASS.
- [ ] **Step 5: Commit**`feat(sbxwaf): synthetic error pages on upstream failure (ref #744)`.
---
## Phase 8 — Packaging + hardening
### Task 8.1: Debian package + systemd template + user
**Files:**
- Create: `packages/secubox-waf-ng/debian/{control,rules,postinst,prerm,compat}`, `packages/secubox-waf-ng/systemd/secubox-waf-ng-worker@.service`, `packages/secubox-waf-ng/debian/secubox-waf-ng.apparmor`
**Interfaces:**
- Produces: installable `secubox-waf-ng` shipping `/usr/sbin/sbxwaf`; `secubox-waf-ng-worker@1..2` enabled; `secubox-waf` user; AppArmor enforce.
- [ ] **Step 1:** Write `debian/control` (`Architecture: arm64`, `Standards-Version: 4.6.2`, `compat 13`), `rules` (cross-build the Go binary via `execute_after_dh_auto_install`), `postinst` (create `secubox-waf` user/group, dirs `/etc/secubox/waf` `/var/log/secubox` `/var/cache/secubox/waf` with correct owners — NEVER chmod the shared parents to 0750 per `[[project_var_log_secubox_traversal]]`/`[[project_etc_secubox_traversal]]`, `aa-enforce`, `systemctl enable --now secubox-waf-ng-worker@{1,2}`), `prerm` (stop workers).
- [ ] **Step 2:** systemd unit: `User=secubox-waf`, `ExecStart=/usr/sbin/sbxwaf --listen 127.0.0.1:808%i --ca-cert /etc/secubox/waf/ca/ca-cert.pem …`, the full hardening block (Global Constraints), `RuntimeDirectory=secubox` + `RuntimeDirectoryPreserve=yes` (per `[[project_runtimedirectory_socket_wipe]]`).
- [ ] **Step 3:** AppArmor profile: rw to `/var/log/secubox/**`, `/var/cache/secubox/waf/**`, `/run/secubox/**`; r to `/etc/secubox/waf/**`, `/etc/secubox/secrets/**`; deny everything else.
- [ ] **Step 4: Build**`dpkg-buildpackage -a arm64 --host-arch arm64 -us -uc -b``.deb` produced.
- [ ] **Step 5: Commit**`feat(packaging): secubox-waf-ng deb + hardened systemd + AppArmor (ref #744)`.
---
## Phase 9 — Parity harness + shadow + cutover
### Task 9.1: Decision parity harness vs mitmproxy
**Files:**
- Create: `packages/secubox-toolbox-ng/cmd/sbxwaf/parity_test.go`, `packages/secubox-toolbox-ng/testdata/waf-parity-fixtures.json`
**Interfaces:**
- Consumes: the same request corpus replayed against Python (`secubox_waf.py`) and Go (`Rules.Match`+`Ban`).
- Produces: a fixture file of `{method, path, query, body, ua, client_ip, expect: allow|warn|ban|421}` and a Go test asserting `sbxwaf` matches `expect` for every row.
- [ ] **Step 1:** Author `waf-parity-fixtures.json` from the Python rule corpus (malicious + benign + private-IP + static + NC-bypass rows).
- [ ] **Step 2:** Write `parity_test.go` looping fixtures through `Rules.Match`+skip-lists+`Ban`, asserting `expect`.
- [ ] **Step 3: Run**`go test ./cmd/sbxwaf/ -run TestWAFParity -v` → PASS; any mismatch is a BLOCKING bug to fix in `rules.go`.
- [ ] **Step 4: Commit**`test(sbxwaf): decision parity harness vs mitmproxy (ref #744)`.
### Task 9.2: Shadow-run + bench + cutover/rollback runbook
**Files:**
- Create: `packages/secubox-waf-ng/docs/CUTOVER.md`, `scripts/sbxwaf-bench.sh`
- [ ] **Step 1:** `sbxwaf-bench.sh` — drive `wrk`/`hey` against both mitmproxy (`:8080` LXC) and sbxwaf (`:8081` shadow), record req/s, p99, RSS; emit a comparison table.
- [ ] **Step 2:** Deploy sbxwaf on `:8081` (shadow), mirror a fraction of traffic (HAProxy `mode tcp` tee or duplicated backend), run the bench + replay the parity corpus live.
- [ ] **Step 3:** `CUTOVER.md` — go/no-go checklist (parity green, bench `>5×`/`p99<⅓`/`RSS<¼`), the HAProxy `server waf` IP flip (LXC→host), and the rollback (re-flip; mitmproxy LXC stays deployed until validated).
- [ ] **Step 4: Commit**`docs(sbxwaf): bench harness + cutover/rollback runbook (ref #744)`.
- [ ] **Step 5:** (Operator-gated) execute cutover only after the go/no-go gate passes; this step is NOT automated.
---
## Self-Review notes
- **Spec coverage:** §3 architecture→Phase 1; §4 components→Phases 0-7 (forge/relay/httpcodec/reload extracted Phase 0; routes/rules/ban/crowdsec/cookieaudit/mediacache/errpages Phases 1-7); §5 feature port→Phases 2-7; §6 hardening→Phase 8; §7 migration→Phase 9; §8 tests→every task + Phase 9 parity. No gaps.
- **Placeholder scan:** none — each task has concrete files, signatures, and test code.
- **Type consistency:** `forge.CA`/`LoadCA`/`Forge`, `Routes.Lookup`, `Rules.Match`, `Ban.Record`, `CrowdSec.Alert` used consistently across phases.

View File

@ -0,0 +1,169 @@
# Design — `sbxwaf` : moteur WAF Go host-native (remplacement mitmproxy)
- **Issue** : #744
- **Date** : 2026-06-26
- **Prior art** : #662 (port toolbox R3 `sbxmitm`), `docs/superpowers/specs/2026-06-18-mitm-engine-migration-analysis.md`
- **Statut** : design validé (brainstorming) — en attente de revue avant plan d'implémentation
## 1. Contexte & problème
Le WAF de SecuBox inspecte tout le trafic externe entrant (HAProxy TLS 1.3 → backend
`mitmproxy_waf` → mitmdump `--mode regular` → backends LXC). L'inspection tourne dans
`mitmproxy` 11.0.2 (LXC `10.100.0.60:8080`) avec trois addons Python :
- `secubox_waf.py` (930 lignes) — routing vhost→backend (`haproxy-routes.json`,
reload mtime 10s), moteur de règles regex (SQLi/XSS/LFI/RCE…), ban gradué
(fenêtre glissante 300s, seuil 3 → 403 WARNING puis 403 BAN), bridge CrowdSec
LAPI (`/v1/alerts` → firewall-bouncer → nft drop), pages d'erreur synthétiques,
`Connection: close` (#496), whitelist CIDR RFC1918, skip statiques, bypass token NC.
- `cookie_audit.py` — ledger RGPD des `Set-Cookie` (JSONL, valeurs hashées SHA256).
- `media_cache.py` — cache de réponses média (16 MB/objet, 2 GB total).
### Problèmes du moteur actuel
1. **Perf** : Python GIL-bound. Phase 9 (#501) a dû lancer **4 workers + fanout
numgen** pour saturer les cœurs. Regex Python + dispatch asyncio par requête.
2. **Fragilité** : dépendance à la version mitmproxy (#605 timing `requestheaders`
en v11), au drop-in confdir (#603), au drift `/data` vs `/srv` des routes — trois
modes de panne mémorisés qui downent tous les vhosts inspectés.
3. **RAM** : ~150-200 MB × 4 workers dans le LXC.
## 2. Objectif & décisions
| Axe | Décision |
|-----|----------|
| Driver principal | **Performance/charge** (throughput, p99 latence, RAM) |
| Périmètre | **Remplacement COMPLET** — aucun mitmproxy résiduel dans le WAF |
| Placement | **Host-native** (workers `secubox-waf-ng-worker@`), durci |
| Approche | **A** — binaire dédié `sbxwaf`, cœur partagé extrait de `sbxmitm`, shadow→cutover |
### Gains estimés (à valider par bench, = critères go/no-go BLOQUANTS)
- Throughput : **>5×/cœur** (suppression GIL + fanout) ; cible bench `>5× req/s·cœur`.
- Latence p99 : **<⅓** (regexp compilé + GC concurrent, pas de thrash refcount).
- RAM : **<¼** (1 binaire statique ~30-80 MB vs 600-800 MB).
- Robustesse : suppression des 3 modes de panne (binaire statique, zéro runtime).
Ces seuils sont **bloquants** : pas de cutover tant qu'ils ne sont pas atteints sur
le bench de charge (§7.3). Si un cas live-dashboard incompressible empêche un seuil,
il est documenté et arbitré explicitement avant cutover.
### Non-objectifs (YAGNI)
- Pas d'unification immédiate des moteurs (`sbxmitm` reste séparé — approche B écartée
pour ne pas coupler les cycles de release WAF et toolbox R3).
- Pas de JA4/splice TLS dans le WAF (besoins toolbox R3, hors périmètre WAF).
## 3. Architecture cible
```
Internet ──TLS1.3──> HAProxy :443
│ use_backend mitmproxy_waf (ACL vhost)
backend mitmproxy_waf
server waf <HOST_IP>:8080 ◄── flip cutover (host au lieu du LXC)
┌─────────────────────────────────────────┐
│ sbxwaf (host-native, user secubox-waf) │
│ workers ng-worker@1..2 (rolling restart)│
│ ├─ forge CA per-host (mode regular) │
│ ├─ routes-loader (haproxy-routes.json) │
│ ├─ moteur règles WAF (waf-rules.json) │
│ ├─ ban gradué (fenêtre glissante) │
│ ├─ bridge CrowdSec LAPI │
│ ├─ cookie-audit JSONL │
│ ├─ media-cache │
│ └─ pages d'erreur 502/503/504 │
└─────────────────────────────────────────┘
backends LXC 10.100.0.0/24
```
- **Position réseau identique** à mitmdump : écoute `:8080`, **même confdir CA**
(migrée `/data/mitmproxy``/etc/secubox/waf/ca`), **même `haproxy-routes.json`**
(reload mtime), **backend HAProxy inchangé** (on flip l'IP `server waf` du LXC vers
l'host). La frontière TLS exacte (forge `--mode regular`) est miroitée par `sbxwaf`.
- **Concurrence** : 1 process tous-cœurs. On garde **2 workers** pour le
rolling-restart sans coupure (pas pour scaler) — le fanout numgen 4-workers
disparaît.
## 4. Composants (unités isolées, testables)
| Package / cmd | Rôle | Dépend de |
|---|---|---|
| `internal/forge` | CA + forge leaf per-host (extrait de `sbxmitm`) | crypto/tls, x509 |
| `internal/relay` | POST async unix-socket fire-and-forget | net |
| `internal/httpcodec` | gzip/br/zstd decode+reencode (extrait) | compress, brotli, zstd |
| `internal/util` | helpers HTTP communs | — |
| `cmd/sbxwaf/routes.go` | charge `haproxy-routes.json`, reload mtime, rewrite `req.Host/URL` | internal |
| `cmd/sbxwaf/rules.go` | regex compilées depuis `waf-rules.json`, match path/query/body/UA | regexp |
| `cmd/sbxwaf/ban.go` | fenêtre glissante 300s, seuil → WARNING/BAN, map lock-guarded TTL | sync |
| `cmd/sbxwaf/crowdsec.go` | POST LAPI `/v1/alerts` (JWT) | net/http |
| `cmd/sbxwaf/cookieaudit.go` | parse Set-Cookie, hash SHA256, append JSONL | crypto/sha256 |
| `cmd/sbxwaf/mediacache.go` | cache réponses média (16MB/2GB) — réutilise `mediacatch.go` | — |
| `cmd/sbxwaf/errpages.go` | templates 502/503/504 embarqués | embed |
| `cmd/sbxwaf/main.go` | reverse-proxy HTTP, pipeline d'inspection, listen :8080 | net/http |
Chaque unité a un contrat clair (entrée→verdict) et est testable isolément contre
des fixtures. Le cœur partagé `internal/*` est consommé par `cmd/sbxmitm` ET
`cmd/sbxwaf` sans coupler leurs binaires.
## 5. Portage des fonctions (remplacement complet)
Parité **exacte** requise avec `secubox_waf.py` (sécurité-critique, no-regress) :
- **Routing** : `requestheaders` → lookup host dans routes, rewrite cible ; host non
mappé → **421**.
- **Règles** : catégories regex (SQLi/XSS/LFI/RCE…) depuis `waf-rules.json`
(enabled/severity), match sur path+query+body+UA. Skip statiques (.js/.css/.png/
health/status), bypass tokens NC (`/index.php/login/v2/`, `/ocs/v2.php/core/login`).
- **Ban gradué** : fenêtre glissante 300s, seuil 3 → 1ʳᵉ détection **403 WARNING**,
count≥3 **403 BAN** ; whitelist CIDR RFC1918+loopback (opérateurs LAN jamais bannis).
- **CrowdSec** : alerte JWT → LAPI `/v1/alerts` → bouncer nft drop (4h défaut).
- **Pages d'erreur** : interception 502/503/504 → pages thémées.
- **Cookie-audit** : `response` → Set-Cookie → JSONL hashé.
- **Media-cache** : Content-Type/size/TTL → store/serve.
- **`Connection: close`** (#496) conservé.
## 6. Durcissement (compense la perte d'isolation LXC)
Le host-native expose le WAF (trafic attaquant) sur l'hôte → contrôles compensatoires
(exigence CSPN — séparation de privilèges, AppArmor enforce) :
- `User=secubox-waf` / `Group=secubox-waf` non-privilégié (créé en postinst).
- `NoNewPrivileges=yes`, `ProtectSystem=strict` + `ReadWritePaths` minimal
(`/var/log/secubox`, `/var/cache/secubox/waf`, `/run/secubox`), `ProtectHome=yes`.
- `RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX`, drop de toutes capabilities,
`SystemCallFilter` (seccomp).
- **Profil AppArmor enforce** livré dans `debian/`, activé en postinst.
- Journalisation audit **append-only** `/var/log/secubox/audit.log` (ban/unban/règle).
- Secrets (JWT CrowdSec, clé CA) hors code, `/etc/secubox/secrets/` chmod 600 owner
`secubox-waf`.
## 7. Migration : shadow → parité → cutover → rollback
1. **Shadow-run** : `sbxwaf` déployé sur un **port parallèle** (`:8081`), trafic
miroité (HAProxy `mode tcp` mirror / tee). Aucun impact prod.
2. **Harness de parité** : corpus de requêtes (malveillantes + légitimes) rejoué
contre Python ET Go ; compare **verdict** (allow/204/403/421/ban) + **cible de
routing**. Réutilise le pattern `parity-fixtures.json` (#662). No-regress détection
= **bloquant**.
3. **Bench perf** (go/no-go) : throughput req/s·cœur, p99 latence, RSS — cibles §2.
4. **Cutover** : flip du `server waf` HAProxy (IP LXC → host:8080). **Rollback** =
re-flip vers le LXC (mitmproxy reste déployé jusqu'à validation).
## 8. Tests
- **Unitaires** : chaque package `internal/*` + `cmd/sbxwaf/*` (rules, ban, routes,
cookieaudit) avec fixtures.
- **Parité** : harness §7.2 (vs mitmproxy live).
- **Charge** : bench §7.3 (critères cutover).
- **Sécurité** : non-régression de la détection (corpus d'attaques connu) + tests CSPN
(séparation privilèges, AppArmor enforce, audit append-only).
## 9. Risques & mitigations
| Risque | Mitigation |
|---|---|
| Régression de détection WAF | Harness parité bloquant + corpus d'attaques avant cutover |
| Perte d'isolation (host-native) | Durcissement §6 (user dédié, AppArmor, seccomp, caps) |
| Frontière TLS forge mal miroitée | Shadow-run + comparaison réponses ; mitmproxy en rollback |
| Couplage cœur partagé ↔ toolbox | `internal/*` versionné, binaires séparés, tests des deux cmd |
| Drift CrowdSec LAPI (auth/format) | Test d'intégration LAPI + fallback log si POST échoue |

View File

@ -10,3 +10,4 @@ cmd/sbxmitm/sbxmitm
/debian/secubox-toolbox-ng/
/debian/debhelper-build-stamp
/debian/*.debhelper.log
/sbxwaf

View File

@ -21,6 +21,8 @@ import (
"path/filepath"
"testing"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
)
func benchCA(b *testing.B) (string, string) {
@ -49,12 +51,12 @@ func benchCA(b *testing.B) (string, string) {
// load (warm forge cache). req/s should rise ~linearly with -cpu (no GIL).
func BenchmarkHandshake(b *testing.B) {
cp, kp := benchCA(b)
ca, err := loadCA(cp, kp)
ca, err := forge.LoadCA(cp, kp)
if err != nil {
b.Fatal(err)
}
px := &Proxy{ca: ca}
if _, err := ca.forge("example.com"); err != nil { // warm cache
if _, err := ca.Forge("example.com"); err != nil { // warm cache
b.Fatal(err)
}
ln, err := net.Listen("tcp", "127.0.0.1:0")
@ -77,7 +79,7 @@ func BenchmarkHandshake(b *testing.B) {
}
}()
pool := x509.NewCertPool()
pool.AddCert(ca.cert)
pool.AddCert(ca.Cert)
addr := ln.Addr().String()
ccfg := &tls.Config{ServerName: "example.com", RootCAs: pool, MinVersion: tls.VersionTLS12}

View File

@ -15,6 +15,8 @@ import (
"net/http"
"strings"
"testing"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
)
// TestAcceptEncodingPreserved pins the #662 behaviour change: the request
@ -48,13 +50,13 @@ func TestBrotliRoundTrip(t *testing.T) {
bytes.Repeat([]byte("AB"), 100000),
}
for _, x := range cases {
enc, err := brotliBytes(x)
enc, err := httpcodec.BrotliBytes(x)
if err != nil {
t.Fatalf("brotliBytes(%d): %v", len(x), err)
t.Fatalf("BrotliBytes(%d): %v", len(x), err)
}
got, err := unbrotliBytes(enc)
got, err := httpcodec.UnbrotliBytes(enc)
if err != nil {
t.Fatalf("unbrotliBytes(%d): %v", len(x), err)
t.Fatalf("UnbrotliBytes(%d): %v", len(x), err)
}
if !bytes.Equal(got, x) {
t.Fatalf("brotli round-trip mismatch: got %d want %d", len(got), len(x))
@ -70,13 +72,13 @@ func TestZstdRoundTrip(t *testing.T) {
bytes.Repeat([]byte("AB"), 100000),
}
for _, x := range cases {
enc, err := zstdBytes(x)
enc, err := httpcodec.ZstdBytes(x)
if err != nil {
t.Fatalf("zstdBytes(%d): %v", len(x), err)
t.Fatalf("ZstdBytes(%d): %v", len(x), err)
}
got, err := unzstdBytes(enc)
got, err := httpcodec.UnzstdBytes(enc)
if err != nil {
t.Fatalf("unzstdBytes(%d): %v", len(x), err)
t.Fatalf("UnzstdBytes(%d): %v", len(x), err)
}
if !bytes.Equal(got, x) {
t.Fatalf("zstd round-trip mismatch: got %d want %d", len(got), len(x))
@ -86,7 +88,7 @@ func TestZstdRoundTrip(t *testing.T) {
func TestInjectIntoBodyBrotli(t *testing.T) {
html := `<html><head><title>page</title></head><body>content</body></html>`
enc, err := brotliBytes([]byte(html))
enc, err := httpcodec.BrotliBytes([]byte(html))
if err != nil {
t.Fatal(err)
}
@ -94,7 +96,7 @@ func TestInjectIntoBodyBrotli(t *testing.T) {
if !ok {
t.Fatal("br inject must report ok=true")
}
plain, err := unbrotliBytes(out)
plain, err := httpcodec.UnbrotliBytes(out)
if err != nil {
t.Fatalf("re-brotli'd output must decode cleanly (encoding stays br): %v", err)
}
@ -109,7 +111,7 @@ func TestInjectIntoBodyBrotli(t *testing.T) {
func TestInjectIntoBodyZstd(t *testing.T) {
html := `<html><head><title>page</title></head><body>content</body></html>`
enc, err := zstdBytes([]byte(html))
enc, err := httpcodec.ZstdBytes([]byte(html))
if err != nil {
t.Fatal(err)
}
@ -117,7 +119,7 @@ func TestInjectIntoBodyZstd(t *testing.T) {
if !ok {
t.Fatal("zstd inject must report ok=true")
}
plain, err := unzstdBytes(out)
plain, err := httpcodec.UnzstdBytes(out)
if err != nil {
t.Fatalf("re-zstd'd output must decode cleanly (encoding stays zstd): %v", err)
}
@ -131,12 +133,12 @@ func TestInjectIntoBodyZstd(t *testing.T) {
}
func TestInjectIntoBodyBrotliCaseInsensitive(t *testing.T) {
enc, _ := brotliBytes([]byte(`<head></head>`))
enc, _ := httpcodec.BrotliBytes([]byte(`<head></head>`))
out, ok := injectIntoBody(enc, "BR", inlineTestScript, "", false)
if !ok {
t.Fatal("Content-Encoding BR (upper) must be recognised → ok=true")
}
plain, err := unbrotliBytes(out)
plain, err := httpcodec.UnbrotliBytes(out)
if err != nil {
t.Fatal(err)
}
@ -168,25 +170,26 @@ func TestInjectIntoBodyZstdFailOpen(t *testing.T) {
}
func TestBrotliZstdBombGuard(t *testing.T) {
zeros := make([]byte, gunzipCap+4096)
brBomb, err := brotliBytes(zeros)
const bombCap = 32 << 20 // mirrors httpcodec.gunzipCap
zeros := make([]byte, bombCap+4096)
brBomb, err := httpcodec.BrotliBytes(zeros)
if err != nil {
t.Fatal(err)
}
if _, err := unbrotliBytes(brBomb); err == nil {
t.Fatal("unbrotliBytes must reject output exceeding gunzipCap")
if _, err := httpcodec.UnbrotliBytes(brBomb); err == nil {
t.Fatal("UnbrotliBytes must reject output exceeding gunzipCap")
}
// fail-open through the inject path.
if out, ok := injectIntoBody(brBomb, "br", inlineTestScript, "", false); ok || !bytes.Equal(out, brBomb) {
t.Fatal("over-cap br body must fail open with original bytes")
}
zsBomb, err := zstdBytes(zeros)
zsBomb, err := httpcodec.ZstdBytes(zeros)
if err != nil {
t.Fatal(err)
}
if _, err := unzstdBytes(zsBomb); err == nil {
t.Fatal("unzstdBytes must reject output exceeding gunzipCap")
if _, err := httpcodec.UnzstdBytes(zsBomb); err == nil {
t.Fatal("UnzstdBytes must reject output exceeding gunzipCap")
}
if out, ok := injectIntoBody(zsBomb, "zstd", inlineTestScript, "", false); ok || !bytes.Equal(out, zsBomb) {
t.Fatal("over-cap zstd body must fail open with original bytes")

View File

@ -7,6 +7,8 @@ package main
import (
"strings"
"testing"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
)
// representativeSelectors covers each ported group + an EXPANDED popup token,
@ -167,12 +169,12 @@ func TestInjectHTMLNonWGSkipsCosmetic(t *testing.T) {
func TestInjectIntoBodyGzipCarriesCosmetic(t *testing.T) {
// The gzip decompress→inject→recompress path must carry BOTH injects for wg.
body := []byte(`<html><head></head><body>hi</body></html>`)
gz := gzipBytes(body)
gz := httpcodec.GzipBytes(body)
out, ok := injectIntoBody(gz, "gzip", inlineTestScript, "", true)
if !ok {
t.Fatalf("injectIntoBody(gzip) returned ok=false")
}
plain, err := gunzipBytes(out)
plain, err := httpcodec.GunzipBytes(out)
if err != nil {
t.Fatalf("re-gzip output not gunzippable: %v", err)
}

View File

@ -17,6 +17,10 @@
// open (serve the ORIGINAL bytes on any decode/encode error — never corrupt a
// page); unknown encodings pass through untouched.
//
// Codec primitives (GunzipBytes / GzipBytes / UnbrotliBytes / BrotliBytes /
// UnzstdBytes / ZstdBytes) live in internal/httpcodec so that cmd/sbxwaf can
// reuse them. This file only contains the sbxmitm-specific inject logic.
//
// Dependencies (cgo-free, pure-Go):
// - compress/gzip (stdlib)
// - github.com/andybalholm/brotli (br)
@ -24,127 +28,11 @@
package main
import (
"bytes"
"compress/gzip"
"io"
"strings"
"github.com/andybalholm/brotli"
"github.com/klauspost/compress/zstd"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
)
// gunzipCap bounds the decompressed output of EVERY codec (gzip/br/zstd) so a
// maliciously-crafted body (a "decompression bomb") cannot blow the worker's
// memory. The upstream body itself is already read under an 8MiB LimitReader;
// 32MiB of inflated HTML is a generous ceiling for a single page. Exceeding it →
// treated as an error (caller fails open and serves the original compressed
// bytes). Named gunzipCap for history; applies uniformly to br + zstd too.
const gunzipCap = 32 << 20
// readCapped inflates a decompressing reader with the gunzipCap bomb guard,
// shared by gzip/br/zstd. Reads up to gunzipCap+1 so "exactly at the cap" (fine)
// is distinguished from "over the cap" (bomb → error).
func readCapped(r io.Reader) ([]byte, error) {
out, err := io.ReadAll(io.LimitReader(r, gunzipCap+1))
if err != nil {
return nil, err
}
if len(out) > gunzipCap {
return nil, errGunzipTooLarge
}
return out, nil
}
// gunzipBytes inflates a gzip-compressed body. It is defensive on two axes:
// - a malformed/non-gzip input returns an error (caller fails open),
// - the decompressed output is capped at gunzipCap; if the stream would
// exceed it, that is reported as an error too (decompression-bomb guard).
func gunzipBytes(in []byte) ([]byte, error) {
zr, err := gzip.NewReader(bytes.NewReader(in))
if err != nil {
return nil, err
}
defer zr.Close()
return readCapped(zr)
}
// errGunzipTooLarge is returned by gunzipBytes when the decompressed stream
// exceeds gunzipCap (decompression-bomb guard).
var errGunzipTooLarge = errString("gunzip output exceeds cap")
// errString is a tiny stdlib-only error type (avoids importing errors/fmt for
// one sentinel).
type errString string
func (e errString) Error() string { return string(e) }
// gzipBytes compresses in with the default gzip level. It never errors: the
// gzip.Writer only writes into an in-memory bytes.Buffer, which cannot fail.
func gzipBytes(in []byte) []byte {
var buf bytes.Buffer
zw := gzip.NewWriter(&buf)
_, _ = zw.Write(in)
_ = zw.Close()
return buf.Bytes()
}
// unbrotliBytes inflates a brotli-compressed body with the gunzipCap bomb guard.
// A malformed/non-brotli input or an over-cap stream returns an error (caller
// fails open). Pure-Go (github.com/andybalholm/brotli — cgo-free).
func unbrotliBytes(in []byte) ([]byte, error) {
return readCapped(brotli.NewReader(bytes.NewReader(in)))
}
// brotliBytes compresses in with brotli at the default quality. It writes into
// an in-memory buffer; Close flushes the final block. The bytes.Buffer cannot
// fail, but brotli.Writer.Write/Close return errors → surfaced so the caller
// fails open rather than serving a truncated stream.
func brotliBytes(in []byte) ([]byte, error) {
var buf bytes.Buffer
bw := brotli.NewWriter(&buf)
if _, err := bw.Write(in); err != nil {
_ = bw.Close()
return nil, err
}
if err := bw.Close(); err != nil {
return nil, err
}
return buf.Bytes(), nil
}
// unzstdBytes inflates a zstd-compressed body with the gunzipCap bomb guard. A
// malformed/non-zstd input or an over-cap stream returns an error (caller fails
// open). Pure-Go (github.com/klauspost/compress/zstd — cgo-free). The decoder is
// created per-call WITHOUT concurrency goroutines (WithDecoderConcurrency(1)) so
// nothing is left running, then Closed.
func unzstdBytes(in []byte) ([]byte, error) {
zr, err := zstd.NewReader(bytes.NewReader(in), zstd.WithDecoderConcurrency(1))
if err != nil {
return nil, err
}
defer zr.Close()
return readCapped(zr)
}
// zstdBytes compresses in with zstd at the default level. The encoder is created
// per-call and Closed (flushing the final frame). Errors are surfaced so the
// caller fails open rather than serving a truncated frame.
func zstdBytes(in []byte) ([]byte, error) {
var buf bytes.Buffer
zw, err := zstd.NewWriter(&buf, zstd.WithEncoderConcurrency(1))
if err != nil {
return nil, err
}
if _, err := zw.Write(in); err != nil {
_ = zw.Close()
return nil, err
}
if err := zw.Close(); err != nil {
return nil, err
}
return buf.Bytes(), nil
}
// injectHTML applies BOTH HTML transforms in one pass over the DECOMPRESSED
// body: the transparency-banner (always, via the INLINE script) AND, for R3 (wg)
// clients, the ad/popup-hiding cosmetic <style> (#662 — the cutover left this
@ -188,34 +76,35 @@ func injectHTML(plain []byte, scriptBody, nonce string, wg bool) []byte {
// encoder error), the ORIGINAL bytes are returned with ok=false so the page is
// never broken or corrupted.
//
// The 32MiB decompression-bomb cap (gunzipCap) is enforced uniformly across
// gzip/br/zstd. idempotency / placement live inside injectInlineBanner/injectCosmetic.
// The 32 MiB decompression-bomb cap (gunzipCap) is enforced uniformly across
// gzip/br/zstd inside internal/httpcodec. idempotency / placement live inside
// injectInlineBanner/injectCosmetic.
func injectIntoBody(body []byte, encoding, scriptBody, nonce string, wg bool) (out []byte, ok bool) {
switch strings.ToLower(strings.TrimSpace(encoding)) {
case "":
return injectHTML(body, scriptBody, nonce, wg), true
case "gzip":
plain, err := gunzipBytes(body)
plain, err := httpcodec.GunzipBytes(body)
if err != nil {
return body, false // fail open: serve the original compressed bytes
}
return gzipBytes(injectHTML(plain, scriptBody, nonce, wg)), true
return httpcodec.GzipBytes(injectHTML(plain, scriptBody, nonce, wg)), true
case "br":
plain, err := unbrotliBytes(body)
plain, err := httpcodec.UnbrotliBytes(body)
if err != nil {
return body, false // fail open
}
reenc, err := brotliBytes(injectHTML(plain, scriptBody, nonce, wg))
reenc, err := httpcodec.BrotliBytes(injectHTML(plain, scriptBody, nonce, wg))
if err != nil {
return body, false // fail open: never serve a truncated br frame
}
return reenc, true
case "zstd":
plain, err := unzstdBytes(body)
plain, err := httpcodec.UnzstdBytes(body)
if err != nil {
return body, false // fail open
}
reenc, err := zstdBytes(injectHTML(plain, scriptBody, nonce, wg))
reenc, err := httpcodec.ZstdBytes(injectHTML(plain, scriptBody, nonce, wg))
if err != nil {
return body, false // fail open: never serve a truncated zstd frame
}

View File

@ -13,6 +13,8 @@ import (
"bytes"
"strings"
"testing"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/httpcodec"
)
func TestGzipRoundTrip(t *testing.T) {
@ -23,9 +25,9 @@ func TestGzipRoundTrip(t *testing.T) {
bytes.Repeat([]byte("AB"), 100000), // larger, compressible payload
}
for _, x := range cases {
got, err := gunzipBytes(gzipBytes(x))
got, err := httpcodec.GunzipBytes(httpcodec.GzipBytes(x))
if err != nil {
t.Fatalf("gunzipBytes(gzipBytes(%d bytes)) errored: %v", len(x), err)
t.Fatalf("GunzipBytes(GzipBytes(%d bytes)) errored: %v", len(x), err)
}
if !bytes.Equal(got, x) {
t.Fatalf("round-trip mismatch: got %d bytes, want %d bytes", len(got), len(x))
@ -35,8 +37,8 @@ func TestGzipRoundTrip(t *testing.T) {
func TestGunzipNonGzipFails(t *testing.T) {
// Plain bytes that are not a gzip stream → error, no panic.
if _, err := gunzipBytes([]byte("this is definitely not gzip")); err == nil {
t.Fatal("gunzipBytes on non-gzip input must error")
if _, err := httpcodec.GunzipBytes([]byte("this is definitely not gzip")); err == nil {
t.Fatal("GunzipBytes on non-gzip input must error")
}
}
@ -44,11 +46,11 @@ func TestInjectIntoBodyGzip(t *testing.T) {
// End-to-end-ish: HTML with <head>, gzipped, run through the exact transform
// the inject path uses. Result must gunzip back to an injected, intact doc.
html := `<html><head><title>page</title></head><body>content</body></html>`
out, ok := injectIntoBody(gzipBytes([]byte(html)), "gzip", inlineTestScript, "", true)
out, ok := injectIntoBody(httpcodec.GzipBytes([]byte(html)), "gzip", inlineTestScript, "", true)
if !ok {
t.Fatal("gzip inject must report ok=true")
}
plain, err := gunzipBytes(out)
plain, err := httpcodec.GunzipBytes(out)
if err != nil {
t.Fatalf("re-gzipped output must gunzip cleanly: %v", err)
}
@ -68,11 +70,11 @@ func TestInjectIntoBodyGzip(t *testing.T) {
func TestInjectIntoBodyGzipCaseInsensitiveEncoding(t *testing.T) {
html := `<head></head>`
out, ok := injectIntoBody(gzipBytes([]byte(html)), "GZIP", inlineTestScript, "", false)
out, ok := injectIntoBody(httpcodec.GzipBytes([]byte(html)), "GZIP", inlineTestScript, "", false)
if !ok {
t.Fatal("Content-Encoding GZIP (upper) must be recognised → ok=true")
}
plain, err := gunzipBytes(out)
plain, err := httpcodec.GunzipBytes(out)
if err != nil {
t.Fatalf("gunzip failed: %v", err)
}
@ -125,10 +127,11 @@ func TestInjectIntoBodyUnknownEncodingPassthrough(t *testing.T) {
func TestGunzipBombGuard(t *testing.T) {
// A body that inflates beyond gunzipCap must be rejected (not OOM the worker).
// gzip of >32MiB of zeros compresses to a small blob but inflates past the
// cap → gunzipBytes returns an error → inject path fails open.
big := gzipBytes(make([]byte, gunzipCap+1024))
if _, err := gunzipBytes(big); err == nil {
t.Fatal("gunzipBytes must reject output exceeding gunzipCap")
// cap → GunzipBytes returns an error → inject path fails open.
const bombCap = 32 << 20 // mirrors httpcodec.gunzipCap
big := httpcodec.GzipBytes(make([]byte, bombCap+1024))
if _, err := httpcodec.GunzipBytes(big); err == nil {
t.Fatal("GunzipBytes must reject output exceeding gunzipCap")
}
// And via the inject path: fail open, original bytes preserved.
out, ok := injectIntoBody(big, "gzip", inlineTestScript, "", false)
@ -142,12 +145,13 @@ func TestGunzipBombGuard(t *testing.T) {
func TestGunzipExactlyAtCap(t *testing.T) {
// A body that inflates to EXACTLY gunzipCap is allowed (boundary).
payload := make([]byte, gunzipCap)
got, err := gunzipBytes(gzipBytes(payload))
const bombCap = 32 << 20 // mirrors httpcodec.gunzipCap
payload := make([]byte, bombCap)
got, err := httpcodec.GunzipBytes(httpcodec.GzipBytes(payload))
if err != nil {
t.Fatalf("exactly-at-cap payload must be allowed: %v", err)
}
if len(got) != gunzipCap {
t.Fatalf("at-cap length mismatch: got %d, want %d", len(got), gunzipCap)
if len(got) != bombCap {
t.Fatalf("at-cap length mismatch: got %d, want %d", len(got), bombCap)
}
}

View File

@ -22,138 +22,20 @@ package main
import (
"bytes"
"crypto"
"crypto/rand"
"crypto/tls"
"crypto/x509"
"crypto/x509/pkix"
"encoding/pem"
"flag"
"fmt"
"io"
"log"
"math/big"
"net"
"net/http"
"os"
"strconv"
"strings"
"sync"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
)
// ── CA + per-host leaf forging ──────────────────────────────────────────────
// CA holds the loaded forging CA (reused from ca-wg) + a per-host leaf cache.
type CA struct {
cert *x509.Certificate
key crypto.Signer
mu sync.Mutex
cache map[string]*tls.Certificate
}
func loadCA(certPath, keyPath string) (*CA, error) {
cpem, err := os.ReadFile(certPath)
if err != nil {
return nil, fmt.Errorf("read ca cert: %w", err)
}
kpem, err := os.ReadFile(keyPath)
if err != nil {
return nil, fmt.Errorf("read ca key: %w", err)
}
// Scan for the right block TYPE rather than assuming position: the live R3
// CA the toolbox forges with (mitmproxy confdir `mitmproxy-ca.pem`) is a
// COMBINED cert+key bundle, and --ca-key may point at it. Tolerate cert and
// key co-residing in either file, in any order.
cblk := firstPEMBlock(cpem, func(b *pem.Block) bool { return b.Type == "CERTIFICATE" })
if cblk == nil {
return nil, fmt.Errorf("ca cert: no CERTIFICATE PEM block")
}
cert, err := x509.ParseCertificate(cblk.Bytes)
if err != nil {
return nil, fmt.Errorf("parse ca cert: %w", err)
}
kblk := firstPEMBlock(kpem, func(b *pem.Block) bool { return strings.Contains(b.Type, "PRIVATE KEY") })
if kblk == nil {
return nil, fmt.Errorf("ca key: no PRIVATE KEY PEM block")
}
key, err := parseKey(kblk.Bytes)
if err != nil {
return nil, fmt.Errorf("parse ca key: %w", err)
}
return &CA{cert: cert, key: key, cache: map[string]*tls.Certificate{}}, nil
}
// firstPEMBlock returns the first PEM block in data satisfying want, or nil.
// Used to pull a specific block (CERTIFICATE / PRIVATE KEY) out of a file that
// may hold several (e.g. mitmproxy's combined CA bundle).
func firstPEMBlock(data []byte, want func(*pem.Block) bool) *pem.Block {
for {
blk, rest := pem.Decode(data)
if blk == nil {
return nil
}
if want(blk) {
return blk
}
data = rest
}
}
func parseKey(der []byte) (crypto.Signer, error) {
if k, err := x509.ParsePKCS8PrivateKey(der); err == nil {
if s, ok := k.(crypto.Signer); ok {
return s, nil
}
}
if k, err := x509.ParsePKCS1PrivateKey(der); err == nil {
return k, nil
}
if k, err := x509.ParseECPrivateKey(der); err == nil {
return k, nil
}
return nil, fmt.Errorf("unsupported CA key format")
}
// forge returns a leaf cert for host signed by the CA, cached.
func (c *CA) forge(host string) (*tls.Certificate, error) {
host = strings.ToLower(strings.TrimSpace(host))
c.mu.Lock()
if tc, ok := c.cache[host]; ok {
c.mu.Unlock()
return tc, nil
}
c.mu.Unlock()
serial, _ := rand.Int(rand.Reader, new(big.Int).Lsh(big.NewInt(1), 128))
tmpl := &x509.Certificate{
SerialNumber: serial,
Subject: pkix.Name{CommonName: host},
// #689 — forged leaves must outlive the (non-evicting) cert cache, else a
// long-running worker keeps serving an expired leaf and every client
// reports "certificat expiré". 365d forward + 48h back-skew = 367d span,
// safely under Apple's 398-day max-validity rule for server certs.
NotBefore: time.Now().Add(-48 * time.Hour),
NotAfter: time.Now().Add(365 * 24 * time.Hour),
KeyUsage: x509.KeyUsageDigitalSignature | x509.KeyUsageKeyEncipherment,
ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
DNSNames: []string{host},
}
der, err := x509.CreateCertificate(rand.Reader, tmpl, c.cert, c.key.Public(), c.key)
if err != nil {
return nil, err
}
leaf, err := x509.ParseCertificate(der) // parsed cert has Raw populated (Verify needs it)
if err != nil {
return nil, err
}
tc := &tls.Certificate{Certificate: [][]byte{der, c.cert.Raw}, PrivateKey: c.key, Leaf: leaf}
c.mu.Lock()
c.cache[host] = tc
c.mu.Unlock()
return tc, nil
}
// ── Pure handler logic ───────────────────────────────────────────────────────
//
// The decision surface (Decide / action / registrable / splice helpers) lives
@ -201,7 +83,7 @@ func ja4ish(h *tls.ClientHelloInfo) string {
// ── CONNECT-proxy MITM wiring ────────────────────────────────────────────────
type Proxy struct {
ca *CA
ca *forge.CA
pol *Policy
jaSink func(string) // JA4 observations (logged; a sidecar in prod)
jarKey []byte // anti-track HMAC fake-identity seed (nil → poison off)
@ -289,7 +171,7 @@ func (px *Proxy) serverTLSConfigCapture(capture func(*tls.ClientHelloInfo)) *tls
if name == "" {
name = "unknown.local"
}
return px.ca.forge(name)
return px.ca.Forge(name)
},
}
}
@ -627,7 +509,7 @@ func main() {
mediaCatch := flag.Bool("media-catch", true,
"R4 media reverse-catcher (#736): record cloneable media URLs (HLS/DASH manifests + direct audio/video) seen on MITM'd flows to "+mediaCatchPath+" for the mediaflow \"Discovered Media\" clone view. URLs only, never bodies; deduped. Set false to disable.")
flag.Parse()
ca, err := loadCA(*caCert, *caKey)
ca, err := forge.LoadCA(*caCert, *caKey)
if err != nil {
log.Fatalf("CA load: %v", err)
}

View File

@ -17,6 +17,8 @@ import (
"sync"
"testing"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
)
// genTestCA writes a self-signed CA (cert+key PEM) to dir, mirroring ca-wg.
@ -53,20 +55,20 @@ func genTestCA(t *testing.T, dir string) (certPath, keyPath string) {
func TestForgeChainsToCA(t *testing.T) {
cp, kp := genTestCA(t, t.TempDir())
ca, err := loadCA(cp, kp)
ca, err := forge.LoadCA(cp, kp)
if err != nil {
t.Fatalf("loadCA: %v", err)
}
leaf, err := ca.forge("ads.example.com")
leaf, err := ca.Forge("ads.example.com")
if err != nil {
t.Fatalf("forge: %v", err)
}
pool := x509.NewCertPool()
pool.AddCert(ca.cert)
pool.AddCert(ca.Cert)
if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil {
t.Fatalf("forged leaf does not chain to CA / wrong SAN: %v", err)
}
leaf2, _ := ca.forge("ads.example.com")
leaf2, _ := ca.Forge("ads.example.com")
if leaf2 != leaf {
t.Fatal("forge not cached")
}
@ -112,22 +114,22 @@ func TestLoadCACombinedPEM(t *testing.T) {
}
// The unit's exact arg shape: --ca-cert <cert-only> --ca-key <combined>.
ca, err := loadCA(certOnly, combined)
ca, err := forge.LoadCA(certOnly, combined)
if err != nil {
t.Fatalf("loadCA(cert-only, combined): %v", err)
t.Fatalf("forge.LoadCA(cert-only, combined): %v", err)
}
leaf, err := ca.forge("ads.example.com")
leaf, err := ca.Forge("ads.example.com")
if err != nil {
t.Fatalf("forge: %v", err)
}
pool := x509.NewCertPool()
pool.AddCert(ca.cert)
pool.AddCert(ca.Cert)
if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil {
t.Fatalf("forged leaf does not chain to combined-PEM CA: %v", err)
}
// Belt-and-braces: the combined file works as BOTH cert and key source.
if _, err := loadCA(combined, combined); err != nil {
t.Fatalf("loadCA(combined, combined): %v", err)
if _, err := forge.LoadCA(combined, combined); err != nil {
t.Fatalf("forge.LoadCA(combined, combined): %v", err)
}
}
@ -161,7 +163,7 @@ func contains(s, sub string) bool {
// ClientHello (JA4 material) is captured.
func TestClientHelloCaptureAndForge(t *testing.T) {
cp, kp := genTestCA(t, t.TempDir())
ca, err := loadCA(cp, kp)
ca, err := forge.LoadCA(cp, kp)
if err != nil {
t.Fatal(err)
}
@ -185,7 +187,7 @@ func TestClientHelloCaptureAndForge(t *testing.T) {
}()
pool := x509.NewCertPool()
pool.AddCert(ca.cert)
pool.AddCert(ca.Cert)
conn, err := tls.Dial("tcp", ln.Addr().String(), &tls.Config{ServerName: "example.com", RootCAs: pool})
if err != nil {
t.Fatalf("client handshake against forged cert failed (CA not trusted / forge broken): %v", err)

View File

@ -13,12 +13,13 @@
package main
import (
"bufio"
"os"
"regexp"
"strings"
"sync"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/reload"
)
// ── ad_ghost: static ad/tracker host pattern (port of _AD_HOST) ──────────────
@ -98,8 +99,8 @@ func envOr(key, def string) string {
// keeps the legacy PoC fields (Inject) so the existing wiring/tests still work.
type Policy struct {
// mu guards the live-reloadable map fields below. Decide/allowed/blockedByAd/
// shouldSplice take RLock; maybeReload takes Lock only when a backing file
// actually changed (the throttle + stat happen under a separate lighter lock).
// shouldSplice take RLock; the reload Apply callbacks take Lock when a backing
// file actually changed.
mu sync.RWMutex
adHost *regexp.Regexp
@ -117,11 +118,21 @@ type Policy struct {
// mtime changes so autolearn promotions / manual edits take effect WITHOUT a
// worker restart (mirrors ad_ghost._maybe_reload). The hot path (Decide)
// calls maybeReload(): a throttle check, then — at most every reloadThrottle —
// a cheap stat() of each backing file. Only a changed file is re-read and its
// map atomically swapped under mu.
reloadFiles []reloadTarget // backing files + their swap target
// the generic reload.Watcher stats each backing file and calls Apply for each
// changed file. Each Apply swaps the affected map under p.mu.
//
// Atomicity note: in the original maybeReload(), ALL changed targets were
// applied under a SINGLE p.mu.Lock(). With reload.Watcher, the Watcher's
// internal mu serialises concurrent Maybe() calls, and each Apply callback
// takes p.mu.Lock() independently. The maps are independent (no cross-map
// invariant between e.g. learned and allow), so per-map locking is safe.
// The Watcher's mu ensures no two Maybe() batches interleave: a second
// goroutine calling Maybe() while a batch is applying will block until
// the first batch completes. Parity tests confirm Decide semantics are
// identical.
watcher *reload.Watcher
fortknoxSites []string // kept for rebuilding the never-set on pure-trackers reload
reloadMu sync.Mutex // guards lastReloadCheck + the per-file mtimes
reloadMu sync.Mutex // guards lastReloadID (throttle bookkeeping)
lastReloadID int64 // unix-nano of the last throttle pass (0 = never)
reloadThrottle time.Duration // min interval between stat passes (0 in tests = eager)
@ -129,63 +140,11 @@ type Policy struct {
Inject []byte // banner / ad-CSS marker injected before </head> or </body>
}
// reloadTarget describes one backing file the engine live-reloads: its path, the
// last mtime we read, whether comment-stripping applies (loadLines vs
// loadLinesRaw), and an applier that swaps the freshly-read set into the right
// Policy field (under p.mu, held by the caller). pure-trackers re-derives the
// never-set ( fortknox) so it stays consistent.
type reloadTarget struct {
path string
stripComm bool
lastMtime int64
apply func(p *Policy, set map[string]bool)
}
// defaultReloadThrottle is the production stat cadence: a backing-file change
// (autolearn runs hourly; a promotion is rare) is observed within ~15s, and the
// hot path stats at most ~4×/minute regardless of request rate.
const defaultReloadThrottle = 15 * time.Second
// loadLines mirrors the comment-stripping Python loaders (splice._load_lines,
// ad_ghost._allowed's allowlist read): split on first '#', trim, lowercase,
// skip blanks. Missing/unreadable file → empty set (best-effort).
func loadLines(path string) map[string]bool {
return scanLines(path, true)
}
// loadLinesRaw mirrors ad_ghost._learned_set, which does NOT comment-strip —
// learned-trackers.txt is a machine-generated one-host-per-line file. It does
// `{ln.strip().lower() for ln in f if ln.strip()}`. Matching this exactly is
// load-bearing for parity (a '#' in this file would be kept verbatim, not a
// comment), so the Go core must mirror the divergent behaviour, not normalise it.
func loadLinesRaw(path string) map[string]bool {
return scanLines(path, false)
}
func scanLines(path string, stripComments bool) map[string]bool {
out := map[string]bool{}
f, err := os.Open(path)
if err != nil {
return out
}
defer f.Close()
sc := bufio.NewScanner(f)
sc.Buffer(make([]byte, 0, 64*1024), 1<<20)
for sc.Scan() {
ln := sc.Text()
if stripComments {
if i := strings.IndexByte(ln, '#'); i >= 0 {
ln = ln[:i]
}
}
ln = strings.ToLower(strings.TrimSpace(ln))
if ln != "" {
out[ln] = true
}
}
return out
}
// LoadPolicy loads all backing files from opts (defaults applied for empty
// fields) and compiles the ad-host regex. It never returns an error for missing
// files (best-effort, like the Python addons), only for a regex-compile bug.
@ -216,7 +175,7 @@ func LoadPolicy(opts PolicyOpts) (*Policy, error) {
}
// never-set = pure-trackers fortknox_sites (mirrors TlsSplice._refresh_sets).
never := loadLines(opts.PureTrackersPath)
never := reload.LoadLines(opts.PureTrackersPath, true)
for _, s := range opts.FortknoxSites {
if s = strings.Trim(strings.ToLower(strings.TrimSpace(s)), "."); s != "" {
never[s] = true
@ -236,10 +195,10 @@ func LoadPolicy(opts PolicyOpts) (*Policy, error) {
p := &Policy{
adHost: re,
learned: loadLinesRaw(opts.LearnedPath), // mirrors _learned_set (no comment-strip)
allow: loadLines(opts.AllowPath),
spliceSeed: loadLines(opts.SpliceSeedPath),
spliceLearn: loadLines(opts.SpliceLearnPath),
learned: reload.LoadLines(opts.LearnedPath, false), // mirrors _learned_set (no comment-strip)
allow: reload.LoadLines(opts.AllowPath, true),
spliceSeed: reload.LoadLines(opts.SpliceSeedPath, true),
spliceLearn: reload.LoadLines(opts.SpliceLearnPath, true),
never: never,
selfRegs: selfRegs,
selfDomains: selfDomains,
@ -249,54 +208,85 @@ func LoadPolicy(opts PolicyOpts) (*Policy, error) {
// ── register the live-reloadable backing files (#662 auto-learn loop) ─────
//
// Each entry re-reads its file when its mtime changes and atomically swaps
// the map under p.mu (held by maybeReload). learned-trackers + ad-allowlist
// are the load-bearing pair (autolearn promotes into learned; the operator
// edits the allowlist); the splice seed/learned + pure-trackers files are
// reloaded too for consistency (pure-trackers re-derives the never-set).
p.reloadFiles = []reloadTarget{
{path: opts.LearnedPath, stripComm: false, lastMtime: statMtime(opts.LearnedPath),
apply: func(p *Policy, s map[string]bool) { p.learned = s }},
{path: opts.AllowPath, stripComm: true, lastMtime: statMtime(opts.AllowPath),
apply: func(p *Policy, s map[string]bool) { p.allow = s }},
{path: opts.SpliceSeedPath, stripComm: true, lastMtime: statMtime(opts.SpliceSeedPath),
apply: func(p *Policy, s map[string]bool) { p.spliceSeed = s }},
{path: opts.SpliceLearnPath, stripComm: true, lastMtime: statMtime(opts.SpliceLearnPath),
apply: func(p *Policy, s map[string]bool) { p.spliceLearn = s }},
{path: opts.PureTrackersPath, stripComm: true, lastMtime: statMtime(opts.PureTrackersPath),
apply: func(p *Policy, s map[string]bool) {
// Each reload.Target re-reads its file when its mtime changes and calls Apply
// to swap the map under p.mu. The Watcher (throttle=0 here; the Policy-level
// throttle check in maybeReload() controls the rate) handles mtime tracking.
//
// learned-trackers uses stripComments=false (loadLinesRaw: machine-generated,
// one-host-per-line, a '#' is kept verbatim). All other files use
// stripComments=true (operator-editable, comment lines are ignored).
targets := []reload.Target{
{
Path: opts.LearnedPath,
LastMtime: reload.StatMtime(opts.LearnedPath),
Load: func(path string) any { return reload.LoadLines(path, false) },
Apply: func(v any) {
p.mu.Lock()
p.learned = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.AllowPath,
LastMtime: reload.StatMtime(opts.AllowPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
p.mu.Lock()
p.allow = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.SpliceSeedPath,
LastMtime: reload.StatMtime(opts.SpliceSeedPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
p.mu.Lock()
p.spliceSeed = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.SpliceLearnPath,
LastMtime: reload.StatMtime(opts.SpliceLearnPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
p.mu.Lock()
p.spliceLearn = v.(map[string]bool)
p.mu.Unlock()
},
},
{
Path: opts.PureTrackersPath,
LastMtime: reload.StatMtime(opts.PureTrackersPath),
Load: func(path string) any { return reload.LoadLines(path, true) },
Apply: func(v any) {
// pure-trackers fortknox → never-set (mirrors LoadPolicy above).
s := v.(map[string]bool)
for _, fk := range p.fortknoxSites {
if fk = strings.Trim(strings.ToLower(strings.TrimSpace(fk)), "."); fk != "" {
s[fk] = true
}
}
p.mu.Lock()
p.never = s
}},
p.mu.Unlock()
},
},
}
// The Watcher is created with throttle=0: the Policy-level reloadThrottle
// check in maybeReload() gates how often we call w.Maybe().
p.watcher = reload.NewWatcher(0, targets...)
return p, nil
}
// statMtime returns the file's mtime in unix-nano, or 0 when the file is missing
// or unreadable (best-effort, like the Python loaders: a missing file → empty
// set, mtime 0). A file appearing/disappearing therefore registers as a change.
func statMtime(path string) int64 {
if path == "" {
return 0
}
fi, err := os.Stat(path)
if err != nil {
return 0
}
return fi.ModTime().UnixNano()
}
// maybeReload re-reads any backing list whose on-disk mtime changed since the
// last pass, swapping the affected map(s) under p.mu. Throttled to at most one
// stat pass per p.reloadThrottle (cheap: a time compare + a few stats), so the
// Decide hot path pays almost nothing. Concurrency-safe: the throttle/mtime
// bookkeeping is under reloadMu and the map swap under mu — Decide's readers
// hold mu.RLock, so a swap is atomic w.r.t. any in-flight decision.
// Decide hot path pays almost nothing. Concurrency-safe: the throttle
// bookkeeping is under reloadMu, the watcher handles mtime tracking and calls
// Apply callbacks (each taking p.mu.Lock) — Decide's readers hold mu.RLock, so
// a swap is atomic w.r.t. any in-flight decision.
func (p *Policy) maybeReload() {
now := time.Now()
p.reloadMu.Lock()
@ -306,35 +296,9 @@ func (p *Policy) maybeReload() {
return
}
p.lastReloadID = now.UnixNano()
// Collect the files that changed (stat under reloadMu; re-read outside mu).
type pending struct {
idx int
set map[string]bool
}
var changed []pending
for i := range p.reloadFiles {
rt := &p.reloadFiles[i]
if rt.path == "" {
continue
}
m := statMtime(rt.path)
if m != rt.lastMtime {
rt.lastMtime = m
changed = append(changed, pending{idx: i, set: scanLines(rt.path, rt.stripComm)})
}
}
p.reloadMu.Unlock()
if len(changed) == 0 {
return
}
// Swap the affected maps atomically under the write lock.
p.mu.Lock()
for _, c := range changed {
p.reloadFiles[c.idx].apply(p, c.set)
}
p.mu.Unlock()
p.watcher.Maybe()
}
// ── registrable: port of ad_ghost._registrable ───────────────────────────────
@ -492,7 +456,7 @@ func (p *Policy) Decide(host, sni string) string {
// #662 — pick up autolearn promotions / manual edits without a worker
// restart. Throttled to ~every reloadThrottle and best-effort, so the hot
// path normally pays only a time compare. Done BEFORE taking the read lock
// (maybeReload may take the write lock to swap a changed map).
// (maybeReload may trigger Apply callbacks that take the write lock).
p.maybeReload()
if sni == "" {
sni = host

View File

@ -24,6 +24,8 @@ import (
"net/http"
"strings"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/relay"
)
// Stable socket paths — verbatim from the Python addons' TARGET constants
@ -65,7 +67,7 @@ func (px *Proxy) relayEmit(socketPath, route string, payload []byte) {
if !px.relayEnabled() || len(payload) == 0 {
return
}
emit(socketPath, route, payload)
relay.Emit(socketPath, route, payload)
}
// ── dpi payload ──────────────────────────────────────────────────────────────

View File

@ -22,74 +22,7 @@
// cookie values) are NOT emitted to a module socket but POSTed to the portal
// /__toolbox/social-event ingest (the social store lives in the toolbox/portal).
//
// emit takes the full socket PATH (not an http+unix:// URL) plus the route in
// the payload's destination; callers build the path from the table above.
//
// Pure standard library — no external modules, no go.sum.
// Transport is now internal/relay. This file is retained for doc context only;
// the emit/emitSync/emitTimeout declarations have been moved to internal/relay
// as Emit/EmitSync/EmitTimeout (ref #744).
package main
import (
"context"
"fmt"
"net"
"time"
)
// emitTimeout caps the whole connect+write+read so a slow/dead module socket
// can never wedge the engine. Mirrors the Python httpx timeout=2.
const emitTimeout = 2 * time.Second
// emit fires a fire-and-forget POST of payload to the given unix socket at
// route, in a detached goroutine. It returns immediately and never blocks the
// caller; all errors (missing socket, dead peer, timeout) are swallowed —
// dropping a relayed signal must never break a client flow. Mirrors
// _common.fire_forget_post + queue_async (create_task, never raise).
//
// route is the HTTP path on the module (e.g. "/inject", "/classify"); use the
// addon→socket table above to pick socketPath + route together.
func emit(socketPath, route string, payload []byte) {
go emitSync(socketPath, route, payload)
}
// emitSync performs the actual POST synchronously (under emitTimeout). Exposed
// (lowercase, same-package) so tests can observe delivery deterministically
// without racing the goroutine. Returns an error only for the test's benefit;
// emit() discards it.
func emitSync(socketPath, route string, payload []byte) error {
if route == "" {
route = "/"
}
ctx, cancel := context.WithTimeout(context.Background(), emitTimeout)
defer cancel()
var d net.Dialer
conn, err := d.DialContext(ctx, "unix", socketPath)
if err != nil {
return err // dead/missing socket — swallowed by emit()
}
defer conn.Close()
if dl, ok := ctx.Deadline(); ok {
_ = conn.SetDeadline(dl)
}
// Minimal HTTP/1.1 POST. Host is a placeholder (unix transport); the module
// FastAPI apps ignore it. Connection: close so the peer EOFs after replying.
req := fmt.Sprintf(
"POST %s HTTP/1.1\r\nHost: secubox.local\r\nContent-Type: application/json\r\n"+
"Content-Length: %d\r\nConnection: close\r\n\r\n",
route, len(payload))
if _, err := conn.Write([]byte(req)); err != nil {
return err
}
if len(payload) > 0 {
if _, err := conn.Write(payload); err != nil {
return err
}
}
// Best-effort drain so the peer sees a clean close; we don't parse the
// response (fire-and-forget). Errors here are irrelevant.
buf := make([]byte, 512)
_, _ = conn.Read(buf)
return nil
}

View File

@ -2,6 +2,7 @@
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// Unit tests for the sidecar emit helper (#662 Phase 4).
// Transport now delegates to internal/relay (ref #744).
package main
import (
@ -11,10 +12,12 @@ import (
"strings"
"testing"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/relay"
)
// TestEmitDelivers: emitSync to a live unix socket delivers the POST request
// line, route and JSON body.
// TestEmitDelivers: relay.EmitSync to a live unix socket delivers the POST
// request line, route and JSON body.
func TestEmitDelivers(t *testing.T) {
sock := filepath.Join(t.TempDir(), "emit.sock")
ln, err := net.Listen("unix", sock)
@ -41,13 +44,13 @@ func TestEmitDelivers(t *testing.T) {
break
}
}
// Reply so emitSync's drain completes cleanly.
// Reply so EmitSync's drain completes cleanly.
c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n"))
got <- sb.String()
}()
if err := emitSync(sock, "/classify", []byte(`{"k":"v"}`)); err != nil {
t.Fatalf("emitSync: %v", err)
if err := relay.EmitSync(sock, "/classify", []byte(`{"k":"v"}`)); err != nil {
t.Fatalf("EmitSync: %v", err)
}
select {
@ -63,31 +66,31 @@ func TestEmitDelivers(t *testing.T) {
}
}
// TestEmitDeadSocketNoPanicNoBlock: emit() (the goroutine form) to a
// nonexistent socket must return immediately and never panic, and emitSync
// TestEmitDeadSocketNoPanicNoBlock: relay.Emit (the goroutine form) to a
// nonexistent socket must return immediately and never panic, and EmitSync
// must just return an error without blocking past the timeout.
func TestEmitDeadSocketNoPanicNoBlock(t *testing.T) {
dead := filepath.Join(t.TempDir(), "nope.sock")
// emit (async) returns instantly even though the socket is dead.
// Emit (async) returns instantly even though the socket is dead.
done := make(chan struct{})
go func() {
defer close(done)
emit(dead, "/inject", []byte(`{"x":1}`)) // must not panic/block
relay.Emit(dead, "/inject", []byte(`{"x":1}`)) // must not panic/block
}()
select {
case <-done:
case <-time.After(time.Second):
t.Fatal("emit() blocked on a dead socket")
t.Fatal("relay.Emit() blocked on a dead socket")
}
// emitSync surfaces the dial error (which emit swallows) without blocking.
// EmitSync surfaces the dial error (which Emit swallows) without blocking.
start := time.Now()
if err := emitSync(dead, "/inject", []byte(`{}`)); err == nil {
t.Error("emitSync to dead socket: expected error, got nil")
if err := relay.EmitSync(dead, "/inject", []byte(`{}`)); err == nil {
t.Error("EmitSync to dead socket: expected error, got nil")
}
if elapsed := time.Since(start); elapsed > emitTimeout+time.Second {
t.Errorf("emitSync blocked %v on dead socket", elapsed)
if elapsed := time.Since(start); elapsed > relay.EmitTimeout+time.Second {
t.Errorf("EmitSync blocked %v on dead socket", elapsed)
}
}
@ -111,8 +114,8 @@ func TestEmitEmptyRouteDefaults(t *testing.T) {
c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n"))
got <- string(buf[:n])
}()
if err := emitSync(sock, "", nil); err != nil {
t.Fatalf("emitSync: %v", err)
if err := relay.EmitSync(sock, "", nil); err != nil {
t.Fatalf("EmitSync: %v", err)
}
select {
case raw := <-got:

View File

@ -0,0 +1,89 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: ban — sliding-window graduated ban state
//
// Mirrors the Python BAN_THRESHOLD=3 / BAN_WINDOW=300s semantics from
// packages/secubox-mitmproxy/addons/secubox_waf.py.
//
// Design notes:
// - Window: 300 s (default, matches Python BAN_WINDOW)
// - Threshold: 3 hits within the window triggers a ban (matches BAN_THRESHOLD)
// - Map cap: 100 000 unique IPs. Once reached, new IPs are silently dropped
// (not recorded, not banned). This bounds memory under a flood: at 8 bytes
// per int64 timestamp × ~10 hits × 100k IPs ≈ 8 MB worst-case, well below
// any realistic RAM budget. The cap is intentionally generous; operator can
// tune via NewBan if needed in the future.
// - Pruning: Per-call, only for the affected IP. No background goroutine;
// avoids timer complexity for Task 3.1 scope.
// - Concurrency: single sync.Mutex guards the whole map. A sharded approach
// can be added later if contention shows up in profiling.
package main
import (
"sync"
"time"
)
const banMapCap = 100_000
// Ban holds the sliding-window threat state for all client IPs.
type Ban struct {
mu sync.Mutex
window int64 // window size in seconds
threshold int
hits map[string][]int64 // IP → slice of Unix timestamps of threat hits
}
// NewBan creates a new Ban tracker.
//
// window — size of the sliding time window (e.g. 300*time.Second)
// threshold — number of hits within the window that triggers a ban
func NewBan(window time.Duration, threshold int) *Ban {
return &Ban{
window: int64(window.Seconds()),
threshold: threshold,
hits: make(map[string][]int64),
}
}
// Record records one threat hit for ip at time nowUnix (Unix seconds).
// It prunes hits older than nowUnix-window BEFORE counting, then appends.
// Returns:
//
// count — number of hits within the window after this one (≥ 1)
// banned — true when count >= threshold
//
// New IPs are silently ignored (not recorded) once the map reaches banMapCap
// to bound memory under a SYN/scan flood. In that case count=0, banned=false.
func (b *Ban) Record(ip string, nowUnix int64) (count int, banned bool) {
b.mu.Lock()
defer b.mu.Unlock()
cutoff := nowUnix - b.window
ts, exists := b.hits[ip]
if !exists {
// Guard: enforce map cap against IP-flood amplification.
if len(b.hits) >= banMapCap {
return 0, false
}
}
// Prune timestamps outside the window.
pruned := ts[:0]
for _, t := range ts {
if t > cutoff {
pruned = append(pruned, t)
}
}
// Append this hit.
pruned = append(pruned, nowUnix)
b.hits[ip] = pruned
count = len(pruned)
banned = count >= b.threshold
return count, banned
}

View File

@ -0,0 +1,75 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: ban_test — sliding-window ban state machine tests
package main
import (
"testing"
"time"
)
// TestBanGraduated verifies that the ban threshold is reached on the 3rd hit
// within the window (default: window=300s, threshold=3).
func TestBanGraduated(t *testing.T) {
b := NewBan(300*time.Second, 3)
ip := "1.2.3.4"
count, banned := b.Record(ip, 0)
if count != 1 || banned {
t.Fatalf("after 1st hit: want (1,false), got (%d,%v)", count, banned)
}
count, banned = b.Record(ip, 0)
if count != 2 || banned {
t.Fatalf("after 2nd hit: want (2,false), got (%d,%v)", count, banned)
}
count, banned = b.Record(ip, 0)
if count != 3 || !banned {
t.Fatalf("after 3rd hit: want (3,true), got (%d,%v)", count, banned)
}
}
// TestBanWindowExpiry verifies that hits older than the window are pruned so
// that a previously-banned IP resets its count after the window expires.
func TestBanWindowExpiry(t *testing.T) {
b := NewBan(300*time.Second, 3)
ip := "1.2.3.4"
// Hit 3 times at t=0 → banned.
b.Record(ip, 0)
b.Record(ip, 0)
count, banned := b.Record(ip, 0)
if count != 3 || !banned {
t.Fatalf("pre-condition: want (3,true) at t=0, got (%d,%v)", count, banned)
}
// At t=400 (> 300s window) all prior hits are pruned; new hit → count=1, not banned.
count, banned = b.Record(ip, 400)
if count != 1 || banned {
t.Fatalf("after window expiry at t=400: want (1,false), got (%d,%v)", count, banned)
}
}
// TestBanPerIPIsolation verifies that hits on one IP do not bleed into another.
func TestBanPerIPIsolation(t *testing.T) {
b := NewBan(300*time.Second, 3)
ipA := "1.2.3.4"
ipB := "5.6.7.8"
// Three hits on A → banned.
b.Record(ipA, 0)
b.Record(ipA, 0)
_, bannedA := b.Record(ipA, 0)
if !bannedA {
t.Fatal("ipA should be banned after 3 hits")
}
// B has had zero hits → count=1, not banned after its first hit.
countB, bannedB := b.Record(ipB, 0)
if countB != 1 || bannedB {
t.Fatalf("ipB isolation: want (1,false), got (%d,%v)", countB, bannedB)
}
}

View File

@ -0,0 +1,277 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: cookieaudit — RGPD Set-Cookie ledger
//
// Task 5.1: For every Set-Cookie header in an upstream response, append one
// JSONL record to a ledger file. Cookie values are SHA256-hashed in-process —
// the raw value NEVER leaves this component.
//
// Port from packages/secubox-mitmproxy/addons/cookie_audit.py (parse_set_cookie
// + CookieAudit._append). Go's http.Response.Cookies() does not expose the
// SameSite attribute, so we parse the raw "Set-Cookie" header strings directly
// (same approach as the Python parse_set_cookie function).
//
// Architecture:
// - A buffered channel (size cookieAuditChanSize) decouples Record callers
// from disk I/O. Record is non-blocking: when the channel is full the
// record is dropped (dropCount incremented) rather than blocking the HTTP
// response path.
// - A single writer goroutine drains the channel and appends to the ledger
// (O_WRONLY|O_CREATE|O_APPEND, 0640). The file is opened once at
// construction and held open for the lifetime of the CookieAudit to avoid
// per-record open/close overhead.
// - Close() closes the channel (draining it first) and waits for the writer
// to exit. Safe to call multiple times via sync.Once.
//
// Ledger path default: /var/log/secubox/cookie-audit/server.jsonl
// Configurable via --cookie-audit-log flag in main().
//
// JSON record fields (mirrors Python cookie_audit.py record):
//
// ts — RFC 3339 UTC timestamp
// vhost — bare hostname from the request (Host header)
// url_path — request URL path
// method — HTTP method
// status — response status code (int)
// name — cookie name
// value_hash — sha256(raw_value).hexdigest()
// domain — cookie Domain attribute (leading '.' stripped, omitted if absent)
// path — cookie Path attribute (omitted if absent)
// secure — bool
// httponly — bool
// samesite — SameSite attribute value (omitted if absent)
package main
import (
"crypto/sha256"
"encoding/json"
"fmt"
"net/http"
"os"
"strings"
"sync"
"sync/atomic"
"time"
)
// cookieAuditChanSize is the depth of the async record channel.
// At 256 entries the buffer absorbs short bursts without blocking; records
// beyond this are dropped (counted but never block the response path).
const cookieAuditChanSize = 256
// DefaultCookieAuditLog is the production ledger path, matching the Python
// addon's DEFAULT_LEDGER constant.
const DefaultCookieAuditLog = "/var/log/secubox/cookie-audit/server.jsonl"
// cookieRecord is the JSON shape written to the ledger.
// Fields mirror the Python parse_set_cookie + response hook dict.
type cookieRecord struct {
TS string `json:"ts"`
Vhost string `json:"vhost"`
URLPath string `json:"url_path"`
Method string `json:"method"`
Status int `json:"status"`
Name string `json:"name"`
ValueHash string `json:"value_hash"`
Domain *string `json:"domain"` // null when absent
Path *string `json:"path"` // null when absent
Secure bool `json:"secure"`
HTTPOnly bool `json:"httponly"`
SameSite *string `json:"samesite"` // null when absent
}
// CookieAudit appends one JSONL record per Set-Cookie header to a ledger.
// Goroutine-safe. Record is non-blocking (drop-on-full channel policy).
type CookieAudit struct {
ch chan cookieRecord
file *os.File
wg sync.WaitGroup
closeOnce sync.Once
dropCount atomic.Int64 // atomic counter for concurrent Record calls
}
// NewCookieAudit creates a CookieAudit that writes to path.
// The parent directory is created (0755) if it does not exist. The ledger file
// is opened with O_APPEND|O_CREATE. Panics if the directory cannot be created
// or the file cannot be opened — startup time, not the request path.
func NewCookieAudit(path string) *CookieAudit {
dir := path
// Trim the file name to get the directory.
if idx := strings.LastIndex(path, "/"); idx >= 0 {
dir = path[:idx]
}
if err := os.MkdirAll(dir, 0755); err != nil {
// Fatal at startup — the operator must fix the path.
panic(fmt.Sprintf("cookieaudit: mkdir %s: %v", dir, err))
}
f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0640)
if err != nil {
panic(fmt.Sprintf("cookieaudit: open %s: %v", path, err))
}
ca := &CookieAudit{
ch: make(chan cookieRecord, cookieAuditChanSize),
file: f,
}
ca.wg.Add(1)
go ca.writer()
return ca
}
// writer drains the channel and appends JSONL records to the ledger.
// Runs as a single goroutine for the lifetime of the CookieAudit.
func (ca *CookieAudit) writer() {
defer ca.wg.Done()
for rec := range ca.ch {
data, err := json.Marshal(rec)
if err != nil {
// json.Marshal with plain strings is unreachable in practice.
fmt.Fprintf(os.Stderr, "cookieaudit: marshal failed: %v\n", err)
continue
}
data = append(data, '\n')
if _, err := ca.file.Write(data); err != nil {
fmt.Fprintf(os.Stderr, "cookieaudit: write failed: %v\n", err)
}
}
}
// Close drains the channel (waits for the writer goroutine) and closes the
// underlying file. Safe to call multiple times.
func (ca *CookieAudit) Close() {
ca.closeOnce.Do(func() {
close(ca.ch)
ca.wg.Wait()
_ = ca.file.Close()
})
}
// Record enumerates the Set-Cookie headers in resp, builds one cookieRecord per
// cookie, SHA256-hashes the value, and sends to the async channel.
// NON-BLOCKING: if the channel is full, the record is dropped (never blocks
// the HTTP response path).
func (ca *CookieAudit) Record(host string, req *http.Request, resp *http.Response) {
if ca == nil || resp == nil {
return
}
rawCookies := resp.Header["Set-Cookie"]
if len(rawCookies) == 0 {
return
}
// Collect context fields once per call.
ts := time.Now().UTC().Format(time.RFC3339)
method := ""
urlPath := ""
status := resp.StatusCode
if req != nil {
method = req.Method
if req.URL != nil {
urlPath = req.URL.Path
}
}
for _, raw := range rawCookies {
rec, ok := parseSetCookieRaw(raw)
if !ok {
continue
}
rec.TS = ts
rec.Vhost = host
rec.URLPath = urlPath
rec.Method = method
rec.Status = status
// Non-blocking send: drop if the channel is full.
select {
case ca.ch <- rec:
default:
ca.dropCount.Add(1)
}
}
}
// parseSetCookieRaw parses a raw Set-Cookie header string into a cookieRecord
// (with only the cookie-level fields populated; context fields are set by
// Record). Returns ok=false if the header is malformed (no name=value pair).
//
// We parse the raw string directly rather than using http.Response.Cookies()
// because Go's net/http cookie parser does not expose the SameSite attribute.
// The parsing logic mirrors Python's parse_set_cookie function in cookie_audit.py.
func parseSetCookieRaw(raw string) (cookieRecord, bool) {
if raw == "" {
return cookieRecord{}, false
}
// Split on ';': first token is name=value, the rest are attributes.
parts := strings.Split(raw, ";")
if len(parts) == 0 {
return cookieRecord{}, false
}
// name=value (first token).
nameVal := strings.TrimSpace(parts[0])
eqIdx := strings.IndexByte(nameVal, '=')
if eqIdx < 0 {
// No '=' in the first token — malformed cookie.
return cookieRecord{}, false
}
name := strings.TrimSpace(nameVal[:eqIdx])
if name == "" {
return cookieRecord{}, false
}
rawValue := strings.TrimSpace(nameVal[eqIdx+1:])
// SHA256 the raw value — never store it.
sum := sha256.Sum256([]byte(rawValue))
valueHash := fmt.Sprintf("%x", sum)
rec := cookieRecord{
Name: name,
ValueHash: valueHash,
Secure: false,
HTTPOnly: false,
}
// Parse attributes.
for _, attr := range parts[1:] {
attr = strings.TrimSpace(attr)
if attr == "" {
continue
}
k, v, _ := strings.Cut(attr, "=")
k = strings.TrimSpace(strings.ToLower(k))
v = strings.TrimSpace(v)
switch k {
case "domain":
d := strings.TrimLeft(v, ".")
if d == "" {
// Empty after stripping dot → treat as absent (null).
break
}
rec.Domain = &d
case "path":
if v != "" {
rec.Path = &v
}
case "secure":
rec.Secure = true
case "httponly":
rec.HTTPOnly = true
case "samesite":
if v != "" {
rec.SameSite = &v
}
// expires, max-age, and other attributes are intentionally ignored
// (not RGPD-relevant per the Python addon's design decision).
}
}
return rec, true
}

View File

@ -0,0 +1,233 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: cookieaudit_test — TDD for Task 5.1
//
// Tests:
// - TestCookieAuditHashesValue: single Set-Cookie → one JSONL record, value
// SHA256-hashed (never raw), domain dot-stripped, attributes correct.
// - TestCookieAuditMultipleCookies: two Set-Cookie headers → two JSONL lines.
// - TestCookieAuditNonBlocking: Record returns promptly even when the writer
// is paused (channel-full drop policy — never blocks the response path).
package main
import (
"bufio"
"bytes"
"crypto/sha256"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"path/filepath"
"strings"
"testing"
"time"
)
// makeFakeResponse builds a minimal *http.Response carrying the given
// Set-Cookie header values. The request is a simple GET to targetURL.
func makeFakeResponse(targetURL string, setCookies []string) (*http.Response, *http.Request) {
req, _ := http.NewRequest(http.MethodGet, targetURL, nil)
hdr := http.Header{}
for _, sc := range setCookies {
hdr.Add("Set-Cookie", sc)
}
resp := &http.Response{
StatusCode: 200,
Header: hdr,
Body: io.NopCloser(bytes.NewReader(nil)),
Request: req,
}
return resp, req
}
// TestCookieAuditHashesValue verifies that:
// - The ledger receives exactly one record for a single Set-Cookie.
// - The raw cookie value ("secretvalue") is NEVER written to the file.
// - value_hash == sha256("secretvalue").
// - domain has the leading dot stripped.
// - secure, httponly are true; samesite is "Lax".
func TestCookieAuditHashesValue(t *testing.T) {
dir := t.TempDir()
ledger := filepath.Join(dir, "cookie-audit", "server.jsonl")
ca := NewCookieAudit(ledger)
defer ca.Close()
resp, req := makeFakeResponse(
"https://example.com/login",
[]string{"session=secretvalue; Domain=.example.com; Path=/; Secure; HttpOnly; SameSite=Lax"},
)
ca.Record(req.Host, req, resp)
// Wait for the async writer goroutine to flush.
ca.Close()
data, err := os.ReadFile(ledger)
if err != nil {
t.Fatalf("read ledger: %v", err)
}
lines := splitNonEmptyLines(string(data))
if len(lines) != 1 {
t.Fatalf("expected 1 JSONL record, got %d:\n%s", len(lines), string(data))
}
var rec map[string]interface{}
if err := json.Unmarshal([]byte(lines[0]), &rec); err != nil {
t.Fatalf("line not valid JSON: %v\nline: %q", err, lines[0])
}
// name
if rec["name"] != "session" {
t.Errorf("name: want %q got %v", "session", rec["name"])
}
// value_hash
wantHash := fmt.Sprintf("%x", sha256.Sum256([]byte("secretvalue")))
if rec["value_hash"] != wantHash {
t.Errorf("value_hash: want %q got %v", wantHash, rec["value_hash"])
}
// raw value must NOT appear anywhere in the file
if strings.Contains(string(data), "secretvalue") {
t.Errorf("raw cookie value 'secretvalue' must not appear in the ledger")
}
// domain: leading dot stripped
if rec["domain"] != "example.com" {
t.Errorf("domain: want %q got %v", "example.com", rec["domain"])
}
// path
if rec["path"] != "/" {
t.Errorf("path: want %q got %v", "/", rec["path"])
}
// secure
if rec["secure"] != true {
t.Errorf("secure: want true got %v", rec["secure"])
}
// httponly
if rec["httponly"] != true {
t.Errorf("httponly: want true got %v", rec["httponly"])
}
// samesite
if rec["samesite"] != "Lax" {
t.Errorf("samesite: want %q got %v", "Lax", rec["samesite"])
}
// ts must be a non-empty string
ts, _ := rec["ts"].(string)
if ts == "" {
t.Errorf("ts must be a non-empty RFC3339 timestamp")
}
}
// TestCookieAuditMultipleCookies verifies that two Set-Cookie headers produce
// two independent JSONL records.
func TestCookieAuditMultipleCookies(t *testing.T) {
dir := t.TempDir()
ledger := filepath.Join(dir, "cookie-audit", "server.jsonl")
ca := NewCookieAudit(ledger)
resp, req := makeFakeResponse(
"https://shop.example.com/cart",
[]string{
"cart=abc123; Path=/; HttpOnly",
"tracker=xyz789; Domain=.example.com; Path=/; Secure; SameSite=None",
},
)
ca.Record(req.Host, req, resp)
// Flush via Close.
ca.Close()
data, err := os.ReadFile(ledger)
if err != nil {
t.Fatalf("read ledger: %v", err)
}
lines := splitNonEmptyLines(string(data))
if len(lines) != 2 {
t.Fatalf("expected 2 JSONL records (one per Set-Cookie), got %d:\n%s", len(lines), string(data))
}
// Both lines must be valid JSON with a name field.
names := map[string]bool{}
for i, line := range lines {
var rec map[string]interface{}
if err := json.Unmarshal([]byte(line), &rec); err != nil {
t.Fatalf("line %d not valid JSON: %v", i+1, err)
}
n, _ := rec["name"].(string)
if n == "" {
t.Errorf("line %d: name must not be empty", i+1)
}
names[n] = true
}
if !names["cart"] {
t.Errorf("expected a record with name=cart")
}
if !names["tracker"] {
t.Errorf("expected a record with name=tracker")
}
}
// TestCookieAuditNonBlocking verifies that Record returns promptly even when
// the internal channel is full (i.e. the writer goroutine is not draining).
// Strategy: create a CookieAudit with a tiny channel, then call Record more
// times than the channel capacity without closing it. The call must return
// within a very short deadline — never blocking the response path.
func TestCookieAuditNonBlocking(t *testing.T) {
dir := t.TempDir()
ledger := filepath.Join(dir, "cookie-audit", "server.jsonl")
// Use the standard constructor (channel size 256). We call Record 512 times
// without any drain delay — the first 256 fill the channel; subsequent sends
// must be dropped non-blockingly. The goroutine will drain concurrently, but
// the test verifies that no single Record call hangs.
ca := NewCookieAudit(ledger)
resp, req := makeFakeResponse(
"https://example.com/",
[]string{"tok=value; Path=/"},
)
start := time.Now()
for i := 0; i < 512; i++ {
ca.Record(req.Host, req, resp)
}
elapsed := time.Since(start)
ca.Close()
// All 512 Record calls must complete in well under 1 second.
// (A blocking send would hang indefinitely; even a 100ms sleep per drop
// would blow this budget.)
if elapsed > 1*time.Second {
t.Errorf("Record loop took %v — looks like it blocked (want < 1s)", elapsed)
}
}
// splitNonEmptyLines splits s by newlines, returning only non-empty lines.
// Reuses the same logic as splitNonEmpty in threatlog_test.go (same package,
// different name to avoid collision with that helper's local scope).
func splitNonEmptyLines(s string) []string {
sc := bufio.NewScanner(bytes.NewBufferString(s))
var out []string
for sc.Scan() {
if line := sc.Text(); line != "" {
out = append(out, line)
}
}
return out
}

View File

@ -0,0 +1,190 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: crowdsec — CrowdSec LAPI alert bridge
//
// Task 4.1: implements CrowdSecClient, which satisfies the CrowdSecReporter
// interface declared in main.go. On a ban event the handler calls
// crowdsec.Report(ip, cat, sev) in a goroutine; this client builds the LAPI
// alert JSON (ported faithfully from secubox_waf.py _ban_via_crowdsec) and
// POSTs it to {lapiURL}/v1/alerts with a 2 s timeout.
//
// Best-effort: network errors are logged and swallowed — the WAF never blocks
// on LAPI availability. SSRF hygiene: redirect-following is disabled.
package main
import (
"bytes"
"encoding/json"
"fmt"
"log"
"net/http"
"strings"
"time"
)
// CrowdSecClient implements CrowdSecReporter by POSTing alert objects to the
// CrowdSec LAPI /v1/alerts endpoint.
type CrowdSecClient struct {
lapiURL string
jwt string
duration string
client *http.Client
}
// NewCrowdSecClient builds a CrowdSecClient with a 2 s timeout and no redirect
// following (SSRF hygiene).
//
// - lapiURL: base URL of the CrowdSec LAPI, e.g. "http://10.100.0.1:8080"
// - jwt: Bearer token (read from --crowdsec-jwt-file by main())
// - duration: ban duration string forwarded in the decision, e.g. "4h"
func NewCrowdSecClient(lapiURL, jwt, duration string) *CrowdSecClient {
return &CrowdSecClient{
lapiURL: strings.TrimRight(lapiURL, "/"),
jwt: jwt,
duration: duration,
client: &http.Client{
Timeout: 2 * time.Second,
// Disable redirect following — prevents SSRF via 3xx to internal hosts.
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
},
},
}
}
// Report satisfies CrowdSecReporter. It builds the LAPI alert payload and
// POSTs it. Errors are logged only (best-effort, never panics).
// The caller already wraps this in a goroutine (see main.go ban branch).
func (c *CrowdSecClient) Report(ip, cat, sev string) {
if err := c.postAlert(ip, cat, sev); err != nil {
log.Printf("sbxwaf: crowdsec bridge error for %s (%s/%s): %v", ip, cat, sev, err)
}
}
// csAlertSource mirrors the source object expected by the CrowdSec LAPI.
type csAlertSource struct {
Scope string `json:"scope"`
Value string `json:"value"`
IP string `json:"ip"`
AsNumber string `json:"as_number"`
AsName string `json:"as_name"`
Cn string `json:"cn"`
Latitude float64 `json:"latitude"`
Longitude float64 `json:"longitude"`
}
// csDecision mirrors the decision object inside the LAPI alert.
type csDecision struct {
Duration string `json:"duration"`
Scenario string `json:"scenario"`
Type string `json:"type"`
Value string `json:"value"`
Scope string `json:"scope"`
Origin string `json:"origin"`
Simulated bool `json:"simulated"`
}
// csEventMeta is one key/value pair inside an event's meta list.
type csEventMeta struct {
Key string `json:"key"`
Value string `json:"value"`
}
// csEvent is a single event in the events array.
type csEvent struct {
Timestamp string `json:"timestamp"`
Meta []csEventMeta `json:"meta"`
}
// csAlert is the full alert object (one element of the POST body array).
type csAlert struct {
Scenario string `json:"scenario"`
ScenarioHash string `json:"scenario_hash"`
ScenarioVersion string `json:"scenario_version"`
Message string `json:"message"`
EventsCount int `json:"events_count"`
StartAt string `json:"start_at"`
StopAt string `json:"stop_at"`
Capacity int `json:"capacity"`
Leakspeed string `json:"leakspeed"`
Simulated bool `json:"simulated"`
Source csAlertSource `json:"source"`
Decisions []csDecision `json:"decisions"`
Events []csEvent `json:"events"`
}
// postAlert builds and POSTs the alert; returns an error for logging.
func (c *CrowdSecClient) postAlert(ip, cat, sev string) error {
// Python uses "%Y-%m-%dT%H:%M:%S.000000Z" — reproduce the same format so
// existing CrowdSec consumers that parse that literal suffix are compatible.
nowISO := time.Now().UTC().Format("2006-01-02T15:04:05.000000Z")
scenario := fmt.Sprintf("secubox-waf/%s", cat)
alert := csAlert{
Scenario: scenario,
ScenarioHash: "",
ScenarioVersion: "1",
Message: fmt.Sprintf("WAF threshold crossed for %s (%s)", ip, cat),
EventsCount: 1,
StartAt: nowISO,
StopAt: nowISO,
Capacity: 0,
Leakspeed: "0s",
Simulated: false,
Source: csAlertSource{
Scope: "Ip",
Value: ip,
IP: ip,
AsNumber: "0",
AsName: "?",
Cn: "?",
Latitude: 0.0,
Longitude: 0.0,
},
Decisions: []csDecision{{
Duration: c.duration,
Scenario: scenario,
Type: "ban",
Value: ip,
Scope: "Ip",
Origin: "secubox-waf",
Simulated: false,
}},
Events: []csEvent{{
Timestamp: nowISO,
Meta: []csEventMeta{
{Key: "source_ip", Value: ip},
{Key: "scenario", Value: cat},
},
}},
}
body, err := json.Marshal([]csAlert{alert})
if err != nil {
return fmt.Errorf("marshal alert: %w", err)
}
endpoint := c.lapiURL + "/v1/alerts"
req, err := http.NewRequest(http.MethodPost, endpoint, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("build request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.jwt)
req.Header.Set("Content-Type", "application/json")
resp, err := c.client.Do(req)
if err != nil {
return fmt.Errorf("POST %s: %w", endpoint, err)
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return fmt.Errorf("LAPI returned %d for %s (%s)", resp.StatusCode, ip, cat)
}
log.Printf("sbxwaf: crowdsec bridge BAN %s ← %s (sev=%s, dur=%s)",
ip, cat, sev, c.duration)
return nil
}

View File

@ -0,0 +1,140 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: crowdsec_test — CrowdSec LAPI bridge tests
package main
import (
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
// TestCrowdSecAlertPayload verifies that Report POSTs to /v1/alerts with the
// correct Authorization header and a well-formed alert JSON array.
func TestCrowdSecAlertPayload(t *testing.T) {
type capturedReq struct {
method string
path string
auth string
body []byte
}
var captured capturedReq
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
captured.method = r.Method
captured.path = r.URL.Path
captured.auth = r.Header.Get("Authorization")
b, _ := io.ReadAll(r.Body)
captured.body = b
w.WriteHeader(http.StatusCreated)
}))
defer srv.Close()
c := NewCrowdSecClient(srv.URL, "testjwt", "4h")
c.Report("1.2.3.4", "sqli", "high")
// Report is synchronous inside this test (no goroutine wrapper here).
// Give a tiny window just in case the httptest server needs to flush.
time.Sleep(20 * time.Millisecond)
// Method and path.
if captured.method != http.MethodPost {
t.Errorf("method: want POST, got %s", captured.method)
}
if captured.path != "/v1/alerts" {
t.Errorf("path: want /v1/alerts, got %s", captured.path)
}
// Authorization header.
if captured.auth != "Bearer testjwt" {
t.Errorf("Authorization: want 'Bearer testjwt', got %q", captured.auth)
}
// Parse the JSON body.
var alerts []map[string]interface{}
if err := json.Unmarshal(captured.body, &alerts); err != nil {
t.Fatalf("body is not valid JSON: %v\nbody: %s", err, captured.body)
}
if len(alerts) != 1 {
t.Fatalf("want 1 alert in array, got %d", len(alerts))
}
a := alerts[0]
// Scenario.
if got, _ := a["scenario"].(string); got != "secubox-waf/sqli" {
t.Errorf("scenario: want 'secubox-waf/sqli', got %q", got)
}
// Source.
src, ok := a["source"].(map[string]interface{})
if !ok {
t.Fatalf("source field missing or wrong type")
}
if v, _ := src["value"].(string); v != "1.2.3.4" {
t.Errorf("source.value: want '1.2.3.4', got %q", v)
}
if v, _ := src["ip"].(string); v != "1.2.3.4" {
t.Errorf("source.ip: want '1.2.3.4', got %q", v)
}
if v, _ := src["scope"].(string); v != "Ip" {
t.Errorf("source.scope: want 'Ip', got %q", v)
}
// Decisions.
decisionsRaw, ok := a["decisions"].([]interface{})
if !ok || len(decisionsRaw) != 1 {
t.Fatalf("decisions: want array of 1, got %v", a["decisions"])
}
d, _ := decisionsRaw[0].(map[string]interface{})
if v, _ := d["type"].(string); v != "ban" {
t.Errorf("decisions[0].type: want 'ban', got %q", v)
}
if v, _ := d["value"].(string); v != "1.2.3.4" {
t.Errorf("decisions[0].value: want '1.2.3.4', got %q", v)
}
if v, _ := d["duration"].(string); v != "4h" {
t.Errorf("decisions[0].duration: want '4h', got %q", v)
}
if v, _ := d["scope"].(string); v != "Ip" {
t.Errorf("decisions[0].scope: want 'Ip', got %q", v)
}
if v, _ := d["origin"].(string); v != "secubox-waf" {
t.Errorf("decisions[0].origin: want 'secubox-waf', got %q", v)
}
// Timestamps: assert fields exist and parse as RFC3339.
for _, field := range []string{"start_at", "stop_at"} {
v, _ := a[field].(string)
if v == "" {
t.Errorf("%s: field missing or empty", field)
continue
}
if _, err := time.Parse(time.RFC3339, strings.TrimSuffix(v, ".000000Z")); err != nil {
// The Python uses ".000000Z" suffix; try parsing with that pattern too.
if _, err2 := time.Parse("2006-01-02T15:04:05.000000Z", v); err2 != nil {
t.Errorf("%s: %q does not parse as RFC3339 or Python variant: %v / %v", field, v, err, err2)
}
}
}
// Events array.
eventsRaw, _ := a["events"].([]interface{})
if len(eventsRaw) < 1 {
t.Errorf("events: want at least 1 entry, got %d", len(eventsRaw))
}
}
// TestCrowdSecBestEffortOnError verifies that Report does not panic when the
// LAPI server is unreachable. Best-effort: errors are logged only.
func TestCrowdSecBestEffortOnError(t *testing.T) {
c := NewCrowdSecClient("http://127.0.0.1:1", "dummy", "4h")
// Must return without panic.
c.Report("1.2.3.4", "sqli", "high")
}

View File

@ -0,0 +1,218 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: errpages — graduated WAF response pages
//
// Task 3.2: ported from WARNING_PAGE (secubox_waf.py ~line 221) and the inline
// ban response (secubox_waf.py ~line 1068-1072).
//
// writeWarning — HTTP 403, cyberpunk-styled warning page with the
//
// X-SecuBox-WAF: warning header. The HTML comment
// "<!-- sbxwaf-warning -->" acts as a machine-readable marker for tests
// and log parsers.
//
// writeBan — HTTP 403, minimal ban page with X-SecuBox-WAF: banned header.
//
// The HTML comment "<!-- sbxwaf-banned -->" is the machine-readable marker.
//
// Task 7.1: synthetic upstream error pages (502/503/504).
//
// errorPage(code, host) — loads the embedded themed HTML template for the
// given upstream error code (502/503/504), substitutes {host} and {time},
// and returns the rendered bytes. Faithful port of the error() hook in
// secubox_waf.py (~line 1096):
// - Connection refused → 502 (ERROR_502_PAGE + {host}/{time} sub)
// - Timeout → 504 (ERROR_502_PAGE with 502→504 / Bad Gateway→Gateway Timeout)
// - Other → 503 (ERROR_503_PAGE, no {host} in the Python page)
//
// writeErrorPage(w, code, host) — sets Content-Type + X-SecuBox-WAF header,
// writes the status code, then writes errorPage output.
package main
import (
"bytes"
_ "embed"
"fmt"
"html"
"net/http"
"time"
)
// Embedded templates — verbatim copies of the Python secubox_waf.py pages.
//
//go:embed templates/error-502.html
var tmpl502 []byte
//go:embed templates/error-503.html
var tmpl503 []byte
//go:embed templates/error-504.html
var tmpl504 []byte
// errorPage returns the themed HTML body for the given upstream HTTP error code.
// host is substituted into {host} placeholders (both the 502 and 504 templates
// contain the upstream hostname in the error box). The {time} placeholder is
// replaced with the current wall-clock time (HH:MM:SS), matching the Python
// error() hook behaviour.
//
// Unknown codes fall back to the 502 template (sane default — keeps tests
// forward-compatible if new codes are added later).
func errorPage(code int, host string) []byte {
var tmpl []byte
switch code {
case 503:
tmpl = tmpl503
case 504:
tmpl = tmpl504
default: // 502 and any unknown code
tmpl = tmpl502
}
now := time.Now().Format("15:04:05")
safeHost := html.EscapeString(host)
out := bytes.ReplaceAll(tmpl, []byte("{host}"), []byte(safeHost))
out = bytes.ReplaceAll(out, []byte("{time}"), []byte(now))
return out
}
// writeErrorPage writes a themed upstream error response.
// Maps the error code to the WAF header value and delegates to errorPage.
func writeErrorPage(w http.ResponseWriter, code int, host string) {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-SecuBox-WAF", fmt.Sprintf("error-%d", code))
w.WriteHeader(code)
_, _ = w.Write(errorPage(code, host))
}
// writeWarning writes a 403 cyberpunk-styled warning page.
// cat is the WAF category ID (e.g. "sqli") shown in the body.
// Faithful port of WARNING_PAGE from secubox_waf.py.
func writeWarning(w http.ResponseWriter, cat string) {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-SecuBox-WAF", "warning")
w.WriteHeader(http.StatusForbidden)
fmt.Fprintf(w, `<!DOCTYPE html>
<!-- sbxwaf-warning -->
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>SecuBox WAF - Security Alert</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: linear-gradient(135deg, #0a0a0f 0%%, #1a0a0f 100%%);
color: #e8e6d9;
font-family: "JetBrains Mono", monospace;
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
}
.container { text-align: center; padding: 2rem; max-width: 800px; }
.alert-icon {
font-size: 6rem;
margin-bottom: 1.5rem;
animation: pulse 2s infinite;
}
@keyframes pulse {
0%%, 100%% { transform: scale(1); opacity: 1; }
50%% { transform: scale(1.1); opacity: 0.8; }
}
h1 { color: #e63946; font-size: 2.5rem; margin-bottom: 1rem;
text-shadow: 0 0 20px rgba(230, 57, 70, 0.5); }
.warning-box {
background: rgba(230, 57, 70, 0.1);
border: 2px solid #e63946;
border-radius: 12px;
padding: 2rem;
margin: 2rem 0;
}
.warning-text { color: #e63946; font-size: 1.2rem; margin-bottom: 1rem; }
.details { color: #6b6b7a; font-size: 0.9rem; margin-top: 1rem; }
.license-box {
background: rgba(201, 168, 76, 0.1);
border: 1px solid #c9a84c;
border-radius: 8px;
padding: 1.5rem;
margin-top: 2rem;
text-align: left;
}
.license-title { color: #c9a84c; font-size: 1rem; margin-bottom: 0.5rem; }
.license-text { color: #6b6b7a; font-size: 0.75rem; line-height: 1.5; }
.footer { margin-top: 2rem; color: #6b6b7a; font-size: 0.8rem; }
.footer a { color: #c9a84c; text-decoration: none; }
</style>
</head>
<body>
<div class="container">
<div class="alert-icon">&#x26A0;&#xFE0F;</div>
<h1>SECURITY ALERT</h1>
<div class="warning-box">
<p class="warning-text">&#x1F6A8; Suspicious Activity Detected</p>
<p>Your request contains patterns that match known attack signatures.</p>
<p class="details">Category: %s</p>
<p class="details">This incident has been logged and your IP address recorded.</p>
<p class="details">Continued malicious activity will result in automatic IP ban.</p>
</div>
<div class="license-box">
<p class="license-title">&#x1F4DC; SecuBox Security Notice</p>
<p class="license-text">
This system is protected by SecuBox WAF (Web Application Firewall).<br>
All access attempts are monitored, logged, and may be reported to authorities.<br>
Continued malicious activity will result in automatic IP ban.<br><br>
&copy; 2024-2026 CyberMind Security Platform<br>
ANSSI CSPN Candidate | https://secubox.in
</p>
</div>
<p class="footer">
Protected by <a href="https://cybermind.fr">CyberMind</a> |
<a href="https://secubox.in">SecuBox</a>
</p>
</div>
</body>
</html>`, cat)
}
// writeBan writes a 403 IP banned response.
// Mirrors the inline ban response from secubox_waf.py lines 1068-1072.
func writeBan(w http.ResponseWriter) {
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-SecuBox-WAF", "banned")
w.WriteHeader(http.StatusForbidden)
fmt.Fprint(w, `<!DOCTYPE html>
<!-- sbxwaf-banned -->
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>403 Forbidden | SecuBox WAF</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #0a0a0f;
color: #e8e6d9;
font-family: "JetBrains Mono", monospace;
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
}
.container { text-align: center; padding: 2rem; max-width: 600px; }
h1 { color: #e63946; font-size: 3rem; margin-bottom: 1rem; }
p { color: #6b6b7a; margin-top: 1rem; }
</style>
</head>
<body>
<div class="container">
<h1>&#x1F6AB; 403 Forbidden</h1>
<p>Your IP has been banned.</p>
<p>This incident has been reported to the security platform.</p>
<p style="margin-top:2rem; font-size:0.8rem; color:#3a3a4a;">
SecuBox WAF &mdash; ANSSI CSPN | <a href="https://secubox.in" style="color:#c9a84c;">secubox.in</a>
</p>
</div>
</body>
</html>`)
}

View File

@ -0,0 +1,236 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf :: errpages_test — TDD for Task 7.1
// Tests for synthetic 502/503/504 themed error pages ported from secubox_waf.py.
package main
import (
"io"
"net"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
// TestErrorPageSubstitutesHost verifies that errorPage(502, host) replaces
// the {host} placeholder in the template and does NOT leave it as a literal.
func TestErrorPageSubstitutesHost(t *testing.T) {
const host = "app.example.com"
body := errorPage(502, host)
if len(body) == 0 {
t.Fatal("errorPage(502, ...) returned empty body")
}
if !strings.Contains(string(body), host) {
t.Fatalf("expected body to contain %q after substitution", host)
}
if strings.Contains(string(body), "{host}") {
t.Fatal("body still contains literal {host} placeholder — substitution failed")
}
// 502 page has a machine-readable marker: the error-code div shows "502"
if !strings.Contains(string(body), "502") {
t.Fatal("expected body to contain the 502 error code marker")
}
}
// TestErrorPageAllCodes checks that 502/503/504 each return a non-empty body
// with a code-specific marker (the error-code div content from the templates).
func TestErrorPageAllCodes(t *testing.T) {
cases := []struct {
code int
marker string // string that must appear in the page
}{
{502, "502"},
{503, "503"},
{504, "504"},
}
for _, tc := range cases {
body := errorPage(tc.code, "test.host.local")
if len(body) == 0 {
t.Errorf("errorPage(%d) returned empty body", tc.code)
continue
}
if !strings.Contains(string(body), tc.marker) {
t.Errorf("errorPage(%d): body does not contain marker %q", tc.code, tc.marker)
}
}
}
// TestErrorPageUnknownCodeFallback checks that an unknown code returns a sane
// (non-empty) body — must not panic or return nil.
func TestErrorPageUnknownCodeFallback(t *testing.T) {
body := errorPage(599, "fallback.example.com")
if len(body) == 0 {
t.Fatal("errorPage(599) returned empty body — expected a non-empty fallback")
}
}
// TestHandlerServesThemed502OnDeadBackend routes a request to a port where
// nothing is listening (connection refused) and asserts:
// - status 502
// - X-SecuBox-WAF: error-502
// - body contains the themed 502 marker ("502")
func TestHandlerServesThemed502OnDeadBackend(t *testing.T) {
// Find an unused local port (bind then close immediately — race is
// acceptable here since the test is the only user and the port is ephemeral).
l, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("could not bind ephemeral port: %v", err)
}
deadAddr := l.Addr().String()
l.Close() // immediately close — the port is now "dead" (refused)
deadHost, deadPortStr, _ := net.SplitHostPort(deadAddr)
var deadPort int
if _, err := io.Discard.Write(nil); err == nil { // no-op; parse port below
}
if _, err := strings.NewReader(deadPortStr).Read(nil); err == nil {
}
// Parse port via strconv-style logic — use net.LookupPort is overkill; cast.
for _, b := range []byte(deadPortStr) {
deadPort = deadPort*10 + int(b-'0')
}
srv := &Server{
routeLookup: func(host string) (string, int, bool) {
if host == "dead.example.com" {
return deadHost, deadPort, true
}
return "", 0, false
},
}
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://dead.example.com/", nil)
req.Host = "dead.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
res := rec.Result()
if res.StatusCode != http.StatusBadGateway {
t.Fatalf("expected 502, got %d", res.StatusCode)
}
wafHdr := res.Header.Get("X-SecuBox-WAF")
if wafHdr != "error-502" {
t.Fatalf("expected X-SecuBox-WAF: error-502, got %q", wafHdr)
}
body, _ := io.ReadAll(res.Body)
if !strings.Contains(string(body), "502") {
t.Fatalf("expected themed 502 body, got: %q", string(body)[:min(200, len(body))])
}
// Must NOT contain the raw placeholder.
if strings.Contains(string(body), "{host}") {
t.Fatal("response body still contains {host} literal — substitution failed")
}
}
// TestHandlerServes504OnUpstreamTimeout routes to a backend that sleeps past a
// short per-request upstream timeout and asserts 504 + X-SecuBox-WAF: error-504.
func TestHandlerServes504OnUpstreamTimeout(t *testing.T) {
// Backend that sleeps 2s — our timeout will be 50ms so it times out.
slow := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(2 * time.Second)
w.WriteHeader(http.StatusOK)
}))
defer slow.Close()
backendAddr := strings.TrimPrefix(slow.URL, "http://")
bHost, bPort, err := splitHostPort(backendAddr)
if err != nil {
t.Fatalf("splitHostPort: %v", err)
}
srv := &Server{
upstreamTimeout: 50 * time.Millisecond, // very short → guaranteed timeout
routeLookup: func(host string) (string, int, bool) {
if host == "slow.example.com" {
return bHost, bPort, true
}
return "", 0, false
},
}
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://slow.example.com/", nil)
req.Host = "slow.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
res := rec.Result()
if res.StatusCode != http.StatusGatewayTimeout {
t.Fatalf("expected 504, got %d", res.StatusCode)
}
wafHdr := res.Header.Get("X-SecuBox-WAF")
if wafHdr != "error-504" {
t.Fatalf("expected X-SecuBox-WAF: error-504, got %q", wafHdr)
}
body, _ := io.ReadAll(res.Body)
if !strings.Contains(string(body), "504") {
t.Fatalf("expected themed 504 body, got: %q", string(body)[:min(200, len(body))])
}
}
// TestErrorPageEscapesHost verifies that a Host value containing HTML-special
// characters is escaped before being inserted into the page, preventing a
// reflected XSS via an attacker-controlled Host header.
//
// Note: the 502 template itself contains a legitimate <script> block for the
// retry countdown timer — that is expected. What must NOT appear is the
// attacker-injected payload "><script>alert(1)</script> reflected verbatim.
// html.EscapeString escapes <, >, &, " and ' — plain text like "alert(1)"
// within the already-escaped tags is safe and will remain in the output.
func TestErrorPageEscapesHost(t *testing.T) {
maliciousHost := "\"><script>alert(1)</script>"
body := string(errorPage(502, maliciousHost))
// The raw, unescaped payload must not appear verbatim.
// If it does, the host value was reflected unescaped — XSS.
if strings.Contains(body, maliciousHost) {
t.Fatal("body contains the raw malicious Host value unescaped — reflected XSS vulnerability")
}
// The injected closing quote + opening angle must not appear — this is
// the breakout vector that allows injecting a new tag context.
if strings.Contains(body, "\"><script>") {
t.Fatal(`body contains unescaped "><script> from Host header — tag-injection XSS vulnerability`)
}
// Must contain the escaped form so the host value is still rendered safely.
if !strings.Contains(body, "&lt;script&gt;") {
t.Fatal("body does not contain escaped &lt;script&gt; — escaping may be missing or incorrect")
}
// Must not contain the bare placeholder.
if strings.Contains(body, "{host}") {
t.Fatal("body still contains literal {host} placeholder — substitution failed")
}
}
// TestErrorPageSubstitutesHostNormal confirms that a well-formed host (no
// special chars) is preserved unchanged after escaping — escaping must not
// mangle safe values.
func TestErrorPageSubstitutesHostNormal(t *testing.T) {
const host = "app.example.com"
body := string(errorPage(502, host))
if !strings.Contains(body, host) {
t.Fatalf("expected body to contain %q after substitution, but it was absent", host)
}
if strings.Contains(body, "{host}") {
t.Fatal("body still contains literal {host} placeholder — substitution failed")
}
}
func min(a, b int) int {
if a < b {
return a
}
return b
}

View File

@ -0,0 +1,163 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — request inspection + skip-lists
//
// Task 2.2: wires the Rules engine into the HTTP handler with:
// - CIDR-based trusted-network bypass (RFC1918 + loopback)
// - Static-asset skip (.js/.css/.png/... and /health, /status, system_health)
// - NC mobile-token bypass (/index.php/login/v2/, /ocs/v2.php/core/login)
// - Body read capped at 1 MiB for inspection; full body forwarded via
// io.MultiReader (prefix + remaining stream) — no truncation on large uploads
// - clientIP extraction: prefer leftmost XFF only when peer is a trusted proxy
//
// Ported faithfully from:
// packages/secubox-mitmproxy/addons/secubox_waf.py
// - _is_whitelisted / _WL_NETS (lines 28-47)
// - get_real_client_ip (lines 193-219)
// - check_request static/health fast-path (lines 764-769)
//
// Connection: close is added to upstream requests per issue #496 (Python parity).
package main
import (
"net"
"net/http"
"strings"
)
// trustedProxies mirrors Python's TRUSTED_PROXIES set (secubox_waf.py line 176).
// Used to decide whether to trust an X-Forwarded-For header: we only use XFF
// when the immediate peer (r.RemoteAddr) is one of these known proxy IPs.
var trustedProxies = map[string]struct{}{
"10.100.0.1": {},
"127.0.0.1": {},
"172.17.0.1": {},
"192.168.255.1": {},
}
// privateCIDRs mirrors Python's _WL_NETS (secubox_waf.py lines 33-38):
// loopback + RFC1918 + IPv6 loopback + ULA.
// Parsed once at package init; clientIP addresses in these ranges bypass
// the WAF entirely (LAN operators must never be banned).
var privateCIDRs []*net.IPNet
func init() {
for _, cidr := range []string{
"127.0.0.0/8",
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
"::1/128",
"fc00::/7",
} {
_, ipNet, err := net.ParseCIDR(cidr)
if err == nil {
privateCIDRs = append(privateCIDRs, ipNet)
}
}
}
// privateCIDR reports whether ip (plain IP string, no port) falls within any
// of the trusted private networks defined above.
// Mirrors Python's _is_whitelisted (secubox_waf.py lines 40-47).
func privateCIDR(ip string) bool {
parsed := net.ParseIP(ip)
if parsed == nil {
return false
}
for _, cidr := range privateCIDRs {
if cidr.Contains(parsed) {
return true
}
}
return false
}
// staticExtensions is the set of lowercase file extensions that skip inspection.
// Mirrors Python's check_request fast-path (secubox_waf.py line 766).
var staticExtensions = []string{
".js", ".css", ".png", ".jpg", ".jpeg", ".gif",
".ico", ".svg", ".woff", ".woff2", ".ttf", ".eot", ".map",
}
// staticAsset reports whether the request path looks like a static asset or a
// health/status endpoint that should skip WAF inspection.
// Mirrors Python check_request (secubox_waf.py lines 764-769):
// - extension match (path.endswith(ext) for ext in static_exts)
// - /health, /status, system_health substrings
func staticAsset(path string) bool {
lower := strings.ToLower(path)
for _, ext := range staticExtensions {
if strings.HasSuffix(lower, ext) {
return true
}
}
return strings.Contains(lower, "/health") ||
strings.Contains(lower, "/status") ||
strings.Contains(lower, "system_health")
}
// ncBypassPaths are Nextcloud mobile-token endpoints that must never be blocked.
// These paths carry opaque login tokens that can look like attack payloads; blocking
// them would break the NC mobile clients permanently.
var ncBypassPaths = []string{
"/index.php/login/v2/",
"/ocs/v2.php/core/login",
}
// ncBypass reports whether the path is a Nextcloud mobile authentication
// endpoint that should be exempt from WAF inspection.
func ncBypass(path string) bool {
lower := strings.ToLower(path)
for _, p := range ncBypassPaths {
if strings.Contains(lower, p) {
return true
}
}
return false
}
// clientIP extracts the real client IP from the request.
//
// Strategy (mirrors Python get_real_client_ip, secubox_waf.py lines 193-219):
// 1. Parse the immediate peer from r.RemoteAddr.
// 2. If the peer is a trusted proxy (trustedProxies), take the LEFTMOST
// non-empty entry from X-Forwarded-For as the real client IP.
// 3. Otherwise, the peer itself is the client (no proxy trust).
//
// Note: the Python version iterates XFF looking for the first non-trusted-proxy
// IP. We simplify to leftmost XFF when the peer is trusted, which is the common
// HAProxy → mitmproxy topology where HAProxy appends its own IP last and sets
// XFF to the original client.
func clientIP(r *http.Request) string {
// Parse peer IP (strip port from RemoteAddr).
peerHost, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
// RemoteAddr without port (unusual but handle gracefully).
peerHost = r.RemoteAddr
}
// Only trust XFF when the immediate peer is a known proxy.
if _, trusted := trustedProxies[peerHost]; trusted {
xff := r.Header.Get("X-Forwarded-For")
if xff != "" {
// Take the leftmost entry (original client in a well-behaved chain).
parts := strings.SplitN(xff, ",", 2)
ip := strings.TrimSpace(parts[0])
if ip != "" {
return ip
}
}
}
return peerHost
}
// defaultMaxBodyInspect is the default cap for body inspection (1 MiB).
// The production flag --max-body-inspect overrides this value.
// NOTE: inspection is bounded to this prefix only; payloads injected beyond
// this offset are NOT detected. This is a documented parity gap vs the Python
// WAF (which buffered the entire body). See docs/CUTOVER.md §pre-cutover for
// the arbitrated detection gap and how to raise or scope this limit.
const defaultMaxBodyInspect = 1 << 20 // 1 MiB

View File

@ -0,0 +1,298 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — request inspection + skip-list tests
//
// TDD for Task 2.2: wiring Rules.Match into the handler with CIDR/static/NC
// skip-lists and body preservation.
package main
import (
"bytes"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"os"
"testing"
)
// buildSQLiRules writes a minimal waf-rules.json with a UNION SELECT pattern
// and returns the path. Caller cleanup handled by t.TempDir().
func buildSQLiRulesFile(t *testing.T) string {
t.Helper()
doc := map[string]any{
"categories": map[string]any{
"sqli": map[string]any{
"name": "SQL Injection",
"severity": "high",
"enabled": true,
"patterns": []any{
map[string]any{"id": "sqli1", "pattern": `union\s+select`, "desc": "UNION SELECT"},
},
},
},
}
f, err := os.CreateTemp(t.TempDir(), "waf-rules*.json")
if err != nil {
t.Fatalf("create temp rules: %v", err)
}
if err := json.NewEncoder(f).Encode(doc); err != nil {
t.Fatalf("encode rules: %v", err)
}
f.Close()
return f.Name()
}
// newInspectServer builds a Server wired with rules and a stub backend.
// backendURL is the httptest.Server URL the routeLookup will target.
func newInspectServer(t *testing.T, rulesPath string, backendAddr string) *Server {
t.Helper()
srv := &Server{
routeLookup: func(host string) (ip string, port int, ok bool) {
h, p, err := splitHostPort(backendAddr)
if err != nil {
return "", 0, false
}
return h, p, true
},
rules: LoadRules(rulesPath),
}
return srv
}
// TestInspectBlocksAttack: public IP + UNION SELECT in query → 403.
func TestInspectBlocksAttack(t *testing.T) {
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "ok")
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
// Public IP, attack query: union+select (URL-encoded, '+' = space after decode)
req := httptest.NewRequest(http.MethodGet, "http://app.example.com/?q=1+union+select+1,2,3", nil)
req.Host = "app.example.com"
req.RemoteAddr = "1.2.3.4:12345" // public IP — no trusted bypass
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusForbidden {
t.Fatalf("expected 403 for WAF hit from public IP, got %d", rec.Code)
}
}
// TestInspectPrivateIPBypass: same attack from private IP → proxied (not 403).
func TestInspectPrivateIPBypass(t *testing.T) {
const wantBody = "backend ok"
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, wantBody)
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://app.example.com/?q=1+union+select+1,2,3", nil)
req.Host = "app.example.com"
req.RemoteAddr = "192.168.1.50:12345" // private RFC1918 — bypass WAF
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code == http.StatusForbidden {
t.Fatalf("private IP should bypass WAF inspection, got 403")
}
body, _ := io.ReadAll(rec.Result().Body)
if string(body) != wantBody {
t.Fatalf("expected backend body %q, got %q", wantBody, string(body))
}
}
// TestInspectStaticAssetSkip: static asset path with attack query → not blocked.
func TestInspectStaticAssetSkip(t *testing.T) {
const wantBody = "js ok"
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, wantBody)
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://app.example.com/app.js?q=1+union+select+1,2", nil)
req.Host = "app.example.com"
req.RemoteAddr = "1.2.3.4:12345" // public IP — but .js asset skips inspection
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code == http.StatusForbidden {
t.Fatalf("static asset should skip WAF inspection, got 403")
}
body, _ := io.ReadAll(rec.Result().Body)
if string(body) != wantBody {
t.Fatalf("expected backend body %q, got %q", wantBody, string(body))
}
}
// TestInspectNCBypass: NC mobile auth path with payload → not blocked.
func TestInspectNCBypass(t *testing.T) {
const wantBody = "nc login ok"
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, wantBody)
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
// NC mobile token path — even with an attack-looking body should not be blocked
req := httptest.NewRequest(http.MethodPost, "http://app.example.com/index.php/login/v2/", bytes.NewBufferString("data=union+select+1,2"))
req.Host = "app.example.com"
req.RemoteAddr = "1.2.3.4:12345" // public IP
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code == http.StatusForbidden {
t.Fatalf("NC bypass path should not be blocked, got 403")
}
body, _ := io.ReadAll(rec.Result().Body)
if string(body) != wantBody {
t.Fatalf("expected backend body %q, got %q", wantBody, string(body))
}
}
// TestInspectLargeBodyForwardedIntact: a POST body larger than defaultMaxBodyInspect
// (1 MiB + 4 KiB) must arrive at the backend byte-for-byte intact.
// This is the regression test for the LimitReader truncation bug: the old code
// restored only the capped prefix to r.Body, silently dropping the tail.
func TestInspectLargeBodyForwardedIntact(t *testing.T) {
// Build a benign body of exactly defaultMaxBodyInspect + 4 KiB (no attack pattern).
// The WAF will inspect the first 1 MiB and pass it; the tail must survive too.
const extraBytes = 4 * 1024
bodySize := defaultMaxBodyInspect + extraBytes
fullBody := make([]byte, bodySize)
for i := range fullBody {
fullBody[i] = byte('A' + (i % 26)) // deterministic fill, no attack pattern
}
var receivedBody []byte
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
receivedBody, _ = io.ReadAll(r.Body)
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "received")
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
req := httptest.NewRequest(http.MethodPost, "http://app.example.com/upload", bytes.NewReader(fullBody))
req.Host = "app.example.com"
req.RemoteAddr = "1.2.3.4:12345" // public IP — inspection runs
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code == http.StatusForbidden {
t.Fatalf("benign large POST should not be blocked, got 403")
}
if len(receivedBody) != bodySize {
t.Fatalf("backend received %d bytes, want %d (body was truncated)", len(receivedBody), bodySize)
}
if !bytes.Equal(receivedBody, fullBody) {
// Find first differing byte for a useful diagnostic.
for i := range fullBody {
if i >= len(receivedBody) || receivedBody[i] != fullBody[i] {
t.Fatalf("body mismatch at byte %d: got 0x%02x, want 0x%02x", i,
func() byte {
if i < len(receivedBody) {
return receivedBody[i]
}
return 0
}(), fullBody[i])
}
}
}
}
// TestInspectLargeBodyAttackInFirstMiB: attack payload within the first 1 MiB
// of a large body must still be caught (inspection still works with streaming).
func TestInspectLargeBodyAttackInFirstMiB(t *testing.T) {
// 512 KiB of attack prefix + 512 KiB + 4 KiB of padding.
const extraBytes = 4 * 1024
bodySize := defaultMaxBodyInspect + extraBytes
body := make([]byte, bodySize)
attackSnippet := []byte("union select 1,2,3")
copy(body[:len(attackSnippet)], attackSnippet)
// Fill the rest with harmless bytes.
for i := len(attackSnippet); i < bodySize; i++ {
body[i] = 'B'
}
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "ok")
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
req := httptest.NewRequest(http.MethodPost, "http://app.example.com/upload", bytes.NewReader(body))
req.Host = "app.example.com"
req.RemoteAddr = "1.2.3.4:12345" // public IP
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusForbidden {
t.Fatalf("expected 403 for attack in first 1 MiB of large body, got %d", rec.Code)
}
}
// TestInspectBodyForwarded: a POST whose body was read for inspection is still
// received intact by the backend.
func TestInspectBodyForwarded(t *testing.T) {
const postBody = "name=alice&value=harmless"
var receivedBody []byte
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
receivedBody, _ = io.ReadAll(r.Body)
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "received")
}))
defer backend.Close()
rulesPath := buildSQLiRulesFile(t)
backendAddr := backend.URL[len("http://"):]
srv := newInspectServer(t, rulesPath, backendAddr)
handler := srv.handler()
req := httptest.NewRequest(http.MethodPost, "http://app.example.com/submit", bytes.NewBufferString(postBody))
req.Host = "app.example.com"
req.RemoteAddr = "1.2.3.4:12345" // public IP — inspection runs but no hit
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code == http.StatusForbidden {
t.Fatalf("benign POST should not be blocked, got 403")
}
if string(receivedBody) != postBody {
t.Fatalf("backend received body %q, want %q (body not restored after read)", receivedBody, postBody)
}
}

View File

@ -0,0 +1,667 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — host-native reverse-proxy skeleton
//
// Phase 0 Task 1.1: skeleton binary with flags, CA load, route-lookup stub,
// and an HTTP handler that reverse-proxies mapped hosts and stamps
// X-SecuBox-WAF: inspected on every response.
//
// Task 1.2: wired the real Routes loader (LoadRoutes / *Routes) so --routes
// parses haproxy-routes.json and the handler uses cached per-backend
// *httputil.ReverseProxy instances (no per-request allocation).
//
// Task 3.2: graduated WARNING/BAN responses + threat log.
// - Server gains ban *Ban and threatLog *ThreatLog fields.
// - On a WAF hit: ban.Record(clientIP, now) → if banned → writeBan + log
// "banned"; else → writeWarning + log "warning".
// - threatLog is set by main() via NewThreatLog(--threat-log path).
// - crowdsec seam: Server.crowdsec (nil-able interface, see below) is the
// hook point for Task 4.1 — call crowdsec.Report(ip, cat, sev) when
// banned, guarded by nil-check so the field is entirely optional.
//
// Design decision — Server struct:
// - ca *forge.CA wired from --ca-cert/--ca-key (lazy: nil when
// flags are empty, so tests don't need PEM files)
// - routes *Routes hot-reload map; nil when --routes is empty
// - routeLookup func(host)(ip,port,ok) — set to routes.Lookup in main(), or
// injected directly by tests
// - upstreamTimeout time.Duration
// - ban *Ban sliding-window ban state; NewBan(300s,3) in main()
// - threatLog *ThreatLog append-only JSON threat log; NewThreatLog in main()
// - crowdsec CrowdSecReporter Task 4.1 seam — nil until wired; see interface below
package main
import (
"bytes"
"flag"
"fmt"
"io"
"log"
"net"
"net/http"
"net/http/httputil"
"net/url"
"os"
"strconv"
"strings"
"time"
"github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng/internal/forge"
)
// upstreamErrorCode maps a round-trip error to the appropriate HTTP error code,
// mirroring the Python error() hook logic (~line 1106):
// - net.Error with Timeout() → 504 Gateway Timeout
// - connection refused / dial failure → 502 Bad Gateway
// - all other errors → 503 Service Unavailable
func upstreamErrorCode(err error) int {
if ne, ok := err.(net.Error); ok && ne.Timeout() {
return http.StatusGatewayTimeout // 504
}
msg := err.Error()
if strings.Contains(msg, "connection refused") || strings.Contains(msg, "dial") {
return http.StatusBadGateway // 502
}
return http.StatusServiceUnavailable // 503
}
// CrowdSecReporter is the seam for Task 4.1 — CrowdSec LAPI bridge.
// When a client IP is banned, the handler calls crowdsec.Report if the field
// is non-nil. Task 4.1 implements a concrete type (e.g. *CrowdSecClient) and
// wires it into Server.crowdsec in main().
//
// TODO(task-4.1): implement CrowdSecClient satisfying this interface and wire
// it via --crowdsec-url / --crowdsec-machine-id / --crowdsec-password flags.
type CrowdSecReporter interface {
// Report submits a ban alert for ip to the CrowdSec LAPI.
// cat and sev are the WAF category and severity strings.
// Must be non-blocking (should run in a goroutine if the LAPI call can block).
Report(ip, cat, sev string)
}
// Server is the sbxwaf reverse-proxy core.
type Server struct {
// ca holds the loaded forging CA. May be nil when --ca-cert/--ca-key are not
// provided (tests, non-TLS deployments).
ca *forge.CA
// routes is the hot-reloadable route map loaded from --routes.
// Nil when --routes is empty (dev mode / no routes file).
routes *Routes
// routeLookup resolves a bare hostname (no port) to a backend ip:port.
// Returns ok=false for unmapped hosts (→ 421).
// In main(), set to routes.Lookup when routes != nil; tests can inject
// a custom closure directly.
routeLookup func(host string) (ip string, port int, ok bool)
// upstreamTimeout is the per-request dial+response timeout for the
// reverse-proxy transport.
upstreamTimeout time.Duration
// transport is the shared *http.Transport used by all reverse-proxy
// instances. Constructed in main() BEFORE LoadRoutes so that startup-built
// proxies use the same tuned pool. When nil, handler() creates a local
// transport from upstreamTimeout (backwards-compat for test-only Servers
// that don't inject a transport).
transport http.RoundTripper
// rules is the hot-reloadable WAF rule set loaded from --rules.
// Nil when --rules is empty (pass-through mode, no inspection).
// Wired in main() via LoadRules; tests can inject directly.
rules *Rules
// ban tracks per-IP threat hit counts in a sliding window.
// Wired in main() via NewBan(300s, 3); tests can inject directly.
// Nil means no ban tracking (legacy: plain 403 on WAF hit).
ban *Ban
// threatLog appends one JSON line per WAF hit to the threats log file.
// Wired in main() via NewThreatLog(--threat-log); tests can inject.
// Nil means no threat logging.
threatLog *ThreatLog
// crowdsec is the Task 4.1 CrowdSec LAPI bridge seam.
// Nil until Task 4.1 is implemented and wired in main().
// When non-nil: called with (ip, cat, sev) whenever an IP reaches BAN.
crowdsec CrowdSecReporter
// maxBodyInspect is the per-request body inspection cap in bytes.
// Only the first maxBodyInspect bytes of the request body are passed to
// Rules.Match; the remainder is streamed to the upstream uninspected.
// Payloads injected beyond this offset will NOT be detected — this is a
// documented parity gap vs the Python WAF (full-body scan).
// Set from --max-body-inspect; defaults to defaultMaxBodyInspect (1 MiB).
// When a body exceeds the cap an AUDIT log line is emitted (action:
// "body-inspect-truncated") so truncation events are operator-visible.
maxBodyInspect int64
// trustedHosts is the set of hostnames that bypass WAF inspection entirely.
// Mirrors Python check_request whitelist (secubox_waf.py:761-763):
// git.gk2.secubox.in, git.secubox.in, admin.gk2.secubox.in, 10.100.0.1:9080.
// Gitea push payloads and admin panel forms routinely contain content that
// would trip WAF rules — this skip prevents false-positive bans on internal
// services. Configurable via --waf-skip-hosts.
trustedHosts map[string]struct{}
// cookieAudit is the Task 5.1 RGPD Set-Cookie ledger.
// When non-nil, ModifyResponse calls Record for every upstream response.
// Nil means auditing is disabled (--cookie-audit-log="").
cookieAudit *CookieAudit
// mediaCache is the Task 6.1 response media cache.
// When non-nil, GET requests are served from cache on a hit (bypassing
// the upstream); cacheable responses on a miss are stored after proxying.
// Nil means caching is disabled (--media-cache-dir="").
mediaCache *MediaCache
}
// handler returns an http.Handler that:
// 1. Calls routes.Maybe() (hot-reload check) if routes is set.
// 2. Strips the port from req.Host and calls routeLookup.
// 3. Returns 421 Misdirected Request for unmapped hosts.
// 4. Uses the cached *httputil.ReverseProxy from Routes (no per-request
// allocation) when routes is set; falls back to a freshly-built proxy for
// test-injected routeLookup closures that bypass Routes.
// 5. Adds X-SecuBox-WAF: inspected to every proxied response.
// 6. (Task 2.2) When rules != nil, inspects the request before proxying:
// - Computes clientIP (XFF when peer is a trusted proxy, else peer).
// - Skips inspection for private/RFC1918 CIDRs (privateCIDR).
// - Skips inspection for static assets and health/status paths (staticAsset).
// - Skips inspection for NC mobile-auth paths (ncBypass).
// - Reads up to maxBodyInspect bytes for inspection; restores the FULL
// body (prefix + remaining stream via io.MultiReader) so the upstream
// proxy always receives every byte intact — no truncation.
// - On WAF hit: returns 403 Forbidden (Task 3.2 refines to WARNING/BAN).
// - Adds Connection: close to upstream requests (#496).
func (s *Server) handler() http.Handler {
// Use the shared transport injected at construction time (main() builds it
// before LoadRoutes so startup proxies already reference it). Fall back to
// a fresh local transport for test Servers that don't inject one.
transport := s.transport
if transport == nil {
timeout := s.upstreamTimeout
if timeout == 0 {
timeout = 10 * time.Second
}
transport = &http.Transport{
DialContext: (&net.Dialer{
Timeout: timeout,
}).DialContext,
ResponseHeaderTimeout: timeout,
}
}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Hot-reload check: stat the routes file and swap the map if mtime changed.
// Cheap when nothing changed (throttle=0 means one stat per call, but stat
// is O(1) and not on the inner response path).
if s.routes != nil {
s.routes.Maybe()
}
// Strip port from Host header to get the bare hostname for lookup.
host, _, err := net.SplitHostPort(r.Host)
if err != nil {
// No port present — use the Host value directly.
host = r.Host
}
host = strings.ToLower(strings.TrimSpace(host))
ip, port, ok := s.routeLookup(host)
if !ok {
http.Error(w, "421 Misdirected Request: no route for host "+host,
http.StatusMisdirectedRequest)
return
}
// Use the cached proxy from Routes when available (Task 1.2 perf goal:
// no per-request *httputil.ReverseProxy allocation).
var proxy *httputil.ReverseProxy
if s.routes != nil {
proxy = s.routes.ProxyFor(host)
}
if proxy == nil {
// Fallback: tests that inject routeLookup without a *Routes, or a
// race between Maybe() reload and ProxyFor (new entry not yet cached).
target := &url.URL{
Scheme: "http",
Host: net.JoinHostPort(ip, strconv.Itoa(port)),
}
proxy = httputil.NewSingleHostReverseProxy(target)
proxy.Transport = transport
proxy.ModifyResponse = func(resp *http.Response) error {
resp.Header.Set("X-SecuBox-WAF", "inspected")
// Task 5.1: record Set-Cookie to RGPD ledger when enabled.
// host is bound per-request (outer HandlerFunc scope).
if ca := s.cookieAudit; ca != nil {
ca.Record(host, resp.Request, resp)
}
return nil
}
proxy.ErrorHandler = func(w http.ResponseWriter, r *http.Request, err error) {
// Task 7.1: themed error pages — mirror the Python error() hook mapping.
// Timeout → 504, connection refused → 502, other → 503.
code := upstreamErrorCode(err)
reqHost := r.Host
if bare, _, e := net.SplitHostPort(reqHost); e == nil {
reqHost = bare
}
writeErrorPage(w, code, reqHost)
}
}
// Task 2.2 — Request inspection.
// Only when rules are loaded; otherwise pass through unconditionally.
if s.rules != nil {
// Add Connection: close to upstream requests (#496, mirrors Python).
r.Header.Set("Connection", "close")
ip := clientIP(r)
// Determine the path for skip-list checks. Use RawPath when available
// (Go only sets it when the path contains percent-encoded chars that
// differ from the decoded form), falling back to Path. This ensures
// we pass the still-encoded path to staticAsset/ncBypass (which do
// lowercasing but do not need decoded content for suffix/contains checks).
rawPath := r.URL.RawPath
if rawPath == "" {
rawPath = r.URL.Path
}
skip := privateCIDR(ip) || staticAsset(rawPath) || ncBypass(rawPath)
// Trusted-host skip: bypass WAF inspection for known internal hosts
// (matches Python check_request whitelist in secubox_waf.py:761-763).
// Checked AFTER privateCIDR/static/NC so that the cheap skips run first.
if !skip && s.isTrustedHost(r.Host) {
skip = true
}
if !skip {
// Read up to s.maxBodyInspect bytes for WAF inspection, then
// restore the FULL body (prefix + remaining stream) so the
// upstream proxy receives every byte intact.
//
// Streaming approach: we buffer at most maxBodyInspect bytes (the
// inspection window), then forward a MultiReader of that buffer +
// the unconsumed tail of r.Body. This keeps memory bounded even
// for multi-GB uploads (PeerTube / Nextcloud file uploads).
//
// PARITY GAP: only the first maxBodyInspect bytes are inspected.
// A payload appended after that offset is NOT detected. When a body
// exceeds the cap, an AUDIT log line is emitted so truncation is
// operator-visible (action="body-inspect-truncated"). See
// docs/CUTOVER.md for the documented detection gap.
cap := s.maxBodyInspect
if cap <= 0 {
cap = defaultMaxBodyInspect
}
var bodyBytes []byte
if r.Body != nil {
prefix, _ := io.ReadAll(io.LimitReader(r.Body, cap))
bodyBytes = prefix
// Restore: prefix already read + remaining stream not yet consumed.
r.Body = io.NopCloser(io.MultiReader(bytes.NewReader(prefix), r.Body))
// Emit audit log when inspection was truncated (Content-Length known
// or body read returned exactly cap bytes → likely more data follows).
if int64(len(prefix)) == cap {
if s.threatLog != nil {
s.threatLog.Record(ThreatRecord{
ClientIP: ip,
Host: r.Host,
Method: r.Method,
Path: rawPath,
Category: "body-inspect-truncated",
Severity: "audit",
Action: "body-inspect-truncated",
UA: r.Header.Get("User-Agent"),
})
}
log.Printf("sbxwaf: AUDIT body-inspect-truncated host=%s path=%s ip=%s cap=%d",
r.Host, rawPath, ip, cap)
}
}
cat, sev, hit := s.rules.Match(
r.Method,
rawPath,
r.URL.RawQuery,
string(bodyBytes),
r.Header.Get("User-Agent"),
)
if hit {
// Task 3.2 — graduated WARNING/BAN response.
//
// When ban is wired (always in production), record the hit and
// return a graduated response:
// count < threshold → WARNING (403, warning page)
// count >= threshold → BAN (403, ban page)
//
// When ban is nil (legacy / no ban tracking), fall back to a
// plain 403 so tests that don't inject ban still pass.
if s.ban == nil {
http.Error(w, "403 Forbidden: WAF blocked this request", http.StatusForbidden)
return
}
count, banned := s.ban.Record(ip, time.Now().Unix())
action := "warning"
if banned {
action = "banned"
}
// Log threat (best-effort: nil threatLog is a no-op).
if s.threatLog != nil {
s.threatLog.Record(ThreatRecord{
ClientIP: ip,
Host: r.Host,
Method: r.Method,
Path: rawPath,
Category: cat,
Severity: sev,
// rules.Match does not return a rule ID in its current
// signature (returns cat, sev, hit). RuleID is left empty
// here; Task 2.x can extend Match to return it if needed.
RuleID: "",
Action: action,
UA: r.Header.Get("User-Agent"),
})
}
log.Printf("sbxwaf: THREAT [%s] %s (%d/%d): %s",
sev, ip, count, 3, cat)
if banned {
// Task 4.1 seam — notify CrowdSec LAPI when non-nil.
if s.crowdsec != nil {
go s.crowdsec.Report(ip, cat, sev)
}
writeBan(w)
} else {
writeWarning(w, cat)
}
return
}
}
}
// Task 6.1 — media cache hit: serve from disk, bypass upstream.
// Only for GET requests; cache is nil-safe.
//
// Cache key is composed from vhost + path+query so that two different
// vhosts serving the same asset path (/logo.png) get distinct keys and
// never cross-contaminate each other's cached content (vhost isolation;
// mirrors Python media_cache.py r.pretty_url which includes the host).
if s.mediaCache != nil && r.Method == http.MethodGet {
vhostCacheURL := "https://" + r.Host + r.URL.RequestURI()
if cachedBody, cachedHdr, hit := s.mediaCache.Get(vhostCacheURL); hit {
for k, vs := range cachedHdr {
for _, v := range vs {
w.Header().Set(k, v)
}
}
w.Header().Set("X-SecuBox-Cache", "hit")
w.Header().Set("X-SecuBox-WAF", "inspected")
w.WriteHeader(http.StatusOK)
_, _ = w.Write(cachedBody)
return
}
// Cache miss: wrap the proxy's ModifyResponse to capture and store the
// response body after proxying. We need a wrapping proxy here so we can
// intercept ModifyResponse without altering the cached proxy instance.
//
// Strategy: build a thin wrapper around the real proxy's transport that
// buffers (up to maxObj bytes) and stores the response body. We cannot
// override proxy.ModifyResponse on a shared cached proxy safely, so
// instead we use a ResponseWriter wrapper that tees the body to cache.
//
// Use a capturing ResponseWriter: let the upstream write normally to
// the real ResponseWriter but simultaneously capture response headers +
// body for MaybeStore. The client always receives the full body —
// we only buffer up to maxObj bytes for the cache and discard the rest
// (the real body still flows through to the client).
cw := &cachingResponseWriter{
ResponseWriter: w,
maxCapture: s.mediaCache.maxObj,
}
// Wire a ModifyResponse on the fallback proxy path that we'll replace
// if using a cached proxy. For the cached-proxy path, we instead use
// a post-ServeHTTP hook via cw.
//
// Build an ad-hoc proxy that wraps the response via ModifyResponse.
// We clone the existing proxy's behaviour but intercept ModifyResponse
// to capture the body. This avoids mutating the shared proxy instance.
//
// Simplest correct approach: let the real proxy handle the response
// (including its own ModifyResponse for WAF headers), then store
// whatever cw captured.
proxy.ServeHTTP(cw, r)
// After proxying: if the response was cacheable and we captured enough
// of the body, store it. The full body was already written to the
// client by the real proxy — we only stored a copy.
// Synchronous: MaybeStore is fast (disk write) and must complete before
// the next request can get a cache hit.
sc := cw.statusCode
if sc == 0 {
sc = http.StatusOK // implicit 200 when WriteHeader was never called
}
if sc == http.StatusOK && cw.captured {
s.mediaCache.MaybeStore(r, &http.Response{
StatusCode: sc,
Header: cw.respHeader,
}, cw.body, vhostCacheURL)
}
return
}
proxy.ServeHTTP(w, r)
})
}
// parseTrustedHosts parses a comma-separated list of hostnames into a set.
// Empty entries are silently skipped.
func parseTrustedHosts(csv string) map[string]struct{} {
m := make(map[string]struct{})
for _, h := range strings.Split(csv, ",") {
h = strings.TrimSpace(h)
if h != "" {
m[strings.ToLower(h)] = struct{}{}
}
}
return m
}
// isTrustedHost reports whether the given Host header value (with optional port)
// belongs to the trusted-host whitelist. Matches the Python check_request
// trusted-host skip (secubox_waf.py:761-763). Checked before WAF inspection so
// internal services (gitea, admin panel) are never WAF-inspected or banned.
func (s *Server) isTrustedHost(hostHeader string) bool {
if len(s.trustedHosts) == 0 {
return false
}
lh := strings.ToLower(strings.TrimSpace(hostHeader))
if _, ok := s.trustedHosts[lh]; ok {
return true
}
// Also check bare hostname (without port) in case hostHeader includes a port.
bare, _, err := net.SplitHostPort(lh)
if err == nil {
if _, ok := s.trustedHosts[bare]; ok {
return true
}
}
return false
}
// splitHostPort splits "host:port" into its components, parsing port as int.
// Exported to package scope so tests can call it directly.
func splitHostPort(addr string) (host string, port int, err error) {
h, ps, e := net.SplitHostPort(addr)
if e != nil {
return "", 0, e
}
p, e := strconv.Atoi(ps)
if e != nil {
return "", 0, fmt.Errorf("invalid port %q: %w", ps, e)
}
return h, p, nil
}
func main() {
listen := flag.String("listen", ":8080", "address to listen on (e.g. :8080 or 0.0.0.0:8080)")
caCert := flag.String("ca-cert", "", "path to CA certificate PEM file (required for TLS forging)")
caKey := flag.String("ca-key", "", "path to CA private key PEM file (or combined cert+key bundle)")
routesFile := flag.String("routes", "", "path to haproxy-routes.json (hot-reloaded on mtime change)")
rules := flag.String("rules", "", "path to rules file (loaded by Task 2.1)")
upstreamTimeout := flag.Duration("upstream-timeout", 10*time.Second, "per-request upstream timeout")
threatLog := flag.String("threat-log", "/var/log/secubox/waf/waf-threats.log",
"path for append-only WAF threat log (NDJSON, one record per hit)")
// Task 4.1: CrowdSec LAPI bridge flags.
crowdsecURL := flag.String("crowdsec-url", "",
"CrowdSec LAPI base URL (e.g. http://10.100.0.1:8080); empty disables the bridge")
crowdsecJWTFile := flag.String("crowdsec-jwt-file", "",
"path to file containing the CrowdSec LAPI JWT/API key (read once at startup)")
crowdsecBanDuration := flag.String("crowdsec-ban-duration", "4h",
"ban duration forwarded to CrowdSec decisions (e.g. 4h, 24h)")
// Task 5.1: RGPD Set-Cookie ledger.
cookieAuditLog := flag.String("cookie-audit-log", DefaultCookieAuditLog,
"path for RGPD cookie audit JSONL ledger (one record per Set-Cookie); empty disables")
// Task 6.1: response media cache.
mediaCacheDir := flag.String("media-cache-dir", "/var/cache/secubox/waf/media",
"directory for the response media cache (16 MiB/obj, 2 GiB total); empty disables")
// Body inspection cap: only the first N bytes of the request body are scanned.
// Payloads beyond this offset are NOT inspected (documented parity gap vs Python full-body scan).
// Raise for stricter coverage; truncation events are always audit-logged regardless of this cap.
maxBodyInspectFlag := flag.Int64("max-body-inspect", defaultMaxBodyInspect,
"max bytes of request body to inspect for WAF rules (default 1 MiB); truncation is audit-logged")
// Trusted-host skip: WAF inspection is bypassed for these hostnames (comma-separated).
// Mirrors Python check_request whitelist (secubox_waf.py:761-763).
// Default list matches the Python source: git.gk2.secubox.in, git.secubox.in,
// admin.gk2.secubox.in, 10.100.0.1:9080.
wafSkipHosts := flag.String("waf-skip-hosts",
"git.gk2.secubox.in,git.secubox.in,admin.gk2.secubox.in,10.100.0.1:9080",
"comma-separated hostnames to bypass WAF inspection entirely (mirrors Python trusted-host list)")
flag.Parse()
// rules is consumed below when --rules is provided.
// Build the shared transport FIRST so it can be passed to LoadRoutes.
// Every proxy — startup-built and reload-built — will share this pool and
// dial timeout. The same pointer is stored in srv.transport for the
// handler's fallback path.
sharedTransport := &http.Transport{
DialContext: (&net.Dialer{
Timeout: *upstreamTimeout,
}).DialContext,
ResponseHeaderTimeout: *upstreamTimeout,
MaxIdleConns: 256,
MaxIdleConnsPerHost: 32,
IdleConnTimeout: 90 * time.Second,
}
// Task 5.1: RGPD cookie-audit ledger. Disabled when --cookie-audit-log is empty.
var cookieAudit *CookieAudit
if *cookieAuditLog != "" {
cookieAudit = NewCookieAudit(*cookieAuditLog)
log.Printf("sbxwaf: cookie-audit ledger enabled → %s", *cookieAuditLog)
}
// Task 6.1: response media cache. Disabled when --media-cache-dir is empty.
var mediaCache *MediaCache
if *mediaCacheDir != "" {
mediaCache = NewMediaCache(*mediaCacheDir)
log.Printf("sbxwaf: media-cache enabled → %s (maxObj=16MiB, maxTotal=2GiB)", *mediaCacheDir)
}
srv := &Server{
upstreamTimeout: *upstreamTimeout,
transport: sharedTransport,
// Task 3.2: graduated ban (window=300s, threshold=3, matches Python
// BAN_WINDOW=300 / BAN_THRESHOLD=3 from secubox_waf.py lines 82-83).
ban: NewBan(300*time.Second, 3),
// Task 3.2: append-only threat log.
threatLog: NewThreatLog(*threatLog),
// crowdsec: wired below when --crowdsec-url and --crowdsec-jwt-file are set.
// Task 5.1: RGPD cookie-audit ledger.
cookieAudit: cookieAudit,
// Task 6.1: response media cache.
mediaCache: mediaCache,
// Body inspection cap (--max-body-inspect).
maxBodyInspect: *maxBodyInspectFlag,
// Trusted-host skip (--waf-skip-hosts): mirrors Python whitelist.
trustedHosts: parseTrustedHosts(*wafSkipHosts),
}
log.Printf("sbxwaf: ban window=300s threshold=3; threat-log=%s", *threatLog)
log.Printf("sbxwaf: body-inspect cap=%d bytes; trusted-skip hosts=%d", *maxBodyInspectFlag, len(srv.trustedHosts))
// Task 4.1: wire CrowdSec LAPI bridge when both --crowdsec-url and
// --crowdsec-jwt-file are provided. The JWT is read from a file so the
// secret never appears in the process command line or environment.
if *crowdsecURL != "" && *crowdsecJWTFile != "" {
jwtBytes, err := os.ReadFile(*crowdsecJWTFile)
if err != nil {
log.Fatalf("sbxwaf: crowdsec: read jwt-file %q: %v", *crowdsecJWTFile, err)
}
jwt := strings.TrimSpace(string(jwtBytes))
srv.crowdsec = NewCrowdSecClient(*crowdsecURL, jwt, *crowdsecBanDuration)
log.Printf("sbxwaf: CrowdSec LAPI bridge enabled → %s (ban-duration=%s)",
*crowdsecURL, *crowdsecBanDuration)
} else if *crowdsecURL != "" || *crowdsecJWTFile != "" {
log.Printf("sbxwaf: crowdsec bridge disabled — both --crowdsec-url and --crowdsec-jwt-file required")
}
// Wire in the WAF rules engine when --rules is provided.
if *rules != "" {
srv.rules = LoadRules(*rules)
log.Printf("sbxwaf: WAF rules loaded from %s", *rules)
}
// Wire in the real Routes loader when --routes is provided.
if *routesFile != "" {
r := LoadRoutes(*routesFile, sharedTransport)
// Task 5.1: inject cookie audit so Routes-built proxies also record cookies.
r.cookieAudit = cookieAudit
srv.routes = r
srv.routeLookup = r.Lookup
log.Printf("sbxwaf: routes loaded from %s (%d entries)", *routesFile, func() int {
r.mu.RLock()
n := len(r.entries)
r.mu.RUnlock()
return n
}())
} else {
// No routes file: answer 421 to every request (smoke-test / dev mode).
srv.routeLookup = func(host string) (string, int, bool) {
return "", 0, false
}
}
// CA load is lazy: skip if flags are empty (dev mode / no TLS forging needed).
if *caCert != "" || *caKey != "" {
if *caCert == "" || *caKey == "" {
log.Fatal("sbxwaf: --ca-cert and --ca-key must both be provided together")
}
ca, err := forge.LoadCA(*caCert, *caKey)
if err != nil {
log.Fatalf("sbxwaf: load CA: %v", err)
}
srv.ca = ca
log.Printf("sbxwaf: CA loaded from %s", *caCert)
}
httpSrv := &http.Server{
Addr: *listen,
Handler: srv.handler(),
ReadHeaderTimeout: 10 * time.Second,
}
log.Printf("sbxwaf: listening on %s", *listen)
if err := httpSrv.ListenAndServe(); err != nil {
log.Fatalf("sbxwaf: %v", err)
}
}

View File

@ -0,0 +1,190 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — reverse-proxy skeleton tests
package main
import (
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
)
// TestProxyPassthrough verifies that a request whose Host is in the route map
// is forwarded to the backend and the response carries X-SecuBox-WAF: inspected.
func TestProxyPassthrough(t *testing.T) {
// Stand up a stub backend that echoes a known body.
const wantBody = "hello from backend"
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, wantBody)
}))
defer backend.Close()
// Parse the backend host:port from its URL (strip "http://").
backendAddr := strings.TrimPrefix(backend.URL, "http://")
// Build a Server with one route: app.example.com → backend.
srv := &Server{
routeLookup: func(host string) (ip string, port int, ok bool) {
if host == "app.example.com" {
// Parse host:port from backendAddr.
h, p, err := splitHostPort(backendAddr)
if err != nil {
return "", 0, false
}
return h, p, true
}
return "", 0, false
},
}
// Build the handler and drive it with httptest.
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://app.example.com/path", nil)
req.Host = "app.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
res := rec.Result()
if res.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", res.StatusCode)
}
body, _ := io.ReadAll(res.Body)
if string(body) != wantBody {
t.Fatalf("expected body %q, got %q", wantBody, string(body))
}
wafHeader := res.Header.Get("X-SecuBox-WAF")
if wafHeader != "inspected" {
t.Fatalf("expected X-SecuBox-WAF: inspected, got %q", wafHeader)
}
}
// TestProxyUnmapped verifies that a request to an unmapped Host gets 421.
func TestProxyUnmapped(t *testing.T) {
srv := &Server{
routeLookup: func(host string) (ip string, port int, ok bool) {
return "", 0, false
},
}
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, "http://unknown.example.com/", nil)
req.Host = "unknown.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusMisdirectedRequest {
t.Fatalf("expected 421, got %d", rec.Code)
}
}
// TestTrustedHostSkipsWAF verifies that a request to a trusted host is NOT
// blocked even when the payload would normally trigger the WAF.
// Mirrors Python check_request whitelist (secubox_waf.py:761-763).
func TestTrustedHostSkipsWAF(t *testing.T) {
// Load real WAF rules so the attack payload would be caught on an untrusted host.
rules := LoadRules(testdataPath("waf-rules.json"))
const trustedHost = "git.gk2.secubox.in"
// Backend that always returns 200 — should be reached for trusted hosts.
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "ok")
}))
defer backend.Close()
backendAddr := strings.TrimPrefix(backend.URL, "http://")
h, p, err := splitHostPort(backendAddr)
if err != nil {
t.Fatalf("splitHostPort: %v", err)
}
banState := NewBan(300*1e9, 3)
srv := &Server{
rules: rules,
ban: banState,
trustedHosts: parseTrustedHosts(trustedHost),
routeLookup: func(host string) (string, int, bool) {
return h, p, true
},
}
handler := srv.handler()
// Attack payload in query string (percent-encoded so httptest.NewRequest accepts it).
// "union+select+1,2,3" → would be caught as SQLi on an untrusted host.
req := httptest.NewRequest(http.MethodGet, "http://"+trustedHost+"/search?q=union+select+1%2C2%2C3", nil)
req.Host = trustedHost
// Simulate a non-private remote addr so privateCIDR doesn't skip first.
req.RemoteAddr = "203.0.113.99:12345"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("trusted host with attack payload: expected 200 (bypass), got %d (WAF blocked — false positive on trusted host)",
rec.Code)
}
// Sanity check: same payload on an UNTRUSTED host must be blocked (warns on first hit).
srvUntrusted := &Server{
rules: rules,
ban: NewBan(300*1e9, 3),
trustedHosts: parseTrustedHosts(""), // empty — no trusted hosts
routeLookup: func(host string) (string, int, bool) {
return h, p, true
},
}
handlerUntrusted := srvUntrusted.handler()
req2 := httptest.NewRequest(http.MethodGet, "http://untrusted.example.com/search?q=union+select+1%2C2%2C3", nil)
req2.Host = "untrusted.example.com"
req2.RemoteAddr = "203.0.113.99:12345"
rec2 := httptest.NewRecorder()
handlerUntrusted.ServeHTTP(rec2, req2)
if rec2.Code == http.StatusOK {
t.Fatal("untrusted host with SQLi payload must be blocked — test sanity check failed")
}
}
// TestIsTrustedHost verifies isTrustedHost matching logic (with/without port).
func TestIsTrustedHost(t *testing.T) {
srv := &Server{
trustedHosts: parseTrustedHosts("git.gk2.secubox.in,10.100.0.1:9080"),
}
cases := []struct {
host string
want bool
}{
{"git.gk2.secubox.in", true},
{"GIT.GK2.SECUBOX.IN", true}, // case-insensitive
{"10.100.0.1:9080", true}, // host:port exact match
{"untrusted.example.com", false},
{"", false},
}
for _, tc := range cases {
got := srv.isTrustedHost(tc.host)
if got != tc.want {
t.Errorf("isTrustedHost(%q) = %v, want %v", tc.host, got, tc.want)
}
}
}
// TestParseTrustedHosts verifies parseTrustedHosts parses comma-separated input.
func TestParseTrustedHosts(t *testing.T) {
m := parseTrustedHosts("a.example.com, b.example.com,c.example.com")
for _, h := range []string{"a.example.com", "b.example.com", "c.example.com"} {
if _, ok := m[h]; !ok {
t.Errorf("expected %q in trusted set", h)
}
}
if len(m) != 3 {
t.Errorf("expected 3 entries, got %d", len(m))
}
}

View File

@ -0,0 +1,477 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — response media cache (Task 6.1)
//
// On-disk layout (mirrors media_cache.py):
//
// <dir>/<key[:2]>/<key> — raw response body (binary)
// <dir>/<key[:2]>/<key>.m — JSON sidecar: {"ct":"…","exp":unix,"url":"…"}
//
// where key = hex(sha256(url)).
//
// Eviction: when the in-memory total-bytes counter would exceed maxTotal after
// a new store, we evict by ascending atime (least-recently-used) until the
// total drops below the cap. This mirrors the Python _evict_if_needed logic.
//
// TTL seam: nowFn is a replaceable clock function (default time.Now) injected
// by tests to make expiry deterministic without real-time sleeps.
//
// Fail-open: every cache error (disk I/O, JSON parse, …) is silently swallowed
// — the caller always receives the real upstream response.
package main
import (
"crypto/sha256"
"encoding/json"
"fmt"
"net/http"
"os"
"path/filepath"
"regexp"
"sort"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
)
// Cache constants — mirror media_cache.py.
const (
mediaCacheMaxObj int64 = 16 * 1024 * 1024 // 16 MiB per object
mediaCacheMaxTotal int64 = 2 * 1024 * 1024 * 1024 // 2 GiB total
mediaCacheTTL int64 = 3600 // default 1 h
)
// Cacheable content-type prefixes/substrings — exact port from _CACHEABLE tuple.
var mediaCacheableTypes = []string{
"image/",
"video/",
"audio/",
"font/",
"text/css",
"javascript",
"ecmascript",
"application/font",
"application/vnd.ms-fontobject",
}
var mediaCacheMaxAgeRe = regexp.MustCompile(`(?i)max-age\s*=\s*(\d+)`)
// cacheEntry is the in-memory index record for one cached object.
type cacheEntry struct {
size int64
exp int64 // unix timestamp; 0 = never expire
atime int64 // unix timestamp of last access (for LRU eviction)
ct string
}
// CacheStats is a snapshot of MediaCache counters.
type CacheStats struct {
Hits int64
Misses int64
Stored int64
Evicted int64
BytesCached int64
Objects int64
}
// MediaCache is a disk-backed, LRU-evicting, TTL-aware response media cache.
// It is safe for concurrent use.
type MediaCache struct {
dir string
maxObj int64
maxTotal int64
mu sync.Mutex
index map[string]*cacheEntry // key → entry
total int64 // current total bytes on disk
// Atomic stats counters (read without lock for Stats()).
hits atomic.Int64
misses atomic.Int64
stored atomic.Int64
evicted atomic.Int64
// nowFn is the clock seam — replaced by tests for deterministic TTL tests.
nowFn func() time.Time
}
// NewMediaCache creates a MediaCache rooted at dir.
// maxObj and maxTotal default to mediaCacheMaxObj / mediaCacheMaxTotal.
// The on-disk index is rebuilt at construction time (mirrors _load_index).
func NewMediaCache(dir string) *MediaCache {
mc := &MediaCache{
dir: dir,
maxObj: mediaCacheMaxObj,
maxTotal: mediaCacheMaxTotal,
index: make(map[string]*cacheEntry),
nowFn: time.Now,
}
// Fail-open: ignore mkdir/scan errors.
_ = os.MkdirAll(dir, 0o755)
mc.loadIndex()
return mc
}
// cacheKey returns hex(sha256(url)) — identical to Python _key().
func cacheKey(url string) string {
h := sha256.Sum256([]byte(url))
return fmt.Sprintf("%x", h)
}
// paths returns (bodyPath, metaPath) for a given cache key.
func (m *MediaCache) paths(key string) (string, string) {
shard := key[:2]
d := filepath.Join(m.dir, shard)
return filepath.Join(d, key), filepath.Join(d, key+".m")
}
// loadIndex scans the cache directory and rebuilds the in-memory index.
// Mirrors Python _load_index; called once at construction.
func (m *MediaCache) loadIndex() {
entries, err := os.ReadDir(m.dir)
if err != nil {
return
}
for _, sub := range entries {
if !sub.IsDir() {
continue
}
subDir := filepath.Join(m.dir, sub.Name())
files, err := os.ReadDir(subDir)
if err != nil {
continue
}
for _, f := range files {
if strings.HasSuffix(f.Name(), ".m") {
continue // skip meta sidecars
}
key := f.Name()
bodyPath := filepath.Join(subDir, key)
info, err := os.Stat(bodyPath)
if err != nil {
continue
}
metaPath := bodyPath + ".m"
var meta struct {
CT string `json:"ct"`
Exp int64 `json:"exp"`
}
if raw, err := os.ReadFile(metaPath); err == nil {
_ = json.Unmarshal(raw, &meta)
}
e := &cacheEntry{
size: info.Size(),
exp: meta.Exp,
// mtime is used deliberately as the LRU recency proxy.
// atime is unreliable on most Linux filesystems (relatime
// mount option suppresses most atime updates), so we use
// mtime which is set explicitly via os.Chtimes on every
// cache Get() hit — a reliable in-band atime surrogate.
atime: info.ModTime().Unix(),
ct: meta.CT,
}
m.index[key] = e
m.total += info.Size()
}
}
}
// isCacheable reports whether ct is a cacheable content-type.
// Mirrors Python _cacheable_ct.
func isCacheable(ct string) bool {
ct = strings.ToLower(strings.SplitN(ct, ";", 2)[0])
ct = strings.TrimSpace(ct)
if ct == "" {
return false
}
for _, prefix := range mediaCacheableTypes {
if strings.Contains(ct, prefix) {
return true
}
}
return false
}
// Get returns the cached body + headers for url if a valid (non-expired) entry
// exists. ok=false means cache miss (or expired). Fail-open: I/O errors → miss.
func (m *MediaCache) Get(url string) (body []byte, hdr http.Header, ok bool) {
key := cacheKey(url)
now := m.nowFn().Unix()
m.mu.Lock()
e, found := m.index[key]
m.mu.Unlock()
if !found {
m.misses.Add(1)
return nil, nil, false
}
if e.exp != 0 && e.exp < now {
// Expired: treat as miss (evict lazily on next store).
m.misses.Add(1)
return nil, nil, false
}
bodyPath, _ := m.paths(key)
data, err := os.ReadFile(bodyPath)
if err != nil {
// File gone (evicted externally, disk error) — remove from index.
m.mu.Lock()
if ex, ok := m.index[key]; ok {
m.total -= ex.size
delete(m.index, key)
}
m.mu.Unlock()
m.misses.Add(1)
return nil, nil, false
}
// Update atime in index and on-disk (mirrors Python e["atime"] = time.time()).
m.mu.Lock()
if ex, ok := m.index[key]; ok {
ex.atime = now
}
m.mu.Unlock()
_ = os.Chtimes(bodyPath, time.Unix(now, 0), time.Unix(now, 0))
h := http.Header{}
if e.ct != "" {
h.Set("Content-Type", e.ct)
}
m.hits.Add(1)
return data, h, true
}
// MaybeStore conditionally stores the response body to disk.
// The cacheURL parameter must be the full per-vhost URL composed by the caller
// as "https://" + r.Host + r.URL.RequestURI() so that assets with the same
// path on different vhosts get distinct cache keys (vhost isolation).
// Checks: method==GET, status==200, no no-store/private/set-cookie, cacheable
// content-type, size < maxObj, ttl > 0.
// Evicts oldest-by-atime entries when total would exceed maxTotal.
// Fail-open: any I/O error is silently ignored.
func (m *MediaCache) MaybeStore(req *http.Request, resp *http.Response, body []byte, cacheURL string) {
if req == nil || resp == nil {
return
}
if req.Method != http.MethodGet {
return
}
if resp.StatusCode != http.StatusOK {
return
}
// Skip Range requests and authenticated responses (mirrors Python).
if req.Header.Get("Range") != "" || req.Header.Get("Authorization") != "" {
return
}
cc := strings.ToLower(resp.Header.Get("Cache-Control"))
if strings.Contains(cc, "no-store") || strings.Contains(cc, "private") {
return
}
if resp.Header.Get("Set-Cookie") != "" {
return
}
ct := resp.Header.Get("Content-Type")
if !isCacheable(ct) {
return
}
// Size gate: reject oversized objects.
if int64(len(body)) > m.maxObj {
return
}
if len(body) == 0 {
return
}
// Parse max-age; fall back to DEFAULT_TTL.
var ttl int64 = mediaCacheTTL
if m := mediaCacheMaxAgeRe.FindStringSubmatch(cc); m != nil {
if v, err := strconv.ParseInt(m[1], 10, 64); err == nil {
ttl = v
}
}
if ttl <= 0 {
return
}
rawURL := cacheURL
key := cacheKey(rawURL)
bodyPath, metaPath := m.paths(key)
// Strip params from ct for storage.
ctClean := strings.TrimSpace(strings.SplitN(ct, ";", 2)[0])
now := m.nowFn().Unix()
exp := now + ttl
// Write body atomically via tmp → rename (mirrors Python tmp + os.replace).
if err := os.MkdirAll(filepath.Dir(bodyPath), 0o755); err != nil {
return
}
tmp := bodyPath + ".tmp"
if err := os.WriteFile(tmp, body, 0o644); err != nil {
_ = os.Remove(tmp)
return
}
if err := os.Rename(tmp, bodyPath); err != nil {
_ = os.Remove(tmp)
return
}
meta := struct {
CT string `json:"ct"`
Exp int64 `json:"exp"`
URL string `json:"url"`
}{
CT: ctClean,
Exp: exp,
URL: func() string {
if len(rawURL) > 300 {
return rawURL[:300]
}
return rawURL
}(),
}
metaBytes, err := json.Marshal(meta)
if err == nil {
_ = os.WriteFile(metaPath, metaBytes, 0o644)
}
// Update index + total under lock.
newSize := int64(len(body))
m.mu.Lock()
old := int64(0)
if ex, ok := m.index[key]; ok {
old = ex.size
}
m.total += newSize - old
m.index[key] = &cacheEntry{
size: newSize,
exp: exp,
atime: now,
ct: ctClean,
}
m.evictIfNeeded()
m.mu.Unlock()
m.stored.Add(1)
}
// evictIfNeeded removes least-recently-used entries until total ≤ maxTotal.
// Must be called with m.mu held.
func (m *MediaCache) evictIfNeeded() {
if m.total <= m.maxTotal {
return
}
// Build a sorted slice of (key, atime) pairs.
type kv struct {
key string
atime int64
}
pairs := make([]kv, 0, len(m.index))
for k, e := range m.index {
pairs = append(pairs, kv{k, e.atime})
}
sort.Slice(pairs, func(i, j int) bool {
return pairs[i].atime < pairs[j].atime
})
for _, p := range pairs {
if m.total <= m.maxTotal {
break
}
e, ok := m.index[p.key]
if !ok {
continue
}
bodyPath, metaPath := m.paths(p.key)
_ = os.Remove(bodyPath)
_ = os.Remove(metaPath)
m.total -= e.size
delete(m.index, p.key)
m.evicted.Add(1)
}
}
// cachingResponseWriter wraps an http.ResponseWriter to tee the response body
// to an in-memory buffer (up to maxCapture bytes) so the handler can store the
// response in the media cache after proxying.
//
// The client always receives the FULL response — we only buffer up to maxCapture
// bytes for the cache decision. If the response body exceeds maxCapture, we
// stop buffering and set captured=false; the client stream is not truncated.
type cachingResponseWriter struct {
http.ResponseWriter
statusCode int
respHeader http.Header
body []byte
captured bool // true when body was fully buffered (len ≤ maxCapture)
maxCapture int64
written int64
overflow bool
}
func (c *cachingResponseWriter) WriteHeader(code int) {
c.statusCode = code
// Snapshot the response headers at the point WriteHeader is called.
// This captures Content-Type, Cache-Control etc. set by the upstream proxy.
c.respHeader = c.ResponseWriter.Header().Clone()
c.ResponseWriter.WriteHeader(code)
}
func (c *cachingResponseWriter) Write(b []byte) (int, error) {
n, err := c.ResponseWriter.Write(b)
if c.overflow {
return n, err
}
if c.statusCode == 0 {
// WriteHeader was not called explicitly — Go sets 200 implicitly.
c.statusCode = http.StatusOK
c.respHeader = c.ResponseWriter.Header().Clone()
}
c.written += int64(n)
if c.written > c.maxCapture {
// Body too large to cache — discard buffer, mark overflow.
c.body = nil
c.overflow = true
c.captured = false
return n, err
}
c.body = append(c.body, b[:n]...)
c.captured = true
return n, err
}
// Flush implements http.Flusher so that httputil.ReverseProxy can flush
// chunks incrementally to the client (important for progressive video /
// PeerTube streaming). It is a pure pass-through to the underlying
// ResponseWriter's Flush method; it does not affect what bytes are
// captured for the cache buffer.
func (c *cachingResponseWriter) Flush() {
if f, ok := c.ResponseWriter.(http.Flusher); ok {
f.Flush()
}
}
// Stats returns a point-in-time snapshot of cache counters.
func (m *MediaCache) Stats() CacheStats {
m.mu.Lock()
objects := int64(len(m.index))
bytes := m.total
m.mu.Unlock()
return CacheStats{
Hits: m.hits.Load(),
Misses: m.misses.Load(),
Stored: m.stored.Load(),
Evicted: m.evicted.Load(),
BytesCached: bytes,
Objects: objects,
}
}

View File

@ -0,0 +1,558 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — media-cache tests (Task 6.1 TDD)
package main
import (
"bytes"
"crypto/sha256"
"fmt"
"io"
"net/http"
"net/http/httptest"
"os"
"strings"
"testing"
"time"
)
// makeResp builds a minimal *http.Response suitable for MaybeStore.
// ct is Content-Type; maxAge is the max-age directive (0 = omit = use DEFAULT_TTL);
// negative means "no-store" in Cache-Control.
func makeResp(statusCode int, ct string, maxAge int, body []byte) *http.Response {
hdr := http.Header{}
if ct != "" {
hdr.Set("Content-Type", ct)
}
switch {
case maxAge < 0:
hdr.Set("Cache-Control", "no-store")
case maxAge > 0:
hdr.Set("Cache-Control", fmt.Sprintf("max-age=%d", maxAge))
}
return &http.Response{
StatusCode: statusCode,
Header: hdr,
Body: io.NopCloser(bytes.NewReader(body)),
}
}
// makeGET builds a minimal GET *http.Request for the given url.
func makeGET(rawURL string) *http.Request {
req, _ := http.NewRequest(http.MethodGet, rawURL, nil)
return req
}
// --- TestMediaCacheStoreAndGet ---------------------------------------------------
func TestMediaCacheStoreAndGet(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://example.com/image.png"
body := []byte("PNG_BYTES")
req := makeGET(testURL)
resp := makeResp(200, "image/png", 3600, body)
mc.MaybeStore(req, resp, body, testURL)
got, hdr, ok := mc.Get(testURL)
if !ok {
t.Fatal("expected cache hit, got miss")
}
if !bytes.Equal(got, body) {
t.Fatalf("body mismatch: got %q want %q", got, body)
}
if ct := hdr.Get("Content-Type"); ct != "image/png" {
t.Fatalf("Content-Type mismatch: got %q", ct)
}
}
// --- TestMediaCacheRejectsNonMedia -----------------------------------------------
func TestMediaCacheRejectsNonMedia(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://example.com/page.html"
body := []byte("<html>hello</html>")
req := makeGET(testURL)
resp := makeResp(200, "text/html", 3600, body)
mc.MaybeStore(req, resp, body, testURL)
_, _, ok := mc.Get(testURL)
if ok {
t.Fatal("text/html must not be cached")
}
}
// --- TestMediaCacheRejectsOversize -----------------------------------------------
func TestMediaCacheRejectsOversize(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://example.com/bigvideo.mp4"
// 16 MiB + 1 byte — just over the limit
bigBody := make([]byte, 16*1024*1024+1)
req := makeGET(testURL)
resp := makeResp(200, "video/mp4", 3600, bigBody)
mc.MaybeStore(req, resp, bigBody, testURL)
_, _, ok := mc.Get(testURL)
if ok {
t.Fatal("oversized object must not be cached")
}
}
// --- TestMediaCacheExpiry --------------------------------------------------------
func TestMediaCacheExpiry(t *testing.T) {
dir := t.TempDir()
// Use a time seam: set nowFn to control the clock.
mc := NewMediaCache(dir)
// Fix "now" at epoch so we can advance it.
epoch := time.Unix(1_000_000, 0)
mc.nowFn = func() time.Time { return epoch }
const testURL = "http://example.com/icon.png"
body := []byte("ICO")
req := makeGET(testURL)
// TTL = 1 second
resp := makeResp(200, "image/png", 1, body)
mc.MaybeStore(req, resp, body, testURL)
// Before expiry: should hit.
if _, _, ok := mc.Get(testURL); !ok {
t.Fatal("expected hit before TTL expires")
}
// Advance clock past TTL.
mc.nowFn = func() time.Time { return epoch.Add(2 * time.Second) }
if _, _, ok := mc.Get(testURL); ok {
t.Fatal("expected miss after TTL expires")
}
}
// --- TestMediaCacheHandlerServesHit ---------------------------------------------
// TestMediaCacheHandlerServesHit verifies that the handler serves a cached
// response without hitting the upstream backend.
func TestMediaCacheHandlerServesHit(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://media.example.com/logo.png"
// The handler computes vhostCacheURL = "https://" + host + path, so
// pre-populate with the same key format to ensure a cache hit.
const cacheKey = "https://media.example.com/logo.png"
body := []byte("PNG_DATA")
// Pre-populate cache directly using the same key the handler will use.
req := makeGET(testURL)
resp := makeResp(200, "image/png", 3600, body)
mc.MaybeStore(req, resp, body, cacheKey)
// Backend that must NOT be called.
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Error("backend was called on a cache hit — should have been short-circuited")
http.Error(w, "backend called", http.StatusInternalServerError)
}))
defer backend.Close()
backendAddr := strings.TrimPrefix(backend.URL, "http://")
h, p, err := splitHostPort(backendAddr)
if err != nil {
t.Fatalf("splitHostPort: %v", err)
}
srv := &Server{
mediaCache: mc,
routeLookup: func(host string) (string, int, bool) {
if host == "media.example.com" {
return h, p, true
}
return "", 0, false
},
}
handler := srv.handler()
httpReq := httptest.NewRequest(http.MethodGet, testURL, nil)
httpReq.Host = "media.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httpReq)
res := rec.Result()
if res.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", res.StatusCode)
}
got, _ := io.ReadAll(res.Body)
if !bytes.Equal(got, body) {
t.Fatalf("body mismatch: got %q want %q", got, body)
}
if v := res.Header.Get("X-SecuBox-Cache"); v != "hit" {
t.Fatalf("expected X-SecuBox-Cache: hit, got %q", v)
}
}
// --- TestMediaCacheNoStoreSkipped -----------------------------------------------
func TestMediaCacheNoStoreSkipped(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://example.com/priv.png"
body := []byte("PNG")
req := makeGET(testURL)
resp := makeResp(200, "image/png", -1, body) // -1 → no-store
mc.MaybeStore(req, resp, body, testURL)
if _, _, ok := mc.Get(testURL); ok {
t.Fatal("no-store response must not be cached")
}
}
// --- TestMediaCacheStatsIncrement -----------------------------------------------
func TestMediaCacheStatsIncrement(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://example.com/audio.mp3"
body := []byte("MP3")
req := makeGET(testURL)
resp := makeResp(200, "audio/mpeg", 3600, body)
mc.MaybeStore(req, resp, body, testURL)
s1 := mc.Stats()
if s1.Stored != 1 {
t.Fatalf("expected Stored=1, got %d", s1.Stored)
}
// Hit
mc.Get(testURL)
s2 := mc.Stats()
if s2.Hits != 1 {
t.Fatalf("expected Hits=1, got %d", s2.Hits)
}
// Miss
mc.Get("http://example.com/notfound.mp3")
s3 := mc.Stats()
if s3.Misses != 1 {
t.Fatalf("expected Misses=1, got %d", s3.Misses)
}
}
// --- TestMediaCacheEviction -----------------------------------------------------
func TestMediaCacheEviction(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
// Set a tiny cap so we can trigger eviction easily.
// 100 bytes total, each object ~30 bytes.
mc.maxTotal = 100
for i := 0; i < 5; i++ {
u := fmt.Sprintf("http://example.com/img%d.png", i)
body := make([]byte, 30)
req := makeGET(u)
resp := makeResp(200, "image/png", 3600, body)
mc.MaybeStore(req, resp, body, u)
}
s := mc.Stats()
if s.Evicted == 0 {
t.Fatalf("expected evictions with a 100-byte cap and 5×30-byte objects")
}
// Confirm total is at or below cap.
if s.BytesCached > mc.maxTotal {
t.Fatalf("total %d exceeds cap %d after eviction", s.BytesCached, mc.maxTotal)
}
}
// --- TestMediaCacheNonGETNotCached ----------------------------------------------
func TestMediaCacheNonGETNotCached(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://example.com/upload.png"
body := []byte("PNG")
// POST, not GET
req, _ := http.NewRequest(http.MethodPost, testURL, nil)
resp := makeResp(200, "image/png", 3600, body)
mc.MaybeStore(req, resp, body, testURL)
if _, _, ok := mc.Get(testURL); ok {
t.Fatal("POST response must not be cached")
}
}
// --- TestMediaCachePersistenceAcrossRestart ------------------------------------
func TestMediaCachePersistenceAcrossRestart(t *testing.T) {
dir := t.TempDir()
mc1 := NewMediaCache(dir)
const testURL = "http://example.com/persist.png"
body := []byte("PERSIST")
req := makeGET(testURL)
resp := makeResp(200, "image/png", 3600, body)
mc1.MaybeStore(req, resp, body, testURL)
// "Restart": new cache instance pointing at same dir.
mc2 := NewMediaCache(dir)
got, _, ok := mc2.Get(testURL)
if !ok {
t.Fatal("expected cache hit after restart (on-disk persistence)")
}
if !bytes.Equal(got, body) {
t.Fatalf("body mismatch after restart: got %q want %q", got, body)
}
}
// --- TestMediaCacheHandlerMissStores -------------------------------------------
// TestMediaCacheHandlerMissStores verifies that on a cache miss the handler
// proxies to the backend, stores the response, and subsequent requests are
// served from cache.
func TestMediaCacheHandlerMissStores(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
body := []byte("IMAGE_FROM_BACKEND")
calls := 0
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
calls++
w.Header().Set("Content-Type", "image/png")
w.Header().Set("Cache-Control", "max-age=3600")
w.WriteHeader(http.StatusOK)
_, _ = w.Write(body)
}))
defer backend.Close()
backendAddr := strings.TrimPrefix(backend.URL, "http://")
h, p, err := splitHostPort(backendAddr)
if err != nil {
t.Fatalf("splitHostPort: %v", err)
}
srv := &Server{
mediaCache: mc,
routeLookup: func(host string) (string, int, bool) {
if host == "media.example.com" {
return h, p, true
}
return "", 0, false
},
}
handler := srv.handler()
// First request: miss → backend called → stored.
req1 := httptest.NewRequest(http.MethodGet, "http://media.example.com/img.png", nil)
req1.Host = "media.example.com"
rec1 := httptest.NewRecorder()
handler.ServeHTTP(rec1, req1)
if rec1.Code != http.StatusOK {
t.Fatalf("first request: expected 200, got %d", rec1.Code)
}
got1, _ := io.ReadAll(rec1.Result().Body)
if !bytes.Equal(got1, body) {
t.Fatalf("first request body mismatch: %q", got1)
}
if calls != 1 {
t.Fatalf("expected 1 backend call, got %d", calls)
}
// Second request: should hit cache.
req2 := httptest.NewRequest(http.MethodGet, "http://media.example.com/img.png", nil)
req2.Host = "media.example.com"
rec2 := httptest.NewRecorder()
handler.ServeHTTP(rec2, req2)
if rec2.Code != http.StatusOK {
t.Fatalf("second request: expected 200, got %d", rec2.Code)
}
if v := rec2.Header().Get("X-SecuBox-Cache"); v != "hit" {
t.Fatalf("expected X-SecuBox-Cache: hit on second request, got %q", v)
}
if calls != 1 {
t.Fatalf("expected still 1 backend call (cache hit), got %d", calls)
}
}
// --- TestMediaCacheHandlerOversizeStreamsFullBody -------------------------------
// TestMediaCacheHandlerOversizeStreamsFullBody is a regression guard for the
// overflow branch of cachingResponseWriter.Write. It verifies that when an
// upstream returns a media response whose body exceeds 16 MiB (the cache object
// cap), the FULL body is still forwarded to the client — not truncated — and that
// the object is NOT stored in the cache (Get → ok=false).
//
// This test guards against any future refactor that might accidentally return
// early or drop bytes when the overflow flag is set, silently truncating large
// progressive-video downloads.
func TestMediaCacheHandlerOversizeStreamsFullBody(t *testing.T) {
const oversizeLen = 16*1024*1024 + 64*1024 // 16 MiB + 64 KiB
// Build a deterministic body so we can checksum it end-to-end.
bigBody := make([]byte, oversizeLen)
for i := range bigBody {
bigBody[i] = byte(i & 0xff)
}
wantSum := sha256.Sum256(bigBody)
dir := t.TempDir()
mc := NewMediaCache(dir)
const testURL = "http://video.example.com/big.mp4"
// Backend returns a video/mp4 response larger than maxObj.
backend := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "video/mp4")
w.Header().Set("Cache-Control", "max-age=3600")
w.WriteHeader(http.StatusOK)
if _, err := w.Write(bigBody); err != nil {
t.Errorf("backend Write error: %v", err)
}
}))
defer backend.Close()
backendAddr := strings.TrimPrefix(backend.URL, "http://")
h, p, err := splitHostPort(backendAddr)
if err != nil {
t.Fatalf("splitHostPort: %v", err)
}
srv := &Server{
mediaCache: mc,
routeLookup: func(host string) (string, int, bool) {
if host == "video.example.com" {
return h, p, true
}
return "", 0, false
},
}
handler := srv.handler()
req := httptest.NewRequest(http.MethodGet, testURL, nil)
req.Host = "video.example.com"
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
res := rec.Result()
if res.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", res.StatusCode)
}
// Read the FULL client body and verify nothing was truncated.
gotBody, err := io.ReadAll(res.Body)
if err != nil {
t.Fatalf("ReadAll response body: %v", err)
}
if len(gotBody) != oversizeLen {
t.Fatalf("client received %d bytes, want %d (full body truncated!)", len(gotBody), oversizeLen)
}
gotSum := sha256.Sum256(gotBody)
if gotSum != wantSum {
t.Fatal("client body checksum mismatch — content corrupted in overflow path")
}
// Verify the oversize object was NOT cached.
_, _, cached := mc.Get(testURL)
if cached {
t.Fatal("oversize object must NOT be cached (exceeds 16 MiB per-object cap)")
}
}
// --- TestMediaCacheVhostIsolation -----------------------------------------------
// TestMediaCacheVhostIsolation verifies that two different vhosts serving the
// same asset path receive independent cache entries. Without vhost-aware keys
// a /logo.png stored for siteA would collide with the lookup for siteB — cross-
// tenant content bleed.
//
// This mirrors the Python media_cache.py behaviour where r.pretty_url (full URL
// including host) is used as the cache key instead of path-only.
func TestMediaCacheVhostIsolation(t *testing.T) {
dir := t.TempDir()
mc := NewMediaCache(dir)
const path = "/x.png"
const hostA = "siteA.example.com"
const hostB = "siteB.example.com"
keyA := "https://" + hostA + path
keyB := "https://" + hostB + path
bodyA := []byte("LOGO_FOR_SITE_A")
bodyB := []byte("LOGO_FOR_SITE_B")
// Store an object for host A only.
reqA := makeGET("http://" + hostA + path)
respA := makeResp(200, "image/png", 3600, bodyA)
mc.MaybeStore(reqA, respA, bodyA, keyA)
// Host A should hit with its own content.
got, _, ok := mc.Get(keyA)
if !ok {
t.Fatal("expected cache HIT for host A")
}
if !bytes.Equal(got, bodyA) {
t.Fatalf("host A body mismatch: got %q want %q", got, bodyA)
}
// Host B — same path, different vhost — must be a MISS (vhost isolation).
_, _, ok = mc.Get(keyB)
if ok {
t.Fatal("expected cache MISS for host B (vhost isolation violated — cross-tenant bleed)")
}
// Store a different object for host B.
reqB := makeGET("http://" + hostB + path)
respB := makeResp(200, "image/png", 3600, bodyB)
mc.MaybeStore(reqB, respB, bodyB, keyB)
// Now host B hits with its own content, not host A's.
got, _, ok = mc.Get(keyB)
if !ok {
t.Fatal("expected cache HIT for host B after store")
}
if !bytes.Equal(got, bodyB) {
t.Fatalf("host B body mismatch: got %q want %q", got, bodyB)
}
// Host A must still serve its own content unchanged.
got, _, ok = mc.Get(keyA)
if !ok {
t.Fatal("expected cache HIT for host A to persist after host B was stored")
}
if !bytes.Equal(got, bodyA) {
t.Fatalf("host A body mismatch after host B store: got %q want %q", got, bodyA)
}
}
// Ensure os is used (for t.TempDir reference to filesystem).
var _ = os.DevNull

View File

@ -0,0 +1,317 @@
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
//
// SecuBox-Deb :: toolbox-ng :: sbxwaf — WAF decision parity harness
//
// TestWAFParity loads the production waf-rules.json (copied to testdata/) and
// the fixture corpus (testdata/waf-parity-fixtures.json), then replays each
// fixture through the exact decision path the production handler uses:
//
// 1. privateCIDR(ip) → verdict "skip"
// 2. staticAsset(path) → verdict "skip"
// 3. ncBypass(path) → verdict "skip"
// 4. Rules.Match(method, rawPath, rawQuery, body, ua)
// → no hit → verdict "allow"
// → hit → ban.Record(ip, now) → count<3 → "warn" / count>=3 → "ban"
//
// Each fixture's "expect" field must equal the computed verdict. Mismatches
// FAIL the test immediately with fixture name, expected, and got.
//
// Fixtures flagged "known_gap": true are EXPECTED to return "allow" (the Go
// engine skips the null-byte RE2 patterns that Python would catch). These rows
// are asserted against their documented gap behaviour AND log a visible
// "KNOWN GAP" line so coverage loss is never silent.
//
// Ban sequencing: fixtures sharing the same client_ip are processed in JSON
// order; the ban counter accumulates across all fixture rows for that IP.
// Fixtures with "_ban_sequence" > 1 rely on prior fixture rows having already
// been processed for the same client_ip.
package main
import (
"encoding/json"
"fmt"
"os"
"path/filepath"
"runtime"
"testing"
"time"
)
// parityFixture is one row in waf-parity-fixtures.json.
type parityFixture struct {
// Fixture identity
Name string `json:"name"`
// HTTP request parameters (raw, not decoded)
Method string `json:"method"`
Path string `json:"path"`
Query string `json:"query"`
Body