mirror of
https://github.com/CyberMind-FR/secubox-deb.git
synced 2026-06-29 19:43:10 +00:00
Compare commits
42 Commits
69f19e72da
...
381eb3b8f5
| Author | SHA1 | Date | |
|---|---|---|---|
| 381eb3b8f5 | |||
| f9affe1e8b | |||
| eea4632642 | |||
|
|
c7d354a153 | ||
| 8e009e0aa6 | |||
|
|
e0cd433485 | ||
| 8ffe54ee0d | |||
| 449b28f8a1 | |||
| 4ef6d3aa76 | |||
| af76e33b45 | |||
|
|
8df8f4d181 | ||
| 70d35eb7f2 | |||
| 73795bb3c3 | |||
| 03fdc8fe14 | |||
|
|
223f81ac63 | ||
| bf022f618f | |||
| 9df984c73f | |||
| 5acfdb17c6 | |||
| 364b8c4a30 | |||
| ba933a6ec3 | |||
| 67e85ba4dd | |||
| 5fb67f5b88 | |||
|
|
c870b6362b | ||
| de15a18c30 | |||
| f65af3355c | |||
|
|
7355e606ca | ||
| e594f681a4 | |||
| 0db96a8beb | |||
| 667d8a09e0 | |||
| 170619053f | |||
|
|
25f6c19586 | ||
| 6dcf978e66 | |||
| df052796d9 | |||
| 5fc8785d68 | |||
| 25a3afaff1 | |||
|
|
84f0a37fdf | ||
| ca9b38b175 | |||
| 8a4996d14c | |||
|
|
da71515d79 | ||
| 73e79b85b4 | |||
|
|
56d1bee9fb | ||
| 6daacb1987 |
|
|
@ -3,6 +3,70 @@
|
|||
|
||||
---
|
||||
|
||||
## 2026-06-18 — #662 Phase 7: Python R3 engine DECOMMISSIONED + nft persistence
|
||||
|
||||
- **nft persistence** (master `eea46326`): the boot re-apply source is the drop-in
|
||||
`/etc/nftables.d/zz-secubox-toolbox-wg-fanout.nft` (loaded by nftables.service). Edited
|
||||
it `808x→809x` (live already 809x → zero disruption), `nft -c -f` validated reboot-safe;
|
||||
patched the repo source `packages/secubox-toolbox/nftables.d/secubox-toolbox-wg-fanout.nft`.
|
||||
- **Python decommissioned**: `disable --now secubox-toolbox-mitm-wg-worker@{1..4}` +
|
||||
`-mitm-wg-dynreload.path` → 8081-8084 free, **~240M RAM freed**. Units kept (disabled)
|
||||
for emergency rollback. **Kept** `secubox-toolbox-mitm.service` (R2 captive-AP mitm on
|
||||
10.99.0.1:8080 — a different path; the cutover was R3-only). Also pointed the board's
|
||||
`/usr/share/.../secubox-toolbox-wg-fanout.nft` → 809x so a postinst re-run can't revert
|
||||
to dead ports.
|
||||
- **Verified self-sufficient with Python gone**: banner injects on gzip HTML, ads 204,
|
||||
redirects relayed 301.
|
||||
- Deliberately did NOT rebuild+reinstall the secubox-toolbox .deb (portal-restart blip +
|
||||
board-wide nft reload, gratuitous) — repo source is 809x, the next natural build closes
|
||||
the installed-payload drift. **#662 epic complete: Go engine sole R3 MITM, fast, ~64MB
|
||||
vs ~280-470MB, persistent, ad-block + banner + redirects all correct.**
|
||||
|
||||
## 2026-06-18 — #662 R3 CUTOVER to the Go MITM engine (PR #670) — LIVE + banner ported
|
||||
|
||||
- **Cutover executed and live.** The Go engine now serves **100% of R3 traffic**,
|
||||
replacing the Python mitmproxy workers. Found + fixed 4 blockers that made the dark
|
||||
package unable to serve the live path: (1) it forged with the wrong CA (ca-wg "WG CA"
|
||||
vs the "R3 CA" clients trust) → now uses the mitmproxy confdir bundle; (2) root-only
|
||||
key vs non-root user → R3 CA bundle is group-readable; (3) bound 127.0.0.1 vs the
|
||||
10.99.1.1 DNAT target → now binds 10.99.1.1; (4) ran CONNECT vs transparent → now
|
||||
`--transparent`. `loadCA` scans PEM blocks by type (combined cert+key bundle).
|
||||
- **Validated on real arm64 hardware** then rolled out gated: localhost forge against
|
||||
the real R3 CA → scoped-DNAT transparent capture → **canary slot 3 (~25%, dead-man
|
||||
armed)** → **widen to 100%**. At 100%: 0 restarts, 0 errors, ~64MB total
|
||||
(vs Python ~280-470MB), even round-robin, 142 distinct SNIs/75s.
|
||||
- **Banner ported** (the one regression the user caught — "no more banner but fast").
|
||||
Go now injects the real loader `<script src="/__toolbox/loader.js" data-mh=.. data-wg=..>`
|
||||
(guard-idempotent, R3 wg flag, mac_hash identity) and reverse-proxies
|
||||
`/__toolbox/loader.js`+`/__toolbox/bundle` to the portal (127.0.0.1:8088, fail-open),
|
||||
keeping bundle/level logic in Python. Verified live: loader injected + assets 200.
|
||||
- **Rollback** = one `nft replace` (Python workers kept warm). **Persistence gap**: the
|
||||
nft flip is a live edit, not yet in the drift-managed generator → reboot safely falls
|
||||
back to Python (workers enabled, banner intact). Phase 7 (decommission Python +
|
||||
persist nft) deferred to a soak'd follow-up.
|
||||
|
||||
## 2026-06-18 — #662 MITM engine migration: P5-prep + P6-prep (PRs #668, #669, all DARK)
|
||||
|
||||
- **P5-prep (PR #668).** Wired the ported `Decide`+jar into the Go engine's request/
|
||||
response handlers: `handleConnect` runs allow/splice/block/mitm; `anonymizeRequest`
|
||||
(strip operator/re-id headers + DNT/GPC) on every MITM'd flow; cookie-poison gated
|
||||
to mitm+tracker only (never allow/own-infra; fail-closed-to-clean; benign cookies +
|
||||
Set-Cookie attrs preserved). New `secubox-toolbox-ng` debian pkg builds an arm64
|
||||
`.deb` shipping `/usr/sbin/sbxmitm` + a **DISABLED** `worker@.service` on `:809%i`
|
||||
(no enable/start, no nft). 22 Go tests, reviewed APPROVED.
|
||||
- **P6-prep (PR #669).** No-traffic build-out of the live transparent path, still DARK.
|
||||
`machash.go` ports `mac_hash_of`/`_wg_hash_of` (WG peers → `sha256(pubkey)[:16]`,
|
||||
mtime-cached, fail-open) wired into `clientHashFromConn`, cross-engine parity vs
|
||||
Python (anti-rig verified). Transparent `SO_ORIGINAL_DST` accept (`--transparent`,
|
||||
default off): peeks ClientHello SNI WITHOUT decrypting → Decide → **splice = true raw
|
||||
passthrough** (never `tls.Server`) / else forge via replayable `prefixConn`; upstream
|
||||
TLS verifies by SNI, pins captured ip:port. Two-stage review caught + fixed a
|
||||
splice-decrypt defect. Builds linux/arm64+amd64+darwin, vet clean, race green, Python
|
||||
parity 10 passed. CONNECT path + poison gate byte-unchanged.
|
||||
- **Engine now functionally complete + packaged, entirely DARK.** Remaining work =
|
||||
the production DEPLOYMENT phases (shadow → cutover → decommission), which touch live
|
||||
R3 traffic and are deferred to a deliberate watched session — NOT chained off "go".
|
||||
|
||||
## 2026-06-18 — #656 Ad Intelligence (PR #657, toolbox 2.6.56) + splice reverted
|
||||
|
||||
- **Ad Intelligence — learn/act/measure.** `ad_ghost` now records every
|
||||
|
|
|
|||
55
docs/superpowers/plans/2026-06-18-mitm-engine-migration.md
Normal file
55
docs/superpowers/plans/2026-06-18-mitm-engine-migration.md
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
# Toolbox MITM engine migration — phased plan (#662)
|
||||
|
||||
> Engine: **Go hot-path core + retained Python analysis sidecars** (see analysis doc).
|
||||
> Discipline: shadow-run before cutover; nft-DNAT flip = instant rollback at every step; NEVER big-bang. This is a multi-PR epic — each phase is its own PR with a gate.
|
||||
|
||||
## Invariants (must hold every phase)
|
||||
- Reuse the existing CA `/etc/secubox/toolbox/ca-wg/{ca.pem,key.pem}` (what R3 clients already trust) — no new CA, no client re-enroll.
|
||||
- Live R3 keeps running on the Python mitmproxy workers (8081-8084) until the final cutover. The Go core runs on **separate ports (8090-8093)**, no DNAT, until Phase 6.
|
||||
- Ad-blocking + anti-track must never regress (the whole point of the appliance).
|
||||
- arm64; one static Go binary; systemd `secubox-toolbox-ng-worker@N`.
|
||||
|
||||
## Phase 1 — PoC (THIS PR) — GATE: compiles + smoke test passes
|
||||
**packages/secubox-toolbox-ng/** (Go module). NOT wired to live R3.
|
||||
- `go.mod`, `cmd/sbxmitm/main.go`: a forging MITM that loads `ca-wg/{ca.pem,key.pem}`, listens on a port, and demonstrates the discriminating capabilities:
|
||||
- request short-circuit **204** for a sample ad host (proves ad_ghost block),
|
||||
- response **body inject** of a marker (proves banner/ad CSS),
|
||||
- **SNI splice** passthrough for a sample host (proves tls_splice),
|
||||
- **JA4 ClientHello capture** via a `crypto/tls` shim logging cipher suites/exts (proves the Go JA4 gap is closable).
|
||||
- Smoke test (`make test` / a shell script): build for host, run, `curl -x`/transparent a request through it, assert the 204 + the injected marker + a JA4 line.
|
||||
- `README.md`: build (`GOOS=linux GOARCH=arm64 go build`), the capability map, and the phase roadmap.
|
||||
- **No deb packaging, no board deploy, no DNAT.** Pure de-risking spike.
|
||||
|
||||
## Phase 2 — arm64 build + board bench (no traffic) — GATE: forge+throughput ≥ mitmproxy
|
||||
- CI/build: cross-compile arm64 static binary; debian packaging stub `secubox-toolbox-ng` (binary + systemd unit, unit DISABLED).
|
||||
- Deploy the binary to gk2, run on :8090 (no DNAT). Bench: cert-forge latency (cold/warm), req/s, multi-core CPU under synthetic load vs a mitmproxy worker. Confirm it reuses ca-wg certs (client trusts forged leaf).
|
||||
|
||||
## Phase 3 — hot-path feature parity — GATE: parity tests green
|
||||
Port the cheap per-request rewrites into the Go core, reading the SAME data files:
|
||||
- block 204 from `_AD_HOST`-equivalent + learned-trackers.txt + pure-trackers.txt, with `ad-allowlist.txt` + own-infra guard (#658) honored.
|
||||
- header/cookie strip (utiq/protective/anonymize), XFF.
|
||||
- serve `/__toolbox/loader.js` + `/__toolbox/bundle`; banner inject (buffer + streaming).
|
||||
- SNI splice from the media seed + learned-splice (the safe, no-auto-promote version).
|
||||
- Parity harness: feed recorded request/response fixtures to both engines, diff the block/inject/strip decisions.
|
||||
|
||||
## Phase 4 — analysis sidecars + anti-track poison — GATE: sidecar contract tests
|
||||
- Go core fires unix-socket events (fire-and-forget) to the EXISTING Python services for social-graph / dpi / cookies / avatar / soc / ja4-scoring — reuse their socket contracts; they stay Python, off the hot path.
|
||||
- Port the deterministic anti-track **HMAC jar + Set-Cookie forge** to Go (small, security-critical → exhaustive tests vs the Python `privacy.py` jar output for identical inputs).
|
||||
- Contextual ad metrics (ad_block_stats / per-visitor) written by a sidecar or the Go core's bg writer.
|
||||
|
||||
## Phase 5 — SHADOW run — GATE: N-day output parity, zero client breakage
|
||||
- Run the Go core on :8090-8093. Mirror a SMALL fraction of R3 (e.g. one fanout slot, or a passive tee) to it; compare its would-block/would-inject/recorded against the live mitmproxy for the same flows. Do NOT serve clients from it yet.
|
||||
- Soak; review divergences; fix; repeat until parity.
|
||||
|
||||
## Phase 6 — CUTOVER — GATE: soak, instant rollback ready
|
||||
- Flip the nft `numgen inc mod 4` fanout from 8081-8084 (mitmproxy) → 8090-8093 (Go core). Keep the mitmproxy workers RUNNING (stopped from receiving DNAT, but up) so rollback = flip the map back (seconds).
|
||||
- Soak under real load; watch ad-blocking, banner, anti-track, JA4, latency, CPU.
|
||||
|
||||
## Phase 7 — decommission — GATE: stable post-cutover window
|
||||
- Stop/disable the mitmproxy workers; keep the package installed (rollback) for one release, then remove.
|
||||
|
||||
## Rollback
|
||||
At every phase the live path is the mitmproxy workers until Phase 6's DNAT flip; Phase 6 rollback is an nft map edit (seconds). No phase removes the fallback until Phase 7.
|
||||
|
||||
## Effort/risk (honest)
|
||||
Weeks across 7 PRs. Highest-risk areas: JA4-in-Go (de-risked in Phase 1), the anti-track poison port (Phase 4, exhaustively tested), and the cutover (Phase 6, shadow-gated + instant rollback). Recommend pausing after each gate for review.
|
||||
|
|
@ -0,0 +1,120 @@
|
|||
# Toolbox MITM engine migration — analysis (gomitmproxy / martian·goproxy / hudsucker / Squid+ICAP)
|
||||
|
||||
- **Date:** 2026-06-18 · **Issue:** #662 · **Status:** analysis + recommendation
|
||||
|
||||
## Why
|
||||
The R3 path runs Python **mitmproxy**: GIL-bound, ~1 core total across 4 workers,
|
||||
the tunnel's CPU/latency ceiling (#646). Goal: a multi-core engine **without
|
||||
losing the 18-addon feature set**. TLS termination was never the bottleneck —
|
||||
the single-thread L7 work is — so a bare TLS proxy is a non-starter (loses every
|
||||
feature). The only worthwhile target is a faster **L7 engine** that re-implements
|
||||
the inline logic.
|
||||
|
||||
## The real requirement: our 18 addons' capabilities
|
||||
| # | Addon | Capability it needs |
|
||||
|---|-------|---------------------|
|
||||
| 1 | inject_xff | requestheaders: set XFF from real peer IP |
|
||||
| 2 | utiq_defense | requestheaders: detect/strip operator (Utiq) headers; short-circuit |
|
||||
| 3 | protective_mode | requestheaders: strip tracker headers/cookies, spoof |
|
||||
| 4 | privacy_guard (anti-track v2) | **request 204 / forge Set-Cookie (HMAC jar) / strip headers**; classify; file+key reads |
|
||||
| 5 | ad_ghost | request **204** + candidate/per-visitor capture; response **CSS body inject**; allowlist; bg SQLite |
|
||||
| 6 | media_cache | response synthesis from disk cache (range) |
|
||||
| 7 | local_store | **tls_clienthello** read + async SQLite |
|
||||
| 8 | social_graph | response cookie-id correlation + **body peek** + SQLite |
|
||||
| 9 | inject_banner | request short-circuit **serve** /__toolbox/*; **streaming** body inject + buffered inject; CSP detect |
|
||||
| 10 | dpi | async fire-and-forget POST (unix socket) |
|
||||
| 11 | cookies | response Set-Cookie read → async POST |
|
||||
| 12 | avatar | UA → async POST |
|
||||
| 13 | ja4 | **raw TLS ClientHello** (cipher suites, extensions, ALPN) |
|
||||
| 14 | soc_relay | events → async POST |
|
||||
| 15 | cert_pin_detect | **TLS handshake-error** hook → learn ignore_hosts |
|
||||
| 16 | media_stats | response headers → stats |
|
||||
| 17 | tls_splice | **tls_clienthello SNI → connection passthrough** (ignore_connection) |
|
||||
| 18 | (dpi dup/util) | — |
|
||||
|
||||
Capability buckets that discriminate the engines:
|
||||
- **(C)** request short-circuit (return 204/synth without upstream) — ad_ghost, privacy_guard, inject_banner, media_cache.
|
||||
- **(E)** **streaming** response body rewrite (inject into first chunk, no buffering) — inject_banner TTFB path.
|
||||
- **(G)** **raw ClientHello introspection** for JA4 — ja4, local_store.
|
||||
- **(H)** **TLS-layer SNI passthrough/splice** — tls_splice, cert_pin_detect, bypass list.
|
||||
- **(I)** TLS handshake-error hook — cert_pin_detect.
|
||||
- **(J)** async side-effects (socket POST / bg SQLite) — 7 addons.
|
||||
|
||||
## Engine assessment
|
||||
|
||||
### gomitmproxy (Go, AdguardTeam) — DROP
|
||||
Purpose-built for ad-blocking MITM, but **last release v0.2.1 (2021), effectively
|
||||
unmaintained**. Reusing an abandoned TLS-handling core for a security appliance
|
||||
is the wrong bet. Cross off.
|
||||
|
||||
### martian (Google) / goproxy (elazarl) — Go, maintained
|
||||
- Strong on **B/C/D/F/J** (modifier/handler APIs return custom responses, modify
|
||||
headers/cookies/body; goroutines for async). Easy **arm64 cross-compile**
|
||||
(`GOOS=linux GOARCH=arm64`), single static binary — great fit for the appliance.
|
||||
- **Gaps:** **(G) JA4** — both abstract TLS at the HTTP layer; raw ClientHello
|
||||
isn't exposed by the modifier API. *Workaround:* wrap the listener with our own
|
||||
`crypto/tls` `Config.GetConfigForClient`/`GetCertificate` to capture the
|
||||
ClientHello before handing to the proxy — feasible, extra code. **(E) streaming
|
||||
inject** is manual (wrap the response body reader). **(H/I)** host-level
|
||||
splice/cert-error handling is doable at the CONNECT layer.
|
||||
- Verdict: pragmatic, lowest-friction toolchain, but JA4 + streaming need custom
|
||||
glue.
|
||||
|
||||
### hudsucker (Rust, omjadas + ideamans fork) — maintained
|
||||
- **Best technical coverage:** tokio/hyper async (**multi-core**), `HttpHandler`
|
||||
(C/D/F), **streaming bodies (E)** native, WebSocket. Critically, **rustls
|
||||
exposes the ClientHello** (Acceptor/`ClientHello` peek pre-handshake) → **JA4
|
||||
(G) is clean**, and SNI-based **splice (H)** is natural.
|
||||
- **Costs:** Rust **arm64 cross-compile friction** (no toolchain here; needs
|
||||
`cross`/musl setup), and porting 18 addons + the anti-track HMAC-jar/classify
|
||||
brain to Rust is the **highest re-implementation + re-validation effort**.
|
||||
- Verdict: technically the strongest (only one covering JA4 + streaming cleanly),
|
||||
but the heaviest port + ops.
|
||||
|
||||
### Squid + ssl-bump + ICAP — mature C, multi-process
|
||||
- **Native wins:** ssl-bump forges from one root key (A), **peek-and-splice (H)
|
||||
is literally tls_splice + the bypass list**, native cert-error handling (I),
|
||||
multi-process scaling. ICAP REQMOD/RESPMOD covers **C/D/F** (204, body rewrite,
|
||||
header/cookie mod) — ad_ghost/banner-buffer/poison can live in an ICAP service.
|
||||
- **Gaps:** **(E) streaming** inject — ICAP buffers, no first-chunk inject.
|
||||
**(G) JA4** — ICAP is post-decrypt HTTP; ClientHello isn't exposed to ICAP
|
||||
(Squid logs its own TLS details, not via ICAP). Heavy **ops/config**; each ICAP
|
||||
call is a round-trip; the anti-track HMAC-jar/poison + social-graph logic in an
|
||||
ICAP service is awkward (still Python, still off-core for analysis).
|
||||
- Verdict: least *custom proxy* code + native splice/cert handling, but loses
|
||||
JA4 + streaming-banner and trades Python addons for Squid-config + an ICAP
|
||||
service. Good if we drop JA4/streaming; otherwise a poor fit.
|
||||
|
||||
## Recommendation — **Go hot-path core + retained Python analysis sidecars** (hybrid)
|
||||
Single-engine "rewrite everything in Rust" is the highest risk; Squid loses JA4 +
|
||||
streaming. The lowest-risk path to multi-core that **preserves the
|
||||
security-validated Python brain**:
|
||||
|
||||
1. **Go core** (goproxy/martian or a thin `net/http`+`crypto/tls` forging proxy)
|
||||
owns the **hot path**: TLS forge (reusing `ca-wg`), SNI splice (H), the cheap
|
||||
per-request rewrites — block 204 (ad_ghost/privacy_guard), header/cookie strip
|
||||
(utiq/protective/anonymize), banner inject (E via body-reader wrap), serve
|
||||
/__toolbox/*. Multi-core, one static arm64 binary.
|
||||
2. **JA4 (G)** in Go via a `crypto/tls` ClientHello-capture shim (no Python).
|
||||
3. **Heavy/off-path analysis stays Python sidecars** the Go core feeds
|
||||
fire-and-forget over unix sockets (J): social-graph correlation, classify,
|
||||
DB/report writers, SOC/DPI relays. These are already async + off the hot path,
|
||||
so they don't need to be fast — and we DON'T re-validate the anti-track
|
||||
HMAC-jar/poison + cookie-graph security logic in a new language.
|
||||
4. The anti-track **poison** (forge Set-Cookie from the HMAC jar) is hot-path +
|
||||
security-critical → port the *deterministic* jar/forge to Go (small, testable),
|
||||
keep classify (which list a host is on) as data the Go core reads from the
|
||||
learned/pure files (already file-based).
|
||||
|
||||
This gets multi-core on the hot path, keeps the risky brain in validated Python,
|
||||
and only ports the small, mechanical, hot pieces. If JA4-in-Go proves painful, the
|
||||
fallback is **hudsucker** (Rust) for the core (clean JA4) at higher port cost.
|
||||
|
||||
## Honest effort/risk
|
||||
- **Weeks, multi-PR.** 18 addons; security-critical; production board.
|
||||
- Must **shadow-run** the new core alongside mitmproxy (mirror a fraction of R3
|
||||
traffic) and compare before any cutover. **Never** big-bang.
|
||||
- Rollback = the nft fanout still points at the mitmproxy workers until the final
|
||||
cutover flips the DNAT to the Go core's ports.
|
||||
|
||||
See the phased plan: `docs/superpowers/plans/2026-06-18-mitm-engine-migration.md`.
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
# MITM engine migration — Phase 2 bench results (#662)
|
||||
|
||||
- **Date:** 2026-06-18 · ran the Phase-1 Go PoC on gk2 (arm64), `127.0.0.1:8090`,
|
||||
**no DNAT** (zero impact on live R3, which stayed on the mitmproxy workers).
|
||||
|
||||
## Proven on the real arm64 board (with the live `ca-wg` CA)
|
||||
| Check | Result |
|
||||
|-------|--------|
|
||||
| Static arm64 binary | 5.4 MB, `ELF aarch64`, CGO_ENABLED=0 — runs natively on gk2 |
|
||||
| **CA-compat forging** | `curl -x :8090 --cacert ca.pem https://example.com/` → **200**; the forged leaf (signed by the existing `ca-wg` CA) is trusted — R3 clients would trust it, no re-enroll |
|
||||
| **MITM + body inject** | injected `<!-- sbx-ng banner -->` marker present in the HTML |
|
||||
| **204 block** | `https://doubleclick.net/` → **204** (ad_ghost path) |
|
||||
| **JA4 capture** | live: `t0304_c31_ah2_sni=example.com` (TLS1.3 / 31 ciphers / ALPN h2 / SNI) — the `ja4` addon's material is reachable in Go on arm64 |
|
||||
| **Footprint** | **~12 MB RSS** vs Python mitmproxy ~70–117 MB per worker |
|
||||
|
||||
So every **discriminating capability** the analysis flagged (CA-compat, request-204,
|
||||
body-inject, SNI-splice, JA4) works on the actual hardware, at ~1/6th the memory.
|
||||
|
||||
## Gate: "forge + throughput ≥ mitmproxy" — PARTIAL
|
||||
- **Forge:** ✅ proven (CA-compat, cached per host, fast).
|
||||
- **Footprint:** ✅ ~12 MB (far below mitmproxy).
|
||||
- **Throughput / multi-core:** ⚠️ **not cleanly measured.** The instantaneous-CPU
|
||||
sample was cut short by (a) a transient `wg-admin` ssh blip and (b) a
|
||||
`pkill -f sbxmitm` self-match bug (the kill matched its own ssh shell). Multi-core
|
||||
is **structurally guaranteed** — Go runs `GOMAXPROCS=4` with no GIL, vs Python
|
||||
mitmproxy capped ~1 core/worker — but a rigorous throughput-vs-mitmproxy
|
||||
comparison must be done in a **controlled load environment**, NOT by hammering
|
||||
the production board.
|
||||
|
||||
## Phase 2b — controlled multi-core throughput bench (SETTLES the gate)
|
||||
`BenchmarkHandshake` (cmd/sbxmitm/bench_test.go) drives full client↔proxy forged
|
||||
TLS handshakes in parallel at `-cpu=1,2,4,8` (dev box, warm forge cache):
|
||||
|
||||
| Cores | ns/handshake | handshakes/s | scaling |
|
||||
|-------|--------------|--------------|---------|
|
||||
| 1 | 398,895 | ~2,510 | 1.00× |
|
||||
| 2 | 204,116 | ~4,900 | **1.95×** |
|
||||
| 4 | 117,307 | ~8,520 | **3.40×** |
|
||||
| 8 | 86,999 | ~11,490 | 4.58× |
|
||||
|
||||
Near-linear to 2 cores, **3.40× at 4 cores** (gk2's core count) — the Go core's
|
||||
throughput **scales with cores**, whereas a GIL-bound Python mitmproxy worker
|
||||
stays ~1 core regardless. So on gk2's 4 cores the Go core does ~3.4× the handshake
|
||||
throughput of one Python worker; ~2,510 handshakes/s even single-core dwarfs the
|
||||
toolbox's real load (a few clients).
|
||||
|
||||
## Conclusion (Phase 2 + 2b)
|
||||
Migration premise **validated on real hardware**: CA-compat + all L7/TLS
|
||||
discriminators + ~12 MB footprint (arm64) + **multi-core throughput scaling**
|
||||
(3.4× at 4 cores). The big unknowns are answered; what remains is
|
||||
mechanical-but-large porting (Phase 3+) + a gated cutover.
|
||||
|
||||
## Ops note
|
||||
The PoC was localhost-only (`127.0.0.1:8090`), no DNAT, cleaned up (`fuser -k
|
||||
8090/tcp` + binary removed). LESSON: never `pkill -f <name>` over ssh when `<name>`
|
||||
appears in the remote command line — it kills its own shell; use `fuser -k
|
||||
<port>/tcp` or `pgrep | grep -v $$` + kill-by-PID.
|
||||
|
||||
## Next
|
||||
Phase 2 + 2b gates PASSED. → **Phase 3** (hot-path feature parity: port block/
|
||||
inject/strip/splice reading the real data files, parity harness vs the Python
|
||||
addons). Pause for review before committing to the port — see the phased plan.
|
||||
|
|
@ -0,0 +1,173 @@
|
|||
# WAF engine migration — feasibility analysis (#662 follow-on)
|
||||
|
||||
> Status: ANALYSIS ONLY. No code, no plan, nothing touched on the live WAF.
|
||||
> Question asked: *"can the #662 Go-engine technique be adapted to the WAF?"*
|
||||
> Date: 2026-06-18. Sibling of `2026-06-18-mitm-engine-migration-analysis.md`.
|
||||
|
||||
## TL;DR
|
||||
|
||||
Technically yes — and the hardest part of #662 (cert forging / transport / CA
|
||||
trust) **does not exist** for the WAF, because HAProxy already terminates TLS and
|
||||
hands mitmproxy cleartext. But the right move is **NOT** to hand-roll a Go WAF the
|
||||
way we hand-rolled the R3 engine. The WAF's decision logic is security-critical and
|
||||
synchronous (block-before-forward), which is exactly where bespoke code is most
|
||||
dangerous. The recommendation is to **ADOPT** a vetted engine (OWASP Coraza + CRS v4)
|
||||
rather than port our bespoke regex rules, and — if the non-WAF addons can be
|
||||
relocated — to **retire the in-path mitmproxy entirely** via HAProxy's SPOA, which
|
||||
also eliminates the WAF's worst failure mode (the single-backend SPOF that "downs all
|
||||
inspected vhosts").
|
||||
|
||||
Crucially, **the perf premise is weaker than #662's.** #662 had a measured CPU/latency
|
||||
ceiling on the R3 tunnel. The WAF is *not* currently throughput-bound. So the
|
||||
justification here is **resilience + security coverage + fewer band-aids**, not raw
|
||||
speed. Be honest about that when deciding whether it's worth the risk.
|
||||
|
||||
---
|
||||
|
||||
## 1. What the WAF actually is (grounded, repo + live board)
|
||||
|
||||
- **Reverse-proxy inspector**, not a transparent/forward MITM like R3. Path:
|
||||
external client → **HAProxy `*:443 ssl` (TLS 1.3 termination)** → cleartext HTTP →
|
||||
**mitmproxy `--mode regular` in the `mitmproxy` LXC (`10.100.0.60:8080`)** →
|
||||
backend vhosts. HAProxy rewrites to absolute-form (`set-uri http://Host/path`) so
|
||||
the forward-proxy accepts it.
|
||||
- **No TLS / no cert machinery on the WAF side.** mitmproxy never decrypts, never
|
||||
forges, holds no CA. (This removes the entire hard half of the #662 port.)
|
||||
- **Hot path (every request), deterministic:** host→backend dict lookup
|
||||
(live-reloaded from `/srv/mitmproxy/haproxy-routes.json`, 255 entries, 187 routed
|
||||
through inspection), then a single linear **regex scan** over
|
||||
`path+query+body+UA` against `waf-rules.json` (~90+ patterns: sqli/xss/cmdi/
|
||||
traversal/ssrf/xxe/log4shell/scanners/cve…), first-match-wins. Block = set
|
||||
`flow.response` to short-circuit → **synchronous, decide-before-forward**.
|
||||
- **Enforcement is graduated and mostly soft:** 1st/2nd hit → 403 *warning page*;
|
||||
3rd hit in 300 s (`BAN_THRESHOLD=3`) → ban via **CrowdSec LAPI** (`POST /v1/alerts`,
|
||||
JWT watcher) → `crowdsec-firewall-bouncer` drops at nft. The CrowdSec POST is a
|
||||
**synchronous `urllib` call (~up to 4 s) inside the request hook** — the clearest
|
||||
GIL/latency smell, trivially a goroutine in Go.
|
||||
- **Stateful bits are small:** per-IP sliding-window dict (in-memory, lost on
|
||||
restart; hit 1500+ entries under attack). Everything else is stateless.
|
||||
- **Three NON-WAF addons ride the same proxy:** `media_cache.py` (#607 disk cache for
|
||||
owned-vhost media), `cookie_audit.py` (RGPD Set-Cookie ledger, observational),
|
||||
and CDN **banner injection** (`response` hook, injects `<script>` before `</body>`
|
||||
on owned vhosts). These do **traffic transformation / caching** — a verdict-only
|
||||
WAF (SPOA) would not cover them; their fate must be decided (relocate, drop, or
|
||||
keep a thin in-path component).
|
||||
- **Two synced package copies:** `packages/secubox-mitmproxy/` (canonical, 1193-line
|
||||
addon, CrowdSec bridge + watchdog + FastAPI control) and the legacy
|
||||
`packages/secubox-waf/` (968-line, ships `wafctl` + the LXC unit). Sync-lag is a
|
||||
known liability (`.claude/TODO.md`).
|
||||
|
||||
## 2. Live performance — the decisive datum
|
||||
|
||||
| Metric (gk2, read-only) | Value |
|
||||
|---|---|
|
||||
| mitmproxy | 11.0.2 / Py 3.11 / **single process, single asyncio loop** (no multi-core) |
|
||||
| Request volume | **~3.6 req/s** sustained (mostly internet scanner probing) |
|
||||
| WAF CPU | **~17–53% of ONE core** (clean Δ ≈ 17%); ~5050 CPU-s over 12 d, niced |
|
||||
| Board load avg | ~3.5 on 4 cores — board near-saturated overall, WAF a minority |
|
||||
| Inspected vhosts | 187 of 255 routes, **one `mitmproxy_inspector` backend** |
|
||||
| Hardening band-aids | `MemoryMax=512M`, `RuntimeMaxSec=21600` (6 h forced restart), `http2=false`, loop-guard, `Connection: close` (FD-leak fix), nft pre-rate-limit, watchdog (lxc-restart on 3 probe fails) |
|
||||
|
||||
**Conclusion:** at today's load a rewrite is **not justified by throughput** — the
|
||||
WAF isn't pegging its core. The real motivations are: (1) the **single-threaded
|
||||
ceiling under attack/burst** (saturates ~7–10 req/s on the inspected path; a scan
|
||||
flood serializes through one loop), (2) the **single-backend SPOF** — with
|
||||
`waf_enabled`, *all* vhosts + the default route funnel through one inspector, so its
|
||||
death = board-wide 503 (the watchdog only turns a multi-hour outage into a ~3-min
|
||||
one), (3) the **resource pathologies** (FD/conn-pool leak, HTTP/2 memory drift)
|
||||
papered over by restarts. The project's own `.claude/PHASE-7-WAF-ROADMAP.md` already
|
||||
says it: *"mitmproxy is NOT a WAF tool… ModSec ~5× throughput of Python mitm."*
|
||||
|
||||
## 3. Why the #662 playbook only half-applies
|
||||
|
||||
| #662 (R3 anti-track) | WAF |
|
||||
|---|---|
|
||||
| Forward/transparent MITM, forges certs, CA trust, SO_ORIGINAL_DST — **hard** | Reverse proxy, **HAProxy already terminates TLS**, cleartext in — **easy** |
|
||||
| Decisions can be **async** (poison cookies fire-and-forget) | Decisions are **synchronous** (block before forward) — can't sidecar the verdict |
|
||||
| Feature-set was **bespoke** → hand-port justified | Detection is **generic WAF rules** → a vetted CRS exists → **adopt, don't port** |
|
||||
| Bug = degraded browsing (annoying) | Bug = **outage of all vhosts OR a security bypass** — far higher bar |
|
||||
| Clear measured perf ceiling drove it | **Not throughput-bound today** — weaker perf case |
|
||||
|
||||
So: transport is easier, but the part #662 deliberately kept in Python (the "risky
|
||||
brain") **is** the WAF's core and is on the synchronous critical path. The lesson is
|
||||
inverted: for R3 we built; for the WAF we should **adopt the engine** and only write
|
||||
thin glue.
|
||||
|
||||
## 4. Options (build-vs-adopt)
|
||||
|
||||
**Option A — HAProxy + `coraza-spoa` + CRS v4 (RECOMMENDED, if addons relocatable).**
|
||||
Keep HAProxy as-is; attach OWASP **Coraza** (CRS v4) as a **SPOA/SPOE agent**.
|
||||
HAProxy sends each request to the agent, **blocks for the verdict**, applies
|
||||
`http-request deny 403 if {var(txn.coraza.action) -m str deny}`. Pure-Go, clean
|
||||
arm64 (`CGO_ENABLED=0`). **Retires the in-path mitmproxy → eliminates the SPOF**
|
||||
(traffic no longer flows *through* the inspector; the agent is out-of-band, in-line
|
||||
only for the verdict). Adopts a community-vetted ruleset instead of our bespoke
|
||||
regex. *Gaps:* SPOA returns a **verdict only — no traffic transformation**, so
|
||||
banner-injection / media-cache / cookie-audit must move elsewhere or be dropped.
|
||||
*Risks:* `coraza-spoa` is **0.x (v0.7.2, 2026-05)**, no named prod adopters → pin +
|
||||
benchmark on arm64; **HAProxy 3.1+ requires `mode spop`** for the SPOA backend →
|
||||
check the board's HAProxy version before wiring.
|
||||
|
||||
**Option B — Go reverse-proxy embedding Coraza (`coraza/v3` `http.WrapHandler`).**
|
||||
A single Go binary replaces mitmproxy *in-path* (`net/http/httputil.ReverseProxy` +
|
||||
Coraza). Keeps the in-path model → can still do banner/cache/transformation, and
|
||||
gets multi-core + bounded memory + no FD leak. Still **adopts** the engine + CRS;
|
||||
only the proxy glue is bespoke. *Cost:* ReverseProxy footguns (bounded body
|
||||
buffering, Content-Length resync, error/upgrade handling) need a real PoC test
|
||||
suite; still an in-path component (SPOF remains, but a robust Go one).
|
||||
|
||||
**Option C — CrowdSec AppSec component (Coraza inline).** CrowdSec's AppSec
|
||||
component *is* Coraza inline; since we already integrate CrowdSec (LAPI bridge), this
|
||||
could deliver the inline WAF as a CrowdSec component and unify the stack. Worth
|
||||
scoping against A.
|
||||
|
||||
**Option D — REJECT: hand-roll a Go WAF engine / port the bespoke regex rules.** The
|
||||
"don't roll your own crypto" rule applies to WAF rulesets. Bespoke signatures miss
|
||||
generic/0-day-class detection that CRS anomaly-scoring is built for, and carry a
|
||||
permanent FP-tuning + CVE-tracking burden. Also reject the dead `spoa-modsecurity`
|
||||
(ModSecurity v2, EOL 2024).
|
||||
|
||||
## 5. CSPN angle
|
||||
|
||||
The project targets ANSSI CSPN. Adopting **OWASP CRS v4** (a flagship, test-suite-
|
||||
covered ruleset) is far more defensible for certification than bespoke regex, and a
|
||||
formal SPOA verdict + an explicit **fail-open vs fail-close** SPOE policy is a clean,
|
||||
auditable security-decision boundary. (Current bespoke WAF = warn-pages + 3-strike
|
||||
CrowdSec ban; CRS gives graduated anomaly scoring with documented paranoia levels.)
|
||||
|
||||
## 6. Recommendation + gated next steps (NOT started)
|
||||
|
||||
**Recommendation:** ADOPT Coraza + CRS v4. Prefer **Option A (SPOA, retire mitmproxy,
|
||||
kill the SPOF)** if banner/cache/cookie-audit can be relocated; fall back to
|
||||
**Option B (in-path Go + embedded Coraza)** if traffic transformation must stay
|
||||
in-path. Do **not** hand-roll the engine or port the regex rules.
|
||||
|
||||
Proposed gated plan, more conservative than #662 (security-critical + SPOF):
|
||||
1. **Decide the addon fate** (banner / media-cache / cookie-audit): relocate, drop,
|
||||
or keep a thin in-path component → this picks A vs B.
|
||||
2. **Check the board's HAProxy version** (SPOE 2.x vs 3.1 `mode spop`).
|
||||
3. **PoC, detect-only, SHADOW:** run coraza-spoa (or the Go+Coraza proxy) in
|
||||
**detection-only** mode against a mirror/copy of real traffic; **compare its
|
||||
verdicts to the current regex WAF** on the same requests (false-pos / false-neg
|
||||
delta). Serve no clients.
|
||||
4. **arm64 benchmark** (latency added per request, body-size cost, burst behaviour).
|
||||
5. **CRS tuning pass** on real traffic in detect-only (FP elimination, paranoia
|
||||
level) before any blocking.
|
||||
6. **Canary ONE low-risk vhost** through the new path with the old WAF as instant
|
||||
fallback; watch; widen; then retire the mitmproxy inspector.
|
||||
|
||||
**Honest framing for the go/no-go:** if the goal is "the WAF is slow," the data says
|
||||
it isn't (yet) — don't take the risk. If the goal is **resilience (kill the SPOF,
|
||||
end the FD-leak/memory restarts, multi-core burst headroom) + better/auditable
|
||||
detection coverage (CRS) for CSPN**, then Coraza+CRS via SPOA is a strong, mostly-
|
||||
*adopt* move with a contained bespoke surface — a very different risk profile from
|
||||
the #662 hand-roll.
|
||||
|
||||
## Sources
|
||||
Repo: `packages/secubox-mitmproxy/addons/secubox_waf.py`, `data/waf-rules.json`,
|
||||
`packages/secubox-haproxy/sbin/haproxyctl`, `packages/secubox-waf/systemd/
|
||||
mitmproxy.service`, `.claude/PHASE-7-WAF-ROADMAP.md`. Live: gk2 read-only
|
||||
(mitmproxy 11.0.2, 3.6 req/s, ~17–53% one core, 255 routes/187 inspected, HAProxy
|
||||
TLS-term → cleartext). External (2025-26): OWASP Coraza v3.7 / coraza-spoa v0.7.2 /
|
||||
coraza-coreruleset (CRS v4.25 LTS), HAProxy SPOE + 3.1 `mode spop`, CrowdSec AppSec
|
||||
in-band/out-of-band, ngrok in-process Coraza.
|
||||
12
packages/secubox-toolbox-ng/.gitignore
vendored
Normal file
12
packages/secubox-toolbox-ng/.gitignore
vendored
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
/sbxmitm
|
||||
*.test
|
||||
cmd/sbxmitm/sbxmitm
|
||||
# Debian build artifacts (rules builds the binary + go caches in-tree)
|
||||
/_gocache/
|
||||
/_gopath/
|
||||
/debian/.debhelper/
|
||||
/debian/files
|
||||
/debian/*.substvars
|
||||
/debian/secubox-toolbox-ng/
|
||||
/debian/debhelper-build-stamp
|
||||
/debian/*.debhelper.log
|
||||
60
packages/secubox-toolbox-ng/README.md
Normal file
60
packages/secubox-toolbox-ng/README.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
# secubox-toolbox-ng — Go MITM engine (migration spike, #662 Phase 1)
|
||||
|
||||
De-risking PoC for migrating the R3 toolbox MITM engine off Python **mitmproxy**
|
||||
(GIL-bound, ~1 core) onto a multi-core **Go** core, **without losing the
|
||||
18-addon feature set**. See:
|
||||
- Analysis: `docs/superpowers/specs/2026-06-18-mitm-engine-migration-analysis.md`
|
||||
- Phased plan: `docs/superpowers/plans/2026-06-18-mitm-engine-migration.md`
|
||||
|
||||
> **Status: Phase 1 — PoC only. NOT wired into the live R3 path.** The live
|
||||
> tunnel still runs on the Python mitmproxy workers (8081-8084). This binary is
|
||||
> a standalone CONNECT-proxy spike that proves the risky capabilities.
|
||||
|
||||
## What the PoC proves (the discriminating risks from the analysis)
|
||||
- **CA-compat forging** — loads the *existing* `ca-wg/{ca.pem,key.pem}` and forges
|
||||
per-host leaf certs the R3 clients already trust (no re-enroll). Cached per host.
|
||||
- **request 204** — short-circuit block (ad_ghost / privacy_guard).
|
||||
- **response body inject** — marker before `</head>`/`</body>` (banner / ad-CSS).
|
||||
- **SNI splice** — raw passthrough, no MITM, by SNI suffix (tls_splice).
|
||||
- **JA4 material capture** — `crypto/tls` `GetCertificate` receives the
|
||||
`ClientHelloInfo` (SNI, cipher suites, ALPN, TLS versions) → proves the `ja4`
|
||||
addon's handshake fingerprint is reachable in Go (full JA4 extension-hash needs
|
||||
a raw-ClientHello peek — Phase 4).
|
||||
|
||||
All stdlib (no external modules → builds offline). Tests are network-free
|
||||
(localhost handshake + temp self-signed CA).
|
||||
|
||||
## Build & test
|
||||
```sh
|
||||
cd packages/secubox-toolbox-ng
|
||||
go test ./... # network-free PoC tests
|
||||
GOOS=linux GOARCH=arm64 go build -o sbxmitm ./cmd/sbxmitm # appliance target
|
||||
```
|
||||
|
||||
## Try it (CONNECT proxy, against the board CA)
|
||||
```sh
|
||||
./sbxmitm --ca-cert /etc/secubox/toolbox/ca-wg/ca.pem \
|
||||
--ca-key /etc/secubox/toolbox/ca-wg/key.pem --listen :8090
|
||||
curl -x localhost:8090 --cacert /etc/secubox/toolbox/ca-wg/ca.pem https://doubleclick.net/ # → 204
|
||||
curl -x localhost:8090 --cacert /etc/secubox/toolbox/ca-wg/ca.pem https://example.com/ # → body has the sbx-ng marker
|
||||
# logs print `ja4 t0304_cNN_a... sni=...` per handshake
|
||||
```
|
||||
|
||||
## Capability → engine map (recap)
|
||||
Go covers request-204 / body-rewrite / header-cookie-mod / splice / async-sidecars
|
||||
cleanly; JA4 needs the ClientHello shim (proven here); streaming inject + the
|
||||
anti-track HMAC-jar/poison port land in Phase 3/4. Heavy analysis (social-graph,
|
||||
classify, DB/report writers) stays in the existing Python sidecars, fed
|
||||
fire-and-forget over unix sockets.
|
||||
|
||||
## Roadmap (do NOT cut over without the gates)
|
||||
1. ✅ PoC (this) — forge + 204 + inject + splice + ClientHello capture, compiled + tested.
|
||||
2. arm64 packaging + board bench on :8090 (no DNAT) — forge/throughput vs mitmproxy.
|
||||
3. Hot-path feature parity (block lists + allowlist + own-infra guard, header/cookie strip, banner, splice) — parity harness vs the Python addons.
|
||||
4. Analysis sidecars (unix-socket fire-and-forget) + anti-track HMAC-jar/forge port (exhaustively tested vs `privacy.py`).
|
||||
5. **Shadow run** — mirror a fraction of R3, compare outputs. No client served yet.
|
||||
6. **Cutover** — flip nft `numgen` fanout 8081-8084 → 8090-8093; mitmproxy stays up for instant rollback.
|
||||
7. Decommission mitmproxy after a stable soak.
|
||||
|
||||
Rollback is an nft DNAT-map edit at every step; the Python engine is the live
|
||||
path until Phase 6.
|
||||
167
packages/secubox-toolbox-ng/cmd/sbxmitm/banner.go
Normal file
167
packages/secubox-toolbox-ng/cmd/sbxmitm/banner.go
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: transparency-banner loader inject (#662)
|
||||
//
|
||||
// Ports the LIVE transparency-banner injection from the authoritative Python
|
||||
// addon (../secubox-toolbox/mitmproxy_addons/inject_banner.py) into the Go
|
||||
// engine. With stream_inject ON the Python addon injects a tiny LOADER
|
||||
// <script src="/__toolbox/loader.js" data-mh=.. data-wg=.. async></script> and
|
||||
// SERVES /__toolbox/loader.js + /__toolbox/bundle itself for ANY origin (the
|
||||
// injected same-origin URL resolves to whatever MITM'd host the client is on).
|
||||
//
|
||||
// To avoid re-porting the bundle/level business logic to Go, this engine
|
||||
// REVERSE-PROXIES /__toolbox/* to the portal (default http://127.0.0.1:8088),
|
||||
// which already serves both endpoints. The injection (injectLoader) mirrors the
|
||||
// Python _loader_script + _LoaderInjector byte-for-byte on the tag shape and
|
||||
// placement; the guard makes it idempotent (matches Python _GUARD).
|
||||
//
|
||||
// Pure standard library — no external modules.
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"io"
|
||||
"log"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
// bannerGuard matches the Python _GUARD ("__GONDWANA_MITM_BANNER__"): an HTML
|
||||
// comment marker that makes injection idempotent across stream chunks / repeat
|
||||
// passes. If the body already contains it, we never inject again.
|
||||
const bannerGuard = "__GONDWANA_MITM_BANNER__"
|
||||
|
||||
// asciiOnly drops every non-ASCII byte from s, mirroring the Python
|
||||
// `s.encode("ascii", "ignore")` used on the client hash before it lands in the
|
||||
// data-mh attribute. The clientHash is normally a hex mac_hash (already ASCII),
|
||||
// but a non-WG fallback could carry odd bytes — strip defensively.
|
||||
func asciiOnly(s string) string {
|
||||
var b strings.Builder
|
||||
b.Grow(len(s))
|
||||
for i := 0; i < len(s); i++ {
|
||||
if s[i] < 0x80 {
|
||||
b.WriteByte(s[i])
|
||||
}
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
// loaderScript builds the loader <script> tag EXACTLY like the Python
|
||||
// _loader_script: a guard comment followed by the same-origin loader.js tag
|
||||
// carrying the client identity (data-mh) + WG flag (data-wg). wg → "1" else "0";
|
||||
// clientHash is ascii-sanitised. The src is same-origin so it resolves to the
|
||||
// MITM'd host and is intercepted by the /__toolbox/* short-circuit.
|
||||
func loaderScript(clientHash string, wg bool) []byte {
|
||||
wgVal := "0"
|
||||
if wg {
|
||||
wgVal = "1"
|
||||
}
|
||||
mh := asciiOnly(clientHash)
|
||||
tag := `<script src="/__toolbox/loader.js" data-mh="` + mh +
|
||||
`" data-wg="` + wgVal + `" async></script>`
|
||||
return []byte("<!-- " + bannerGuard + " -->" + tag)
|
||||
}
|
||||
|
||||
// injectLoader inserts the loader <script> into an HTML body once. Placement
|
||||
// mirrors the Python _LoaderInjector.__call__:
|
||||
// - guard idempotency: if the body already contains bannerGuard → unchanged.
|
||||
// - find the first (case-insensitive) "<head"; if present, find the next ">"
|
||||
// after it and insert the tag right after that ">".
|
||||
// - else find the first "<body" and insert the tag right BEFORE it.
|
||||
// - if neither is present → return the body unchanged (no inject).
|
||||
func injectLoader(body []byte, clientHash string, wg bool) []byte {
|
||||
if bytes.Contains(body, []byte(bannerGuard)) {
|
||||
return body
|
||||
}
|
||||
script := loaderScript(clientHash, wg)
|
||||
low := bytes.ToLower(body)
|
||||
|
||||
if h := bytes.Index(low, []byte("<head")); h >= 0 {
|
||||
if j := bytes.IndexByte(body[h:], '>'); j >= 0 {
|
||||
at := h + j + 1
|
||||
out := make([]byte, 0, len(body)+len(script))
|
||||
out = append(out, body[:at]...)
|
||||
out = append(out, script...)
|
||||
out = append(out, body[at:]...)
|
||||
return out
|
||||
}
|
||||
}
|
||||
if b := bytes.Index(low, []byte("<body")); b >= 0 {
|
||||
out := make([]byte, 0, len(body)+len(script))
|
||||
out = append(out, body[:b]...)
|
||||
out = append(out, script...)
|
||||
out = append(out, body[b:]...)
|
||||
return out
|
||||
}
|
||||
return body
|
||||
}
|
||||
|
||||
// ── /__toolbox/* reverse-proxy to the portal ─────────────────────────────────
|
||||
|
||||
// isToolboxAssetPath reports whether a request path is one of the banner assets
|
||||
// the engine must serve itself (by reverse-proxying to the portal) for ANY
|
||||
// origin. STARTSWITH (not exact) is REQUIRED: the path includes the query
|
||||
// string and the bundle is fetched as /__toolbox/bundle?mh=..&wg=.. — an exact
|
||||
// match would never fire. Mirrors the Python request() p.startswith(...) checks.
|
||||
func isToolboxAssetPath(path string) bool {
|
||||
return strings.HasPrefix(path, "/__toolbox/loader.js") ||
|
||||
strings.HasPrefix(path, "/__toolbox/bundle")
|
||||
}
|
||||
|
||||
// portalTargetURL builds the absolute portal URL for an intercepted asset
|
||||
// request: <portal-base> + the original request path (which already includes
|
||||
// the query string). The portal base's trailing slash is trimmed so the result
|
||||
// never doubles the leading "/" of the path.
|
||||
func portalTargetURL(portal, pathWithQuery string) string {
|
||||
return strings.TrimRight(portal, "/") + pathWithQuery
|
||||
}
|
||||
|
||||
// portalClient is the short-timeout HTTP client used to fetch banner assets from
|
||||
// the portal. Shared (stdlib http.Client is goroutine-safe) so we don't churn
|
||||
// connections per request.
|
||||
var portalClient = &http.Client{
|
||||
Timeout: 5 * time.Second,
|
||||
// Never follow redirects: the portal is a fixed loopback base, so not
|
||||
// following 3xx means a misbehaving/compromised portal can't steer the
|
||||
// worker into fetching an arbitrary outbound host (SSRF hygiene). The 3xx
|
||||
// is relayed to the client as-is.
|
||||
CheckRedirect: func(*http.Request, []*http.Request) error { return http.ErrUseLastResponse },
|
||||
}
|
||||
|
||||
// servePortalAsset reverse-proxies a /__toolbox/* request to the portal and
|
||||
// writes the portal's response (status + Content-Type + Cache-Control + body)
|
||||
// back to the client over the already-established (TLS) conn. It returns true
|
||||
// once it has written a response — the caller MUST NOT then forward upstream.
|
||||
//
|
||||
// Fail-open: if the portal request errors (portal down, timeout, non-2xx read
|
||||
// failure) we serve a minimal 204 No Content so the navigation is never broken,
|
||||
// and log at most a warning. We never 502 the whole page over a banner asset.
|
||||
func servePortalAsset(w io.Writer, portal, pathWithQuery string) bool {
|
||||
target := portalTargetURL(portal, pathWithQuery)
|
||||
resp, err := portalClient.Get(target)
|
||||
if err != nil {
|
||||
log.Printf("portal asset fetch failed for %s: %v", target, err)
|
||||
writeRaw(w, 204, "No Content", nil, nil)
|
||||
return true
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
body, rerr := io.ReadAll(io.LimitReader(resp.Body, 8<<20))
|
||||
if rerr != nil {
|
||||
log.Printf("portal asset read failed for %s: %v", target, rerr)
|
||||
writeRaw(w, 204, "No Content", nil, nil)
|
||||
return true
|
||||
}
|
||||
headers := map[string]string{}
|
||||
if ct := resp.Header.Get("Content-Type"); ct != "" {
|
||||
headers["Content-Type"] = ct
|
||||
}
|
||||
if cc := resp.Header.Get("Cache-Control"); cc != "" {
|
||||
headers["Cache-Control"] = cc
|
||||
}
|
||||
// writeRaw formats "HTTP/1.1 <code> <status>"; pass only the reason phrase
|
||||
// (not resp.Status, which already embeds the code → would double it).
|
||||
writeRaw(w, resp.StatusCode, http.StatusText(resp.StatusCode), headers, body)
|
||||
return true
|
||||
}
|
||||
132
packages/secubox-toolbox-ng/cmd/sbxmitm/banner_test.go
Normal file
132
packages/secubox-toolbox-ng/cmd/sbxmitm/banner_test.go
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: transparency-banner loader inject tests (#662)
|
||||
//
|
||||
// Mirrors the authoritative Python tests of inject_banner._loader_script /
|
||||
// _LoaderInjector / the /__toolbox/* request() short-circuit. The portal
|
||||
// reverse-proxy integration (a live portal) is validated on-board, NOT here;
|
||||
// these unit tests cover the pure injection logic + the path/url helpers.
|
||||
package main
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestInjectLoaderGuardIdempotent(t *testing.T) {
|
||||
// Body already carrying the guard → returned byte-for-byte unchanged.
|
||||
body := []byte("<html><head><!-- " + bannerGuard + " --><script></script></head><body>hi</body></html>")
|
||||
out := injectLoader(body, "abc123", false)
|
||||
if string(out) != string(body) {
|
||||
t.Fatalf("guarded body must be unchanged.\n got: %s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectLoaderHeadInsertion(t *testing.T) {
|
||||
body := []byte(`<html><head lang="en"><title>x</title></head><body>hi</body></html>`)
|
||||
out := string(injectLoader(body, "deadbeef", true))
|
||||
// The tag must land right AFTER the first <head ...>'s closing '>'.
|
||||
headOpen := `<head lang="en">`
|
||||
idx := strings.Index(out, headOpen)
|
||||
if idx < 0 {
|
||||
t.Fatalf("head open lost: %s", out)
|
||||
}
|
||||
after := out[idx+len(headOpen):]
|
||||
wantTag := `<!-- ` + bannerGuard + ` --><script src="/__toolbox/loader.js" data-mh="deadbeef" data-wg="1" async></script>`
|
||||
if !strings.HasPrefix(after, wantTag) {
|
||||
t.Fatalf("tag not inserted right after <head>'s '>'.\n got: %s", after)
|
||||
}
|
||||
// <title> must still follow the injected tag (we inserted, not replaced).
|
||||
if !strings.Contains(out, wantTag+`<title>x</title>`) {
|
||||
t.Fatalf("original head content displaced: %s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectLoaderBodyFallback(t *testing.T) {
|
||||
// No <head> → insert right BEFORE the first <body>.
|
||||
body := []byte(`<html><body class="x">hi</body></html>`)
|
||||
out := string(injectLoader(body, "cafe", false))
|
||||
wantTag := `<!-- ` + bannerGuard + ` --><script src="/__toolbox/loader.js" data-mh="cafe" data-wg="0" async></script>`
|
||||
if !strings.Contains(out, wantTag+`<body class="x">`) {
|
||||
t.Fatalf("tag not inserted right before <body>.\n got: %s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectLoaderNeitherHeadNorBody(t *testing.T) {
|
||||
body := []byte(`<p>just a fragment</p>`)
|
||||
out := injectLoader(body, "x", true)
|
||||
if string(out) != string(body) {
|
||||
t.Fatalf("no head/body → must be unchanged.\n got: %s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectLoaderWGAttr(t *testing.T) {
|
||||
cases := []struct {
|
||||
wg bool
|
||||
want string
|
||||
}{
|
||||
{true, `data-wg="1"`},
|
||||
{false, `data-wg="0"`},
|
||||
}
|
||||
for _, c := range cases {
|
||||
out := string(injectLoader([]byte(`<head></head>`), "mh1", c.wg))
|
||||
if !strings.Contains(out, c.want) {
|
||||
t.Fatalf("wg=%v: want %q in %s", c.wg, c.want, out)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectLoaderNonASCIIHashStripped(t *testing.T) {
|
||||
// Non-ascii bytes in the client hash are dropped (Python .encode("ascii","ignore")).
|
||||
out := string(injectLoader([]byte(`<head></head>`), "abécÿ12", false))
|
||||
if !strings.Contains(out, `data-mh="abc12"`) {
|
||||
t.Fatalf("non-ascii bytes not stripped: %s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectLoaderHeadCaseInsensitive(t *testing.T) {
|
||||
body := []byte(`<HTML><HEAD></HEAD><BODY>hi</BODY></HTML>`)
|
||||
out := string(injectLoader(body, "z", false))
|
||||
if !strings.Contains(out, `<HEAD><!-- `+bannerGuard) {
|
||||
t.Fatalf("case-insensitive <HEAD> match failed: %s", out)
|
||||
}
|
||||
}
|
||||
|
||||
func TestIsToolboxAssetPath(t *testing.T) {
|
||||
cases := []struct {
|
||||
path string
|
||||
want bool
|
||||
}{
|
||||
{"/__toolbox/loader.js", true},
|
||||
{"/__toolbox/loader.js?v=2", true},
|
||||
{"/__toolbox/bundle", true},
|
||||
{"/__toolbox/bundle?mh=abc&wg=1", true},
|
||||
{"/__toolbox/other", false},
|
||||
{"/index.html", false},
|
||||
{"/", false},
|
||||
{"", false},
|
||||
{"/__toolboxbundle", false},
|
||||
}
|
||||
for _, c := range cases {
|
||||
if got := isToolboxAssetPath(c.path); got != c.want {
|
||||
t.Errorf("isToolboxAssetPath(%q) = %v, want %v", c.path, got, c.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestPortalTargetURL(t *testing.T) {
|
||||
cases := []struct {
|
||||
portal, path, want string
|
||||
}{
|
||||
{"http://127.0.0.1:8088", "/__toolbox/loader.js", "http://127.0.0.1:8088/__toolbox/loader.js"},
|
||||
{"http://127.0.0.1:8088", "/__toolbox/bundle?mh=abc&wg=1", "http://127.0.0.1:8088/__toolbox/bundle?mh=abc&wg=1"},
|
||||
// Trailing slash on the portal base must not double up.
|
||||
{"http://127.0.0.1:8088/", "/__toolbox/loader.js", "http://127.0.0.1:8088/__toolbox/loader.js"},
|
||||
}
|
||||
for _, c := range cases {
|
||||
if got := portalTargetURL(c.portal, c.path); got != c.want {
|
||||
t.Errorf("portalTargetURL(%q,%q) = %q, want %q", c.portal, c.path, got, c.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
95
packages/secubox-toolbox-ng/cmd/sbxmitm/bench_test.go
Normal file
95
packages/secubox-toolbox-ng/cmd/sbxmitm/bench_test.go
Normal file
|
|
@ -0,0 +1,95 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// #662 Phase 2b — controlled multi-core throughput bench. Drives full client↔
|
||||
// proxy TLS handshakes (forge + ClientHello capture) in parallel. Run with
|
||||
// `-cpu=1,2,4,8` to SHOW the scaling Python mitmproxy's GIL cannot do:
|
||||
// go test -run x -bench BenchmarkHandshake -benchmem -cpu=1,2,4,8 ./cmd/sbxmitm
|
||||
package main
|
||||
|
||||
import (
|
||||
"crypto/ecdsa"
|
||||
"crypto/elliptic"
|
||||
"crypto/rand"
|
||||
"crypto/tls"
|
||||
"crypto/x509"
|
||||
"crypto/x509/pkix"
|
||||
"encoding/pem"
|
||||
"math/big"
|
||||
"net"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func benchCA(b *testing.B) (string, string) {
|
||||
b.Helper()
|
||||
dir := b.TempDir()
|
||||
key, _ := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: big.NewInt(1), Subject: pkix.Name{CommonName: "Bench CA"},
|
||||
NotBefore: time.Now().Add(-time.Hour), NotAfter: time.Now().Add(24 * time.Hour),
|
||||
IsCA: true, KeyUsage: x509.KeyUsageCertSign, BasicConstraintsValid: true,
|
||||
}
|
||||
der, _ := x509.CreateCertificate(rand.Reader, tmpl, tmpl, key.Public(), key)
|
||||
cp := filepath.Join(dir, "ca.pem")
|
||||
kp := filepath.Join(dir, "key.pem")
|
||||
cf, _ := os.Create(cp)
|
||||
pem.Encode(cf, &pem.Block{Type: "CERTIFICATE", Bytes: der})
|
||||
cf.Close()
|
||||
kder, _ := x509.MarshalPKCS8PrivateKey(key)
|
||||
kf, _ := os.Create(kp)
|
||||
pem.Encode(kf, &pem.Block{Type: "PRIVATE KEY", Bytes: kder})
|
||||
kf.Close()
|
||||
return cp, kp
|
||||
}
|
||||
|
||||
// BenchmarkHandshake: steady-state forged-cert TLS handshakes/sec under parallel
|
||||
// load (warm forge cache). req/s should rise ~linearly with -cpu (no GIL).
|
||||
func BenchmarkHandshake(b *testing.B) {
|
||||
cp, kp := benchCA(b)
|
||||
ca, err := loadCA(cp, kp)
|
||||
if err != nil {
|
||||
b.Fatal(err)
|
||||
}
|
||||
px := &Proxy{ca: ca}
|
||||
if _, err := ca.forge("example.com"); err != nil { // warm cache
|
||||
b.Fatal(err)
|
||||
}
|
||||
ln, err := net.Listen("tcp", "127.0.0.1:0")
|
||||
if err != nil {
|
||||
b.Fatal(err)
|
||||
}
|
||||
defer ln.Close()
|
||||
cfg := px.serverTLSConfig()
|
||||
go func() {
|
||||
for {
|
||||
c, err := ln.Accept()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
go func() {
|
||||
s := tls.Server(c, cfg)
|
||||
s.Handshake()
|
||||
s.Close()
|
||||
}()
|
||||
}
|
||||
}()
|
||||
pool := x509.NewCertPool()
|
||||
pool.AddCert(ca.cert)
|
||||
addr := ln.Addr().String()
|
||||
ccfg := &tls.Config{ServerName: "example.com", RootCAs: pool, MinVersion: tls.VersionTLS12}
|
||||
|
||||
b.ResetTimer()
|
||||
b.RunParallel(func(pb *testing.PB) {
|
||||
for pb.Next() {
|
||||
conn, err := tls.Dial("tcp", addr, ccfg)
|
||||
if err != nil {
|
||||
b.Error(err)
|
||||
return
|
||||
}
|
||||
conn.Close()
|
||||
}
|
||||
})
|
||||
}
|
||||
109
packages/secubox-toolbox-ng/cmd/sbxmitm/gzip.go
Normal file
109
packages/secubox-toolbox-ng/cmd/sbxmitm/gzip.go
Normal file
|
|
@ -0,0 +1,109 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: gzip-aware banner injection (#662)
|
||||
//
|
||||
// The transparency-banner inject (injectLoader) scans the HTML body for
|
||||
// <head>/<body>. Browsers send `Accept-Encoding: gzip, br`, so most upstream
|
||||
// responses come back COMPRESSED — and a compressed body has no plaintext
|
||||
// <head>/<body> for injectLoader to find, so it silently no-ops (the banner
|
||||
// vanished on every gzip page). mitmPipeline now pins the upstream request to
|
||||
// `Accept-Encoding: gzip` (dropping br/zstd/deflate we cannot decode with the
|
||||
// stdlib), so every response is either gzip or identity.
|
||||
//
|
||||
// This file holds the gzip helpers + the single inject-path transform that
|
||||
// decompresses (if gzip) → injectLoader → recompresses, fail-open on any error
|
||||
// so a banner asset never breaks the page.
|
||||
//
|
||||
// Pure standard library — compress/gzip only; no external modules (brotli/zstd
|
||||
// are NOT in the stdlib, which is exactly why we constrain the wire to gzip).
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"compress/gzip"
|
||||
"io"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// gunzipCap bounds the decompressed output so a maliciously-crafted gzip body
|
||||
// (a "decompression bomb") cannot blow the worker's memory. The upstream body
|
||||
// itself is already read under an 8MiB LimitReader; 32MiB of inflated HTML is a
|
||||
// generous ceiling for a single page. Exceeding it → treated as an error
|
||||
// (caller fails open and serves the original compressed bytes).
|
||||
const gunzipCap = 32 << 20
|
||||
|
||||
// gunzipBytes inflates a gzip-compressed body. It is defensive on two axes:
|
||||
// - a malformed/non-gzip input returns an error (caller fails open),
|
||||
// - the decompressed output is capped at gunzipCap; if the stream would
|
||||
// exceed it, that is reported as an error too (decompression-bomb guard).
|
||||
func gunzipBytes(in []byte) ([]byte, error) {
|
||||
zr, err := gzip.NewReader(bytes.NewReader(in))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer zr.Close()
|
||||
// Read up to gunzipCap+1 so we can tell "exactly at the cap" (fine) from
|
||||
// "the stream is bigger than the cap" (bomb → error).
|
||||
out, err := io.ReadAll(io.LimitReader(zr, gunzipCap+1))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(out) > gunzipCap {
|
||||
return nil, errGunzipTooLarge
|
||||
}
|
||||
return out, nil
|
||||
}
|
||||
|
||||
// errGunzipTooLarge is returned by gunzipBytes when the decompressed stream
|
||||
// exceeds gunzipCap (decompression-bomb guard).
|
||||
var errGunzipTooLarge = errString("gunzip output exceeds cap")
|
||||
|
||||
// errString is a tiny stdlib-only error type (avoids importing errors/fmt for
|
||||
// one sentinel).
|
||||
type errString string
|
||||
|
||||
func (e errString) Error() string { return string(e) }
|
||||
|
||||
// gzipBytes compresses in with the default gzip level. It never errors: the
|
||||
// gzip.Writer only writes into an in-memory bytes.Buffer, which cannot fail.
|
||||
func gzipBytes(in []byte) []byte {
|
||||
var buf bytes.Buffer
|
||||
zw := gzip.NewWriter(&buf)
|
||||
_, _ = zw.Write(in)
|
||||
_ = zw.Close()
|
||||
return buf.Bytes()
|
||||
}
|
||||
|
||||
// injectIntoBody runs the transparency-banner injection over a (possibly
|
||||
// gzip-compressed) HTML body, returning the new body bytes to serve and whether
|
||||
// the body was rewritten.
|
||||
//
|
||||
// - encoding == "" (identity): injectLoader runs directly on body; the result
|
||||
// is returned (ok=true). The caller MUST update Content-Length to len(out).
|
||||
// - encoding == "gzip" (case-insensitive): the body is gunzipped, injected,
|
||||
// then RE-gzipped so the client transfer stays compressed (the tunnel is
|
||||
// perf-sensitive). The caller keeps Content-Encoding: gzip and sets
|
||||
// Content-Length to len(out).
|
||||
// - any other encoding (br/zstd/deflate — should not occur after the upstream
|
||||
// Accept-Encoding pin, but be safe): pass through untouched, ok=false.
|
||||
//
|
||||
// Fail-open: if gunzip fails (corrupt / not-actually-gzip / bomb), the ORIGINAL
|
||||
// bytes are returned with ok=false so the page is never broken.
|
||||
//
|
||||
// idempotency / placement live entirely inside injectLoader (unchanged).
|
||||
func injectIntoBody(body []byte, encoding, clientHash string, wg bool) (out []byte, ok bool) {
|
||||
switch strings.ToLower(strings.TrimSpace(encoding)) {
|
||||
case "":
|
||||
return injectLoader(body, clientHash, wg), true
|
||||
case "gzip":
|
||||
plain, err := gunzipBytes(body)
|
||||
if err != nil {
|
||||
return body, false // fail open: serve the original compressed bytes
|
||||
}
|
||||
injected := injectLoader(plain, clientHash, wg)
|
||||
return gzipBytes(injected), true
|
||||
default:
|
||||
return body, false // unknown encoding we cannot decode → pass through
|
||||
}
|
||||
}
|
||||
152
packages/secubox-toolbox-ng/cmd/sbxmitm/gzip_test.go
Normal file
152
packages/secubox-toolbox-ng/cmd/sbxmitm/gzip_test.go
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: gzip-aware banner injection tests (#662)
|
||||
//
|
||||
// Covers the LIVE bug: the banner only injected into UNCOMPRESSED HTML, so
|
||||
// gzip pages (the common case — browsers send Accept-Encoding: gzip,br) lost
|
||||
// the banner. These tests pin the decompress→inject→recompress transform and
|
||||
// its fail-open behaviour.
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestGzipRoundTrip(t *testing.T) {
|
||||
cases := [][]byte{
|
||||
[]byte(""),
|
||||
[]byte("hello world"),
|
||||
[]byte(`<html><head><title>x</title></head><body>hi</body></html>`),
|
||||
bytes.Repeat([]byte("AB"), 100000), // larger, compressible payload
|
||||
}
|
||||
for _, x := range cases {
|
||||
got, err := gunzipBytes(gzipBytes(x))
|
||||
if err != nil {
|
||||
t.Fatalf("gunzipBytes(gzipBytes(%d bytes)) errored: %v", len(x), err)
|
||||
}
|
||||
if !bytes.Equal(got, x) {
|
||||
t.Fatalf("round-trip mismatch: got %d bytes, want %d bytes", len(got), len(x))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestGunzipNonGzipFails(t *testing.T) {
|
||||
// Plain bytes that are not a gzip stream → error, no panic.
|
||||
if _, err := gunzipBytes([]byte("this is definitely not gzip")); err == nil {
|
||||
t.Fatal("gunzipBytes on non-gzip input must error")
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectIntoBodyGzip(t *testing.T) {
|
||||
// End-to-end-ish: HTML with <head>, gzipped, run through the exact transform
|
||||
// the inject path uses. Result must gunzip back to an injected, intact doc.
|
||||
html := `<html><head><title>page</title></head><body>content</body></html>`
|
||||
out, ok := injectIntoBody(gzipBytes([]byte(html)), "gzip", "abc123", true)
|
||||
if !ok {
|
||||
t.Fatal("gzip inject must report ok=true")
|
||||
}
|
||||
plain, err := gunzipBytes(out)
|
||||
if err != nil {
|
||||
t.Fatalf("re-gzipped output must gunzip cleanly: %v", err)
|
||||
}
|
||||
s := string(plain)
|
||||
if !strings.Contains(s, bannerGuard) {
|
||||
t.Fatalf("banner guard %q absent after gzip inject:\n%s", bannerGuard, s)
|
||||
}
|
||||
// Document otherwise intact: original head/body content preserved.
|
||||
if !strings.Contains(s, "<title>page</title>") || !strings.Contains(s, "<body>content</body>") {
|
||||
t.Fatalf("original document content displaced:\n%s", s)
|
||||
}
|
||||
// The loader tag landed inside <head>.
|
||||
if !strings.Contains(s, `<head><!-- `+bannerGuard) {
|
||||
t.Fatalf("loader tag not inserted right after <head>:\n%s", s)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectIntoBodyGzipCaseInsensitiveEncoding(t *testing.T) {
|
||||
html := `<head></head>`
|
||||
out, ok := injectIntoBody(gzipBytes([]byte(html)), "GZIP", "z", false)
|
||||
if !ok {
|
||||
t.Fatal("Content-Encoding GZIP (upper) must be recognised → ok=true")
|
||||
}
|
||||
plain, err := gunzipBytes(out)
|
||||
if err != nil {
|
||||
t.Fatalf("gunzip failed: %v", err)
|
||||
}
|
||||
if !strings.Contains(string(plain), bannerGuard) {
|
||||
t.Fatalf("banner absent for upper-case GZIP encoding: %s", plain)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectIntoBodyGzipFailOpen(t *testing.T) {
|
||||
// Bytes labelled gzip but NOT gzip → fail open: original bytes, ok=false,
|
||||
// no panic.
|
||||
bad := []byte("not gzip at all <head></head>")
|
||||
out, ok := injectIntoBody(bad, "gzip", "x", false)
|
||||
if ok {
|
||||
t.Fatal("corrupt gzip body must fail open (ok=false)")
|
||||
}
|
||||
if !bytes.Equal(out, bad) {
|
||||
t.Fatalf("fail-open must return the ORIGINAL bytes untouched")
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectIntoBodyIdentity(t *testing.T) {
|
||||
// Identity (empty Content-Encoding): inject directly, grown body returned.
|
||||
html := []byte(`<html><head></head><body>hi</body></html>`)
|
||||
out, ok := injectIntoBody(html, "", "deadbeef", false)
|
||||
if !ok {
|
||||
t.Fatal("identity inject must report ok=true")
|
||||
}
|
||||
if !bytes.Contains(out, []byte(bannerGuard)) {
|
||||
t.Fatalf("banner absent on identity inject: %s", out)
|
||||
}
|
||||
if len(out) <= len(html) {
|
||||
t.Fatalf("identity inject must GROW the body: got %d, was %d", len(out), len(html))
|
||||
}
|
||||
}
|
||||
|
||||
func TestInjectIntoBodyUnknownEncodingPassthrough(t *testing.T) {
|
||||
// br/zstd/deflate (shouldn't occur after the Accept-Encoding pin) → untouched.
|
||||
body := []byte("\x1f\x8b some br-ish bytes")
|
||||
out, ok := injectIntoBody(body, "br", "x", false)
|
||||
if ok {
|
||||
t.Fatal("unknown encoding must pass through (ok=false)")
|
||||
}
|
||||
if !bytes.Equal(out, body) {
|
||||
t.Fatalf("unknown-encoding passthrough must be byte-for-byte")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGunzipBombGuard(t *testing.T) {
|
||||
// A body that inflates beyond gunzipCap must be rejected (not OOM the worker).
|
||||
// gzip of >32MiB of zeros compresses to a small blob but inflates past the
|
||||
// cap → gunzipBytes returns an error → inject path fails open.
|
||||
big := gzipBytes(make([]byte, gunzipCap+1024))
|
||||
if _, err := gunzipBytes(big); err == nil {
|
||||
t.Fatal("gunzipBytes must reject output exceeding gunzipCap")
|
||||
}
|
||||
// And via the inject path: fail open, original bytes preserved.
|
||||
out, ok := injectIntoBody(big, "gzip", "x", false)
|
||||
if ok {
|
||||
t.Fatal("over-cap gzip body must fail open through injectIntoBody")
|
||||
}
|
||||
if !bytes.Equal(out, big) {
|
||||
t.Fatal("over-cap fail-open must return the original compressed bytes")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGunzipExactlyAtCap(t *testing.T) {
|
||||
// A body that inflates to EXACTLY gunzipCap is allowed (boundary).
|
||||
payload := make([]byte, gunzipCap)
|
||||
got, err := gunzipBytes(gzipBytes(payload))
|
||||
if err != nil {
|
||||
t.Fatalf("exactly-at-cap payload must be allowed: %v", err)
|
||||
}
|
||||
if len(got) != gunzipCap {
|
||||
t.Fatalf("at-cap length mismatch: got %d, want %d", len(got), gunzipCap)
|
||||
}
|
||||
}
|
||||
153
packages/secubox-toolbox-ng/cmd/sbxmitm/jar.go
Normal file
153
packages/secubox-toolbox-ng/cmd/sbxmitm/jar.go
Normal file
|
|
@ -0,0 +1,153 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: anti-track fake-identity jar (#662 Phase 4)
|
||||
//
|
||||
// Byte-exact port of the Python anti-track HMAC fake-identity jar
|
||||
// (packages/secubox-toolbox/secubox_toolbox/privacy.py: _jar_key / _shape /
|
||||
// fake_id). Python is the source of truth; this mirrors it exactly, proven by
|
||||
// the cross-engine parity harness (testdata/jar-fixtures.json + jar_test.go ↔
|
||||
// tests/test_jar_parity.py).
|
||||
//
|
||||
// The jar mints a STABLE fabricated cookie value per (client, tracker,
|
||||
// cookie_name): a deterministic HMAC-SHA256 of stable inputs, never derived
|
||||
// from real client data, identical across workers and restarts ('rémanent').
|
||||
//
|
||||
// Pure standard library — no external modules, no go.sum.
|
||||
package main
|
||||
|
||||
import (
|
||||
"crypto/hmac"
|
||||
"crypto/sha256"
|
||||
"encoding/binary"
|
||||
"encoding/hex"
|
||||
"fmt"
|
||||
"os"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// _privacyMultiTLD mirrors privacy._MULTI_TLD EXACTLY (NOT ad_ghost._2L — they
|
||||
// differ: privacy has ac.uk/com.cn/com.tr/gov.uk/org.uk, lacks gouv.fr; and
|
||||
// privacy returns IP literals as-is where ad_ghost returns None). The jar MUST
|
||||
// use the privacy-flavored registrable so fakeID is byte-identical to
|
||||
// privacy.fake_id across engines (else the fake persona mismatches at cutover).
|
||||
var _privacyMultiTLD = map[string]bool{
|
||||
"ac.uk": true, "co.jp": true, "co.nz": true, "co.uk": true, "co.za": true,
|
||||
"com.au": true, "com.br": true, "com.cn": true, "com.tr": true,
|
||||
"gov.uk": true, "org.uk": true,
|
||||
}
|
||||
|
||||
// registrableJar mirrors privacy.registrable (NOT policy.go's ad_ghost-flavored
|
||||
// registrable). eTLD+1 with the privacy multi-TLD table; IP literals returned
|
||||
// as-is.
|
||||
func registrableJar(host string) string {
|
||||
host = strings.TrimRight(strings.ToLower(strings.TrimSpace(host)), ".")
|
||||
if host == "" {
|
||||
return host
|
||||
}
|
||||
allDigit := true
|
||||
for _, c := range strings.ReplaceAll(host, ".", "") {
|
||||
if c < '0' || c > '9' {
|
||||
allDigit = false
|
||||
break
|
||||
}
|
||||
}
|
||||
if allDigit {
|
||||
return host // IP literal → as-is (matches privacy.registrable)
|
||||
}
|
||||
parts := strings.Split(host, ".")
|
||||
if len(parts) <= 2 {
|
||||
return host
|
||||
}
|
||||
last2 := strings.Join(parts[len(parts)-2:], ".")
|
||||
if _privacyMultiTLD[last2] {
|
||||
return strings.Join(parts[len(parts)-3:], ".")
|
||||
}
|
||||
return last2
|
||||
}
|
||||
|
||||
// loadJarKey reads the seed key file, trimming surrounding whitespace exactly
|
||||
// like Python's `Path(JAR_KEY_PATH).read_bytes().strip()`.
|
||||
//
|
||||
// Returns nil when the file is missing/unreadable OR strips to empty — both of
|
||||
// which mirror Python's `_jar_key()` returning None (which makes fake_id return
|
||||
// None / fakeID return ("", false)). Note: strings.TrimSpace and Python's
|
||||
// bytes.strip() trim the SAME ASCII whitespace set on byte boundaries
|
||||
// (space, \t, \n, \r, \v=0x0b, \f=0x0c). The canonical key's first/last bytes
|
||||
// must be non-whitespace, which the test fixture guarantees.
|
||||
func loadJarKey(path string) []byte {
|
||||
raw, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
return nil
|
||||
}
|
||||
// strings.TrimSpace over the byte string trims the same ASCII whitespace
|
||||
// bytes Python's bytes.strip() does (it also strips Unicode space runes,
|
||||
// but a key file is raw bytes with ASCII-whitespace padding, so the two
|
||||
// agree on the edge bytes the fixture uses).
|
||||
key := []byte(strings.TrimSpace(string(raw)))
|
||||
if len(key) == 0 {
|
||||
return nil
|
||||
}
|
||||
return key
|
||||
}
|
||||
|
||||
// shape renders the HMAC digest into the cookie's observed format so the
|
||||
// target accepts it. Mirrors privacy._shape EXACTLY:
|
||||
//
|
||||
// n = (name or "").lower()
|
||||
// i = int.from_bytes(digest[:8], "big"); j = int.from_bytes(digest[8:16], "big")
|
||||
// if n.startswith("_ga"): return "GA1.2.%d.%d" % (i % 1e10, j % 1e10)
|
||||
// if n in ("_fbp",): return "fb.1.%d.%d" % (i % 1e13, j % 1e10)
|
||||
// if n in ("uuid","uid","_pk_id") or len(name) >= 32:
|
||||
// h = digest.hex(); return "%s-%s-%s-%s-%s" % (h[:8],h[8:12],h[12:16],h[16:20],h[20:32])
|
||||
// return digest.hex()[:32]
|
||||
//
|
||||
// Note: Python `len(name)` is the RUNE (character) length, not byte length;
|
||||
// we use len([]rune(name)) to match. The GA1/fb int math is on a uint64 read
|
||||
// big-endian from the first/second 8 bytes; every modulus is < 2^64 so the
|
||||
// Go uint64 computation matches Python's non-negative int, and fmt "%d" of a
|
||||
// uint64 matches Python's "%d".
|
||||
func shape(name string, digest []byte) string {
|
||||
n := strings.ToLower(name)
|
||||
i := binary.BigEndian.Uint64(digest[:8])
|
||||
j := binary.BigEndian.Uint64(digest[8:16])
|
||||
switch {
|
||||
case strings.HasPrefix(n, "_ga"):
|
||||
return fmt.Sprintf("GA1.2.%d.%d", i%10_000_000_000, j%10_000_000_000)
|
||||
case n == "_fbp":
|
||||
return fmt.Sprintf("fb.1.%d.%d", i%10_000_000_000_000, j%10_000_000_000)
|
||||
case n == "uuid" || n == "uid" || n == "_pk_id" || len([]rune(name)) >= 32:
|
||||
h := hex.EncodeToString(digest)
|
||||
return fmt.Sprintf("%s-%s-%s-%s-%s", h[:8], h[8:12], h[12:16], h[16:20], h[20:32])
|
||||
default:
|
||||
return hex.EncodeToString(digest)[:32]
|
||||
}
|
||||
}
|
||||
|
||||
// fakeID returns a stable fabricated cookie value for (clientHash, tracker,
|
||||
// cookieName). Mirrors privacy.fake_id EXACTLY:
|
||||
//
|
||||
// if not key or not client_hash or not tracker: return None
|
||||
// msg = ("%s|%s|%s" % (client_hash, registrable(tracker), cookie_name)).encode()
|
||||
// digest = hmac.new(key, msg, sha256).digest()
|
||||
// return _shape(cookie_name, digest)
|
||||
//
|
||||
// Returns ("", false) for every case where Python returns None: empty key,
|
||||
// empty clientHash, or empty tracker.
|
||||
//
|
||||
// IMPORTANT: this uses registrableJar (privacy.registrable flavor), NOT the
|
||||
// ad_ghost-flavored registrable() in policy.go. They DIVERGE (gov.uk vs gouv.fr,
|
||||
// IP literals) — `privacy.fake_id` folds the tracker via privacy.registrable, so
|
||||
// the jar MUST too or the fake persona mismatches across engines at cutover.
|
||||
// Do NOT "consolidate" to policy.registrable; the divergence-guard fixtures
|
||||
// (ad.example.gov.uk, 9.9.9.9) will fail if you do.
|
||||
func fakeID(clientHash, tracker, cookieName string, key []byte) (string, bool) {
|
||||
if len(key) == 0 || clientHash == "" || tracker == "" {
|
||||
return "", false
|
||||
}
|
||||
msg := fmt.Sprintf("%s|%s|%s", clientHash, registrableJar(tracker), cookieName)
|
||||
mac := hmac.New(sha256.New, key)
|
||||
mac.Write([]byte(msg))
|
||||
digest := mac.Sum(nil)
|
||||
return shape(cookieName, digest), true
|
||||
}
|
||||
141
packages/secubox-toolbox-ng/cmd/sbxmitm/jar_test.go
Normal file
141
packages/secubox-toolbox-ng/cmd/sbxmitm/jar_test.go
Normal file
|
|
@ -0,0 +1,141 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// Cross-engine JAR parity harness — Go side (#662 Phase 4).
|
||||
//
|
||||
// Loads testdata/jar-fixtures.json + the fixed test key (testdata/jar-test.key,
|
||||
// NOT the real /etc key), computes fakeID per fixture, and asserts == the
|
||||
// fixture's expect. The Python side (../secubox-toolbox/tests/test_jar_parity.py)
|
||||
// loads the SAME files and drives privacy.fake_id; both must agree → the HMAC
|
||||
// fake-identity jar is byte-exact across engines. Python is the source of truth.
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/hex"
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
)
|
||||
|
||||
type jarFixture struct {
|
||||
Client string `json:"client"`
|
||||
Tracker string `json:"tracker"`
|
||||
CookieName string `json:"cookie_name"`
|
||||
Expect string `json:"expect"`
|
||||
Why string `json:"why"`
|
||||
}
|
||||
|
||||
type jarFile struct {
|
||||
KeyFile string `json:"key_file"`
|
||||
KeyHex string `json:"key_hex"`
|
||||
Fixtures []jarFixture `json:"fixtures"`
|
||||
}
|
||||
|
||||
func loadJarFile(t *testing.T) (jarFile, string) {
|
||||
t.Helper()
|
||||
dir := testdataDir(t) // shared with policy_test.go (cmd/sbxmitm → ../../testdata)
|
||||
raw, err := os.ReadFile(filepath.Join(dir, "jar-fixtures.json"))
|
||||
if err != nil {
|
||||
t.Fatalf("read jar fixtures: %v", err)
|
||||
}
|
||||
var jf jarFile
|
||||
if err := json.Unmarshal(raw, &jf); err != nil {
|
||||
t.Fatalf("parse jar fixtures: %v", err)
|
||||
}
|
||||
if len(jf.Fixtures) == 0 {
|
||||
t.Fatal("no jar fixtures")
|
||||
}
|
||||
return jf, dir
|
||||
}
|
||||
|
||||
// TestJarKeyLoad: loadJarKey strips the file's surrounding whitespace back to
|
||||
// the canonical key declared in key_hex (proves .strip()/TrimSpace parity).
|
||||
func TestJarKeyLoad(t *testing.T) {
|
||||
jf, dir := loadJarFile(t)
|
||||
key := loadJarKey(filepath.Join(dir, jf.KeyFile))
|
||||
if key == nil {
|
||||
t.Fatal("loadJarKey returned nil")
|
||||
}
|
||||
want, err := hex.DecodeString(jf.KeyHex)
|
||||
if err != nil {
|
||||
t.Fatalf("bad key_hex: %v", err)
|
||||
}
|
||||
if hex.EncodeToString(key) != hex.EncodeToString(want) {
|
||||
t.Fatalf("loaded key %x != canonical %x", key, want)
|
||||
}
|
||||
}
|
||||
|
||||
// TestJarParity: fakeID == Python-generated expect for every fixture.
|
||||
func TestJarParity(t *testing.T) {
|
||||
jf, dir := loadJarFile(t)
|
||||
key := loadJarKey(filepath.Join(dir, jf.KeyFile))
|
||||
if key == nil {
|
||||
t.Fatal("loadJarKey returned nil — cannot run parity")
|
||||
}
|
||||
for _, fx := range jf.Fixtures {
|
||||
got, ok := fakeID(fx.Client, fx.Tracker, fx.CookieName, key)
|
||||
if !ok {
|
||||
t.Errorf("fakeID(%q,%q,%q) returned !ok (%s)", fx.Client, fx.Tracker, fx.CookieName, fx.Why)
|
||||
continue
|
||||
}
|
||||
if got != fx.Expect {
|
||||
t.Errorf("fakeID(%q,%q,%q)=%q want %q (%s)",
|
||||
fx.Client, fx.Tracker, fx.CookieName, got, fx.Expect, fx.Why)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestJarShapeCoverage: the fixtures must exercise every _shape branch, else
|
||||
// "parity" is vacuous for an untested branch.
|
||||
func TestJarShapeCoverage(t *testing.T) {
|
||||
jf, _ := loadJarFile(t)
|
||||
var sawGA, sawFB, sawUUID, sawHex bool
|
||||
for _, fx := range jf.Fixtures {
|
||||
switch {
|
||||
case len(fx.Expect) >= 4 && fx.Expect[:4] == "GA1.":
|
||||
sawGA = true
|
||||
case len(fx.Expect) >= 3 && fx.Expect[:3] == "fb.":
|
||||
sawFB = true
|
||||
case len(fx.Expect) == 36 && fx.Expect[8] == '-':
|
||||
sawUUID = true
|
||||
case len(fx.Expect) == 32:
|
||||
sawHex = true
|
||||
}
|
||||
}
|
||||
if !sawGA || !sawFB || !sawUUID || !sawHex {
|
||||
t.Fatalf("shape coverage incomplete: GA=%v FB=%v UUID=%v HEX=%v", sawGA, sawFB, sawUUID, sawHex)
|
||||
}
|
||||
}
|
||||
|
||||
// TestJarFolding: two subdomains of the same registrable tracker, same client &
|
||||
// cookie name, mint the IDENTICAL fake id (registrable() folding).
|
||||
func TestJarFolding(t *testing.T) {
|
||||
jf, dir := loadJarFile(t)
|
||||
key := loadJarKey(filepath.Join(dir, jf.KeyFile))
|
||||
a, _ := fakeID("foldclient", "px.doubleclick.net", "uid", key)
|
||||
b, _ := fakeID("foldclient", "ads.doubleclick.net", "uid", key)
|
||||
if a == "" || a != b {
|
||||
t.Fatalf("folding broken: px=%q ads=%q", a, b)
|
||||
}
|
||||
}
|
||||
|
||||
// TestJarNilCases: fakeID returns ("",false) exactly where Python returns None.
|
||||
func TestJarNilCases(t *testing.T) {
|
||||
jf, dir := loadJarFile(t)
|
||||
key := loadJarKey(filepath.Join(dir, jf.KeyFile))
|
||||
cases := []struct {
|
||||
name string
|
||||
client, tracker, cookie string
|
||||
k []byte
|
||||
}{
|
||||
{"empty key", "c", "t.example", "uid", nil},
|
||||
{"empty client", "", "t.example", "uid", key},
|
||||
{"empty tracker", "c", "", "uid", key},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
if v, ok := fakeID(tc.client, tc.tracker, tc.cookie, tc.k); ok || v != "" {
|
||||
t.Errorf("%s: fakeID=%q,%v want \"\",false", tc.name, v, ok)
|
||||
}
|
||||
}
|
||||
}
|
||||
119
packages/secubox-toolbox-ng/cmd/sbxmitm/machash.go
Normal file
119
packages/secubox-toolbox-ng/cmd/sbxmitm/machash.go
Normal file
|
|
@ -0,0 +1,119 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: WG persona identity (mac_hash) (#662 Phase 6 prep)
|
||||
//
|
||||
// Byte-exact port of the Python WG-peer identity resolver
|
||||
// (packages/secubox-toolbox/mitmproxy_addons/_common.py: _wg_hash_of /
|
||||
// mac_hash_of). Python is the source of truth; this mirrors it exactly, proven
|
||||
// by the cross-engine parity harness (testdata/wg-peers-fixture.json +
|
||||
// testdata/machash-fixtures.json + machash_test.go ↔ tests/test_machash_parity.py).
|
||||
//
|
||||
// R3 clients reach this transparent engine over WireGuard on 10.99.1.0/24 and
|
||||
// have NO ARP entry on the captive subnet, so they are identified by their WG
|
||||
// public key (one peer → one IP, deterministic): ip → sha256(pubkey)[:16].
|
||||
//
|
||||
// Pure standard library — no external modules, no go.sum.
|
||||
package main
|
||||
|
||||
import (
|
||||
"crypto/sha256"
|
||||
"encoding/hex"
|
||||
"encoding/json"
|
||||
"os"
|
||||
"strings"
|
||||
"sync"
|
||||
)
|
||||
|
||||
// wgPeersPath is the on-disk WG peer DB, mirroring _common._WG_PEERS_DB. It is a
|
||||
// package-level var (not a const) so tests can repoint it at a fixture.
|
||||
var wgPeersPath = "/var/lib/secubox/toolbox/wg-peers.json"
|
||||
|
||||
// wgPeer mirrors the per-pubkey metadata object in wg-peers.json. Only "ip" is
|
||||
// consumed here (other fields are ignored, like the Python meta.get("ip")).
|
||||
type wgPeer struct {
|
||||
IP string `json:"ip"`
|
||||
}
|
||||
|
||||
// wgPeersDB mirrors the file shape: {"peers": {"<pubkey>": {"ip": "..."}}}.
|
||||
type wgPeersDB struct {
|
||||
Peers map[string]wgPeer `json:"peers"`
|
||||
}
|
||||
|
||||
// WG peer cache, mtime-keyed and reloaded only on mtime change — exactly like
|
||||
// the Python _WG_PEERS_CACHE / _WG_PEERS_MTIME globals. Guarded by a mutex: the
|
||||
// Go proxy is genuinely concurrent (Python relied on the GIL), so the cache map
|
||||
// and mtime MUST NOT be read/written without holding wgMu.
|
||||
var (
|
||||
wgMu sync.Mutex
|
||||
wgCache map[string]string // ip → sha256(pubkey)[:16]
|
||||
wgMtime int64 // last loaded file mtime (UnixNano), 0 = unloaded
|
||||
)
|
||||
|
||||
// resetWGCache clears the in-process WG cache so the next wgHashOf reload reads
|
||||
// wgPeersPath afresh. Used by tests after repointing wgPeersPath; mirrors the
|
||||
// Python parity test resetting _WG_PEERS_CACHE/_WG_PEERS_MTIME.
|
||||
func resetWGCache() {
|
||||
wgMu.Lock()
|
||||
wgCache = nil
|
||||
wgMtime = 0
|
||||
wgMu.Unlock()
|
||||
}
|
||||
|
||||
// wgHashOf maps a WG peer IP (10.99.1.X) to sha256(peer_pubkey)[:16]. Mirrors
|
||||
// _common._wg_hash_of EXACTLY: mtime-cached, reloaded only when the file mtime
|
||||
// changes (or the cache is empty); ANY error (missing file, bad JSON, stat
|
||||
// failure) → "" (best-effort, fail-open to empty, never panics). Returns "" for
|
||||
// an IP not present in the DB. The cache is mutex-guarded for concurrency.
|
||||
func wgHashOf(ip string) string {
|
||||
wgMu.Lock()
|
||||
defer wgMu.Unlock()
|
||||
|
||||
fi, err := os.Stat(wgPeersPath)
|
||||
if err != nil {
|
||||
return "" // missing file / unreadable → fail-open (Python: not exists → None)
|
||||
}
|
||||
mtime := fi.ModTime().UnixNano()
|
||||
if mtime != wgMtime || wgCache == nil {
|
||||
raw, err := os.ReadFile(wgPeersPath)
|
||||
if err != nil {
|
||||
return ""
|
||||
}
|
||||
var db wgPeersDB
|
||||
if err := json.Unmarshal(raw, &db); err != nil {
|
||||
return "" // bad JSON → fail-open (Python: except → None)
|
||||
}
|
||||
fresh := make(map[string]string, len(db.Peers))
|
||||
for pubkey, meta := range db.Peers {
|
||||
if meta.IP != "" {
|
||||
sum := sha256.Sum256([]byte(pubkey))
|
||||
fresh[meta.IP] = hex.EncodeToString(sum[:])[:16]
|
||||
}
|
||||
}
|
||||
wgCache = fresh
|
||||
wgMtime = mtime
|
||||
}
|
||||
return wgCache[ip] // missing key → "" (Python: cache.get(ip) → None)
|
||||
}
|
||||
|
||||
// macHashOf resolves an IP to a stable per-client persona identity hash.
|
||||
// Mirrors _common.mac_hash_of, but scoped to the R3 transparent engine:
|
||||
//
|
||||
// - empty ip → ""
|
||||
// - 10.99.1.0/24 (WG peer) → wgHashOf(ip) = sha256(peer_pubkey)[:16]
|
||||
// - else → ""
|
||||
//
|
||||
// The Python mac_hash_of has a third branch for the captive subnet
|
||||
// (R0/R1/R2): hash_mac(mac_of(ip)) = HMAC(salt, ARP MAC). That ARP/HMAC path is
|
||||
// INTENTIONALLY out of scope here — R3 clients arrive over WireGuard and have no
|
||||
// ARP entry on the captive subnet, so this engine is WG-only. Off-subnet IPs
|
||||
// therefore resolve to "" (the caller falls back to the raw peer IP).
|
||||
func macHashOf(ip string) string {
|
||||
if ip == "" {
|
||||
return ""
|
||||
}
|
||||
if strings.HasPrefix(ip, "10.99.1.") {
|
||||
return wgHashOf(ip)
|
||||
}
|
||||
return "" // R0-R2 ARP/HMAC path out of scope for the R3 transparent engine
|
||||
}
|
||||
118
packages/secubox-toolbox-ng/cmd/sbxmitm/machash_test.go
Normal file
118
packages/secubox-toolbox-ng/cmd/sbxmitm/machash_test.go
Normal file
|
|
@ -0,0 +1,118 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// Cross-engine mac_hash (WG persona identity) parity harness — Go side
|
||||
// (#662 Phase 6 prep).
|
||||
//
|
||||
// Loads testdata/machash-fixtures.json + the SAME testdata/wg-peers-fixture.json
|
||||
// the Python side reads, points wgPeersPath at the fixture, and asserts
|
||||
// macHashOf(ip) == each fixture's expected. The Python side
|
||||
// (../secubox-toolbox/tests/test_machash_parity.py) monkeypatches
|
||||
// _common._WG_PEERS_DB to the SAME fixture and drives _common.mac_hash_of; both
|
||||
// must agree → the WG persona hash is byte-exact across engines. Python is the
|
||||
// source of truth: the expected values were GENERATED by sha256(pubkey)[:16] in
|
||||
// Python, never hand-computed in Go (non-circular parity).
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
)
|
||||
|
||||
type machashFixture struct {
|
||||
IP string `json:"ip"`
|
||||
Expected string `json:"expected"`
|
||||
Why string `json:"why"`
|
||||
}
|
||||
|
||||
type machashFile struct {
|
||||
WGPeersFile string `json:"wg_peers_file"`
|
||||
Fixtures []machashFixture `json:"fixtures"`
|
||||
}
|
||||
|
||||
func loadMachashFile(t *testing.T) (machashFile, string) {
|
||||
t.Helper()
|
||||
dir := testdataDir(t) // shared with policy_test.go (cmd/sbxmitm → ../../testdata)
|
||||
raw, err := os.ReadFile(filepath.Join(dir, "machash-fixtures.json"))
|
||||
if err != nil {
|
||||
t.Fatalf("read machash fixtures: %v", err)
|
||||
}
|
||||
var mf machashFile
|
||||
if err := json.Unmarshal(raw, &mf); err != nil {
|
||||
t.Fatalf("parse machash fixtures: %v", err)
|
||||
}
|
||||
if len(mf.Fixtures) == 0 {
|
||||
t.Fatal("no machash fixtures")
|
||||
}
|
||||
return mf, dir
|
||||
}
|
||||
|
||||
// withWGFixture points wgPeersPath at the fixture and resets the cache so the
|
||||
// override is (re)read, restoring the original path afterwards. Mirrors exactly
|
||||
// the (path, cache) surface the Python _wg_hash_of reads.
|
||||
func withWGFixture(t *testing.T, mf machashFile, dir string) {
|
||||
t.Helper()
|
||||
orig := wgPeersPath
|
||||
wgPeersPath = filepath.Join(dir, mf.WGPeersFile)
|
||||
resetWGCache()
|
||||
t.Cleanup(func() {
|
||||
wgPeersPath = orig
|
||||
resetWGCache()
|
||||
})
|
||||
}
|
||||
|
||||
// TestMacHashParity: macHashOf == Python-generated expected for every fixture.
|
||||
func TestMacHashParity(t *testing.T) {
|
||||
mf, dir := loadMachashFile(t)
|
||||
withWGFixture(t, mf, dir)
|
||||
for _, fx := range mf.Fixtures {
|
||||
got := macHashOf(fx.IP)
|
||||
if got != fx.Expected {
|
||||
t.Errorf("macHashOf(%q)=%q want %q (%s)", fx.IP, got, fx.Expected, fx.Why)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestMacHashCoverage: the fixtures must exercise the discriminating cases, else
|
||||
// "parity" is vacuous. We need at least one resolved WG peer (non-empty), one
|
||||
// in-subnet miss (empty), one off-subnet IP (empty), and the empty ip (empty).
|
||||
func TestMacHashCoverage(t *testing.T) {
|
||||
mf, dir := loadMachashFile(t)
|
||||
withWGFixture(t, mf, dir)
|
||||
var sawResolved, sawSubnetMiss, sawOffSubnet, sawEmpty bool
|
||||
for _, fx := range mf.Fixtures {
|
||||
switch {
|
||||
case fx.IP == "":
|
||||
sawEmpty = true
|
||||
case fx.Expected != "":
|
||||
sawResolved = true
|
||||
case len(fx.IP) >= 8 && fx.IP[:8] == "10.99.1.":
|
||||
sawSubnetMiss = true
|
||||
default:
|
||||
sawOffSubnet = true
|
||||
}
|
||||
}
|
||||
if !sawResolved || !sawSubnetMiss || !sawOffSubnet || !sawEmpty {
|
||||
t.Fatalf("machash coverage incomplete: resolved=%v subnetMiss=%v offSubnet=%v empty=%v",
|
||||
sawResolved, sawSubnetMiss, sawOffSubnet, sawEmpty)
|
||||
}
|
||||
}
|
||||
|
||||
// TestWGCacheReload: wgHashOf reflects the file's content; after pointing at a
|
||||
// missing path it fails open to "" (best-effort, never panics).
|
||||
func TestWGCacheReload(t *testing.T) {
|
||||
mf, dir := loadMachashFile(t)
|
||||
withWGFixture(t, mf, dir)
|
||||
// A resolved peer from the fixture returns non-empty.
|
||||
if got := wgHashOf("10.99.1.10"); got == "" {
|
||||
t.Fatal("wgHashOf(10.99.1.10) empty — fixture not loaded")
|
||||
}
|
||||
// Repoint at a missing file → reload → fail-open to "".
|
||||
wgPeersPath = filepath.Join(dir, "does-not-exist.json")
|
||||
resetWGCache()
|
||||
if got := wgHashOf("10.99.1.10"); got != "" {
|
||||
t.Fatalf("wgHashOf with missing file = %q want \"\"", got)
|
||||
}
|
||||
}
|
||||
479
packages/secubox-toolbox-ng/cmd/sbxmitm/main.go
Normal file
479
packages/secubox-toolbox-ng/cmd/sbxmitm/main.go
Normal file
|
|
@ -0,0 +1,479 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: forging MITM PoC (#662 Phase 1)
|
||||
//
|
||||
// De-risking spike for migrating the R3 MITM engine off Python mitmproxy onto a
|
||||
// multi-core Go core. Pure standard library (no external modules) so it builds
|
||||
// offline and cross-compiles to arm64 with `GOOS=linux GOARCH=arm64 go build`.
|
||||
//
|
||||
// It is NOT wired into the live R3 path. It proves the discriminating
|
||||
// capabilities the engine analysis flagged as risky:
|
||||
// - forge per-host leaf certs from the EXISTING ca-wg CA (client trust intact),
|
||||
// - request short-circuit 204 (ad_ghost block),
|
||||
// - response body inject (banner / ad-CSS),
|
||||
// - SNI splice passthrough (tls_splice),
|
||||
// - TLS ClientHello capture for JA4 (ja4 addon) via crypto/tls.GetCertificate.
|
||||
//
|
||||
// Runs as an HTTP CONNECT proxy for easy smoke-testing (`curl -x`). The live
|
||||
// engine will run transparent (SO_ORIGINAL_DST) — same handlers, different
|
||||
// accept path (Phase 2+).
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"crypto"
|
||||
"crypto/rand"
|
||||
"crypto/tls"
|
||||
"crypto/x509"
|
||||
"crypto/x509/pkix"
|
||||
"encoding/pem"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"math/big"
|
||||
"net"
|
||||
"net/http"
|
||||
"os"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// ── CA + per-host leaf forging ──────────────────────────────────────────────
|
||||
|
||||
// CA holds the loaded forging CA (reused from ca-wg) + a per-host leaf cache.
|
||||
type CA struct {
|
||||
cert *x509.Certificate
|
||||
key crypto.Signer
|
||||
mu sync.Mutex
|
||||
cache map[string]*tls.Certificate
|
||||
}
|
||||
|
||||
func loadCA(certPath, keyPath string) (*CA, error) {
|
||||
cpem, err := os.ReadFile(certPath)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read ca cert: %w", err)
|
||||
}
|
||||
kpem, err := os.ReadFile(keyPath)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read ca key: %w", err)
|
||||
}
|
||||
// Scan for the right block TYPE rather than assuming position: the live R3
|
||||
// CA the toolbox forges with (mitmproxy confdir `mitmproxy-ca.pem`) is a
|
||||
// COMBINED cert+key bundle, and --ca-key may point at it. Tolerate cert and
|
||||
// key co-residing in either file, in any order.
|
||||
cblk := firstPEMBlock(cpem, func(b *pem.Block) bool { return b.Type == "CERTIFICATE" })
|
||||
if cblk == nil {
|
||||
return nil, fmt.Errorf("ca cert: no CERTIFICATE PEM block")
|
||||
}
|
||||
cert, err := x509.ParseCertificate(cblk.Bytes)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("parse ca cert: %w", err)
|
||||
}
|
||||
kblk := firstPEMBlock(kpem, func(b *pem.Block) bool { return strings.Contains(b.Type, "PRIVATE KEY") })
|
||||
if kblk == nil {
|
||||
return nil, fmt.Errorf("ca key: no PRIVATE KEY PEM block")
|
||||
}
|
||||
key, err := parseKey(kblk.Bytes)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("parse ca key: %w", err)
|
||||
}
|
||||
return &CA{cert: cert, key: key, cache: map[string]*tls.Certificate{}}, nil
|
||||
}
|
||||
|
||||
// firstPEMBlock returns the first PEM block in data satisfying want, or nil.
|
||||
// Used to pull a specific block (CERTIFICATE / PRIVATE KEY) out of a file that
|
||||
// may hold several (e.g. mitmproxy's combined CA bundle).
|
||||
func firstPEMBlock(data []byte, want func(*pem.Block) bool) *pem.Block {
|
||||
for {
|
||||
blk, rest := pem.Decode(data)
|
||||
if blk == nil {
|
||||
return nil
|
||||
}
|
||||
if want(blk) {
|
||||
return blk
|
||||
}
|
||||
data = rest
|
||||
}
|
||||
}
|
||||
|
||||
func parseKey(der []byte) (crypto.Signer, error) {
|
||||
if k, err := x509.ParsePKCS8PrivateKey(der); err == nil {
|
||||
if s, ok := k.(crypto.Signer); ok {
|
||||
return s, nil
|
||||
}
|
||||
}
|
||||
if k, err := x509.ParsePKCS1PrivateKey(der); err == nil {
|
||||
return k, nil
|
||||
}
|
||||
if k, err := x509.ParseECPrivateKey(der); err == nil {
|
||||
return k, nil
|
||||
}
|
||||
return nil, fmt.Errorf("unsupported CA key format")
|
||||
}
|
||||
|
||||
// forge returns a leaf cert for host signed by the CA, cached.
|
||||
func (c *CA) forge(host string) (*tls.Certificate, error) {
|
||||
host = strings.ToLower(strings.TrimSpace(host))
|
||||
c.mu.Lock()
|
||||
if tc, ok := c.cache[host]; ok {
|
||||
c.mu.Unlock()
|
||||
return tc, nil
|
||||
}
|
||||
c.mu.Unlock()
|
||||
|
||||
serial, _ := rand.Int(rand.Reader, new(big.Int).Lsh(big.NewInt(1), 128))
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: serial,
|
||||
Subject: pkix.Name{CommonName: host},
|
||||
NotBefore: time.Now().Add(-1 * time.Hour),
|
||||
NotAfter: time.Now().Add(24 * time.Hour),
|
||||
KeyUsage: x509.KeyUsageDigitalSignature | x509.KeyUsageKeyEncipherment,
|
||||
ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageServerAuth},
|
||||
DNSNames: []string{host},
|
||||
}
|
||||
der, err := x509.CreateCertificate(rand.Reader, tmpl, c.cert, c.key.Public(), c.key)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
leaf, err := x509.ParseCertificate(der) // parsed cert has Raw populated (Verify needs it)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
tc := &tls.Certificate{Certificate: [][]byte{der, c.cert.Raw}, PrivateKey: c.key, Leaf: leaf}
|
||||
c.mu.Lock()
|
||||
c.cache[host] = tc
|
||||
c.mu.Unlock()
|
||||
return tc, nil
|
||||
}
|
||||
|
||||
// ── Pure handler logic ───────────────────────────────────────────────────────
|
||||
//
|
||||
// The decision surface (Decide / action / registrable / splice helpers) lives
|
||||
// in policy.go, ported from the Python addons and proven at parity by the
|
||||
// cross-engine harness. The body-inject helper is kept here next to the wiring.
|
||||
|
||||
// injectMarker inserts p.Inject before </head> (else </body>, else prepends).
|
||||
func (p *Policy) injectMarker(body []byte) []byte {
|
||||
if len(p.Inject) == 0 || bytes.Contains(body, p.Inject) {
|
||||
return body
|
||||
}
|
||||
for _, tag := range [][]byte{[]byte("</head>"), []byte("</body>")} {
|
||||
if i := bytes.Index(bytes.ToLower(body), bytes.ToLower(tag)); i >= 0 {
|
||||
out := make([]byte, 0, len(body)+len(p.Inject))
|
||||
out = append(out, body[:i]...)
|
||||
out = append(out, p.Inject...)
|
||||
out = append(out, body[i:]...)
|
||||
return out
|
||||
}
|
||||
}
|
||||
return append(append([]byte{}, p.Inject...), body...)
|
||||
}
|
||||
|
||||
// ── JA4 ClientHello capture (the Go-feasibility proof for the ja4 addon) ─────
|
||||
|
||||
// ja4ish builds a compact handshake fingerprint from the fields crypto/tls
|
||||
// exposes in ClientHelloInfo (SNI, TLS versions, cipher count, ALPN). A FULL
|
||||
// JA4 also needs the extension list, which requires a raw-ClientHello-bytes
|
||||
// peek before stdlib parsing — feasible (Phase 4); this proves the material is
|
||||
// reachable in Go without Python.
|
||||
func ja4ish(h *tls.ClientHelloInfo) string {
|
||||
maxVer := uint16(0)
|
||||
for _, v := range h.SupportedVersions {
|
||||
if v > maxVer {
|
||||
maxVer = v
|
||||
}
|
||||
}
|
||||
alpn := "none"
|
||||
if len(h.SupportedProtos) > 0 {
|
||||
alpn = h.SupportedProtos[0]
|
||||
}
|
||||
return fmt.Sprintf("t%04x_c%02d_a%s_sni=%s", maxVer, len(h.CipherSuites), alpn, h.ServerName)
|
||||
}
|
||||
|
||||
// ── CONNECT-proxy MITM wiring ────────────────────────────────────────────────
|
||||
|
||||
type Proxy struct {
|
||||
ca *CA
|
||||
pol *Policy
|
||||
jaSink func(string) // JA4 observations (logged; a sidecar in prod)
|
||||
jarKey []byte // anti-track HMAC fake-identity seed (nil → poison off)
|
||||
poison bool // master gate: poison tracker Set-Cookies (default on when jarKey present)
|
||||
portal string // portal base URL for /__toolbox/* reverse-proxy (banner assets)
|
||||
}
|
||||
|
||||
func (px *Proxy) serverTLSConfig() *tls.Config {
|
||||
return &tls.Config{
|
||||
GetCertificate: func(h *tls.ClientHelloInfo) (*tls.Certificate, error) {
|
||||
if px.jaSink != nil {
|
||||
px.jaSink(ja4ish(h)) // capture handshake fingerprint
|
||||
}
|
||||
name := h.ServerName
|
||||
if name == "" {
|
||||
name = "unknown.local"
|
||||
}
|
||||
return px.ca.forge(name)
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
func (px *Proxy) handleConnect(w http.ResponseWriter, r *http.Request) {
|
||||
host := r.URL.Hostname()
|
||||
hj, ok := w.(http.Hijacker)
|
||||
if !ok {
|
||||
http.Error(w, "no hijack", 500)
|
||||
return
|
||||
}
|
||||
client, _, err := hj.Hijack()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer client.Close()
|
||||
io.WriteString(client, "HTTP/1.1 200 Connection Established\r\n\r\n")
|
||||
|
||||
// Decide once on (host, sni). For the CONNECT PoC the SNI is the CONNECT
|
||||
// host; the transparent engine will splice on the real ClientHello SNI.
|
||||
verdict := px.pol.Decide(host, host)
|
||||
|
||||
if verdict == "splice" {
|
||||
// passthrough: raw TCP to upstream, no TLS interception (tls_splice).
|
||||
up, err := net.DialTimeout("tcp", r.URL.Host, 10*time.Second)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer up.Close()
|
||||
go io.Copy(up, client)
|
||||
io.Copy(client, up)
|
||||
return
|
||||
}
|
||||
|
||||
// MITM: TLS-terminate the client with a forged cert (+ ClientHello capture).
|
||||
tconn := tls.Server(client, px.serverTLSConfig())
|
||||
if err := tconn.Handshake(); err != nil {
|
||||
return
|
||||
}
|
||||
defer tconn.Close()
|
||||
|
||||
// Shared post-TLS pipeline. CONNECT dials upstream by the request URL host
|
||||
// (req.URL.Host set inside), so dialHost is "" → mitmPipeline derives it.
|
||||
// CONNECT PoC is never an R3 WG client → wg=false.
|
||||
px.mitmPipeline(tconn, client, host, verdict, "", false)
|
||||
}
|
||||
|
||||
// mitmPipeline runs the shared post-TLS-handshake MITM logic used by BOTH the
|
||||
// CONNECT path (handleConnect) and the transparent path (handleTransparent):
|
||||
// read the decrypted request, apply the verdict, anonymize, proxy upstream,
|
||||
// poison tracker Set-Cookies, inject into HTML, and write the response back over
|
||||
// tconn. Factored out so the two accept paths never drift.
|
||||
//
|
||||
// - tconn : the TLS-terminated client connection (forged leaf).
|
||||
// - rawClient : the underlying client net.Conn (for the per-client identity).
|
||||
// - host : the decision host (CONNECT host / transparent SNI). Also the
|
||||
// Host/SNI used for the upstream request and TLS verification.
|
||||
// - verdict : the already-Decided action ∈ {allow, mitm, block}.
|
||||
// - dialHost : upstream "ip:port" to FORCE-dial at the TCP layer. "" →
|
||||
// CONNECT semantics: dial by req.URL.Host (the request URL / host). Non-""
|
||||
// → transparent: TCP-connect the captured original-dst while doing TLS with
|
||||
// ServerName=host and verifying the cert against host (not the bare IP).
|
||||
// - wg : the client is an R3 WireGuard peer (10.99.1.0/24); threaded
|
||||
// into the injected loader's data-wg attribute. CONNECT path passes false.
|
||||
func (px *Proxy) mitmPipeline(tconn *tls.Conn, rawClient net.Conn, host, verdict, dialHost string, wg bool) {
|
||||
br := newReader(tconn)
|
||||
req, err := http.ReadRequest(br)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
req.URL.Scheme = "https"
|
||||
if req.URL.Host == "" {
|
||||
req.URL.Host = host
|
||||
}
|
||||
|
||||
// #636/#662 — serve the banner loader + bundle for ANY origin so the injected
|
||||
// <script src="/__toolbox/loader.js"> resolves (R3 clients hit arbitrary
|
||||
// hosts whose origin can't serve /__toolbox/*). Short-circuit BEFORE dialing
|
||||
// the real upstream by reverse-proxying to the portal. Mirrors the Python
|
||||
// InjectBanner.request() startswith checks (path includes the query string).
|
||||
if isToolboxAssetPath(req.URL.RequestURI()) {
|
||||
servePortalAsset(tconn, px.portal, req.URL.RequestURI())
|
||||
return
|
||||
}
|
||||
// Transparent: the upstream request must carry the SNI host (for Host header,
|
||||
// SNI, and cert verification); the actual TCP dial is pinned to the captured
|
||||
// original-dst by transparentTransport. We do NOT put the bare ip:port in
|
||||
// req.URL.Host (that would make http.Client verify the cert against the IP).
|
||||
if dialHost != "" && host != "" {
|
||||
req.URL.Host = host
|
||||
}
|
||||
|
||||
if verdict == "block" {
|
||||
writeRaw(tconn, 204, "No Content", map[string]string{"X-SecuBox-Ng": "blocked"}, nil)
|
||||
return
|
||||
}
|
||||
|
||||
// ── verdict ∈ {"allow","mitm"} → intercept normally ──────────────────────
|
||||
//
|
||||
// allow → own-infra / allowlist: clean MITM, apply NO block/poison.
|
||||
// mitm → intercept + apply the response handlers (poison if a tracker).
|
||||
//
|
||||
// Always-on hygiene: anonymize the request on EVERY MITM'd flow (incl.
|
||||
// allow — stripping operator headers + asserting opt-out is universally
|
||||
// safe and never touches own-infra correctness).
|
||||
clientHash := clientHashFromConn(rawClient) // mac_hash-aware (WG persona)
|
||||
anonymizeRequest(req.Header)
|
||||
|
||||
// #662 — pin the upstream Accept-Encoding to gzip (overwrite, dropping
|
||||
// br/zstd/deflate we cannot decode with the stdlib). This guarantees every
|
||||
// response is either gzip or identity, so the inject path can reliably
|
||||
// gunzip→inject→re-gzip the HTML. We Set (not Del): Del would make Go's
|
||||
// Transport auto-decompress and re-serve identity, losing wire compression
|
||||
// to the client for ALL resources (incl. non-injected ones). Set keeps the
|
||||
// Transport in pass-through mode so non-HTML bodies stay compressed
|
||||
// end-to-end. Browsers always accept gzip, so relaying gzip back is safe.
|
||||
req.Header.Set("Accept-Encoding", "gzip")
|
||||
|
||||
// proxy upstream, inject into HTML bodies.
|
||||
//
|
||||
// CheckRedirect: a MITM proxy must NOT follow 3xx itself — it relays the
|
||||
// redirect to the client so the BROWSER follows it (correct URL bar, origin,
|
||||
// cookie scope, method semantics). Go's http.Client follows by default, which
|
||||
// would collapse a 301/302 into the final 200 under the original URL (wrong).
|
||||
// Mirror mitmproxy's pass-through behaviour.
|
||||
up := &http.Client{
|
||||
Timeout: 30 * time.Second,
|
||||
CheckRedirect: func(*http.Request, []*http.Request) error { return http.ErrUseLastResponse },
|
||||
}
|
||||
if dialHost != "" {
|
||||
// Transparent: pin the TCP dial to the captured original-dst, do TLS with
|
||||
// ServerName=host, verify the cert against host (verification stays ON).
|
||||
up.Transport = transparentTransport(dialHost, host)
|
||||
}
|
||||
req.RequestURI = ""
|
||||
resp, err := up.Do(req)
|
||||
if err != nil {
|
||||
writeRaw(tconn, 502, "Bad Gateway", nil, nil)
|
||||
return
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
// Poison: only on MITM'd tracker flows (never on allow/own-infra), and only
|
||||
// when the jar key is loaded. Replaces tracking-id Set-Cookie values with a
|
||||
// stable fabricated persona; benign cookies pass through untouched.
|
||||
if verdict == "mitm" && px.poison && len(px.jarKey) > 0 && px.pol.shouldPoison(host) {
|
||||
if sc := resp.Header.Values("Set-Cookie"); len(sc) > 0 {
|
||||
poisoned := poisonSetCookies(sc, clientHash, host, px.jarKey)
|
||||
resp.Header.Del("Set-Cookie")
|
||||
for _, c := range poisoned {
|
||||
resp.Header.Add("Set-Cookie", c)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
body, _ := io.ReadAll(io.LimitReader(resp.Body, 8<<20))
|
||||
// Inject the transparency-banner loader only on 2xx text/html responses
|
||||
// (mirrors the Python addon, which skips non-200). The loader's same-origin
|
||||
// <script src="/__toolbox/loader.js"> is served by the short-circuit above.
|
||||
//
|
||||
// #662 — the body may be gzip-compressed (we pinned Accept-Encoding: gzip
|
||||
// upstream). injectIntoBody gunzips→injects→re-gzips when Content-Encoding
|
||||
// is gzip, injects directly when identity, and fails open (untouched) on a
|
||||
// corrupt/unknown encoding. Only on a successful rewrite do we update the
|
||||
// framing: writeResponse emits Content-Length from len(body), but a stale
|
||||
// resp.ContentLength / Content-Encoding could mislead downstream — so we
|
||||
// keep them consistent with the bytes we actually serve.
|
||||
if resp.StatusCode >= 200 && resp.StatusCode < 300 &&
|
||||
strings.Contains(resp.Header.Get("Content-Type"), "text/html") {
|
||||
if out, ok := injectIntoBody(body, resp.Header.Get("Content-Encoding"), clientHash, wg); ok {
|
||||
body = out
|
||||
// Keep the response framing consistent with the served bytes. The
|
||||
// encoding is unchanged (gzip stays gzip, identity stays identity);
|
||||
// only the length changed because injection grew the body. A stale
|
||||
// Content-Length would truncate/corrupt the response.
|
||||
resp.Header.Set("Content-Length", strconv.Itoa(len(body)))
|
||||
resp.ContentLength = int64(len(body))
|
||||
}
|
||||
}
|
||||
writeResponse(tconn, resp, body)
|
||||
}
|
||||
|
||||
// transparentTransport builds a per-request http.Transport for the transparent
|
||||
// path: it TCP-dials the captured original-dst (ip:port) for EVERY connection
|
||||
// regardless of req.URL.Host, while performing TLS with ServerName=sni and
|
||||
// verifying the cert against that name — so a transparently-redirected upstream
|
||||
// is reached at the real captured IP yet validated by hostname, NOT the bare IP
|
||||
// (which would always mismatch the cert). Cert verification stays ON
|
||||
// (no InsecureSkipVerify). Pure stdlib so it builds on all GOOS.
|
||||
func transparentTransport(dialAddr, sni string) *http.Transport {
|
||||
d := &net.Dialer{Timeout: 10 * time.Second}
|
||||
return &http.Transport{
|
||||
DialContext: func(ctx context.Context, network, _ string) (net.Conn, error) {
|
||||
return d.DialContext(ctx, network, dialAddr)
|
||||
},
|
||||
TLSClientConfig: &tls.Config{ServerName: sni},
|
||||
TLSHandshakeTimeout: 10 * time.Second,
|
||||
ResponseHeaderTimeout: 30 * time.Second,
|
||||
ForceAttemptHTTP2: false,
|
||||
}
|
||||
}
|
||||
|
||||
func main() {
|
||||
caCert := flag.String("ca-cert", "/etc/secubox/toolbox/ca-wg/ca.pem", "CA cert PEM")
|
||||
caKey := flag.String("ca-key", "/etc/secubox/toolbox/ca-wg/key.pem", "CA key PEM")
|
||||
addr := flag.String("listen", ":8090", "CONNECT proxy listen addr")
|
||||
jarKeyPath := flag.String("jar-key", "/etc/secubox/secrets/privacy-jar.key",
|
||||
"anti-track HMAC fake-identity seed (poison disabled if absent)")
|
||||
poison := flag.Bool("poison", true,
|
||||
"poison tracking Set-Cookies on MITM'd tracker flows (needs --jar-key; never touches allow/own-infra)")
|
||||
transparent := flag.Bool("transparent", false,
|
||||
"transparent mode: accept nft-DNAT'd conns + recover SO_ORIGINAL_DST (live R3); default is the CONNECT proxy PoC")
|
||||
portal := flag.String("portal", "http://127.0.0.1:8088",
|
||||
"portal base URL; /__toolbox/loader.js + /__toolbox/bundle are reverse-proxied here (banner assets, served for any MITM'd origin)")
|
||||
flag.Parse()
|
||||
ca, err := loadCA(*caCert, *caKey)
|
||||
if err != nil {
|
||||
log.Fatalf("CA load: %v", err)
|
||||
}
|
||||
// Load the BLOCK/SPLICE policy from the SAME on-disk config the Python
|
||||
// addons read (defaults + env overrides). Missing files are tolerated
|
||||
// (best-effort, like the addons): the engine then simply MITMs everything.
|
||||
pol, err := LoadPolicy(PolicyOpts{})
|
||||
if err != nil {
|
||||
log.Fatalf("policy load: %v", err)
|
||||
}
|
||||
pol.Inject = []byte("<!-- sbx-ng banner -->")
|
||||
// Anti-track jar seed: best-effort (like the Python _jar_key). Absent/empty
|
||||
// → loadJarKey returns nil → poison stays off even if --poison is set.
|
||||
jarKey := loadJarKey(*jarKeyPath)
|
||||
if *poison && len(jarKey) == 0 {
|
||||
log.Printf("poison requested but jar key %s absent/empty → poison OFF", *jarKeyPath)
|
||||
}
|
||||
px := &Proxy{
|
||||
ca: ca,
|
||||
pol: pol,
|
||||
jaSink: func(s string) { log.Printf("ja4 %s", s) },
|
||||
jarKey: jarKey,
|
||||
poison: *poison,
|
||||
portal: *portal,
|
||||
}
|
||||
if *transparent {
|
||||
// Transparent R3 mode: raw accept loop, each conn carries its pre-DNAT
|
||||
// destination via SO_ORIGINAL_DST (recovered in handleTransparent). The
|
||||
// accept loop lives in runTransparent — linux-tagged, with a non-linux
|
||||
// stub so the package still builds (and `darwin go build`) off-target.
|
||||
runTransparent(px, *addr)
|
||||
return
|
||||
}
|
||||
|
||||
srv := &http.Server{Addr: *addr, Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if r.Method == http.MethodConnect {
|
||||
px.handleConnect(w, r)
|
||||
return
|
||||
}
|
||||
http.Error(w, "CONNECT only (PoC)", 405)
|
||||
})}
|
||||
log.Printf("sbxmitm CONNECT PoC listening on %s (CA %s)", *addr, *caCert)
|
||||
log.Fatal(srv.ListenAndServe())
|
||||
}
|
||||
204
packages/secubox-toolbox-ng/cmd/sbxmitm/main_test.go
Normal file
204
packages/secubox-toolbox-ng/cmd/sbxmitm/main_test.go
Normal file
|
|
@ -0,0 +1,204 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
package main
|
||||
|
||||
import (
|
||||
"crypto/ecdsa"
|
||||
"crypto/elliptic"
|
||||
"crypto/rand"
|
||||
"crypto/tls"
|
||||
"crypto/x509"
|
||||
"crypto/x509/pkix"
|
||||
"encoding/pem"
|
||||
"math/big"
|
||||
"net"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// genTestCA writes a self-signed CA (cert+key PEM) to dir, mirroring ca-wg.
|
||||
func genTestCA(t *testing.T, dir string) (certPath, keyPath string) {
|
||||
t.Helper()
|
||||
key, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: big.NewInt(1),
|
||||
Subject: pkix.Name{CommonName: "SecuBox Test CA"},
|
||||
NotBefore: time.Now().Add(-time.Hour),
|
||||
NotAfter: time.Now().Add(24 * time.Hour),
|
||||
IsCA: true,
|
||||
KeyUsage: x509.KeyUsageCertSign | x509.KeyUsageDigitalSignature,
|
||||
BasicConstraintsValid: true,
|
||||
}
|
||||
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, key.Public(), key)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
certPath = filepath.Join(dir, "ca.pem")
|
||||
keyPath = filepath.Join(dir, "key.pem")
|
||||
cf, _ := os.Create(certPath)
|
||||
pem.Encode(cf, &pem.Block{Type: "CERTIFICATE", Bytes: der})
|
||||
cf.Close()
|
||||
kder, _ := x509.MarshalPKCS8PrivateKey(key)
|
||||
kf, _ := os.Create(keyPath)
|
||||
pem.Encode(kf, &pem.Block{Type: "PRIVATE KEY", Bytes: kder})
|
||||
kf.Close()
|
||||
return certPath, keyPath
|
||||
}
|
||||
|
||||
func TestForgeChainsToCA(t *testing.T) {
|
||||
cp, kp := genTestCA(t, t.TempDir())
|
||||
ca, err := loadCA(cp, kp)
|
||||
if err != nil {
|
||||
t.Fatalf("loadCA: %v", err)
|
||||
}
|
||||
leaf, err := ca.forge("ads.example.com")
|
||||
if err != nil {
|
||||
t.Fatalf("forge: %v", err)
|
||||
}
|
||||
pool := x509.NewCertPool()
|
||||
pool.AddCert(ca.cert)
|
||||
if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil {
|
||||
t.Fatalf("forged leaf does not chain to CA / wrong SAN: %v", err)
|
||||
}
|
||||
leaf2, _ := ca.forge("ads.example.com")
|
||||
if leaf2 != leaf {
|
||||
t.Fatal("forge not cached")
|
||||
}
|
||||
}
|
||||
|
||||
// TestLoadCACombinedPEM proves loadCA pulls the right blocks out of a COMBINED
|
||||
// cert+key bundle — the real shape of mitmproxy's confdir `mitmproxy-ca.pem`,
|
||||
// which the live R3 CA uses and the worker unit points --ca-key at. mitmproxy
|
||||
// writes the PRIVATE KEY block first, then the CERTIFICATE; loadCA must scan by
|
||||
// type, not position.
|
||||
func TestLoadCACombinedPEM(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
key, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
tmpl := &x509.Certificate{
|
||||
SerialNumber: big.NewInt(7),
|
||||
Subject: pkix.Name{CommonName: "Gondwana ToolBoX R3 CA (test)"},
|
||||
NotBefore: time.Now().Add(-time.Hour),
|
||||
NotAfter: time.Now().Add(24 * time.Hour),
|
||||
IsCA: true,
|
||||
KeyUsage: x509.KeyUsageCertSign | x509.KeyUsageDigitalSignature,
|
||||
BasicConstraintsValid: true,
|
||||
}
|
||||
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, key.Public(), key)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
kder, _ := x509.MarshalPKCS8PrivateKey(key)
|
||||
keyPEM := pem.EncodeToMemory(&pem.Block{Type: "PRIVATE KEY", Bytes: kder})
|
||||
certPEM := pem.EncodeToMemory(&pem.Block{Type: "CERTIFICATE", Bytes: der})
|
||||
|
||||
// mitmproxy-ca.pem layout: key THEN cert in one file.
|
||||
combined := filepath.Join(dir, "mitmproxy-ca.pem")
|
||||
if err := os.WriteFile(combined, append(append([]byte{}, keyPEM...), certPEM...), 0o600); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
// mitmproxy-ca-cert.pem: cert only.
|
||||
certOnly := filepath.Join(dir, "mitmproxy-ca-cert.pem")
|
||||
if err := os.WriteFile(certOnly, certPEM, 0o644); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
// The unit's exact arg shape: --ca-cert <cert-only> --ca-key <combined>.
|
||||
ca, err := loadCA(certOnly, combined)
|
||||
if err != nil {
|
||||
t.Fatalf("loadCA(cert-only, combined): %v", err)
|
||||
}
|
||||
leaf, err := ca.forge("ads.example.com")
|
||||
if err != nil {
|
||||
t.Fatalf("forge: %v", err)
|
||||
}
|
||||
pool := x509.NewCertPool()
|
||||
pool.AddCert(ca.cert)
|
||||
if _, err := leaf.Leaf.Verify(x509.VerifyOptions{Roots: pool, DNSName: "ads.example.com"}); err != nil {
|
||||
t.Fatalf("forged leaf does not chain to combined-PEM CA: %v", err)
|
||||
}
|
||||
// Belt-and-braces: the combined file works as BOTH cert and key source.
|
||||
if _, err := loadCA(combined, combined); err != nil {
|
||||
t.Fatalf("loadCA(combined, combined): %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// NOTE (#662 Phase 3): the old TestActionDecision drove the removed hardcoded
|
||||
// Policy{AdHosts, SpliceHosts} fields. The decision surface now loads from
|
||||
// disk (LoadPolicy) and mirrors the Python addons; coverage moved to
|
||||
// TestParityDecide / TestPolicyActionVerbs in policy_test.go.
|
||||
|
||||
func TestInjectMarker(t *testing.T) {
|
||||
p := &Policy{Inject: []byte("<!--SBX-->")}
|
||||
out := string(p.injectMarker([]byte("<html><head></head><body>hi</body></html>")))
|
||||
if !contains(out, "<!--SBX--></head>") {
|
||||
t.Fatalf("marker not injected before </head>: %s", out)
|
||||
}
|
||||
if string(p.injectMarker([]byte(out))) != out {
|
||||
t.Fatal("inject not idempotent")
|
||||
}
|
||||
}
|
||||
|
||||
func contains(s, sub string) bool {
|
||||
for i := 0; i+len(sub) <= len(s); i++ {
|
||||
if s[i:i+len(sub)] == sub {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// TestClientHelloCaptureAndForge: a real localhost TLS handshake proves the Go
|
||||
// core forges a per-SNI cert from the CA that the client trusts AND that the
|
||||
// ClientHello (JA4 material) is captured.
|
||||
func TestClientHelloCaptureAndForge(t *testing.T) {
|
||||
cp, kp := genTestCA(t, t.TempDir())
|
||||
ca, err := loadCA(cp, kp)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
var mu sync.Mutex
|
||||
var captured string
|
||||
px := &Proxy{ca: ca, jaSink: func(s string) { mu.Lock(); captured = s; mu.Unlock() }}
|
||||
|
||||
ln, err := net.Listen("tcp", "127.0.0.1:0")
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
defer ln.Close()
|
||||
go func() {
|
||||
c, err := ln.Accept()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
s := tls.Server(c, px.serverTLSConfig())
|
||||
s.Handshake()
|
||||
s.Close()
|
||||
}()
|
||||
|
||||
pool := x509.NewCertPool()
|
||||
pool.AddCert(ca.cert)
|
||||
conn, err := tls.Dial("tcp", ln.Addr().String(), &tls.Config{ServerName: "example.com", RootCAs: pool})
|
||||
if err != nil {
|
||||
t.Fatalf("client handshake against forged cert failed (CA not trusted / forge broken): %v", err)
|
||||
}
|
||||
conn.Close()
|
||||
|
||||
mu.Lock()
|
||||
defer mu.Unlock()
|
||||
if captured == "" {
|
||||
t.Fatal("ClientHello not captured")
|
||||
}
|
||||
if !contains(captured, "sni=example.com") {
|
||||
t.Fatalf("JA4 capture missing SNI: %q", captured)
|
||||
}
|
||||
t.Logf("captured JA4-ish: %s", captured)
|
||||
}
|
||||
47
packages/secubox-toolbox-ng/cmd/sbxmitm/poison_gate_test.go
Normal file
47
packages/secubox-toolbox-ng/cmd/sbxmitm/poison_gate_test.go
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// Gate tests for the poison emission (#662 Phase 5-prep, Part A): poison only
|
||||
// fires on MITM'd TRACKER flows, never on allow/own-infra flows. This is the
|
||||
// same safety envelope as anti-track — own-infra/allowlist flows stay clean.
|
||||
package main
|
||||
|
||||
import (
|
||||
"path/filepath"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestShouldPoisonGate: a tracker host MITM'd → poison; an own-infra/allowlisted
|
||||
// host → never poison (even though both are intercepted = "mitm" verb).
|
||||
func TestShouldPoisonGate(t *testing.T) {
|
||||
pf, dir := loadParityFile(t)
|
||||
cfgPath := func(rel string) string { return filepath.Join(dir, filepath.FromSlash(rel)) }
|
||||
pol, err := LoadPolicy(PolicyOpts{
|
||||
AllowPath: cfgPath(pf.Config.AdAllowlist),
|
||||
LearnedPath: cfgPath(pf.Config.LearnedTrackers),
|
||||
SpliceSeedPath: cfgPath(pf.Config.SpliceSeed),
|
||||
SpliceLearnPath: cfgPath(pf.Config.SpliceLearned),
|
||||
PureTrackersPath: cfgPath(pf.Config.PureTrackers),
|
||||
FortknoxSites: pf.Config.FortknoxSites,
|
||||
SelfDomains: pf.Config.SelfDomains,
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
cases := map[string]bool{
|
||||
// tracker hosts → poison eligible (a tracker we'd otherwise block, but
|
||||
// once MITM'd we poison rather than blunt-block).
|
||||
"ads.doubleclick.net": true,
|
||||
"adnxs.com": true,
|
||||
// own-infra + allowlisted + benign → NEVER poison.
|
||||
"hub.secubox.in": false,
|
||||
"analytics.example-allowed.com": false,
|
||||
"news.example.com": false,
|
||||
}
|
||||
for host, want := range cases {
|
||||
if got := pol.shouldPoison(host); got != want {
|
||||
t.Errorf("shouldPoison(%q)=%v want %v", host, got, want)
|
||||
}
|
||||
}
|
||||
}
|
||||
369
packages/secubox-toolbox-ng/cmd/sbxmitm/policy.go
Normal file
369
packages/secubox-toolbox-ng/cmd/sbxmitm/policy.go
Normal file
|
|
@ -0,0 +1,369 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: policy layer (#662 Phase 3)
|
||||
//
|
||||
// Ports the toolbox BLOCK (ad_ghost) and SPLICE (tls_splice) decision logic
|
||||
// into the Go core, reading the SAME on-disk config files the Python addons
|
||||
// use. Python is the source of truth; this mirrors it byte-for-byte on the
|
||||
// decision surface, proven by the cross-engine parity harness
|
||||
// (testdata/parity-fixtures.json + policy_test.go ↔ tests/test_engine_parity.py).
|
||||
//
|
||||
// Pure standard library — no external modules, no go.sum.
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"os"
|
||||
"regexp"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// ── ad_ghost: static ad/tracker host pattern (port of _AD_HOST) ──────────────
|
||||
//
|
||||
// Python (mitmproxy_addons/ad_ghost.py):
|
||||
//
|
||||
// _AD_HOST = re.compile(
|
||||
// r"(?:^|\.)(?:doubleclick|googlesyndication|googleadservices|"
|
||||
// r"googletagservices|adservice\.google|amazon-adsystem|adnxs|adsrvr|"
|
||||
// r"adform|criteo|rubiconproject|taboola|outbrain|smartadserver|moatads|"
|
||||
// r"scorecardresearch|2mdn|adroll|pubmatic|openx|casalemedia|"
|
||||
// r"yieldlove|sharethrough|teads|3lift|adsystem|adserver)",
|
||||
// re.IGNORECASE)
|
||||
//
|
||||
// Every construct here — non-capturing groups, `^`, `\.`, alternation, the
|
||||
// case-insensitive flag — is RE2-safe, so it translates 1:1 to Go regexp via
|
||||
// the `(?i)` inline flag. No fallback substring split was needed.
|
||||
const adHostPattern = `(?i)(?:^|\.)(?:doubleclick|googlesyndication|googleadservices|` +
|
||||
`googletagservices|adservice\.google|amazon-adsystem|adnxs|adsrvr|` +
|
||||
`adform|criteo|rubiconproject|taboola|outbrain|smartadserver|moatads|` +
|
||||
`scorecardresearch|2mdn|adroll|pubmatic|openx|casalemedia|` +
|
||||
`yieldlove|sharethrough|teads|3lift|adsystem|adserver)`
|
||||
|
||||
// _2L_TLD: two-level public suffixes (port of ad_ghost._2L_TLD).
|
||||
var twoLevelTLD = map[string]bool{
|
||||
"co.uk": true, "com.au": true, "co.jp": true, "co.nz": true,
|
||||
"com.br": true, "co.za": true, "gouv.fr": true,
|
||||
}
|
||||
|
||||
// ── PolicyOpts: configurable file paths (env-overridable, like Python) ───────
|
||||
|
||||
// PolicyOpts holds the on-disk paths the loaders read. Empty fields fall back
|
||||
// to the real production defaults (or the env override) in LoadPolicy.
|
||||
type PolicyOpts struct {
|
||||
AllowPath string // ad-allowlist.txt (_ALLOW_PATH)
|
||||
LearnedPath string // learned-trackers.txt (_LEARNED_PATH)
|
||||
SpliceSeedPath string // conf/tls-splice-seed.conf (SEED_PATH)
|
||||
SpliceLearnPath string // splice-learned.txt (LEARNED_PATH)
|
||||
PureTrackersPath string // pure-trackers.txt (PURE_PATH)
|
||||
FortknoxSites []string // filters.json fortknox_sites
|
||||
SelfDomains []string // _SELF_REGS (default {secubox.in}, env SECUBOX_SELF_DOMAINS)
|
||||
}
|
||||
|
||||
// defaultPolicyOpts returns the production defaults, honoring the same env vars
|
||||
// the Python addons read.
|
||||
func defaultPolicyOpts() PolicyOpts {
|
||||
o := PolicyOpts{
|
||||
AllowPath: "/var/lib/secubox/toolbox/ad-allowlist.txt",
|
||||
LearnedPath: "/var/lib/secubox/toolbox/learned-trackers.txt",
|
||||
SpliceSeedPath: envOr("SECUBOX_SPLICE_SEED", "/usr/lib/secubox/toolbox/conf/tls-splice-seed.conf"),
|
||||
SpliceLearnPath: envOr("SECUBOX_SPLICE_LEARNED", "/var/lib/secubox/toolbox/splice-learned.txt"),
|
||||
PureTrackersPath: envOr("SECUBOX_PURE_TRACKERS", "/var/lib/secubox/toolbox/pure-trackers.txt"),
|
||||
}
|
||||
// _SELF_REGS: env SECUBOX_SELF_DOMAINS (comma-split), default {secubox.in}.
|
||||
self := os.Getenv("SECUBOX_SELF_DOMAINS")
|
||||
if strings.TrimSpace(self) == "" {
|
||||
self = "secubox.in"
|
||||
}
|
||||
for _, d := range strings.Split(self, ",") {
|
||||
if d = strings.TrimSpace(strings.ToLower(d)); d != "" {
|
||||
o.SelfDomains = append(o.SelfDomains, d)
|
||||
}
|
||||
}
|
||||
return o
|
||||
}
|
||||
|
||||
func envOr(key, def string) string {
|
||||
if v := os.Getenv(key); v != "" {
|
||||
return v
|
||||
}
|
||||
return def
|
||||
}
|
||||
|
||||
// ── Policy: the loaded decision state ────────────────────────────────────────
|
||||
|
||||
// Policy carries the loaded sets/regex and decides per-host actions. It also
|
||||
// keeps the legacy PoC fields (Inject) so the existing wiring/tests still work.
|
||||
type Policy struct {
|
||||
adHost *regexp.Regexp
|
||||
learned map[string]bool // learned-trackers (host or registrable, lowercased)
|
||||
allow map[string]bool // ad-allowlist (host or registrable, lowercased)
|
||||
spliceSeed map[string]bool // splice seed patterns
|
||||
spliceLearn map[string]bool // splice learned patterns
|
||||
never map[string]bool // pure-trackers ∪ fortknox (splice never-set)
|
||||
selfRegs map[string]bool // own-infra registrable domains
|
||||
selfDomains []string // own-infra (for the host==d || host endswith .d guard)
|
||||
|
||||
// Legacy PoC fields kept so non-policy behaviour is unchanged.
|
||||
Inject []byte // banner / ad-CSS marker injected before </head> or </body>
|
||||
}
|
||||
|
||||
// loadLines mirrors the comment-stripping Python loaders (splice._load_lines,
|
||||
// ad_ghost._allowed's allowlist read): split on first '#', trim, lowercase,
|
||||
// skip blanks. Missing/unreadable file → empty set (best-effort).
|
||||
func loadLines(path string) map[string]bool {
|
||||
return scanLines(path, true)
|
||||
}
|
||||
|
||||
// loadLinesRaw mirrors ad_ghost._learned_set, which does NOT comment-strip —
|
||||
// learned-trackers.txt is a machine-generated one-host-per-line file. It does
|
||||
// `{ln.strip().lower() for ln in f if ln.strip()}`. Matching this exactly is
|
||||
// load-bearing for parity (a '#' in this file would be kept verbatim, not a
|
||||
// comment), so the Go core must mirror the divergent behaviour, not normalise it.
|
||||
func loadLinesRaw(path string) map[string]bool {
|
||||
return scanLines(path, false)
|
||||
}
|
||||
|
||||
func scanLines(path string, stripComments bool) map[string]bool {
|
||||
out := map[string]bool{}
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
return out
|
||||
}
|
||||
defer f.Close()
|
||||
sc := bufio.NewScanner(f)
|
||||
sc.Buffer(make([]byte, 0, 64*1024), 1<<20)
|
||||
for sc.Scan() {
|
||||
ln := sc.Text()
|
||||
if stripComments {
|
||||
if i := strings.IndexByte(ln, '#'); i >= 0 {
|
||||
ln = ln[:i]
|
||||
}
|
||||
}
|
||||
ln = strings.ToLower(strings.TrimSpace(ln))
|
||||
if ln != "" {
|
||||
out[ln] = true
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// LoadPolicy loads all backing files from opts (defaults applied for empty
|
||||
// fields) and compiles the ad-host regex. It never returns an error for missing
|
||||
// files (best-effort, like the Python addons), only for a regex-compile bug.
|
||||
func LoadPolicy(opts PolicyOpts) (*Policy, error) {
|
||||
def := defaultPolicyOpts()
|
||||
if opts.AllowPath == "" {
|
||||
opts.AllowPath = def.AllowPath
|
||||
}
|
||||
if opts.LearnedPath == "" {
|
||||
opts.LearnedPath = def.LearnedPath
|
||||
}
|
||||
if opts.SpliceSeedPath == "" {
|
||||
opts.SpliceSeedPath = def.SpliceSeedPath
|
||||
}
|
||||
if opts.SpliceLearnPath == "" {
|
||||
opts.SpliceLearnPath = def.SpliceLearnPath
|
||||
}
|
||||
if opts.PureTrackersPath == "" {
|
||||
opts.PureTrackersPath = def.PureTrackersPath
|
||||
}
|
||||
if len(opts.SelfDomains) == 0 {
|
||||
opts.SelfDomains = def.SelfDomains
|
||||
}
|
||||
|
||||
re, err := regexp.Compile(adHostPattern)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// never-set = pure-trackers ∪ fortknox_sites (mirrors TlsSplice._refresh_sets).
|
||||
never := loadLines(opts.PureTrackersPath)
|
||||
for _, s := range opts.FortknoxSites {
|
||||
if s = strings.Trim(strings.ToLower(strings.TrimSpace(s)), "."); s != "" {
|
||||
never[s] = true
|
||||
}
|
||||
}
|
||||
|
||||
selfRegs := map[string]bool{}
|
||||
selfDomains := make([]string, 0, len(opts.SelfDomains))
|
||||
for _, d := range opts.SelfDomains {
|
||||
d = strings.ToLower(strings.TrimSpace(d))
|
||||
if d == "" {
|
||||
continue
|
||||
}
|
||||
selfRegs[d] = true
|
||||
selfDomains = append(selfDomains, d)
|
||||
}
|
||||
|
||||
return &Policy{
|
||||
adHost: re,
|
||||
learned: loadLinesRaw(opts.LearnedPath), // mirrors _learned_set (no comment-strip)
|
||||
allow: loadLines(opts.AllowPath),
|
||||
spliceSeed: loadLines(opts.SpliceSeedPath),
|
||||
spliceLearn: loadLines(opts.SpliceLearnPath),
|
||||
never: never,
|
||||
selfRegs: selfRegs,
|
||||
selfDomains: selfDomains,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// ── registrable: port of ad_ghost._registrable ───────────────────────────────
|
||||
//
|
||||
// host = host.split(":")[0].lower().strip(".")
|
||||
// if not host or host.replace(".","").isdigit() or ":" in host: return None
|
||||
// p = host.split(".")
|
||||
// if len(p) <= 2: return host
|
||||
// last2 = ".".join(p[-2:])
|
||||
// return ".".join(p[-3:]) if (last2 in _2L_TLD and len(p) >= 3) else last2
|
||||
func registrable(host string) string {
|
||||
host = strings.ToLower(host)
|
||||
if i := strings.IndexByte(host, ':'); i >= 0 {
|
||||
host = host[:i]
|
||||
}
|
||||
host = strings.Trim(host, ".")
|
||||
if host == "" {
|
||||
return ""
|
||||
}
|
||||
// host.replace(".","").isdigit() → all-digit IPv4-ish → no registrable.
|
||||
if isAllDigits(strings.ReplaceAll(host, ".", "")) {
|
||||
return ""
|
||||
}
|
||||
// The Python checks ":" in host AFTER stripping the port; a residual colon
|
||||
// (e.g. an IPv6 literal) yields None. We already split on the first colon,
|
||||
// so re-check the remainder for any colon to mirror exactly.
|
||||
if strings.IndexByte(host, ':') >= 0 {
|
||||
return ""
|
||||
}
|
||||
p := strings.Split(host, ".")
|
||||
if len(p) <= 2 {
|
||||
return host
|
||||
}
|
||||
last2 := strings.Join(p[len(p)-2:], ".")
|
||||
if twoLevelTLD[last2] && len(p) >= 3 {
|
||||
return strings.Join(p[len(p)-3:], ".")
|
||||
}
|
||||
return last2
|
||||
}
|
||||
|
||||
func isAllDigits(s string) bool {
|
||||
if s == "" {
|
||||
return false // Python "".isdigit() is False
|
||||
}
|
||||
for _, r := range s {
|
||||
if r < '0' || r > '9' {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// ── splice helpers: port of splice.host_matches / should_splice ──────────────
|
||||
|
||||
// hostMatches: True if host == pattern OR host is a dotted-suffix subdomain.
|
||||
func hostMatches(host string, patterns map[string]bool) bool {
|
||||
h := strings.Trim(strings.ToLower(host), ".")
|
||||
if h == "" || len(patterns) == 0 {
|
||||
return false
|
||||
}
|
||||
if patterns[h] {
|
||||
return true
|
||||
}
|
||||
for p := range patterns {
|
||||
if strings.HasSuffix(h, "."+p) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// allowed: port of ad_ghost._allowed. Own-infra ALWAYS wins (reflash-safe),
|
||||
// then the operator allowlist (host or registrable).
|
||||
func (p *Policy) allowed(host string) bool {
|
||||
h := strings.ToLower(host)
|
||||
reg := registrable(h)
|
||||
if reg == "" {
|
||||
reg = h
|
||||
}
|
||||
// own infra: registrable in selfRegs, OR host == d || host endswith "."+d.
|
||||
if p.selfRegs[reg] {
|
||||
return true
|
||||
}
|
||||
for _, d := range p.selfDomains {
|
||||
if h == d || strings.HasSuffix(h, "."+d) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return p.allow[h] || p.allow[reg]
|
||||
}
|
||||
|
||||
// shouldSplice: port of splice.should_splice (never wins; then seed ∪ learned).
|
||||
func (p *Policy) shouldSplice(sni string) bool {
|
||||
s := strings.Trim(strings.ToLower(sni), ".")
|
||||
if s == "" {
|
||||
return false
|
||||
}
|
||||
if hostMatches(s, p.never) {
|
||||
return false
|
||||
}
|
||||
return hostMatches(s, p.spliceSeed) || hostMatches(s, p.spliceLearn)
|
||||
}
|
||||
|
||||
// blockedByAd: port of the ad_ghost requestheaders block decision (sans the
|
||||
// allowlist guard, which Decide applies first): _AD_HOST match OR
|
||||
// registrable/host in learned-trackers.
|
||||
func (p *Policy) blockedByAd(host string) bool {
|
||||
if p.adHost.MatchString(host) {
|
||||
return true
|
||||
}
|
||||
reg := registrable(host)
|
||||
if reg != "" && p.learned[reg] {
|
||||
return true
|
||||
}
|
||||
return p.learned[strings.ToLower(host)]
|
||||
}
|
||||
|
||||
// ── Decide: the unified cross-engine decision ────────────────────────────────
|
||||
//
|
||||
// action ∈ {"allow","block","splice","mitm"}. Precedence (mirrors the Python
|
||||
// across the two addons, documented in the harness):
|
||||
//
|
||||
// 1. own-infra / allowlist → "allow" (ad_ghost._allowed; never block/splice)
|
||||
// 2. splice never-set check, then seed/learned → "splice"
|
||||
// (tls_splice runs FIRST at the TLS layer; should_splice already excludes
|
||||
// the never-set = pure-trackers ∪ fortknox, so a tracker that is also a
|
||||
// splice candidate fails should_splice here and falls through to block)
|
||||
// 3. _AD_HOST / learned → "block" (ad_ghost requestheaders, request layer)
|
||||
// 4. otherwise → "mitm"
|
||||
//
|
||||
// sni defaults to host when empty (the live engine splices on SNI == the TLS
|
||||
// host; for the parity harness host and sni are the same value).
|
||||
func (p *Policy) Decide(host, sni string) string {
|
||||
if sni == "" {
|
||||
sni = host
|
||||
}
|
||||
if p.allowed(host) {
|
||||
return "allow"
|
||||
}
|
||||
if p.shouldSplice(sni) {
|
||||
return "splice"
|
||||
}
|
||||
if p.blockedByAd(host) {
|
||||
return "block"
|
||||
}
|
||||
return "mitm"
|
||||
}
|
||||
|
||||
// action keeps the legacy 3-verb surface (block/splice/mitm) for the PoC
|
||||
// CONNECT wiring, derived from Decide: "allow" collapses to "mitm" (an
|
||||
// allowlisted host is intercepted normally, just never short-circuited).
|
||||
func (p *Policy) action(host string) string {
|
||||
switch p.Decide(host, host) {
|
||||
case "splice":
|
||||
return "splice"
|
||||
case "block":
|
||||
return "block"
|
||||
default: // "allow" and "mitm" both → normal interception
|
||||
return "mitm"
|
||||
}
|
||||
}
|
||||
142
packages/secubox-toolbox-ng/cmd/sbxmitm/policy_test.go
Normal file
142
packages/secubox-toolbox-ng/cmd/sbxmitm/policy_test.go
Normal file
|
|
@ -0,0 +1,142 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// Cross-engine parity harness — Go side (#662 Phase 3).
|
||||
//
|
||||
// Loads testdata/parity-fixtures.json + the testdata/config snapshot, runs
|
||||
// Policy.Decide on each host, and asserts == the fixture's expect. The Python
|
||||
// side (../secubox-toolbox/tests/test_engine_parity.py) loads the SAME files
|
||||
// and drives the SAME decision; both must agree → parity proven.
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
)
|
||||
|
||||
type parityConfig struct {
|
||||
AdAllowlist string `json:"ad_allowlist"`
|
||||
LearnedTrackers string `json:"learned_trackers"`
|
||||
SpliceSeed string `json:"splice_seed"`
|
||||
SpliceLearned string `json:"splice_learned"`
|
||||
PureTrackers string `json:"pure_trackers"`
|
||||
SelfDomains []string `json:"self_domains"`
|
||||
FortknoxSites []string `json:"fortknox_sites"`
|
||||
}
|
||||
|
||||
type parityFixture struct {
|
||||
Host string `json:"host"`
|
||||
Expect string `json:"expect"`
|
||||
Why string `json:"why"`
|
||||
}
|
||||
|
||||
type parityFile struct {
|
||||
Config parityConfig `json:"config"`
|
||||
Fixtures []parityFixture `json:"fixtures"`
|
||||
}
|
||||
|
||||
// testdataDir resolves the testdata/ dir relative to this package
|
||||
// (cmd/sbxmitm → ../../testdata).
|
||||
func testdataDir(t *testing.T) string {
|
||||
t.Helper()
|
||||
d, err := filepath.Abs(filepath.Join("..", "..", "testdata"))
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
return d
|
||||
}
|
||||
|
||||
func loadParityFile(t *testing.T) (parityFile, string) {
|
||||
t.Helper()
|
||||
dir := testdataDir(t)
|
||||
raw, err := os.ReadFile(filepath.Join(dir, "parity-fixtures.json"))
|
||||
if err != nil {
|
||||
t.Fatalf("read fixtures: %v", err)
|
||||
}
|
||||
var pf parityFile
|
||||
if err := json.Unmarshal(raw, &pf); err != nil {
|
||||
t.Fatalf("parse fixtures: %v", err)
|
||||
}
|
||||
if len(pf.Fixtures) == 0 {
|
||||
t.Fatal("no fixtures")
|
||||
}
|
||||
return pf, dir
|
||||
}
|
||||
|
||||
func TestParityDecide(t *testing.T) {
|
||||
pf, dir := loadParityFile(t)
|
||||
cfgPath := func(rel string) string { return filepath.Join(dir, filepath.FromSlash(rel)) }
|
||||
|
||||
pol, err := LoadPolicy(PolicyOpts{
|
||||
AllowPath: cfgPath(pf.Config.AdAllowlist),
|
||||
LearnedPath: cfgPath(pf.Config.LearnedTrackers),
|
||||
SpliceSeedPath: cfgPath(pf.Config.SpliceSeed),
|
||||
SpliceLearnPath: cfgPath(pf.Config.SpliceLearned),
|
||||
PureTrackersPath: cfgPath(pf.Config.PureTrackers),
|
||||
FortknoxSites: pf.Config.FortknoxSites,
|
||||
SelfDomains: pf.Config.SelfDomains,
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("LoadPolicy: %v", err)
|
||||
}
|
||||
|
||||
for _, fx := range pf.Fixtures {
|
||||
got := pol.Decide(fx.Host, fx.Host)
|
||||
if got != fx.Expect {
|
||||
t.Errorf("Decide(%q)=%q want %q (%s)", fx.Host, got, fx.Expect, fx.Why)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestPolicyActionVerbs checks the legacy 3-verb action() surface still wired
|
||||
// into the PoC CONNECT path: allow collapses to mitm; block/splice preserved.
|
||||
func TestPolicyActionVerbs(t *testing.T) {
|
||||
pf, dir := loadParityFile(t)
|
||||
cfgPath := func(rel string) string { return filepath.Join(dir, filepath.FromSlash(rel)) }
|
||||
pol, err := LoadPolicy(PolicyOpts{
|
||||
AllowPath: cfgPath(pf.Config.AdAllowlist),
|
||||
LearnedPath: cfgPath(pf.Config.LearnedTrackers),
|
||||
SpliceSeedPath: cfgPath(pf.Config.SpliceSeed),
|
||||
SpliceLearnPath: cfgPath(pf.Config.SpliceLearned),
|
||||
PureTrackersPath: cfgPath(pf.Config.PureTrackers),
|
||||
FortknoxSites: pf.Config.FortknoxSites,
|
||||
SelfDomains: pf.Config.SelfDomains,
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
cases := map[string]string{
|
||||
"ads.doubleclick.net": "block",
|
||||
"r1.googlevideo.com": "splice",
|
||||
"news.example.com": "mitm",
|
||||
"notdoubleclick.net": "mitm",
|
||||
"analytics.example-allowed.com": "mitm", // allow → normal interception (mitm verb)
|
||||
"hub.secubox.in": "mitm", // own-infra → normal interception
|
||||
}
|
||||
for host, want := range cases {
|
||||
if got := pol.action(host); got != want {
|
||||
t.Errorf("action(%q)=%q want %q", host, got, want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestRegistrable exercises the _registrable port incl. the 2-level TLD list.
|
||||
func TestRegistrable(t *testing.T) {
|
||||
cases := map[string]string{
|
||||
"a.b.example.com": "example.com",
|
||||
"example.com": "example.com",
|
||||
"com": "com",
|
||||
"a.b.example.co.uk": "example.co.uk",
|
||||
"example.co.uk": "example.co.uk", // 2 labels → returned as-is
|
||||
"x.y.z.example.com": "example.com",
|
||||
"1.2.3.4": "",
|
||||
"": "",
|
||||
}
|
||||
for in, want := range cases {
|
||||
if got := registrable(in); got != want {
|
||||
t.Errorf("registrable(%q)=%q want %q", in, got, want)
|
||||
}
|
||||
}
|
||||
}
|
||||
194
packages/secubox-toolbox-ng/cmd/sbxmitm/privacy.go
Normal file
194
packages/secubox-toolbox-ng/cmd/sbxmitm/privacy.go
Normal file
|
|
@ -0,0 +1,194 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: always-on anonymize + Set-Cookie poison wiring
|
||||
// (#662 Phase 5-prep, Part A)
|
||||
//
|
||||
// These helpers wire the ported policy (policy.go) + HMAC fake-identity jar
|
||||
// (jar.go) into the MITM response path. They mirror the INTENT of the Python
|
||||
// privacy_guard._anonymize and privacy.fake_id poison (mitmproxy_addons/
|
||||
// privacy_guard.py, secubox_toolbox/privacy.py) — best-effort privacy hygiene,
|
||||
// NOT byte-identical to the Python request-Cookie path. The jar values
|
||||
// themselves ARE byte-exact (proven in jar_test.go).
|
||||
//
|
||||
// Safety envelope (DARK, like anti-track): poison only acts on MITM'd TRACKER
|
||||
// flows. allow/own-infra flows are left CLEAN — never poisoned, never blocked.
|
||||
//
|
||||
// Pure standard library — no external modules.
|
||||
package main
|
||||
|
||||
import (
|
||||
"net"
|
||||
"net/http"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// ── anonymize: always-on hygiene ─────────────────────────────────────────────
|
||||
|
||||
// anonymizeStrip mirrors privacy_guard._STRIP / protective_mode._STRIP: the
|
||||
// operator/carrier + re-identification REQUEST headers we drop on every MITM'd
|
||||
// flow. Lower-cased for case-insensitive matching against canonicalised keys.
|
||||
var anonymizeStrip = []string{
|
||||
"msisdn", "x-msisdn", "x-up-calling-line-id", "x-up-subno",
|
||||
"x-nokia-msisdn", "x-acr", "x-vf-acr", "x-amobee-1", "x-amobee-2",
|
||||
"tm-user-id", "x-wap-profile", "x-wap-msisdn", "x-network-info",
|
||||
"x-forwarded-for", "forwarded", "x-real-ip", "via",
|
||||
}
|
||||
|
||||
// anonymizeRequest applies always-on privacy hygiene to a MITM'd request:
|
||||
// drop the operator/tracking headers above, then pin DNT:1 + Sec-GPC:1 (the
|
||||
// opt-out signals). Mirrors privacy_guard._anonymize. Minimal + best-effort:
|
||||
// it never errors and is safe to call on every intercepted request.
|
||||
//
|
||||
// NOTE: unlike the Python spoof path we do NOT drop Cookie/Referer here —
|
||||
// anonymize is the universally-safe hygiene layer; cookie neutralisation is the
|
||||
// poison layer (poisonSetCookies), gated behind the tracker classification.
|
||||
func anonymizeRequest(h http.Header) {
|
||||
for _, name := range anonymizeStrip {
|
||||
// http.Header.Del canonicalises the key; our list is lower-case but Del
|
||||
// matches case-insensitively via CanonicalMIMEHeaderKey.
|
||||
h.Del(name)
|
||||
}
|
||||
h.Set("DNT", "1")
|
||||
h.Set("Sec-GPC", "1")
|
||||
}
|
||||
|
||||
// ── poison: response Set-Cookie value replacement ────────────────────────────
|
||||
|
||||
// trackingCookieNames is the set of exact cookie names we treat as tracking
|
||||
// identifiers worth poisoning (lower-cased). These map onto the shapes the jar
|
||||
// (_shape in jar.go) knows how to forge plausibly.
|
||||
var trackingCookieNames = map[string]bool{
|
||||
"_fbp": true, "_fbc": true, "_gid": true, "_gcl_au": true,
|
||||
"uid": true, "uuid": true, "_pk_id": true, "_pk_ses": true,
|
||||
"__qca": true, "muid": true, "ide": true, "fr": true,
|
||||
"_uetvid": true, "_uetsid": true, "anid": true, "nid": true,
|
||||
}
|
||||
|
||||
// isTrackingCookieName reports whether a Set-Cookie name looks like a tracking
|
||||
// identifier we should poison. Prefix rule: any "_ga*" cookie (GA + GA4
|
||||
// per-property _ga_<id>) is a tracking id; otherwise an exact-match against
|
||||
// trackingCookieNames. Benign session/CSRF cookies (sessionid, csrftoken, …)
|
||||
// are NOT matched, so they pass through untouched.
|
||||
func isTrackingCookieName(name string) bool {
|
||||
n := strings.ToLower(strings.TrimSpace(name))
|
||||
if n == "" {
|
||||
return false
|
||||
}
|
||||
if strings.HasPrefix(n, "_ga") {
|
||||
return true
|
||||
}
|
||||
return trackingCookieNames[n]
|
||||
}
|
||||
|
||||
// poisonSetCookies rewrites the response Set-Cookie header lines for a MITM'd
|
||||
// tracker flow: for each cookie whose NAME is a tracking id, the value is
|
||||
// replaced with the jar fakeID(clientHash, host, name, key) while ALL cookie
|
||||
// attributes (Path, Domain, Max-Age, Secure, HttpOnly, SameSite, …) are
|
||||
// preserved verbatim. Non-tracking cookies are returned byte-identical.
|
||||
//
|
||||
// Gating (caller's responsibility too, but defensive here): if the jar key is
|
||||
// absent OR fakeID returns !ok (empty clientHash / tracker), the cookie is left
|
||||
// UNCHANGED — we never emit a malformed cookie, and we never invent a fake
|
||||
// where we lack the seed. This keeps the poison fail-closed-to-clean.
|
||||
//
|
||||
// This is the emission half of the jar; the classification half (is this a
|
||||
// tracker flow at all) is Policy.shouldPoison, applied by the wiring before
|
||||
// this is ever called — poison NEVER touches allow/own-infra flows.
|
||||
func poisonSetCookies(setCookies []string, clientHash, host string, key []byte) []string {
|
||||
if len(setCookies) == 0 {
|
||||
return setCookies
|
||||
}
|
||||
out := make([]string, len(setCookies))
|
||||
for i, sc := range setCookies {
|
||||
out[i] = poisonOneSetCookie(sc, clientHash, host, key)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// poisonOneSetCookie rewrites a single Set-Cookie line. The line shape is
|
||||
// `name=value; Attr1; Attr2=...`; we split off the first `;` to isolate the
|
||||
// name=value pair, replace value if name is a tracking id and a fake mints,
|
||||
// then re-attach the (unchanged) attribute tail.
|
||||
func poisonOneSetCookie(sc, clientHash, host string, key []byte) string {
|
||||
semi := strings.IndexByte(sc, ';')
|
||||
pair := sc
|
||||
tail := ""
|
||||
if semi >= 0 {
|
||||
pair = sc[:semi]
|
||||
tail = sc[semi:] // includes the leading ';'
|
||||
}
|
||||
eq := strings.IndexByte(pair, '=')
|
||||
if eq < 0 {
|
||||
return sc // attribute-only / malformed → leave untouched
|
||||
}
|
||||
name := strings.TrimSpace(pair[:eq])
|
||||
if !isTrackingCookieName(name) {
|
||||
return sc
|
||||
}
|
||||
fake, ok := fakeID(clientHash, host, name, key)
|
||||
if !ok {
|
||||
return sc // no jar key / no clientHash → leave clean (fail-closed)
|
||||
}
|
||||
return name + "=" + fake + tail
|
||||
}
|
||||
|
||||
// ── tracker classification + poison gate ─────────────────────────────────────
|
||||
|
||||
// isTracker mirrors the tracker classification used by the block decision
|
||||
// (privacy.is_tracker / ad_ghost): _AD_HOST regex OR host/registrable in the
|
||||
// learned-trackers set. Reused here so poison fires on exactly the hosts the
|
||||
// engine already considers trackers.
|
||||
func (p *Policy) isTracker(host string) bool {
|
||||
return p.blockedByAd(host)
|
||||
}
|
||||
|
||||
// shouldPoison reports whether a MITM'd flow to host should have its tracking
|
||||
// Set-Cookies poisoned. TRUE only for tracker hosts that are NOT own-infra /
|
||||
// allowlisted — own-infra flows are left clean (same dark safety as the block
|
||||
// path). The caller additionally requires a loaded jar key.
|
||||
func (p *Policy) shouldPoison(host string) bool {
|
||||
if p.allowed(host) {
|
||||
return false // own-infra / allowlist → never poison
|
||||
}
|
||||
return p.isTracker(host)
|
||||
}
|
||||
|
||||
// ── client identity ──────────────────────────────────────────────────────────
|
||||
|
||||
// clientHashFromConn returns the per-client identity used to mint the stable
|
||||
// fake persona (jar fakeID first arg).
|
||||
//
|
||||
// It mirrors the Python privacy_guard._client_hash → _common.mac_hash_of(peer_ip)
|
||||
// for the WireGuard R3 path: the peer IP is resolved to the WG persona hash
|
||||
// (sha256(peer_pubkey)[:16]) by macHashOf. For 10.99.1.0/24 WG peers that hash
|
||||
// is byte-identical to the Python engine (proven in machash_test.go ↔
|
||||
// test_machash_parity.py), so a flow's fake persona is stable across the Go and
|
||||
// Python engines and across restarts.
|
||||
//
|
||||
// macHashOf returns "" for any IP it cannot resolve (non-WG peers, the captive
|
||||
// R0-R2 ARP path which is out of scope for this R3 engine, missing WG DB). In
|
||||
// that case we fall back to the raw peer IP so non-WG / test conns still get a
|
||||
// deterministic seed and poison remains functional — the fallback value is just
|
||||
// not cross-engine-stable, which is acceptable for non-R3 traffic.
|
||||
//
|
||||
// DONE(#662): mac_hash wiring for the WG path. Remaining gaps, intentionally NOT
|
||||
// addressed here:
|
||||
// - the transparent original-dst plumbing that feeds the *real* peer IP into
|
||||
// this function lives in transparent.go (handleTransparent); the CONNECT PoC
|
||||
// still sees the proxy-hop peer IP.
|
||||
// - the R0-R2 captive-subnet ARP/HMAC branch of _common.mac_hash_of is out of
|
||||
// scope (this engine is WG-only — see machash.go macHashOf).
|
||||
func clientHashFromConn(conn net.Conn) string {
|
||||
if conn == nil {
|
||||
return ""
|
||||
}
|
||||
host, _, err := net.SplitHostPort(conn.RemoteAddr().String())
|
||||
if err != nil {
|
||||
host = conn.RemoteAddr().String()
|
||||
}
|
||||
if mh := macHashOf(host); mh != "" {
|
||||
return mh
|
||||
}
|
||||
return host
|
||||
}
|
||||
152
packages/secubox-toolbox-ng/cmd/sbxmitm/privacy_test.go
Normal file
152
packages/secubox-toolbox-ng/cmd/sbxmitm/privacy_test.go
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// Unit tests for the always-on anonymize hygiene + the Set-Cookie poison
|
||||
// emission wired into the MITM response path (#662 Phase 5-prep, Part A).
|
||||
//
|
||||
// These exercise the PURE helpers (anonymizeRequest / poisonSetCookies /
|
||||
// isTrackingCookieName) so the wiring is testable without standing up a full
|
||||
// proxy. The behaviour mirrors the Python privacy_guard._anonymize and the
|
||||
// privacy.fake_id poison intent (see comments in privacy.go) — best-effort
|
||||
// hygiene, not byte-identical to the request-Cookie path.
|
||||
package main
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestAnonymizeRequestStripsOperatorHeaders: the operator/carrier + re-id
|
||||
// headers are dropped, and DNT:1 + Sec-GPC:1 are pinned (mirrors
|
||||
// privacy_guard._anonymize / protective_mode spoof header hygiene).
|
||||
func TestAnonymizeRequestStripsOperatorHeaders(t *testing.T) {
|
||||
h := http.Header{}
|
||||
h.Set("X-MSISDN", "33612345678")
|
||||
h.Set("X-ACR", "carrier-acr-token")
|
||||
h.Set("X-Up-Calling-Line-Id", "33612345678")
|
||||
h.Set("X-Wap-Profile", "http://wap.example/ua.xml")
|
||||
h.Set("X-Forwarded-For", "10.0.0.7")
|
||||
h.Set("Via", "1.1 carrier-proxy")
|
||||
h.Set("User-Agent", "Mozilla/5.0") // must survive
|
||||
|
||||
anonymizeRequest(h)
|
||||
|
||||
for _, k := range []string{
|
||||
"X-Msisdn", "X-Acr", "X-Up-Calling-Line-Id", "X-Wap-Profile",
|
||||
"X-Forwarded-For", "Via",
|
||||
} {
|
||||
if v := h.Get(k); v != "" {
|
||||
t.Errorf("anonymizeRequest left %s=%q (should be stripped)", k, v)
|
||||
}
|
||||
}
|
||||
if h.Get("User-Agent") != "Mozilla/5.0" {
|
||||
t.Errorf("anonymizeRequest clobbered a benign header: User-Agent=%q", h.Get("User-Agent"))
|
||||
}
|
||||
if h.Get("DNT") != "1" {
|
||||
t.Errorf("DNT not pinned: %q", h.Get("DNT"))
|
||||
}
|
||||
if h.Get("Sec-GPC") != "1" {
|
||||
t.Errorf("Sec-GPC not pinned: %q", h.Get("Sec-GPC"))
|
||||
}
|
||||
}
|
||||
|
||||
// TestAnonymizeRequestPinsSignalsWhenAbsent: DNT/Sec-GPC are asserted even when
|
||||
// no operator headers were present (always-on hygiene).
|
||||
func TestAnonymizeRequestPinsSignalsWhenAbsent(t *testing.T) {
|
||||
h := http.Header{}
|
||||
anonymizeRequest(h)
|
||||
if h.Get("DNT") != "1" || h.Get("Sec-GPC") != "1" {
|
||||
t.Fatalf("opt-out signals not pinned on a clean request: DNT=%q GPC=%q",
|
||||
h.Get("DNT"), h.Get("Sec-GPC"))
|
||||
}
|
||||
}
|
||||
|
||||
// TestIsTrackingCookieName: known tracking-id cookie names are recognised;
|
||||
// benign session/CSRF cookies are not.
|
||||
func TestIsTrackingCookieName(t *testing.T) {
|
||||
track := []string{"_ga", "_GA_ABC123", "_fbp", "_gid", "uid", "uuid", "_pk_id", "__qca", "_gcl_au"}
|
||||
for _, n := range track {
|
||||
if !isTrackingCookieName(n) {
|
||||
t.Errorf("isTrackingCookieName(%q)=false, want true", n)
|
||||
}
|
||||
}
|
||||
benign := []string{"sessionid", "csrftoken", "XSRF-TOKEN", "PHPSESSID", "cart", "lang"}
|
||||
for _, n := range benign {
|
||||
if isTrackingCookieName(n) {
|
||||
t.Errorf("isTrackingCookieName(%q)=true, want false", n)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestPoisonSetCookiesReplacesTrackingValue: a tracking Set-Cookie has its value
|
||||
// replaced by the jar fakeID (attributes preserved), while a non-tracking cookie
|
||||
// is left byte-identical.
|
||||
func TestPoisonSetCookiesReplacesTrackingValue(t *testing.T) {
|
||||
key := []byte("test-jar-seed-key-0123456789abcdef")
|
||||
const ch = "203.0.113.9"
|
||||
const host = "ads.doubleclick.net"
|
||||
|
||||
in := []string{
|
||||
"_ga=GA1.2.111.222; Path=/; Domain=.doubleclick.net; Max-Age=63072000",
|
||||
"sessionid=abc123; Path=/; HttpOnly",
|
||||
}
|
||||
out := poisonSetCookies(in, ch, host, key)
|
||||
if len(out) != 2 {
|
||||
t.Fatalf("poisonSetCookies returned %d cookies, want 2", len(out))
|
||||
}
|
||||
|
||||
// The _ga value must be the jar fakeID and the attributes preserved.
|
||||
want, ok := fakeID(ch, host, "_ga", key)
|
||||
if !ok {
|
||||
t.Fatal("fakeID returned !ok for _ga")
|
||||
}
|
||||
wantCookie := "_ga=" + want + "; Path=/; Domain=.doubleclick.net; Max-Age=63072000"
|
||||
if out[0] != wantCookie {
|
||||
t.Errorf("poisoned _ga = %q\n want %q", out[0], wantCookie)
|
||||
}
|
||||
if out[0] == in[0] {
|
||||
t.Error("tracking cookie value was NOT changed")
|
||||
}
|
||||
|
||||
// The benign cookie must be untouched.
|
||||
if out[1] != in[1] {
|
||||
t.Errorf("non-tracking cookie altered: %q != %q", out[1], in[1])
|
||||
}
|
||||
}
|
||||
|
||||
// TestPoisonSetCookiesNoKeyLeavesUnchanged: with no jar key (key present-gate),
|
||||
// nothing is poisoned (fail-closed-to-clean: we never emit a broken cookie).
|
||||
func TestPoisonSetCookiesNoKeyLeavesUnchanged(t *testing.T) {
|
||||
in := []string{"_ga=GA1.2.1.2; Path=/"}
|
||||
out := poisonSetCookies(in, "1.2.3.4", "ads.doubleclick.net", nil)
|
||||
if len(out) != 1 || out[0] != in[0] {
|
||||
t.Fatalf("poisonSetCookies with nil key altered output: %v", out)
|
||||
}
|
||||
}
|
||||
|
||||
// TestPoisonSetCookiesNoClientHashLeavesUnchanged: empty clientHash → fakeID !ok
|
||||
// → cookie left as-is.
|
||||
func TestPoisonSetCookiesNoClientHashLeavesUnchanged(t *testing.T) {
|
||||
key := []byte("test-jar-seed-key-0123456789abcdef")
|
||||
in := []string{"_ga=GA1.2.1.2; Path=/"}
|
||||
out := poisonSetCookies(in, "", "ads.doubleclick.net", key)
|
||||
if len(out) != 1 || out[0] != in[0] {
|
||||
t.Fatalf("poisonSetCookies with empty clientHash altered output: %v", out)
|
||||
}
|
||||
}
|
||||
|
||||
// TestPoisonSetCookiesDeterministic: same (client,host,name) → same fake value
|
||||
// across calls ('rémanent' jar — proven byte-exact in jar_test.go; here we just
|
||||
// assert the wiring keeps it stable).
|
||||
func TestPoisonSetCookiesDeterministic(t *testing.T) {
|
||||
key := []byte("test-jar-seed-key-0123456789abcdef")
|
||||
in := []string{"uid=real-user-7; Path=/"}
|
||||
a := poisonSetCookies(in, "9.9.9.9", "adnxs.com", key)
|
||||
b := poisonSetCookies(in, "9.9.9.9", "adnxs.com", key)
|
||||
if a[0] != b[0] {
|
||||
t.Fatalf("poison not deterministic: %q != %q", a[0], b[0])
|
||||
}
|
||||
if a[0] == in[0] {
|
||||
t.Fatal("uid (tracking) cookie not poisoned")
|
||||
}
|
||||
}
|
||||
93
packages/secubox-toolbox-ng/cmd/sbxmitm/sidecar.go
Normal file
93
packages/secubox-toolbox-ng/cmd/sbxmitm/sidecar.go
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// SecuBox-Deb :: toolbox-ng :: sidecar emit helper (#662 Phase 4)
|
||||
//
|
||||
// Fire-and-forget POST to a unix-socket'd SecuBox module, mirroring the Python
|
||||
// addons' _common.fire_forget_post: it NEVER blocks the proxy flow and NEVER
|
||||
// raises into the caller. The live engine will relay extracted signals to the
|
||||
// existing module sockets; this is the transport only — NOT yet wired into the
|
||||
// live request/response path (Phase 5+ wiring).
|
||||
//
|
||||
// Addon → socket mapping the live engine will use (verbatim from the Python
|
||||
// addons' TARGET constants, packages/secubox-toolbox/mitmproxy_addons/*.py):
|
||||
//
|
||||
// addon socket path route
|
||||
// cookies → /run/secubox/cookies.sock POST /inject
|
||||
// dpi → /run/secubox/dpi.sock POST /classify
|
||||
// avatar → /run/secubox/avatar.sock POST /fingerprint
|
||||
// ja4 → /run/secubox/threat-analyst.sock POST /ja4
|
||||
// soc_relay → /run/secubox/soc.sock POST /event
|
||||
// social_graph: in-process (no socket) — correlated inside the engine, not emitted.
|
||||
//
|
||||
// emit takes the full socket PATH (not an http+unix:// URL) plus the route in
|
||||
// the payload's destination; callers build the path from the table above.
|
||||
//
|
||||
// Pure standard library — no external modules, no go.sum.
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"net"
|
||||
"time"
|
||||
)
|
||||
|
||||
// emitTimeout caps the whole connect+write+read so a slow/dead module socket
|
||||
// can never wedge the engine. Mirrors the Python httpx timeout=2.
|
||||
const emitTimeout = 2 * time.Second
|
||||
|
||||
// emit fires a fire-and-forget POST of payload to the given unix socket at
|
||||
// route, in a detached goroutine. It returns immediately and never blocks the
|
||||
// caller; all errors (missing socket, dead peer, timeout) are swallowed —
|
||||
// dropping a relayed signal must never break a client flow. Mirrors
|
||||
// _common.fire_forget_post + queue_async (create_task, never raise).
|
||||
//
|
||||
// route is the HTTP path on the module (e.g. "/inject", "/classify"); use the
|
||||
// addon→socket table above to pick socketPath + route together.
|
||||
func emit(socketPath, route string, payload []byte) {
|
||||
go emitSync(socketPath, route, payload)
|
||||
}
|
||||
|
||||
// emitSync performs the actual POST synchronously (under emitTimeout). Exposed
|
||||
// (lowercase, same-package) so tests can observe delivery deterministically
|
||||
// without racing the goroutine. Returns an error only for the test's benefit;
|
||||
// emit() discards it.
|
||||
func emitSync(socketPath, route string, payload []byte) error {
|
||||
if route == "" {
|
||||
route = "/"
|
||||
}
|
||||
ctx, cancel := context.WithTimeout(context.Background(), emitTimeout)
|
||||
defer cancel()
|
||||
|
||||
var d net.Dialer
|
||||
conn, err := d.DialContext(ctx, "unix", socketPath)
|
||||
if err != nil {
|
||||
return err // dead/missing socket — swallowed by emit()
|
||||
}
|
||||
defer conn.Close()
|
||||
|
||||
if dl, ok := ctx.Deadline(); ok {
|
||||
_ = conn.SetDeadline(dl)
|
||||
}
|
||||
|
||||
// Minimal HTTP/1.1 POST. Host is a placeholder (unix transport); the module
|
||||
// FastAPI apps ignore it. Connection: close so the peer EOFs after replying.
|
||||
req := fmt.Sprintf(
|
||||
"POST %s HTTP/1.1\r\nHost: secubox.local\r\nContent-Type: application/json\r\n"+
|
||||
"Content-Length: %d\r\nConnection: close\r\n\r\n",
|
||||
route, len(payload))
|
||||
if _, err := conn.Write([]byte(req)); err != nil {
|
||||
return err
|
||||
}
|
||||
if len(payload) > 0 {
|
||||
if _, err := conn.Write(payload); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
// Best-effort drain so the peer sees a clean close; we don't parse the
|
||||
// response (fire-and-forget). Errors here are irrelevant.
|
||||
buf := make([]byte, 512)
|
||||
_, _ = conn.Read(buf)
|
||||
return nil
|
||||
}
|
||||
125
packages/secubox-toolbox-ng/cmd/sbxmitm/sidecar_test.go
Normal file
125
packages/secubox-toolbox-ng/cmd/sbxmitm/sidecar_test.go
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
// Unit tests for the sidecar emit helper (#662 Phase 4).
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"net"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// TestEmitDelivers: emitSync to a live unix socket delivers the POST request
|
||||
// line, route and JSON body.
|
||||
func TestEmitDelivers(t *testing.T) {
|
||||
sock := filepath.Join(t.TempDir(), "emit.sock")
|
||||
ln, err := net.Listen("unix", sock)
|
||||
if err != nil {
|
||||
t.Fatalf("listen: %v", err)
|
||||
}
|
||||
defer ln.Close()
|
||||
|
||||
got := make(chan string, 1)
|
||||
go func() {
|
||||
c, err := ln.Accept()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer c.Close()
|
||||
c.SetReadDeadline(time.Now().Add(2 * time.Second))
|
||||
var sb strings.Builder
|
||||
r := bufio.NewReader(c)
|
||||
buf := make([]byte, 4096)
|
||||
for {
|
||||
n, err := r.Read(buf)
|
||||
sb.Write(buf[:n])
|
||||
if err != nil || strings.Contains(sb.String(), `"k":"v"`) {
|
||||
break
|
||||
}
|
||||
}
|
||||
// Reply so emitSync's drain completes cleanly.
|
||||
c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n"))
|
||||
got <- sb.String()
|
||||
}()
|
||||
|
||||
if err := emitSync(sock, "/classify", []byte(`{"k":"v"}`)); err != nil {
|
||||
t.Fatalf("emitSync: %v", err)
|
||||
}
|
||||
|
||||
select {
|
||||
case raw := <-got:
|
||||
if !strings.HasPrefix(raw, "POST /classify HTTP/1.1") {
|
||||
t.Errorf("missing/wrong request line in:\n%s", raw)
|
||||
}
|
||||
if !strings.Contains(raw, `{"k":"v"}`) {
|
||||
t.Errorf("body not delivered in:\n%s", raw)
|
||||
}
|
||||
case <-time.After(3 * time.Second):
|
||||
t.Fatal("server never received the emit")
|
||||
}
|
||||
}
|
||||
|
||||
// TestEmitDeadSocketNoPanicNoBlock: emit() (the goroutine form) to a
|
||||
// nonexistent socket must return immediately and never panic, and emitSync
|
||||
// must just return an error without blocking past the timeout.
|
||||
func TestEmitDeadSocketNoPanicNoBlock(t *testing.T) {
|
||||
dead := filepath.Join(t.TempDir(), "nope.sock")
|
||||
|
||||
// emit (async) returns instantly even though the socket is dead.
|
||||
done := make(chan struct{})
|
||||
go func() {
|
||||
defer close(done)
|
||||
emit(dead, "/inject", []byte(`{"x":1}`)) // must not panic/block
|
||||
}()
|
||||
select {
|
||||
case <-done:
|
||||
case <-time.After(time.Second):
|
||||
t.Fatal("emit() blocked on a dead socket")
|
||||
}
|
||||
|
||||
// emitSync surfaces the dial error (which emit swallows) without blocking.
|
||||
start := time.Now()
|
||||
if err := emitSync(dead, "/inject", []byte(`{}`)); err == nil {
|
||||
t.Error("emitSync to dead socket: expected error, got nil")
|
||||
}
|
||||
if elapsed := time.Since(start); elapsed > emitTimeout+time.Second {
|
||||
t.Errorf("emitSync blocked %v on dead socket", elapsed)
|
||||
}
|
||||
}
|
||||
|
||||
// TestEmitEmptyRouteDefaults: an empty route becomes "/".
|
||||
func TestEmitEmptyRouteDefaults(t *testing.T) {
|
||||
sock := filepath.Join(t.TempDir(), "root.sock")
|
||||
ln, err := net.Listen("unix", sock)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
defer ln.Close()
|
||||
got := make(chan string, 1)
|
||||
go func() {
|
||||
c, err := ln.Accept()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer c.Close()
|
||||
buf := make([]byte, 256)
|
||||
n, _ := c.Read(buf)
|
||||
c.Write([]byte("HTTP/1.1 204 No Content\r\nContent-Length: 0\r\nConnection: close\r\n\r\n"))
|
||||
got <- string(buf[:n])
|
||||
}()
|
||||
if err := emitSync(sock, "", nil); err != nil {
|
||||
t.Fatalf("emitSync: %v", err)
|
||||
}
|
||||
select {
|
||||
case raw := <-got:
|
||||
if !strings.HasPrefix(raw, "POST / HTTP/1.1") {
|
||||
t.Errorf("empty route not defaulted to /, got:\n%s", raw)
|
||||
}
|
||||
case <-time.After(2 * time.Second):
|
||||
t.Fatal("no request received")
|
||||
}
|
||||
}
|
||||
398
packages/secubox-toolbox-ng/cmd/sbxmitm/transparent.go
Normal file
398
packages/secubox-toolbox-ng/cmd/sbxmitm/transparent.go
Normal file
|
|
@ -0,0 +1,398 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
//go:build linux
|
||||
|
||||
// SecuBox-Deb :: toolbox-ng :: transparent SO_ORIGINAL_DST accept path
|
||||
// (#662 Phase 6 prep)
|
||||
//
|
||||
// The live R3 engine runs transparent: nft DNAT redirects the client's TCP SYN
|
||||
// to this worker, which recovers the ORIGINAL destination via
|
||||
// getsockopt(SOL_IP, SO_ORIGINAL_DST) (IPv4) or
|
||||
// getsockopt(SOL_IPV6, IP6T_SO_ORIGINAL_DST=80) (IPv6). This is a SECOND listen
|
||||
// mode behind --transparent; the CONNECT PoC (main.go handleConnect) is left
|
||||
// EXACTLY as-is.
|
||||
//
|
||||
// This is DARK — never wired to live traffic yet. The pure parser (parseOrigDst)
|
||||
// is unit-tested; the syscall glue (origDst) and end-to-end transparent capture
|
||||
// can only be exercised behind a real nft DNAT redirect, validated at Phase 5
|
||||
// shadow on the board, NOT in unit tests.
|
||||
//
|
||||
// Pure standard library — syscall + net + crypto/tls; no external modules.
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crypto/tls"
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"net"
|
||||
"strings"
|
||||
"syscall"
|
||||
"unsafe"
|
||||
)
|
||||
|
||||
// SO_ORIGINAL_DST is the Netfilter getsockopt that returns the pre-DNAT
|
||||
// destination sockaddr. Same value (80) for IPv4 (SOL_IP) and IPv6
|
||||
// (SOL_IPV6, where it is named IP6T_SO_ORIGINAL_DST).
|
||||
const soOriginalDst = 80
|
||||
|
||||
// parseOrigDst decodes a raw sockaddr blob (as returned by getsockopt
|
||||
// SO_ORIGINAL_DST) into host + port. It is PURE — no syscall — so it is fully
|
||||
// unit-testable offline.
|
||||
//
|
||||
// IPv4 sockaddr_in (16 bytes): [0:2]=family (AF_INET=2, host byte order),
|
||||
// [2:4]=port (BIG-endian / network order), [4:8]=4-byte address.
|
||||
// IPv6 sockaddr_in6 (≥24 bytes): [0:2]=family (AF_INET6=10), [2:4]=port (BE),
|
||||
// [4:8]=flowinfo, [8:24]=16-byte address.
|
||||
//
|
||||
// The family field is host byte order in the kernel; on x86/arm64 (little-end)
|
||||
// AF_INET=2 lands in the low byte. We accept the family if EITHER the LE or BE
|
||||
// 16-bit read matches the expected constant, so the parser is endianness-robust
|
||||
// across architectures.
|
||||
func parseOrigDst(raw []byte) (host string, port int, err error) {
|
||||
if len(raw) < 4 {
|
||||
return "", 0, fmt.Errorf("sockaddr too short: %d bytes", len(raw))
|
||||
}
|
||||
famLE := binary.LittleEndian.Uint16(raw[0:2])
|
||||
famBE := binary.BigEndian.Uint16(raw[0:2])
|
||||
p := int(binary.BigEndian.Uint16(raw[2:4])) // port is network order
|
||||
|
||||
switch {
|
||||
case famLE == syscall.AF_INET || famBE == syscall.AF_INET:
|
||||
if len(raw) < 8 {
|
||||
return "", 0, fmt.Errorf("sockaddr_in too short: %d bytes", len(raw))
|
||||
}
|
||||
ip := net.IPv4(raw[4], raw[5], raw[6], raw[7])
|
||||
return ip.String(), p, nil
|
||||
case famLE == syscall.AF_INET6 || famBE == syscall.AF_INET6:
|
||||
if len(raw) < 24 {
|
||||
return "", 0, fmt.Errorf("sockaddr_in6 too short: %d bytes", len(raw))
|
||||
}
|
||||
ip := make(net.IP, 16)
|
||||
copy(ip, raw[8:24])
|
||||
return ip.String(), p, nil
|
||||
default:
|
||||
return "", 0, fmt.Errorf("unknown sockaddr family: LE=%d BE=%d", famLE, famBE)
|
||||
}
|
||||
}
|
||||
|
||||
// origDst recovers the pre-DNAT original destination of a transparently
|
||||
// redirected TCP connection via getsockopt(SO_ORIGINAL_DST). v4 vs v6 is chosen
|
||||
// by the local address family. stdlib-only (syscall.Syscall6 on the raw fd via
|
||||
// SyscallConn). Linux-only by build tag.
|
||||
func origDst(conn *net.TCPConn) (host string, port int, err error) {
|
||||
level := syscall.SOL_IP
|
||||
if la, ok := conn.LocalAddr().(*net.TCPAddr); ok && la.IP.To4() == nil && la.IP != nil {
|
||||
level = syscall.SOL_IPV6
|
||||
}
|
||||
rc, err := conn.SyscallConn()
|
||||
if err != nil {
|
||||
return "", 0, err
|
||||
}
|
||||
// A sockaddr_in6 is 28 bytes; size the buffer for the larger of the two.
|
||||
buf := make([]byte, 28)
|
||||
size := uint32(len(buf))
|
||||
var goErr error
|
||||
ctrlErr := rc.Control(func(fd uintptr) {
|
||||
_, _, errno := syscall.Syscall6(
|
||||
syscall.SYS_GETSOCKOPT,
|
||||
fd,
|
||||
uintptr(level),
|
||||
uintptr(soOriginalDst),
|
||||
uintptr(unsafe.Pointer(&buf[0])),
|
||||
uintptr(unsafe.Pointer(&size)),
|
||||
0,
|
||||
)
|
||||
if errno != 0 {
|
||||
goErr = errno
|
||||
}
|
||||
})
|
||||
if ctrlErr != nil {
|
||||
return "", 0, ctrlErr
|
||||
}
|
||||
if goErr != nil {
|
||||
return "", 0, goErr
|
||||
}
|
||||
return parseOrigDst(buf[:size])
|
||||
}
|
||||
|
||||
// ── ClientHello SNI peek (no decryption) ─────────────────────────────────────
|
||||
|
||||
// recordingReader tees every byte it reads off the underlying reader into an
|
||||
// in-memory buffer, so the exact bytes consumed during the ClientHello peek can
|
||||
// be re-fed to either the upstream (splice) or a tls.Server (mitm/allow/block).
|
||||
type recordingReader struct {
|
||||
r io.Reader
|
||||
buf bytes.Buffer
|
||||
}
|
||||
|
||||
func (rr *recordingReader) Read(p []byte) (int, error) {
|
||||
n, err := rr.r.Read(p)
|
||||
if n > 0 {
|
||||
rr.buf.Write(p[:n])
|
||||
}
|
||||
return n, err
|
||||
}
|
||||
|
||||
// prefixConn is a net.Conn whose Read drains an internal prefix buffer (the
|
||||
// bytes already peeked off the wire) before delegating to the underlying conn;
|
||||
// every other net.Conn method delegates straight through. This re-presents the
|
||||
// recorded ClientHello bytes to a tls.Server / upstream that must see the
|
||||
// original handshake.
|
||||
type prefixConn struct {
|
||||
prefix []byte
|
||||
off int
|
||||
net.Conn
|
||||
}
|
||||
|
||||
func (pc *prefixConn) Read(p []byte) (int, error) {
|
||||
if pc.off < len(pc.prefix) {
|
||||
n := copy(p, pc.prefix[pc.off:])
|
||||
pc.off += n
|
||||
return n, nil
|
||||
}
|
||||
return pc.Conn.Read(p)
|
||||
}
|
||||
|
||||
// peekClientHello reads exactly the first TLS record (the ClientHello) off conn
|
||||
// WITHOUT consuming it from the caller's perspective: the bytes are recorded so
|
||||
// they can be replayed. It returns the recorded record bytes (the full set of
|
||||
// bytes read off the wire, which equals the first TLS record) for replay.
|
||||
func peekClientHello(conn net.Conn) (record []byte, err error) {
|
||||
rr := &recordingReader{r: conn}
|
||||
// TLS record header: type(1) + version(2) + length(2).
|
||||
hdr := make([]byte, 5)
|
||||
if _, err := io.ReadFull(rr, hdr); err != nil {
|
||||
return rr.buf.Bytes(), err
|
||||
}
|
||||
recLen := int(binary.BigEndian.Uint16(hdr[3:5]))
|
||||
// Sanity cap: a ClientHello must fit in a single record (max 16KiB payload).
|
||||
if recLen < 0 || recLen > (1<<14) {
|
||||
return rr.buf.Bytes(), fmt.Errorf("clienthello record length out of range: %d", recLen)
|
||||
}
|
||||
if _, err := io.ReadFull(rr, make([]byte, recLen)); err != nil {
|
||||
return rr.buf.Bytes(), err
|
||||
}
|
||||
return rr.buf.Bytes(), nil
|
||||
}
|
||||
|
||||
// sniFromClientHello extracts the SNI host_name from a raw TLS ClientHello
|
||||
// record. It is PURE (no I/O) and defensive: every slice is bounds-checked and
|
||||
// any malformed/short input or absent SNI returns ("", false) — it never panics.
|
||||
//
|
||||
// Record framing parsed here:
|
||||
//
|
||||
// record header : type=0x16 (handshake) | version(2) | length(2)
|
||||
// handshake hdr : type=0x01 (ClientHello) | length(3)
|
||||
// body : client_version(2) | random(32) |
|
||||
// session_id_len(1) + session_id |
|
||||
// cipher_suites_len(2) + cipher_suites |
|
||||
// compression_len(1) + compression_methods |
|
||||
// extensions_len(2) + extensions
|
||||
// extension : ext_type(2) | ext_len(2) + ext_data
|
||||
// server_name : list_len(2) | name_type(1)=0 | name_len(2) + host
|
||||
func sniFromClientHello(record []byte) (string, bool) {
|
||||
// record header (5) — type 0x16 handshake.
|
||||
if len(record) < 5 || record[0] != 0x16 {
|
||||
return "", false
|
||||
}
|
||||
recLen := int(binary.BigEndian.Uint16(record[3:5]))
|
||||
body := record[5:]
|
||||
if len(body) < recLen {
|
||||
return "", false
|
||||
}
|
||||
body = body[:recLen]
|
||||
|
||||
// handshake header (4) — type 0x01 ClientHello + 3-byte length.
|
||||
if len(body) < 4 || body[0] != 0x01 {
|
||||
return "", false
|
||||
}
|
||||
hsLen := int(body[1])<<16 | int(body[2])<<8 | int(body[3])
|
||||
hs := body[4:]
|
||||
if len(hs) < hsLen {
|
||||
return "", false
|
||||
}
|
||||
hs = hs[:hsLen]
|
||||
|
||||
// client_version(2) + random(32).
|
||||
if len(hs) < 34 {
|
||||
return "", false
|
||||
}
|
||||
p := hs[34:]
|
||||
|
||||
// session_id: len(1) + data.
|
||||
if len(p) < 1 {
|
||||
return "", false
|
||||
}
|
||||
sidLen := int(p[0])
|
||||
p = p[1:]
|
||||
if len(p) < sidLen {
|
||||
return "", false
|
||||
}
|
||||
p = p[sidLen:]
|
||||
|
||||
// cipher_suites: len(2) + data.
|
||||
if len(p) < 2 {
|
||||
return "", false
|
||||
}
|
||||
csLen := int(binary.BigEndian.Uint16(p[0:2]))
|
||||
p = p[2:]
|
||||
if len(p) < csLen {
|
||||
return "", false
|
||||
}
|
||||
p = p[csLen:]
|
||||
|
||||
// compression_methods: len(1) + data.
|
||||
if len(p) < 1 {
|
||||
return "", false
|
||||
}
|
||||
cmLen := int(p[0])
|
||||
p = p[1:]
|
||||
if len(p) < cmLen {
|
||||
return "", false
|
||||
}
|
||||
p = p[cmLen:]
|
||||
|
||||
// extensions: len(2) + entries.
|
||||
if len(p) < 2 {
|
||||
return "", false
|
||||
}
|
||||
extLen := int(binary.BigEndian.Uint16(p[0:2]))
|
||||
p = p[2:]
|
||||
if len(p) < extLen {
|
||||
return "", false
|
||||
}
|
||||
ext := p[:extLen]
|
||||
|
||||
for len(ext) >= 4 {
|
||||
etype := binary.BigEndian.Uint16(ext[0:2])
|
||||
elen := int(binary.BigEndian.Uint16(ext[2:4]))
|
||||
ext = ext[4:]
|
||||
if len(ext) < elen {
|
||||
return "", false
|
||||
}
|
||||
data := ext[:elen]
|
||||
ext = ext[elen:]
|
||||
if etype != 0x0000 { // server_name
|
||||
continue
|
||||
}
|
||||
// server_name_list: list_len(2) + entries.
|
||||
if len(data) < 2 {
|
||||
return "", false
|
||||
}
|
||||
listLen := int(binary.BigEndian.Uint16(data[0:2]))
|
||||
list := data[2:]
|
||||
if len(list) < listLen {
|
||||
return "", false
|
||||
}
|
||||
list = list[:listLen]
|
||||
// First entry: name_type(1) + name_len(2) + host.
|
||||
if len(list) < 3 {
|
||||
return "", false
|
||||
}
|
||||
nameType := list[0]
|
||||
nameLen := int(binary.BigEndian.Uint16(list[1:3]))
|
||||
list = list[3:]
|
||||
if nameType != 0x00 || len(list) < nameLen { // 0 = host_name
|
||||
return "", false
|
||||
}
|
||||
return string(list[:nameLen]), true
|
||||
}
|
||||
return "", false
|
||||
}
|
||||
|
||||
// ── transparent accept path ──────────────────────────────────────────────────
|
||||
|
||||
// runTransparent runs the transparent (SO_ORIGINAL_DST) accept loop: listen on
|
||||
// addr, and for each nft-DNAT'd connection recover its pre-DNAT destination and
|
||||
// dispatch to handleTransparent. Linux-only (build-tagged).
|
||||
func runTransparent(px *Proxy, addr string) {
|
||||
ln, err := net.Listen("tcp", addr)
|
||||
if err != nil {
|
||||
log.Fatalf("transparent listen: %v", err)
|
||||
}
|
||||
log.Printf("sbxmitm TRANSPARENT listening on %s", addr)
|
||||
for {
|
||||
conn, err := ln.Accept()
|
||||
if err != nil {
|
||||
log.Printf("accept: %v", err)
|
||||
continue
|
||||
}
|
||||
go px.handleTransparent(conn)
|
||||
}
|
||||
}
|
||||
|
||||
// handleTransparent serves one transparently-redirected client connection:
|
||||
// 1. recover the pre-DNAT original destination via SO_ORIGINAL_DST,
|
||||
// 2. PEEK the ClientHello off the raw conn without consuming it,
|
||||
// 3. parse the SNI and Decide WITHOUT decrypting,
|
||||
// 4. splice → raw TCP passthrough to the ORIGINAL dst, replaying the peeked
|
||||
// ClientHello first; NEVER terminate TLS (cert-pinned/own-infra safe),
|
||||
// 5. allow/mitm/block → NOW tls.Server over the replayable conn (so the TLS
|
||||
// server still sees the original ClientHello) and run the shared pipeline.
|
||||
func (px *Proxy) handleTransparent(client net.Conn) {
|
||||
defer client.Close()
|
||||
|
||||
tcp, ok := client.(*net.TCPConn)
|
||||
if !ok {
|
||||
return // transparent mode only accepts raw TCP conns
|
||||
}
|
||||
// R3 WG client? The data-wg attribute of the injected loader mirrors the
|
||||
// Python _loader_script (ip.startswith("10.99.1.")) — derived from the same
|
||||
// client conn peer IP that feeds clientHashFromConn.
|
||||
wg := false
|
||||
if peer, _, perr := net.SplitHostPort(client.RemoteAddr().String()); perr == nil {
|
||||
wg = strings.HasPrefix(peer, "10.99.1.")
|
||||
}
|
||||
dstHost, dstPort, err := origDst(tcp)
|
||||
if err != nil {
|
||||
return // no original-dst (not DNAT'd) → drop; nothing safe to do
|
||||
}
|
||||
dialAddr := net.JoinHostPort(dstHost, fmt.Sprintf("%d", dstPort))
|
||||
|
||||
// Peek the ClientHello WITHOUT decrypting. The recorded bytes are replayed
|
||||
// to whatever we hand the conn to next (upstream for splice, tls.Server
|
||||
// otherwise) so the original handshake is preserved byte-for-byte.
|
||||
hello, perr := peekClientHello(client)
|
||||
if perr != nil {
|
||||
return // could not read a ClientHello → nothing safe to do
|
||||
}
|
||||
sni, _ := sniFromClientHello(hello)
|
||||
decisionHost := sni
|
||||
if decisionHost == "" {
|
||||
decisionHost = dstHost // no SNI → fall back to the captured dst IP
|
||||
}
|
||||
|
||||
verdict := px.pol.Decide(decisionHost, sni)
|
||||
|
||||
if verdict == "splice" {
|
||||
// Passthrough: raw TCP to the REAL captured destination, never the SNI,
|
||||
// NEVER terminating TLS. Replay the peeked ClientHello to the upstream
|
||||
// first, then pipe raw bytes both directions over the raw client conn.
|
||||
up, derr := net.Dial("tcp", dialAddr)
|
||||
if derr != nil {
|
||||
return
|
||||
}
|
||||
defer up.Close()
|
||||
if _, werr := up.Write(hello); werr != nil {
|
||||
return
|
||||
}
|
||||
go func() { _, _ = io.Copy(up, client) }()
|
||||
_, _ = io.Copy(client, up)
|
||||
return
|
||||
}
|
||||
|
||||
// allow / mitm / block → re-present the peeked ClientHello to a tls.Server
|
||||
// over a replayable conn, then run the shared pipeline dialling the captured
|
||||
// original-dst (NOT the SNI).
|
||||
replay := &prefixConn{prefix: hello, Conn: client}
|
||||
tconn := tls.Server(replay, px.serverTLSConfig())
|
||||
if err := tconn.Handshake(); err != nil {
|
||||
return
|
||||
}
|
||||
defer tconn.Close()
|
||||
px.mitmPipeline(tconn, client, decisionHost, verdict, dialAddr, wg)
|
||||
}
|
||||
33
packages/secubox-toolbox-ng/cmd/sbxmitm/transparent_stub.go
Normal file
33
packages/secubox-toolbox-ng/cmd/sbxmitm/transparent_stub.go
Normal file
|
|
@ -0,0 +1,33 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
//go:build !linux
|
||||
|
||||
// SecuBox-Deb :: toolbox-ng :: transparent mode non-linux stub (#662).
|
||||
//
|
||||
// SO_ORIGINAL_DST recovery is Netfilter-specific (Linux-only). The real
|
||||
// transparent accept path lives in transparent.go behind //go:build linux. This
|
||||
// stub lets the package still compile (and `GOOS=darwin go build ./...`) on
|
||||
// non-linux: invoking transparent mode there is a hard error, never silently
|
||||
// degraded. handleTransparent is stubbed too in case it is referenced.
|
||||
package main
|
||||
|
||||
import (
|
||||
"log"
|
||||
"net"
|
||||
)
|
||||
|
||||
// runTransparent is the non-linux counterpart of the linux accept loop: it
|
||||
// refuses to start, because transparent SO_ORIGINAL_DST capture requires Linux.
|
||||
func runTransparent(px *Proxy, addr string) {
|
||||
_ = px
|
||||
_ = addr
|
||||
log.Fatal("transparent mode requires linux (SO_ORIGINAL_DST)")
|
||||
}
|
||||
|
||||
// handleTransparent is a non-linux stub; it can never be reached because
|
||||
// runTransparent log.Fatals first. Present so any reference still links.
|
||||
func (px *Proxy) handleTransparent(client net.Conn) {
|
||||
_ = client
|
||||
log.Fatal("transparent mode requires linux (SO_ORIGINAL_DST)")
|
||||
}
|
||||
304
packages/secubox-toolbox-ng/cmd/sbxmitm/transparent_test.go
Normal file
304
packages/secubox-toolbox-ng/cmd/sbxmitm/transparent_test.go
Normal file
|
|
@ -0,0 +1,304 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
//
|
||||
//go:build linux
|
||||
|
||||
// Tests for the transparent SO_ORIGINAL_DST sockaddr parser (#662 Phase 6 prep).
|
||||
//
|
||||
// Only the PURE parser (parseOrigDst) is unit-tested here: it decodes a raw
|
||||
// sockaddr byte blob with no syscall, so it is fully covered offline. The real
|
||||
// getsockopt(SO_ORIGINAL_DST) glue (origDst) cannot be exercised without an nft
|
||||
// DNAT redirect in the kernel — end-to-end transparent capture is validated at
|
||||
// Phase 5 shadow on the board, NOT in unit tests (documented in transparent.go).
|
||||
package main
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/binary"
|
||||
"io"
|
||||
"net"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// mkSockaddrIn4 builds a 16-byte sockaddr_in: family(2 host-order) + port(BE) +
|
||||
// 4-byte addr + 8 pad. familyLE controls whether the 2 family bytes are written
|
||||
// little-endian (low byte first, the x86/arm64 host order) or big-endian, so we
|
||||
// can prove parseOrigDst tolerates both.
|
||||
func mkSockaddrIn4(family uint16, port uint16, a, b, c, d byte, familyLE bool) []byte {
|
||||
buf := make([]byte, 16)
|
||||
if familyLE {
|
||||
binary.LittleEndian.PutUint16(buf[0:2], family)
|
||||
} else {
|
||||
binary.BigEndian.PutUint16(buf[0:2], family)
|
||||
}
|
||||
binary.BigEndian.PutUint16(buf[2:4], port) // port is always network order
|
||||
buf[4], buf[5], buf[6], buf[7] = a, b, c, d
|
||||
return buf
|
||||
}
|
||||
|
||||
// mkSockaddrIn6 builds a 28-byte sockaddr_in6: family(2) + port(BE) +
|
||||
// flowinfo(4) + 16-byte addr + scope_id(4).
|
||||
func mkSockaddrIn6(family uint16, port uint16, addr [16]byte, familyLE bool) []byte {
|
||||
buf := make([]byte, 28)
|
||||
if familyLE {
|
||||
binary.LittleEndian.PutUint16(buf[0:2], family)
|
||||
} else {
|
||||
binary.BigEndian.PutUint16(buf[0:2], family)
|
||||
}
|
||||
binary.BigEndian.PutUint16(buf[2:4], port)
|
||||
copy(buf[8:24], addr[:])
|
||||
return buf
|
||||
}
|
||||
|
||||
func TestParseOrigDstIPv4(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
raw []byte
|
||||
wantHost string
|
||||
wantPort int
|
||||
}{
|
||||
{"le-family", mkSockaddrIn4(2, 443, 93, 184, 216, 34, true), "93.184.216.34", 443},
|
||||
{"be-family", mkSockaddrIn4(2, 8080, 10, 99, 1, 10, false), "10.99.1.10", 8080},
|
||||
{"high-port", mkSockaddrIn4(2, 65535, 1, 2, 3, 4, true), "1.2.3.4", 65535},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
host, port, err := parseOrigDst(tc.raw)
|
||||
if err != nil {
|
||||
t.Fatalf("parseOrigDst: %v", err)
|
||||
}
|
||||
if host != tc.wantHost || port != tc.wantPort {
|
||||
t.Fatalf("parseOrigDst = %q:%d want %q:%d", host, port, tc.wantHost, tc.wantPort)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseOrigDstIPv6(t *testing.T) {
|
||||
// 2606:2800:220:1:248:1893:25c8:1946 (example.com-ish), port 443.
|
||||
addr := [16]byte{0x26, 0x06, 0x28, 0x00, 0x02, 0x20, 0x00, 0x01,
|
||||
0x02, 0x48, 0x18, 0x93, 0x25, 0xc8, 0x19, 0x46}
|
||||
for _, le := range []bool{true, false} {
|
||||
raw := mkSockaddrIn6(10, 443, addr, le)
|
||||
host, port, err := parseOrigDst(raw)
|
||||
if err != nil {
|
||||
t.Fatalf("parseOrigDst(le=%v): %v", le, err)
|
||||
}
|
||||
want := "2606:2800:220:1:248:1893:25c8:1946"
|
||||
if host != want || port != 443 {
|
||||
t.Fatalf("parseOrigDst(le=%v) = %q:%d want %q:443", le, host, port, want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseOrigDstPortBigEndian(t *testing.T) {
|
||||
// Port 0x01BB = 443; assert it is read big-endian (network order), not the
|
||||
// host-order 0xBB01 = 47873.
|
||||
raw := mkSockaddrIn4(2, 0x01BB, 8, 8, 8, 8, true)
|
||||
_, port, err := parseOrigDst(raw)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if port != 443 {
|
||||
t.Fatalf("port = %d want 443 (big-endian decode)", port)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseOrigDstErrors(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
raw []byte
|
||||
}{
|
||||
{"empty", nil},
|
||||
{"unknown-family-4", make([]byte, 4)}, // all-zero family=0 → unknown-family branch
|
||||
{"too-short-v4", mkV4Short()}, // valid AF_INET family but 4≤len<8 → sockaddr_in <8 guard
|
||||
{"too-short-v6", mkV6Short()}, // AF_INET6 but < 24 bytes
|
||||
{"unknown-family", mkSockaddrIn4(7, 443, 1, 2, 3, 4, true)},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
if _, _, err := parseOrigDst(tc.raw); err == nil {
|
||||
t.Fatalf("parseOrigDst(%s) = nil err, want error", tc.name)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// mkV6Short returns an AF_INET6 blob truncated before the 16-byte address.
|
||||
func mkV6Short() []byte {
|
||||
buf := make([]byte, 10) // family + port + flowinfo + 2 bytes of addr
|
||||
binary.LittleEndian.PutUint16(buf[0:2], 10)
|
||||
binary.BigEndian.PutUint16(buf[2:4], 443)
|
||||
return buf
|
||||
}
|
||||
|
||||
// mkV4Short returns a blob with a valid AF_INET family byte but a total length
|
||||
// in [4,8): it passes the >=4 length check and matches the AF_INET case, so it
|
||||
// exercises parseOrigDst's sockaddr_in `<8` guard (not the unknown-family path).
|
||||
func mkV4Short() []byte {
|
||||
buf := make([]byte, 6) // family(2) + port(2) but no full 4-byte address
|
||||
binary.LittleEndian.PutUint16(buf[0:2], 2) // AF_INET
|
||||
binary.BigEndian.PutUint16(buf[2:4], 443)
|
||||
return buf
|
||||
}
|
||||
|
||||
// ── sniFromClientHello ───────────────────────────────────────────────────────
|
||||
|
||||
// mkClientHello hand-assembles a minimal but structurally-valid TLS
|
||||
// ClientHello record. If withSNI is true a server_name extension carrying
|
||||
// `sni` (a single host_name entry) is appended; otherwise NO extensions are
|
||||
// emitted (extensions length 0).
|
||||
//
|
||||
// Record layout assembled here (see sniFromClientHello for the parser):
|
||||
//
|
||||
// record header : type=0x16 (handshake) | version 0x0303 | record_len(2)
|
||||
// handshake : type=0x01 (ClientHello) | hs_len(3)
|
||||
// body : client_version 0x0303 | random(32) |
|
||||
// session_id_len=0 |
|
||||
// cipher_suites_len(2)=2 | cipher 0x002f |
|
||||
// compression_len=1 | method 0x00 |
|
||||
// extensions_len(2) | [ server_name ext ]
|
||||
// server_name : ext_type 0x0000 | ext_len(2) |
|
||||
// list_len(2) | name_type 0x00 | name_len(2) | host bytes
|
||||
func mkClientHello(sni string, withSNI bool) []byte {
|
||||
body := []byte{0x03, 0x03} // client_version TLS1.2
|
||||
body = append(body, make([]byte, 32)...) // random (zeros)
|
||||
body = append(body, 0x00) // session_id_len = 0
|
||||
// cipher_suites: length 2, one suite TLS_RSA_WITH_AES_128_CBC_SHA (0x002f)
|
||||
body = append(body, 0x00, 0x02, 0x00, 0x2f)
|
||||
// compression_methods: length 1, method null (0x00)
|
||||
body = append(body, 0x01, 0x00)
|
||||
|
||||
var exts []byte
|
||||
if withSNI {
|
||||
host := []byte(sni)
|
||||
var sn []byte
|
||||
sn = append(sn, 0x00) // name_type = host_name
|
||||
sn = append(sn, byte(len(host)>>8), byte(len(host))) // name_len(2)
|
||||
sn = append(sn, host...)
|
||||
var list []byte
|
||||
list = append(list, byte(len(sn)>>8), byte(len(sn))) // server_name_list len(2)
|
||||
list = append(list, sn...)
|
||||
exts = append(exts, 0x00, 0x00) // ext_type = server_name
|
||||
exts = append(exts, byte(len(list)>>8), byte(len(list))) // ext_len(2)
|
||||
exts = append(exts, list...)
|
||||
}
|
||||
body = append(body, byte(len(exts)>>8), byte(len(exts))) // extensions_len(2)
|
||||
body = append(body, exts...)
|
||||
|
||||
// handshake header: type 0x01 + 3-byte length
|
||||
hs := []byte{0x01, byte(len(body) >> 16), byte(len(body) >> 8), byte(len(body))}
|
||||
hs = append(hs, body...)
|
||||
|
||||
// record header: type 0x16 + version 0x0303 + 2-byte length
|
||||
rec := []byte{0x16, 0x03, 0x03, byte(len(hs) >> 8), byte(len(hs))}
|
||||
rec = append(rec, hs...)
|
||||
return rec
|
||||
}
|
||||
|
||||
func TestSNIFromClientHello(t *testing.T) {
|
||||
// Sanity: the hand-assembled blob parses with our own parser.
|
||||
good := mkClientHello("example.com", true)
|
||||
if sni, ok := sniFromClientHello(good); !ok || sni != "example.com" {
|
||||
t.Fatalf("sniFromClientHello(valid) = %q,%v want example.com,true", sni, ok)
|
||||
}
|
||||
|
||||
cases := []struct {
|
||||
name string
|
||||
rec []byte
|
||||
wantSNI string
|
||||
wantOK bool
|
||||
}{
|
||||
{"with-sni", mkClientHello("secubox.in", true), "secubox.in", true},
|
||||
{"no-sni-ext", mkClientHello("", false), "", false},
|
||||
{"nil", nil, "", false},
|
||||
{"empty", []byte{}, "", false},
|
||||
{"non-handshake-record", []byte{0x17, 0x03, 0x03, 0x00, 0x05, 1, 2, 3, 4, 5}, "", false},
|
||||
{"truncated-header", []byte{0x16, 0x03}, "", false},
|
||||
// valid record header claiming length 100 but body truncated.
|
||||
{"truncated-body", []byte{0x16, 0x03, 0x03, 0x00, 0x64, 0x01, 0x00, 0x00}, "", false},
|
||||
// truncate a known-good blob mid-extensions.
|
||||
{"truncated-good", good[:len(good)-3], "", false},
|
||||
{"not-clienthello-hs", func() []byte {
|
||||
b := mkClientHello("x.example", true)
|
||||
b[5] = 0x02 // handshake type ServerHello, not ClientHello
|
||||
return b
|
||||
}(), "", false},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
sni, ok := sniFromClientHello(tc.rec)
|
||||
if ok != tc.wantOK || sni != tc.wantSNI {
|
||||
t.Fatalf("sniFromClientHello = %q,%v want %q,%v", sni, ok, tc.wantSNI, tc.wantOK)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestSNIFromClientHelloNoPanic(t *testing.T) {
|
||||
// Fuzz-ish: every truncation of a valid blob must return cleanly, never panic.
|
||||
good := mkClientHello("example.com", true)
|
||||
for i := 0; i <= len(good); i++ {
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
t.Fatalf("panic on good[:%d]: %v", i, r)
|
||||
}
|
||||
}()
|
||||
_, _ = sniFromClientHello(good[:i])
|
||||
}()
|
||||
}
|
||||
}
|
||||
|
||||
// ── prefixConn (replayable client conn) ──────────────────────────────────────
|
||||
|
||||
// fakeConn adapts an io.ReadWriteCloser to net.Conn for prefixConn tests.
|
||||
type fakeConn struct{ io.ReadWriteCloser }
|
||||
|
||||
func (fakeConn) LocalAddr() net.Addr { return &net.TCPAddr{} }
|
||||
func (fakeConn) RemoteAddr() net.Addr { return &net.TCPAddr{} }
|
||||
func (fakeConn) SetDeadline(time.Time) error { return nil }
|
||||
func (fakeConn) SetReadDeadline(time.Time) error { return nil }
|
||||
func (fakeConn) SetWriteDeadline(time.Time) error { return nil }
|
||||
|
||||
type rwc struct {
|
||||
*bytes.Reader
|
||||
w *bytes.Buffer
|
||||
}
|
||||
|
||||
func (r rwc) Write(p []byte) (int, error) { return r.w.Write(p) }
|
||||
func (rwc) Close() error { return nil }
|
||||
|
||||
func TestPrefixConnReplaysBufferedThenLive(t *testing.T) {
|
||||
live := bytes.NewReader([]byte("LIVE-DATA"))
|
||||
wbuf := &bytes.Buffer{}
|
||||
underlying := fakeConn{rwc{Reader: live, w: wbuf}}
|
||||
|
||||
pc := &prefixConn{prefix: []byte("PEEKED"), Conn: underlying}
|
||||
|
||||
got, err := io.ReadAll(pc)
|
||||
if err != nil {
|
||||
t.Fatalf("read: %v", err)
|
||||
}
|
||||
if string(got) != "PEEKEDLIVE-DATA" {
|
||||
t.Fatalf("prefixConn read = %q want PEEKEDLIVE-DATA", got)
|
||||
}
|
||||
// Writes delegate straight through to the underlying conn.
|
||||
if _, err := pc.Write([]byte("OUT")); err != nil {
|
||||
t.Fatalf("write: %v", err)
|
||||
}
|
||||
if wbuf.String() != "OUT" {
|
||||
t.Fatalf("underlying write = %q want OUT", wbuf.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestPrefixConnEmptyPrefix(t *testing.T) {
|
||||
live := bytes.NewReader([]byte("ONLY-LIVE"))
|
||||
underlying := fakeConn{rwc{Reader: live, w: &bytes.Buffer{}}}
|
||||
pc := &prefixConn{Conn: underlying}
|
||||
got, _ := io.ReadAll(pc)
|
||||
if string(got) != "ONLY-LIVE" {
|
||||
t.Fatalf("prefixConn read = %q want ONLY-LIVE", got)
|
||||
}
|
||||
}
|
||||
60
packages/secubox-toolbox-ng/cmd/sbxmitm/util.go
Normal file
60
packages/secubox-toolbox-ng/cmd/sbxmitm/util.go
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
// SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
// Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"fmt"
|
||||
"io"
|
||||
"net"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
func newReader(c net.Conn) *bufio.Reader { return bufio.NewReader(c) }
|
||||
|
||||
// writeResponse serializes an http.Response (status + headers + body) onto a
|
||||
// (TLS) conn, preserving MULTI-VALUED headers (notably Set-Cookie, which the
|
||||
// poison path rewrites per-cookie). Hop-by-hop framing headers are dropped and
|
||||
// replaced with an explicit Content-Length + Connection: close, because we send
|
||||
// the fully-buffered body.
|
||||
func writeResponse(c io.Writer, resp *http.Response, body []byte) {
|
||||
status := resp.Status
|
||||
if status == "" {
|
||||
status = fmt.Sprintf("%d", resp.StatusCode)
|
||||
}
|
||||
fmt.Fprintf(c, "HTTP/1.1 %s\r\n", status)
|
||||
for k, vals := range resp.Header {
|
||||
switch http.CanonicalHeaderKey(k) {
|
||||
case "Content-Length", "Transfer-Encoding", "Connection":
|
||||
continue // we set framing ourselves
|
||||
}
|
||||
for _, v := range vals {
|
||||
fmt.Fprintf(c, "%s: %s\r\n", k, v)
|
||||
}
|
||||
}
|
||||
fmt.Fprintf(c, "Content-Length: %d\r\n", len(body))
|
||||
fmt.Fprintf(c, "Connection: close\r\n")
|
||||
io.WriteString(c, "\r\n")
|
||||
if len(body) > 0 {
|
||||
c.Write(body)
|
||||
}
|
||||
}
|
||||
|
||||
// writeRaw writes a minimal HTTP/1.1 response onto a (TLS) conn.
|
||||
func writeRaw(c io.Writer, code int, status string, headers map[string]string, body []byte) {
|
||||
if status == "" {
|
||||
status = "OK"
|
||||
}
|
||||
fmt.Fprintf(c, "HTTP/1.1 %d %s\r\n", code, status)
|
||||
fmt.Fprintf(c, "Content-Length: %d\r\n", len(body))
|
||||
fmt.Fprintf(c, "Connection: close\r\n")
|
||||
for k, v := range headers {
|
||||
if v != "" {
|
||||
fmt.Fprintf(c, "%s: %s\r\n", k, v)
|
||||
}
|
||||
}
|
||||
io.WriteString(c, "\r\n")
|
||||
if len(body) > 0 {
|
||||
c.Write(body)
|
||||
}
|
||||
}
|
||||
48
packages/secubox-toolbox-ng/debian/changelog
Normal file
48
packages/secubox-toolbox-ng/debian/changelog
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
secubox-toolbox-ng (0.1.4-1~bookworm1) bookworm; urgency=medium
|
||||
|
||||
* proxy: do NOT follow upstream redirects — relay 3xx to the client so the
|
||||
browser follows it (correct URL/origin/cookies). Go's default http.Client
|
||||
followed them, collapsing 301/302 into a final 200 under the original URL.
|
||||
(ref #662)
|
||||
|
||||
-- Gerald KERMA <devel@cybermind.fr> Wed, 18 Jun 2026 20:10:00 +0000
|
||||
|
||||
secubox-toolbox-ng (0.1.3-1~bookworm1) bookworm; urgency=medium
|
||||
|
||||
* banner: inject into COMPRESSED HTML too. Pin upstream Accept-Encoding to gzip
|
||||
(stdlib can't brotli), and in the inject path gunzip → injectLoader → re-gzip
|
||||
(32MiB inflate cap, fail-open on corrupt). Fixes missing banner on the common
|
||||
gzip/br case; non-HTML passes through untouched. (ref #662)
|
||||
|
||||
-- Gerald KERMA <devel@cybermind.fr> Wed, 18 Jun 2026 19:45:00 +0000
|
||||
|
||||
secubox-toolbox-ng (0.1.2-1~bookworm1) bookworm; urgency=medium
|
||||
|
||||
* banner: port the real transparency-banner inject — inject the loader
|
||||
<script src="/__toolbox/loader.js" data-mh=.. data-wg=..> (guard-idempotent,
|
||||
R3 wg flag, mac_hash identity) and reverse-proxy /__toolbox/loader.js +
|
||||
/__toolbox/bundle to the portal (127.0.0.1:8088), replacing the invisible
|
||||
marker comment. Fail-open to 204. (ref #662)
|
||||
|
||||
-- Gerald KERMA <devel@cybermind.fr> Wed, 18 Jun 2026 19:20:00 +0000
|
||||
|
||||
secubox-toolbox-ng (0.1.1-1~bookworm1) bookworm; urgency=medium
|
||||
|
||||
* worker@ unit: forge with the LIVE R3 CA clients trust (mitmproxy confdir
|
||||
bundle, group-readable) instead of the root-only ca-wg WG-CA key; bind
|
||||
transparent on 10.99.1.1:809%i (the nft R3 DNAT target) instead of CONNECT
|
||||
on 127.0.0.1; add wg-quick@wg-toolbox dependency. (ref #662)
|
||||
* loadCA: scan PEM blocks by type so a combined cert+key bundle
|
||||
(mitmproxy-ca.pem) is accepted for --ca-key. (ref #662)
|
||||
|
||||
-- Gerald KERMA <devel@cybermind.fr> Wed, 18 Jun 2026 19:00:00 +0000
|
||||
|
||||
secubox-toolbox-ng (0.1.0-1~bookworm1) bookworm; urgency=medium
|
||||
|
||||
* Initial packaging of the Go MITM engine migration target (#662 Phase 5-prep).
|
||||
Ships /usr/sbin/sbxmitm + a DISABLED systemd template unit
|
||||
(secubox-toolbox-ng-worker@.service). DARK by design: the unit is not
|
||||
enabled or started, no nft DNAT, no live-R3 wiring — enabled only at the
|
||||
Phase 6 cutover.
|
||||
|
||||
-- Gerald KERMA <devel@cybermind.fr> Wed, 18 Jun 2026 22:00:00 +0200
|
||||
22
packages/secubox-toolbox-ng/debian/control
Normal file
22
packages/secubox-toolbox-ng/debian/control
Normal file
|
|
@ -0,0 +1,22 @@
|
|||
Source: secubox-toolbox-ng
|
||||
Section: net
|
||||
Priority: optional
|
||||
Maintainer: Gerald KERMA <devel@cybermind.fr>
|
||||
Build-Depends: debhelper-compat (= 13), golang-go (>= 2:1.22~)
|
||||
Standards-Version: 4.6.2
|
||||
Homepage: https://cybermind.fr/secubox
|
||||
Rules-Requires-Root: no
|
||||
|
||||
Package: secubox-toolbox-ng
|
||||
Architecture: arm64
|
||||
Depends: ${misc:Depends}
|
||||
Description: SecuBox-Deb — Go MITM engine (migration target, DARK)
|
||||
Multi-core Go re-implementation of the R3 toolbox MITM engine (#662),
|
||||
ported off the GIL-bound Python mitmproxy worker fleet. Ships the
|
||||
standalone sbxmitm binary plus a DISABLED systemd template unit.
|
||||
.
|
||||
This package is the Phase-6-cutover migration target. The unit is NOT
|
||||
enabled or started by the maintainer scripts — the live R3 tunnel keeps
|
||||
running on the Python workers until the cutover is performed manually.
|
||||
Installing this package changes NO runtime behaviour (no service start,
|
||||
no nft DNAT).
|
||||
27
packages/secubox-toolbox-ng/debian/postinst
Executable file
27
packages/secubox-toolbox-ng/debian/postinst
Executable file
|
|
@ -0,0 +1,27 @@
|
|||
#!/bin/sh
|
||||
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
# SecuBox-Deb :: toolbox-ng — postinst
|
||||
#
|
||||
# DARK by design (#662 Phase 5-prep):
|
||||
# - DO reload the systemd unit catalogue so the template is known.
|
||||
# - DO NOT enable or start secubox-toolbox-ng-worker@.service — this is the
|
||||
# Phase-6 cutover target; the live R3 tunnel keeps running on the Python
|
||||
# workers until the operator performs the cutover manually.
|
||||
# - DO NOT touch nftables (no DNAT, no live-R3 rewiring).
|
||||
set -e
|
||||
|
||||
case "$1" in
|
||||
configure)
|
||||
if [ -d /run/systemd/system ]; then
|
||||
systemctl daemon-reload >/dev/null 2>&1 || true
|
||||
fi
|
||||
# Intentionally NO `systemctl enable --now`. See the unit header and
|
||||
# debian/changelog: enabled only at the Phase 6 cutover.
|
||||
;;
|
||||
abort-upgrade|abort-remove|abort-deconfigure)
|
||||
;;
|
||||
esac
|
||||
|
||||
#DEBHELPER#
|
||||
|
||||
exit 0
|
||||
44
packages/secubox-toolbox-ng/debian/rules
Executable file
44
packages/secubox-toolbox-ng/debian/rules
Executable file
|
|
@ -0,0 +1,44 @@
|
|||
#!/usr/bin/make -f
|
||||
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
# SecuBox-Deb :: toolbox-ng — Go MITM engine (migration target, DARK)
|
||||
#
|
||||
# The binary is pure-stdlib (no go.sum, no external modules), so it
|
||||
# cross-compiles offline with GOPROXY=off. CI cross-builds for arm64;
|
||||
# this rules file does the same with `GOOS=linux GOARCH=arm64 go build`.
|
||||
|
||||
export DH_VERBOSE = 1
|
||||
|
||||
# Build the static arm64 binary offline (stdlib only — no network, no go.sum).
|
||||
export GOOS = linux
|
||||
export GOARCH = arm64
|
||||
export CGO_ENABLED = 0
|
||||
export GOFLAGS = -mod=mod
|
||||
export GOPROXY = off
|
||||
# Keep the Go build/module cache inside the build tree (sandbox-friendly).
|
||||
export GOCACHE = $(CURDIR)/_gocache
|
||||
export GOPATH = $(CURDIR)/_gopath
|
||||
|
||||
%:
|
||||
dh $@
|
||||
|
||||
override_dh_auto_build:
|
||||
go build -trimpath -ldflags=-s -o sbxmitm ./cmd/sbxmitm
|
||||
|
||||
# No Go unit tests at package-build time (run in CI on the host arch; the
|
||||
# arm64 cross-binary cannot execute its tests here).
|
||||
override_dh_auto_test:
|
||||
|
||||
override_dh_auto_install:
|
||||
install -d debian/secubox-toolbox-ng/usr/sbin
|
||||
install -m 0755 sbxmitm debian/secubox-toolbox-ng/usr/sbin/sbxmitm
|
||||
|
||||
override_dh_auto_clean:
|
||||
rm -f sbxmitm
|
||||
rm -rf _gocache _gopath
|
||||
|
||||
# DARK: install the unit file into the catalogue but DO NOT enable or start it.
|
||||
# This is the Phase-6 cutover target; the live R3 tunnel stays on the Python
|
||||
# workers until the operator enables it manually. The postinst still reloads the
|
||||
# unit catalogue so `systemctl` knows the template exists.
|
||||
override_dh_installsystemd:
|
||||
dh_installsystemd --no-enable --no-start --name=secubox-toolbox-ng-worker@
|
||||
|
|
@ -0,0 +1,66 @@
|
|||
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
# SecuBox-Deb :: toolbox-ng — Go MITM engine worker template (#662)
|
||||
#
|
||||
# ── DISABLED BY DESIGN (DARK) ────────────────────────────────────────────────
|
||||
# This is the Phase-6 CUTOVER MIGRATION TARGET. It is NOT enabled or started by
|
||||
# the package (postinst does not `systemctl enable --now`). The live R3 tunnel
|
||||
# keeps running on the Python mitmproxy workers
|
||||
# (secubox-toolbox-mitm-wg-worker@{1..4}, ports 8081-8084) until the cutover is
|
||||
# performed manually.
|
||||
#
|
||||
# Mirrors the Python worker@ fanout: each %i ∈ {1..4} listens TRANSPARENT on
|
||||
# 10.99.1.1:809%i — the SAME wg-toolbox interface IP the nft R3 DNAT targets
|
||||
# (`iif wg-toolbox tcp dport 443/80 → 10.99.1.1:numgen inc mod 4 → 808{1..4}`),
|
||||
# on 809%i ports so the Go and Python fleets coexist during a side-by-side
|
||||
# canary. The engine recovers the original destination via SO_ORIGINAL_DST
|
||||
# (works for this non-root user under NoNewPrivileges, same as mitmdump).
|
||||
#
|
||||
# Forges with the LIVE R3 CA clients already trust — mitmproxy's confdir bundle
|
||||
# (CN "Gondwana ToolBoX R3 CA"), group-readable by secubox-toolbox — NOT the
|
||||
# root-only ca-wg key.pem (CN "WG CA"), which clients do NOT trust.
|
||||
#
|
||||
# Enable ONLY at Phase 6 canary:
|
||||
#
|
||||
# systemctl enable --now secubox-toolbox-ng-worker@1.service # one slot first
|
||||
# # canary: nft ... map { ... 3 : 8091 } (was 3:8084), watch, then widen
|
||||
#
|
||||
# Rollback: re-point the nft DNAT map slot back at the Python 808%i worker,
|
||||
# then disable this unit.
|
||||
|
||||
[Unit]
|
||||
Description=SecuBox ToolBoX-NG Go MITM worker %i (migration target, transparent 10.99.1.1:809%i)
|
||||
Documentation=https://github.com/CyberMind-FR/secubox-deb/issues/662
|
||||
After=network.target wg-quick@wg-toolbox.service
|
||||
Wants=wg-quick@wg-toolbox.service
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=secubox-toolbox
|
||||
Group=secubox-toolbox
|
||||
|
||||
# Forge with the LIVE R3 CA the clients trust: cert = mitmproxy-ca-cert.pem,
|
||||
# key = mitmproxy-ca.pem (a combined cert+key bundle — loadCA scans for the
|
||||
# PRIVATE KEY block). Both are group-readable by secubox-toolbox. The anti-track
|
||||
# jar key is best-effort: absent → poison stays off.
|
||||
ExecStart=/usr/sbin/sbxmitm \
|
||||
--transparent \
|
||||
--listen 10.99.1.1:809%i \
|
||||
--ca-cert /etc/secubox/toolbox/ca-wg/mitmproxy-ca-cert.pem \
|
||||
--ca-key /etc/secubox/toolbox/ca-wg/mitmproxy-ca.pem \
|
||||
--jar-key /etc/secubox/secrets/privacy-jar.key
|
||||
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
# Hardening (mirrors the Python worker envelope).
|
||||
NoNewPrivileges=yes
|
||||
ProtectSystem=strict
|
||||
ProtectHome=yes
|
||||
PrivateTmp=yes
|
||||
ReadOnlyPaths=/etc/secubox
|
||||
MemoryHigh=100M
|
||||
MemoryMax=128M
|
||||
TasksMax=128
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
1
packages/secubox-toolbox-ng/debian/source/format
Normal file
1
packages/secubox-toolbox-ng/debian/source/format
Normal file
|
|
@ -0,0 +1 @@
|
|||
3.0 (native)
|
||||
3
packages/secubox-toolbox-ng/go.mod
Normal file
3
packages/secubox-toolbox-ng/go.mod
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
module github.com/CyberMind-FR/secubox-deb/secubox-toolbox-ng
|
||||
|
||||
go 1.22
|
||||
4
packages/secubox-toolbox-ng/testdata/config/ad-allowlist.txt
vendored
Normal file
4
packages/secubox-toolbox-ng/testdata/config/ad-allowlist.txt
vendored
Normal file
|
|
@ -0,0 +1,4 @@
|
|||
# SecuBox toolbox-ng parity fixture: operator ad-allowlist.
|
||||
# Allowlist ALWAYS wins (never block, never splice, never record).
|
||||
analytics.example-allowed.com # an allowlisted host
|
||||
criteo-but-allowed.example # would-be-ad registrable, but allowlisted
|
||||
3
packages/secubox-toolbox-ng/testdata/config/learned-trackers.txt
vendored
Normal file
3
packages/secubox-toolbox-ng/testdata/config/learned-trackers.txt
vendored
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
learned-tracker.example
|
||||
pure-tracker.example
|
||||
commented-learned.example # inline comment — _learned_set keeps the FULL line, not comment-stripped
|
||||
3
packages/secubox-toolbox-ng/testdata/config/pure-trackers.txt
vendored
Normal file
3
packages/secubox-toolbox-ng/testdata/config/pure-trackers.txt
vendored
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
# SecuBox toolbox-ng parity fixture: pure trackers — the splice never-set.
|
||||
# A host here is NEVER spliced even if it's a splice-seed/learned candidate.
|
||||
pure-tracker.example # pure tracker AND in splice-learned → never wins → block
|
||||
3
packages/secubox-toolbox-ng/testdata/config/splice-learned.txt
vendored
Normal file
3
packages/secubox-toolbox-ng/testdata/config/splice-learned.txt
vendored
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
# SecuBox toolbox-ng parity fixture: auto-learned splice (never-HTML) hosts.
|
||||
assets.example-cdn.com # a splice-learned host
|
||||
pure-tracker.example # ALSO in pure-trackers (never) → never wins → not spliced
|
||||
3
packages/secubox-toolbox-ng/testdata/config/tls-splice-seed.conf
vendored
Normal file
3
packages/secubox-toolbox-ng/testdata/config/tls-splice-seed.conf
vendored
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
# SecuBox toolbox-ng parity fixture: shipped splice seed (pure-asset CDNs).
|
||||
googlevideo.com # YouTube video streams
|
||||
fbcdn.net # Facebook / Instagram media
|
||||
84
packages/secubox-toolbox-ng/testdata/jar-fixtures.json
vendored
Normal file
84
packages/secubox-toolbox-ng/testdata/jar-fixtures.json
vendored
Normal file
|
|
@ -0,0 +1,84 @@
|
|||
{
|
||||
"_doc": "Cross-engine JAR (anti-track HMAC fake-identity) parity fixtures (#662 Phase 4). Go core (jar_test.go) and Python (privacy.fake_id via tests/test_jar_parity.py) load THIS file + the fixed test key file (jar-test.key, NOT the real /etc/secubox/secrets/privacy-jar.key), compute fakeID/fake_id per fixture, and MUST agree. Python is the source of truth; expect values are GENERATED by privacy.fake_id (never hand-computed). The key file carries leading/trailing whitespace to exercise .strip()/TrimSpace; key_hex below is the canonical post-strip key.",
|
||||
"key_file": "jar-test.key",
|
||||
"key_hex": "53656375426f780a546573744a61724b65795631aabbccddeeff0011deadbe7f",
|
||||
"fixtures": [
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "google-analytics.com",
|
||||
"cookie_name": "_ga",
|
||||
"expect": "GA1.2.3904711466.3108239649",
|
||||
"why": "_ga cookie -> GA1 shape"
|
||||
},
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "google-analytics.com",
|
||||
"cookie_name": "_ga_ABC123",
|
||||
"expect": "GA1.2.5796600959.265364931",
|
||||
"why": "GA4 per-property -> still GA1 shape (startswith _ga)"
|
||||
},
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "connect.facebook.net",
|
||||
"cookie_name": "_fbp",
|
||||
"expect": "fb.1.6011068296128.8272063998",
|
||||
"why": "_fbp -> fb shape"
|
||||
},
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "tracker.example.com",
|
||||
"cookie_name": "uuid",
|
||||
"expect": "a357739e-e6e8-020e-c9ee-cb92950d1a71",
|
||||
"why": "uuid -> uuid shape"
|
||||
},
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "matomo.example.com",
|
||||
"cookie_name": "_pk_id",
|
||||
"expect": "7be228ae-3261-d609-1cec-dc0dc05a8abf",
|
||||
"why": "_pk_id -> uuid shape"
|
||||
},
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "tracker.example.com",
|
||||
"cookie_name": "abcdefghijklmnopqrstuvwxyz012345",
|
||||
"expect": "416e7233-dfb8-ec7f-a2fe-45ed5dbdcaf4",
|
||||
"why": "name >=32 chars -> uuid shape via len branch"
|
||||
},
|
||||
{
|
||||
"client": "clientAAA",
|
||||
"tracker": "tracker.example.com",
|
||||
"cookie_name": "sid",
|
||||
"expect": "5cb0940c4562a4f76cf638e40ff552af",
|
||||
"why": "generic -> hex[:32]"
|
||||
},
|
||||
{
|
||||
"client": "clientFold",
|
||||
"tracker": "px.doubleclick.net",
|
||||
"cookie_name": "uid",
|
||||
"expect": "c1b6daf8-7ac1-edf6-c67b-3e23ec8eb61d",
|
||||
"why": "registrable folding A (px.doubleclick.net)"
|
||||
},
|
||||
{
|
||||
"client": "clientFold",
|
||||
"tracker": "ads.doubleclick.net",
|
||||
"cookie_name": "uid",
|
||||
"expect": "c1b6daf8-7ac1-edf6-c67b-3e23ec8eb61d",
|
||||
"why": "registrable folding B (ads.doubleclick.net) -> SAME fake_id as A"
|
||||
},
|
||||
{
|
||||
"client": "clientGovuk",
|
||||
"tracker": "ad.example.gov.uk",
|
||||
"cookie_name": "uid",
|
||||
"expect": "75cc2df5-1ee2-da62-9023-aa11c57419af",
|
||||
"why": "DIVERGENCE GUARD: privacy.registrable=example.gov.uk (gov.uk in privacy._MULTI_TLD); ad_ghost._2L lacks gov.uk so policy.registrable would give gov.uk -> forces the jar to use registrableJar"
|
||||
},
|
||||
{
|
||||
"client": "clientIP",
|
||||
"tracker": "9.9.9.9",
|
||||
"cookie_name": "sid",
|
||||
"expect": "53bf4dd57df7a26d6eff83092c869835",
|
||||
"why": "DIVERGENCE GUARD: IP-literal tracker -> privacy.registrable returns as-is (ad_ghost._registrable returns None) -> forces registrableJar"
|
||||
}
|
||||
]
|
||||
}
|
||||
BIN
packages/secubox-toolbox-ng/testdata/jar-test.key
vendored
Normal file
BIN
packages/secubox-toolbox-ng/testdata/jar-test.key
vendored
Normal file
Binary file not shown.
36
packages/secubox-toolbox-ng/testdata/machash-fixtures.json
vendored
Normal file
36
packages/secubox-toolbox-ng/testdata/machash-fixtures.json
vendored
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
{
|
||||
"_doc": "Cross-engine mac_hash (WG persona identity) parity fixtures (#662 Phase 6 prep). Go core (machash_test.go, macHashOf with wgPeersPath pointed at wg-peers-fixture.json) and Python (_common.mac_hash_of with _WG_PEERS_DB monkeypatched to the SAME wg-peers-fixture.json) load THIS file and MUST agree. Python is the source of truth: expected = sha256(pubkey.encode()).hexdigest()[:16], generated by Python, never Go-authored. The R0-R2 ARP/HMAC path is intentionally out of scope for the R3 transparent engine (WG-only); off-subnet IPs expect empty.",
|
||||
"wg_peers_file": "wg-peers-fixture.json",
|
||||
"fixtures": [
|
||||
{
|
||||
"ip": "10.99.1.10",
|
||||
"expected": "7d790156855ebeef",
|
||||
"why": "WG peer phone-gk2 -> sha256(pubkey)[:16]"
|
||||
},
|
||||
{
|
||||
"ip": "10.99.1.11",
|
||||
"expected": "6f3663aa06e871c4",
|
||||
"why": "WG peer laptop-admin -> sha256(pubkey)[:16]"
|
||||
},
|
||||
{
|
||||
"ip": "10.99.1.12",
|
||||
"expected": "1db566f7c72180f0",
|
||||
"why": "WG peer tablet-lab -> sha256(pubkey)[:16]"
|
||||
},
|
||||
{
|
||||
"ip": "10.99.1.250",
|
||||
"expected": "",
|
||||
"why": "WG subnet but no peer entry -> empty"
|
||||
},
|
||||
{
|
||||
"ip": "192.168.1.5",
|
||||
"expected": "",
|
||||
"why": "off-subnet (R0-R2 ARP path out of scope in R3) -> empty"
|
||||
},
|
||||
{
|
||||
"ip": "",
|
||||
"expected": "",
|
||||
"why": "empty ip -> empty"
|
||||
}
|
||||
]
|
||||
}
|
||||
31
packages/secubox-toolbox-ng/testdata/parity-fixtures.json
vendored
Normal file
31
packages/secubox-toolbox-ng/testdata/parity-fixtures.json
vendored
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
{
|
||||
"_doc": "Cross-engine parity fixtures (#662 Phase 3). Both the Go core (policy_test.go) and the Python addons (tests/test_engine_parity.py) load THIS file plus the testdata/config snapshot, run their Decide logic on each host, and must agree. Python is the source of truth; Go matches it. action ∈ {allow, block, splice, mitm}.",
|
||||
"config": {
|
||||
"ad_allowlist": "config/ad-allowlist.txt",
|
||||
"learned_trackers": "config/learned-trackers.txt",
|
||||
"splice_seed": "config/tls-splice-seed.conf",
|
||||
"splice_learned": "config/splice-learned.txt",
|
||||
"pure_trackers": "config/pure-trackers.txt",
|
||||
"self_domains": ["secubox.in"],
|
||||
"fortknox_sites": ["mybank.example"]
|
||||
},
|
||||
"fixtures": [
|
||||
{"host": "ads.doubleclick.net", "expect": "block", "why": "static ad host (_AD_HOST dotted-prefix doubleclick)"},
|
||||
{"host": "doubleclick.net", "expect": "block", "why": "static ad host (_AD_HOST bare)"},
|
||||
{"host": "criteo.com", "expect": "block", "why": "static ad host (_AD_HOST criteo)"},
|
||||
{"host": "learned-tracker.example", "expect": "block", "why": "auto-learned tracker (learned-trackers.txt)"},
|
||||
{"host": "pure-tracker.example", "expect": "block", "why": "pure-tracker + splice-learned: never wins (no splice) → falls to block (also learned)"},
|
||||
{"host": "hub.secubox.in", "expect": "allow", "why": "own-infra subdomain (self_domains) — never block/splice"},
|
||||
{"host": "secubox.in", "expect": "allow", "why": "own-infra apex"},
|
||||
{"host": "analytics.example-allowed.com", "expect": "allow", "why": "operator allowlisted host"},
|
||||
{"host": "criteo-but-allowed.example", "expect": "allow", "why": "would-be-ad registrable but allowlisted → allowlist wins"},
|
||||
{"host": "r1.googlevideo.com", "expect": "splice", "why": "splice seed subdomain (CDN shard)"},
|
||||
{"host": "googlevideo.com", "expect": "splice", "why": "splice seed exact"},
|
||||
{"host": "assets.example-cdn.com", "expect": "splice", "why": "splice-learned host"},
|
||||
{"host": "mybank.example", "expect": "mitm", "why": "fortknox site in never-set; not in seed/learned → no splice; not ad/learned → mitm"},
|
||||
{"host": "notdoubleclick.net", "expect": "mitm", "why": "no-false-suffix negative — _AD_HOST requires (^|.) boundary"},
|
||||
{"host": "news.example.com", "expect": "mitm", "why": "plain site"},
|
||||
{"host": "notsecubox.in", "expect": "mitm", "why": "own-infra FALSE-prefix negative — must NOT match self_domains"},
|
||||
{"host": "commented-learned.example", "expect": "mitm", "why": "learned-trackers NOT comment-stripped (_learned_set keeps full line incl ' # ...'); bare host not in set → not blocked. Discriminates loadLinesRaw vs loadLines"}
|
||||
]
|
||||
}
|
||||
16
packages/secubox-toolbox-ng/testdata/wg-peers-fixture.json
vendored
Normal file
16
packages/secubox-toolbox-ng/testdata/wg-peers-fixture.json
vendored
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
{
|
||||
"peers": {
|
||||
"aL3kF2pQ9rZxT7vN1wB4cD6eH8jM0sU2yX5zA7bC1E=": {
|
||||
"ip": "10.99.1.10",
|
||||
"name": "phone-gk2"
|
||||
},
|
||||
"bM4lG3qR0sAyU8wO2xC5dE7fI9kN1tV3zY6aB8cD2F=": {
|
||||
"ip": "10.99.1.11",
|
||||
"name": "laptop-admin"
|
||||
},
|
||||
"cN5mH4rS1tBzV9xP3yD6eF8gJ0lO2uW4aZ7bC9dE3G=": {
|
||||
"ip": "10.99.1.12",
|
||||
"name": "tablet-lab"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -3,7 +3,12 @@
|
|||
#
|
||||
# REPLACES the prerouting rules from secubox-toolbox-wg.nft :
|
||||
# iif wg-toolbox tcp dport 443 dnat ip to 10.99.1.1:8081 (single port)
|
||||
# with a round-robin numgen mapping to ports 8081..8084.
|
||||
# with a round-robin numgen mapping to ports 8091..8094.
|
||||
#
|
||||
# #662 CUTOVER (2026-06-18): the fanout now targets the Go MITM engine
|
||||
# (secubox-toolbox-ng-worker@{1..4}, transparent on 10.99.1.1:809%i) instead
|
||||
# of the Python mitmproxy workers (808%i). Rollback = change 809x → 808x below
|
||||
# and `nft -f` this file (the Python workers are kept warm for that).
|
||||
#
|
||||
# Why numgen inc and not jhash : nftables 1.0.6 (Debian bookworm) doesn't
|
||||
# support `jhash` in numgen yet (lands in 1.0.7+). `inc` is round-robin
|
||||
|
|
@ -25,19 +30,20 @@ table inet wg-toolbox {
|
|||
# Phase 9 (#501) — 4-worker round-robin DNAT. numgen returns
|
||||
# 0..3 ; the map sends each to one of the 4 worker ports on
|
||||
# 10.99.1.1. Conntrack pins the choice for the whole flow.
|
||||
# #662: ports are 809x (Go engine), was 808x (Python).
|
||||
iif "wg-toolbox" tcp dport 443 dnat ip to 10.99.1.1 \
|
||||
: numgen inc mod 4 map {
|
||||
0 : 8081,
|
||||
1 : 8082,
|
||||
2 : 8083,
|
||||
3 : 8084
|
||||
0 : 8091,
|
||||
1 : 8092,
|
||||
2 : 8093,
|
||||
3 : 8094
|
||||
}
|
||||
iif "wg-toolbox" tcp dport 80 dnat ip to 10.99.1.1 \
|
||||
: numgen inc mod 4 map {
|
||||
0 : 8081,
|
||||
1 : 8082,
|
||||
2 : 8083,
|
||||
3 : 8084
|
||||
0 : 8091,
|
||||
1 : 8092,
|
||||
2 : 8093,
|
||||
3 : 8094
|
||||
}
|
||||
|
||||
# Phase 7 (#498) — DNS DNAT for legacy peer configs that hand out
|
||||
|
|
|
|||
125
packages/secubox-toolbox/tests/test_engine_parity.py
Normal file
125
packages/secubox-toolbox/tests/test_engine_parity.py
Normal file
|
|
@ -0,0 +1,125 @@
|
|||
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
# Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
"""Cross-engine parity harness — Python side (#662 Phase 3).
|
||||
|
||||
Loads the SAME ``parity-fixtures.json`` and ``testdata/config`` snapshot the Go
|
||||
core uses (``../secubox-toolbox-ng/testdata``), drives the production Python
|
||||
decision logic — ``ad_ghost._allowed`` + ``_AD_HOST`` + the learned-trackers
|
||||
check, composed with ``splice.should_splice`` — under the SAME precedence as
|
||||
Go's ``Policy.Decide``, and asserts the action == the fixture's ``expect``.
|
||||
|
||||
Python is the source of truth: if Go and Python ever diverge on a fixture, Go
|
||||
is fixed to match this. Both test files reading the identical inputs is what
|
||||
makes the parity meaningful.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
|
||||
import pytest
|
||||
|
||||
from mitmproxy_addons import ad_ghost
|
||||
from secubox_toolbox import splice
|
||||
|
||||
_HERE = os.path.dirname(os.path.abspath(__file__))
|
||||
# tests/ → packages/secubox-toolbox → packages → packages/secubox-toolbox-ng
|
||||
_NG_TESTDATA = os.path.normpath(
|
||||
os.path.join(_HERE, "..", "..", "secubox-toolbox-ng", "testdata"))
|
||||
_FIXTURES = os.path.join(_NG_TESTDATA, "parity-fixtures.json")
|
||||
|
||||
|
||||
def _load_fixtures():
|
||||
with open(_FIXTURES, encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
|
||||
|
||||
def _cfg_path(rel: str) -> str:
|
||||
return os.path.join(_NG_TESTDATA, rel.replace("/", os.sep))
|
||||
|
||||
|
||||
def _decide(host: str, sni: str, *, seed, learned_splice, never,
|
||||
self_regs) -> str:
|
||||
"""Mirror Go's Policy.Decide precedence EXACTLY.
|
||||
|
||||
1. own-infra / allowlist (ad_ghost._allowed) → "allow"
|
||||
2. splice never-set check, then seed/learned (splice.should_splice) → "splice"
|
||||
3. _AD_HOST match OR registrable/host in learned-trackers → "block"
|
||||
4. otherwise → "mitm"
|
||||
"""
|
||||
# 1. allowlist + own-infra ALWAYS win first.
|
||||
if ad_ghost._allowed(host):
|
||||
return "allow"
|
||||
# 2. splice (TLS layer runs first; never-set already excludes trackers).
|
||||
if splice.should_splice(sni or host, seed, learned_splice, never):
|
||||
return "splice"
|
||||
# 3. ad_ghost block decision (request layer).
|
||||
blocked = bool(ad_ghost._AD_HOST.search(host))
|
||||
if not blocked:
|
||||
reg = ad_ghost._registrable(host)
|
||||
ls = ad_ghost._learned_set()
|
||||
if (reg and reg in ls) or host.lower() in ls:
|
||||
blocked = True
|
||||
if blocked:
|
||||
return "block"
|
||||
# 4.
|
||||
return "mitm"
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def parity_env(monkeypatch):
|
||||
"""Point the Python addon decision logic at the SAME testdata snapshot the
|
||||
Go core loads, and load the splice sets the same way the addon does."""
|
||||
data = _load_fixtures()
|
||||
cfg = data["config"]
|
||||
|
||||
# ad_ghost: allowlist + learned-trackers paths, self-domains, fresh caches.
|
||||
monkeypatch.setattr(ad_ghost, "_ALLOW_PATH", _cfg_path(cfg["ad_allowlist"]))
|
||||
monkeypatch.setattr(ad_ghost, "_LEARNED_PATH", _cfg_path(cfg["learned_trackers"]))
|
||||
monkeypatch.setattr(ad_ghost, "_SELF_REGS",
|
||||
{d.strip().lower() for d in cfg["self_domains"] if d.strip()})
|
||||
# reset module-level caches so the monkeypatched paths are (re)read.
|
||||
monkeypatch.setattr(ad_ghost, "_allow", set())
|
||||
monkeypatch.setattr(ad_ghost, "_allow_mtime", 0.0)
|
||||
monkeypatch.setattr(ad_ghost, "_learned", set())
|
||||
monkeypatch.setattr(ad_ghost, "_learned_mtime", 0.0)
|
||||
monkeypatch.setattr(ad_ghost, "_learned_check", 0.0) # bypass the 60s cache
|
||||
|
||||
# splice: load seed/learned the addon way; never = pure-trackers ∪ fortknox.
|
||||
seed = splice.load_splice_seed(_cfg_path(cfg["splice_seed"]))
|
||||
learned_splice = splice.load_learned_splice(_cfg_path(cfg["splice_learned"]))
|
||||
never = splice.load_learned_splice(_cfg_path(cfg["pure_trackers"]))
|
||||
for s in cfg.get("fortknox_sites", []) or []:
|
||||
never.add(str(s).lower().strip("."))
|
||||
|
||||
return {
|
||||
"fixtures": data["fixtures"],
|
||||
"seed": seed,
|
||||
"learned_splice": learned_splice,
|
||||
"never": never,
|
||||
"self_regs": ad_ghost._SELF_REGS,
|
||||
}
|
||||
|
||||
|
||||
def test_parity_decide(parity_env):
|
||||
seed = parity_env["seed"]
|
||||
learned_splice = parity_env["learned_splice"]
|
||||
never = parity_env["never"]
|
||||
self_regs = parity_env["self_regs"]
|
||||
|
||||
failures = []
|
||||
for fx in parity_env["fixtures"]:
|
||||
host = fx["host"]
|
||||
got = _decide(host, host, seed=seed, learned_splice=learned_splice,
|
||||
never=never, self_regs=self_regs)
|
||||
if got != fx["expect"]:
|
||||
failures.append(
|
||||
f"Decide({host!r})={got!r} want {fx['expect']!r} ({fx.get('why')})")
|
||||
assert not failures, "Python↔fixture parity mismatches:\n" + "\n".join(failures)
|
||||
|
||||
|
||||
def test_fixtures_present(parity_env):
|
||||
# Guard: the fixture set must cover every action class, else "parity" is
|
||||
# vacuously true for a missing branch.
|
||||
actions = {fx["expect"] for fx in parity_env["fixtures"]}
|
||||
assert actions == {"allow", "block", "splice", "mitm"}, actions
|
||||
97
packages/secubox-toolbox/tests/test_jar_parity.py
Normal file
97
packages/secubox-toolbox/tests/test_jar_parity.py
Normal file
|
|
@ -0,0 +1,97 @@
|
|||
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
# Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
"""Cross-engine JAR parity harness — Python side (#662 Phase 4).
|
||||
|
||||
Loads the SAME ``jar-fixtures.json`` + fixed test key the Go core uses
|
||||
(``../secubox-toolbox-ng/testdata``), points ``privacy.JAR_KEY_PATH`` at the
|
||||
test key (NOT the real ``/etc/secubox/secrets/privacy-jar.key``), resets the
|
||||
jar-key cache, and asserts ``privacy.fake_id`` == each fixture's ``expect``.
|
||||
|
||||
Python is the source of truth: the ``expect`` values were GENERATED by this
|
||||
very ``privacy.fake_id`` with the test key. The Go side (jar_test.go) must
|
||||
reproduce them byte-for-byte. Both files reading identical inputs is what makes
|
||||
the parity meaningful.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
|
||||
import pytest
|
||||
|
||||
from secubox_toolbox import privacy
|
||||
|
||||
_HERE = os.path.dirname(os.path.abspath(__file__))
|
||||
# tests/ → packages/secubox-toolbox → packages → packages/secubox-toolbox-ng
|
||||
_NG_TESTDATA = os.path.normpath(
|
||||
os.path.join(_HERE, "..", "..", "secubox-toolbox-ng", "testdata"))
|
||||
_FIXTURES = os.path.join(_NG_TESTDATA, "jar-fixtures.json")
|
||||
|
||||
|
||||
def _load():
|
||||
with open(_FIXTURES, encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def jar_env(monkeypatch):
|
||||
"""Point privacy at the test key file and reset the cache so the override
|
||||
is (re)read. Mirrors exactly the (path, cache) surface the Go loadJarKey
|
||||
reads."""
|
||||
data = _load()
|
||||
key_path = os.path.join(_NG_TESTDATA, data["key_file"].replace("/", os.sep))
|
||||
monkeypatch.setattr(privacy, "JAR_KEY_PATH", key_path)
|
||||
monkeypatch.setattr(privacy, "_jar_key_cache", {"v": None})
|
||||
return data
|
||||
|
||||
|
||||
def test_jar_key_loads_canonical(jar_env):
|
||||
# _jar_key() must strip the file's surrounding whitespace back to the
|
||||
# canonical key declared in key_hex (proves .strip() parity with TrimSpace).
|
||||
key = privacy._jar_key()
|
||||
assert key is not None
|
||||
assert key.hex() == jar_env["key_hex"]
|
||||
|
||||
|
||||
def test_jar_parity(jar_env):
|
||||
failures = []
|
||||
for fx in jar_env["fixtures"]:
|
||||
got = privacy.fake_id(fx["client"], fx["tracker"], fx["cookie_name"])
|
||||
if got != fx["expect"]:
|
||||
failures.append(
|
||||
f"fake_id({fx['client']!r},{fx['tracker']!r},{fx['cookie_name']!r})"
|
||||
f"={got!r} want {fx['expect']!r} ({fx.get('why')})")
|
||||
assert not failures, "Python↔fixture jar parity mismatches:\n" + "\n".join(failures)
|
||||
|
||||
|
||||
def test_jar_shapes_covered(jar_env):
|
||||
# Every _shape branch must appear, else parity is vacuous for that branch.
|
||||
shapes = set()
|
||||
for fx in jar_env["fixtures"]:
|
||||
e = fx["expect"]
|
||||
if e.startswith("GA1."):
|
||||
shapes.add("ga")
|
||||
elif e.startswith("fb."):
|
||||
shapes.add("fb")
|
||||
elif len(e) == 36 and e[8] == "-":
|
||||
shapes.add("uuid")
|
||||
elif len(e) == 32:
|
||||
shapes.add("hex")
|
||||
assert shapes == {"ga", "fb", "uuid", "hex"}, shapes
|
||||
|
||||
|
||||
def test_jar_folding(jar_env):
|
||||
# Two subdomains of the same registrable tracker fold to the SAME fake id.
|
||||
a = privacy.fake_id("foldclient", "px.doubleclick.net", "uid")
|
||||
b = privacy.fake_id("foldclient", "ads.doubleclick.net", "uid")
|
||||
assert a is not None and a == b
|
||||
|
||||
|
||||
def test_jar_none_cases(jar_env):
|
||||
# fake_id returns None exactly where Go fakeID returns ("", False).
|
||||
assert privacy.fake_id("", "t.example", "uid") is None # empty client
|
||||
assert privacy.fake_id("c", "", "uid") is None # empty tracker
|
||||
# empty key → None
|
||||
monkeypatched_empty = {"v": b""}
|
||||
object.__setattr__(privacy, "_jar_key_cache", monkeypatched_empty)
|
||||
assert privacy.fake_id("c", "t.example", "uid") is None
|
||||
87
packages/secubox-toolbox/tests/test_machash_parity.py
Normal file
87
packages/secubox-toolbox/tests/test_machash_parity.py
Normal file
|
|
@ -0,0 +1,87 @@
|
|||
# SPDX-License-Identifier: LicenseRef-CMSD-1.0
|
||||
# Copyright (c) 2026 CyberMind — Gérald Kerma <devel@cybermind.fr>
|
||||
"""Cross-engine mac_hash (WG persona identity) parity harness — Python side
|
||||
(#662 Phase 6 prep).
|
||||
|
||||
Loads the SAME ``machash-fixtures.json`` + ``wg-peers-fixture.json`` the Go core
|
||||
uses (``../secubox-toolbox-ng/testdata``), points ``_common._WG_PEERS_DB`` at the
|
||||
fixture WG DB (NOT the real ``/var/lib/secubox/toolbox/wg-peers.json``), resets
|
||||
the WG cache, and asserts ``_common.mac_hash_of`` == each fixture's ``expected``.
|
||||
|
||||
Python is the source of truth: the ``expected`` values were GENERATED by
|
||||
``sha256(pubkey.encode()).hexdigest()[:16]`` (the very algorithm
|
||||
``_common._wg_hash_of`` runs). The Go side (machash_test.go) must reproduce them
|
||||
byte-for-byte. Both files reading identical inputs is what makes the parity
|
||||
meaningful (and non-circular).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from mitmproxy_addons import _common
|
||||
|
||||
_HERE = os.path.dirname(os.path.abspath(__file__))
|
||||
# tests/ → packages/secubox-toolbox → packages → packages/secubox-toolbox-ng
|
||||
_NG_TESTDATA = os.path.normpath(
|
||||
os.path.join(_HERE, "..", "..", "secubox-toolbox-ng", "testdata"))
|
||||
_FIXTURES = os.path.join(_NG_TESTDATA, "machash-fixtures.json")
|
||||
|
||||
|
||||
def _load():
|
||||
with open(_FIXTURES, encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def wg_env(monkeypatch):
|
||||
"""Point _common at the fixture WG DB and reset the mtime cache so the
|
||||
override is (re)read. Mirrors exactly the (path, cache, mtime) surface the
|
||||
Go wgHashOf reads (wgPeersPath + resetWGCache)."""
|
||||
data = _load()
|
||||
wg_path = os.path.join(_NG_TESTDATA, data["wg_peers_file"].replace("/", os.sep))
|
||||
monkeypatch.setattr(_common, "_WG_PEERS_DB", Path(wg_path))
|
||||
monkeypatch.setattr(_common, "_WG_PEERS_MTIME", 0.0)
|
||||
_common._WG_PEERS_CACHE.clear()
|
||||
return data
|
||||
|
||||
|
||||
def test_machash_parity(wg_env):
|
||||
failures = []
|
||||
for fx in wg_env["fixtures"]:
|
||||
# _common returns None where Go returns ""; normalise None → "".
|
||||
got = _common.mac_hash_of(fx["ip"]) or ""
|
||||
if got != fx["expected"]:
|
||||
failures.append(
|
||||
f"mac_hash_of({fx['ip']!r})={got!r} want {fx['expected']!r}"
|
||||
f" ({fx.get('why')})")
|
||||
assert not failures, "Python↔fixture mac_hash parity mismatches:\n" + "\n".join(failures)
|
||||
|
||||
|
||||
def test_machash_coverage(wg_env):
|
||||
# The fixtures must exercise the discriminating cases, else parity is vacuous.
|
||||
resolved = subnet_miss = off_subnet = empty = False
|
||||
for fx in wg_env["fixtures"]:
|
||||
ip, exp = fx["ip"], fx["expected"]
|
||||
if ip == "":
|
||||
empty = True
|
||||
elif exp != "":
|
||||
resolved = True
|
||||
elif ip.startswith("10.99.1."):
|
||||
subnet_miss = True
|
||||
else:
|
||||
off_subnet = True
|
||||
assert resolved and subnet_miss and off_subnet and empty, (
|
||||
f"coverage incomplete: resolved={resolved} subnet_miss={subnet_miss} "
|
||||
f"off_subnet={off_subnet} empty={empty}")
|
||||
|
||||
|
||||
def test_machash_missing_db_fail_open(wg_env):
|
||||
# A missing WG DB fails open to None (best-effort), never raises.
|
||||
_common._WG_PEERS_DB = Path("/nonexistent/secubox/wg-peers.json")
|
||||
_common._WG_PEERS_MTIME = 0.0
|
||||
_common._WG_PEERS_CACHE.clear()
|
||||
assert _common.mac_hash_of("10.99.1.10") is None
|
||||
Loading…
Reference in New Issue
Block a user