19 Commits

Author SHA1 Message Date
6bb7ec4d27 Web UI overhaul: interactive config editor, SSE live updates, log viewer, and SMB reload fixes
- Replace raw TOML textarea with Alpine.js interactive form editor (10 collapsible
  sections with change-tier badges, dynamic array management for connections/shares/
  warmup rules, proper input controls per field type)
- Add SSE-based live dashboard updates replacing htmx polling
- Add log viewer tab with ring buffer backend and incremental polling
- Fix SMB not seeing new shares after config reload: kill entire smbd process group
  (not just parent PID) so forked workers release port 445
- Add SIGHUP-based smbd config reload for share changes instead of full restart,
  preserving existing client connections
- Generate human-readable commented TOML from config editor instead of bare
  toml::to_string_pretty() output
- Fix Alpine.js 2.x __x.$data calls in dashboard/share templates (now Alpine 3.x)
- Fix toggle switch CSS overlap with field labels
- Fix dashboard going blank on tab switch (remove hx-swap-oob from tab content)
- Add htmx:afterSettle → Alpine.initTree() bridge for robust tab switching

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 18:06:52 +08:00
466ea5cfa8 Add pre-mount remote path probe and per-share health status
Before mounting, probe each share's remote path with `rclone lsf`
(10s timeout, parallel execution). Failed shares are skipped — they
never get mounted or exposed to SMB/NFS/WebDAV — preventing the
silent hang that occurred when rclone mounted a nonexistent directory.

- ShareHealth enum: Pending → Probing → Healthy / Failed(reason)
- Supervisor: probe phase between preflight and mount, protocol
  configs generated after probe with only healthy shares
- Web UI: health-aware badges (OK/FAILED/PROBING/PENDING) with
  error messages on dashboard, status partial, and share detail
- JSON API: health + health_message fields on /api/status
- CLI: `warpgate status` queries daemon API first for tri-state
  display (OK/FAILED/DOWN), falls back to direct mount checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 15:28:56 +08:00
ba1cae7f75 Add daemon web UI, JSON API, and config hot-reload engine
- New: axum web server on port 8090 with htmx dashboard
- New: JSON API endpoints (/api/status, /api/config, /api/bwlimit)
- New: config diff engine with 4-tier change classification
- New: tiered config hot-reload (live/protocol/per-share/global)
- Refactor: supervisor loop uses mpsc command channel (recv_timeout)
- Refactor: supervisor updates shared DaemonStatus every poll cycle
- Dependencies: tokio, axum, askama, tower-http

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:18:20 +08:00
08f8fc4667 Per-share independent mounts: each share gets its own rclone process
Replace the hierarchical single-mount design with independent mounts:
each [[shares]] entry is a (name, remote_path, mount_point) triplet
with its own rclone FUSE mount process and dedicated RC API port
(5572 + index). Remove top-level connection.remote_path and [mount]
section. Auto-warmup now runs in a background thread to avoid blocking
the supervision loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:32:18 +08:00
46e592c3a4 Flatten project structure: move warpgate/ contents to repo root
Single-crate project doesn't need a subdirectory. Moves Cargo.toml,
src/, templates/ to root for standard Rust project layout. Updates
.gitignore and test harness binary paths accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:25:15 +08:00
a2d49137f9 Add comprehensive test suite: 63 integration tests + 110 Rust unit tests
Integration tests (tests/):
- 9 categories covering config, lifecycle, signals, supervision,
  cache, writeback, network faults, crash recovery, and CLI
- Shell-based harness with mock NAS (network namespace + SFTP),
  fault injection (tc netem), and power loss simulation
- TAP format runner (run-all.sh) with proper SKIP detection

Rust unit tests (warpgate/src/):
- 110 tests across 14 modules, all passing in 0.01s
- Config parsing, defaults validation, RestartTracker logic,
  RC API response parsing, rclone arg generation, service
  config generation, CLI output formatting, warmup path logic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 11:21:35 +08:00
e6c48c9bd9 Harden supervisor shutdown: process group isolation, write-back drain
- Spawn all children (rclone, smbd, webdav) in isolated process groups
  so Ctrl+C doesn't reach them directly — supervisor controls shutdown order
- Wait for rclone VFS write-back queue to drain before unmounting (5min cap)
- Prefer fusermount3 over fusermount, skip if already unmounted

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:56:09 +08:00
960ddd20ce Add incremental warmup with cache check and auto-warmup on startup
Warmup now checks the rclone VFS cache directory before reading each file
through the FUSE mount, skipping already-cached files for fast re-runs.
Also adds WarmupConfig with configurable rules that auto-execute when
the supervisor starts (best-effort, non-blocking).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:39:58 +08:00
9b37c88cd5 Fix warmup to use VFS cache, dynamic SMB share name, smbd long flags
- warmup: read files through FUSE mount instead of rclone copy to temp
  dir. Files now actually land in rclone VFS SSD cache.
- samba: derive share name from mount point dir name instead of
  hardcoded [nas-photos] (e.g. /mnt/projects → [projects])
- supervisor: use smbd long flags (--foreground, --debug-stdout,
  --no-process-group, --configfile) for compatibility with Samba 4.19

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 00:38:42 +08:00
5d8bf52ae9 Add warpgate MVP implementation with hardened supervisor
Full Rust implementation of the warpgate NAS cache proxy:

- CLI: clap-based with subcommands (run, setup, status, cache, warmup,
  bwlimit, speed-test, config-init, log)
- Config: TOML-based with env var override, preset templates
- rclone: VFS mount args builder, config generator, RC API client
- Services: Samba config gen, NFS exports, WebDAV serve args, systemd units
- Deploy: dependency checker, filesystem validation, one-click setup
- Supervisor: single-process tree managing rclone mount + smbd + WebDAV
  as child processes — no systemd dependency for protocol management

Supervisor hardening:
- ProtocolChildren Drop impl prevents orphaned processes on startup failure
- Early rclone exit detection in mount wait loop (fail fast, not 30s timeout)
- Graceful SIGTERM → 3s grace → SIGKILL (prevents orphaned smbd workers)
- RestartTracker with 5-min stability reset and linear backoff (2s/4s/6s)
- Shutdown signal checked during mount wait and before protocol start

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 23:29:17 +08:00
8f00f86eb4 PRD v4: revert to rclone VFS read-write proxy architecture
Drop the read-only cache + SD Uploader design in favor of rclone VFS
native read-write caching. Key changes:
- SMB shares are now read-write, writes go to SSD and async write-back to NAS
- Remove SD card import/upload, metadata DB, self-built polling
- Simplify remote change detection to rclone --dir-cache-time
- Add dirty file management, write-back config, and related risks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 18:56:12 +08:00
3caddc6370 Simplify architecture: read-only cache + one-way SD upload
Replace OverlayFS + sync daemon with two independent subsystems:
- Read-only cache: rclone --read-only + Samba read only = yes
- SD Uploader: staging → SFTP direct upload to NAS (temp file + rename)

Remove: OverlayFS, sync daemon, three-timestamp model, write-back,
conflict detection, dirty file tracking. Net -299 lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:35:43 +08:00
d40997312b Redesign conflict UX: in-place copies like Dropbox/iCloud (4.16)
Instead of moving conflict files to a separate conflict/ directory,
keep them in the original directory with naming convention:
  {name} (Warpgate Conflict {YYYY-MM-DD HH-mm}).{ext}

Benefits:
- Lightroom/Finder see both versions side by side
- Preserved extension ensures app compatibility
- Matches Dropbox/iCloud behavior users already know
- Conflict copies auto-sync to NAS via rclone (backed up)

Remote-deleted + local-dirty: file stays in place (no rename),
marked as orphan-conflict, user decides whether to re-upload.

Updated: decision matrix diagrams, scenario walkthroughs,
cache_files lifecycle, CLI commands, config section, directory
structure description.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 21:44:51 +08:00
823d20606a Add adaptive throttle for write-back bandwidth (4.14)
Throughput-based congestion detection: when sustained throughput
drops >30% over sliding window with rising RTT, auto-reduce
write-back speed to 50% of current throughput, then probe back up
at +10% every 2 minutes.

- Throttle state visible via `warpgate status`
- User can disable with BW_ADAPTIVE=no
- Only affects write-back uploads, not read fetches
- New config: BW_ADAPTIVE, BW_ADAPTIVE_WINDOW, BW_ADAPTIVE_PROBE_INTERVAL

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 21:42:00 +08:00
7fd1934be5 Clarify metadata.db must be on local filesystem, not FUSE mount
SQLite WAL depends on POSIX file locks and shared memory (-shm),
which FUSE/network filesystems cannot support correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 21:38:08 +08:00
aaf947859f Convert all ASCII art diagrams to Mermaid
Replace 20 ASCII box-drawing diagrams with Mermaid equivalents:
- System architecture → flowchart with subgraphs
- Multi-protocol cache → flowchart LR
- Write-back decision matrix → flowchart with branches
- SD card import decision tree → flowchart
- Read cache validation → markdown table (cleaner than ASCII grid)
- 5 scenario walkthroughs → flowcharts with timeline context
- 4-table ER diagram → erDiagram
- Deletion detection flow → flowchart
- Write-back dual-pipeline → flowchart with subgraphs
- Import state machine → stateDiagram-v2
- Tiered polling strategy → flowchart
- NAS agent push → flowchart LR
- Read/write flows → flowcharts
- Cache eviction → flowchart
- Headscale infrastructure → flowchart BT
- Cloud backup → flowchart with subgraph
- TM write-back strategy → flowchart LR

Kept directory tree structure as plain text (standard convention).
Cache protection measures converted to structured markdown list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 21:28:03 +08:00
ddcfb87b36 Add captive portal setup AP as P1 feature (4.11)
Hotel/airport WiFi requires web-based captive portal authentication,
which is impossible on a headless device without this feature.

- New P1 feature 4.11: Setup AP + Captive Portal proxy
  - Box auto-enters setup mode when no network is available
  - Phone connects to temporary AP, completes portal auth via proxy
  - Requires WiFi AP+STA concurrent mode
- Fallback options: USB tethering, mobile hotspot, ethernet, MAC clone
- New CLI commands: warpgate setup-wifi, warpgate clone-mac
- New config section for setup AP parameters
- Updated hardware requirements: WiFi module must support AP+STA
- Updated roadmap v1.5 to include setup AP
- Added risk entry and glossary terms
- Renumbered 4.12-4.23 accordingly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:20:42 +08:00
aa2db2bf5f Rename product to Warpgate — Make your NAS feel local
- Rename NAS Cache Proxy → Warpgate throughout PRD
- Update CLI commands: nas-cache → warpgate
- Update paths: /mnt/ssd/nas-cache → /mnt/ssd/warpgate
- Rename file: nas-cache-proxy-prd-v3.md → warpgate-prd-v3.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:05:23 +08:00
c3b458bced Add NAS Cache Proxy PRD v3
Complete product requirements document covering:
- Transparent SMB/NFS/WebDAV cache proxy with rclone VFS
- SD card ingest + auto archive pipeline for photographers
- Three-timestamp consistency model with write-back controller
- Time Machine backup target with independent sparsebundle sync
- Layered SFTP polling for remote change detection
- Cache space protection and import state machine
- Paid services: Headscale + DERP relay, cloud disaster backup
- Hardware appliance roadmap (v1.0 MVP → v3.0 hardware product)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:00:44 +08:00