Files
wl-webrtc/.sisyphus/plans/wl-webrtc-implementation.md
2026-02-03 11:14:25 +08:00

1167 lines
39 KiB
Markdown

# Wayland → WebRTC Remote Desktop Implementation Plan
## TL;DR
> **Quick Summary**: Implement a high-performance Rust backend that captures Wayland screens via PipeWire DMA-BUF, encodes to H.264 (hardware/software), and streams to WebRTC clients with 15-25ms latency target.
>
> **Deliverables**:
> - Complete Rust backend (5,000-8,000 LOC)
> - 5 major modules: capture, encoder, buffer management, WebRTC transport, signaling
> - Configuration system and CLI
> - Basic documentation and examples
>
> **Estimated Effort**: Large (4-6 weeks full-time)
> **Parallel Execution**: YES - 4 waves
> **Critical Path**: Project setup → Capture → Encoder → WebRTC integration → End-to-end
---
## Context
### Original Request
User wants to implement a Wayland to WebRTC remote desktop backend based on three comprehensive design documents (DETAILED_DESIGN_CN.md, DESIGN_CN.md, DESIGN.md).
### Design Documents Analysis
Three detailed design documents provided:
- **DETAILED_DESIGN_CN.md**: 14,000+ lines covering architecture, components, data structures, performance targets
- **DESIGN_CN.md / DESIGN.md**: Technical design with code examples and optimization strategies
**Key Requirements from Designs:**
- Zero-copy DMA-BUF pipeline for minimal latency
- Hardware encoding support (VA-API/NVENC) with software fallback (x264)
- WebRTC transport with low-latency configuration
- 15-25ms latency (LAN), <100ms (WAN)
- 30-60 FPS, up to 4K resolution
- Adaptive bitrate and damage tracking
### Current State
- **Empty Rust project** - No source code exists
- **Cargo.toml configured** with all dependencies (tokio, pipewire, webrtc-rs, x264, etc.)
- **Design complete** - Comprehensive specifications available
- **No tests or infrastructure** - Starting from scratch
---
## Work Objectives
### Core Objective
Build a production-ready remote desktop backend that captures Wayland screen content and streams it to WebRTC clients with ultra-low latency (15-25ms) using zero-copy DMA-BUF architecture.
### Concrete Deliverables
- Complete Rust implementation in `src/` directory
- 5 functional modules: capture, encoder, buffer, webrtc, signaling
- Working CLI application (`src/main.rs`)
- Configuration system (`config.toml`)
- Basic documentation (README, usage examples)
### Definition of Done
- [x] All major modules compile and integrate
- [x] End-to-end pipeline works: capture → encode → WebRTC → client receives
- [x] Software encoder (x264) functional
- [x] Hardware encoder infrastructure ready (VA-API hooks)
- [x] `cargo build --release` succeeds
- [x] Basic smoke test runs without crashes
- [x] README with setup instructions
### Must Have
- PipeWire screen capture with DMA-BUF support
- Video encoding (at least x264 software encoder)
- WebRTC peer connection and media streaming
- Signaling server (WebSocket for SDP/ICE exchange)
- Zero-copy buffer management
- Error handling and logging
### Must NOT Have (Guardrails)
- **Audio capture**: Out of scope (design only mentions video)
- **Multi-user sessions**: Single session only
- **Authentication/Security**: Basic implementation only (no complex auth)
- **Hardware encoding full implementation**: Infrastructure only, placeholders for VA-API/NVENC
- **Browser client**: Backend only, assume existing WebRTC client
- **Persistent storage**: No database or file storage
- **Advanced features**: Damage tracking, adaptive bitrate (deferred to v2)
---
## Verification Strategy
### Test Decision
- **Infrastructure exists**: NO
- **User wants tests**: YES (automated verification)
- **Framework**: `criterion` (benchmarks) + simple integration tests
### Automated Verification Approach
**For Each Module**:
1. **Unit tests** for core types and error handling
2. **Integration tests** for data flow between modules
3. **Benchmarks** for performance validation (latency < target)
**No Browser Testing Required**: Use mock WebRTC behavior or simple echo test for verification.
### If TDD Enabled
Each TODO follows RED-GREEN-REFACTOR:
**Test Setup Task** (first task):
- Install test dependencies
- Create basic test infrastructure
- Example test to verify setup
**Module Tasks**:
- RED: Write failing test for feature
- GREEN: Implement minimum code to pass
- REFACTOR: Clean up while passing tests
---
## Execution Strategy
### Parallel Execution Waves
```
Wave 1 (Start Immediately):
├── Task 1: Project structure and types
└── Task 5: Configuration system
Wave 2 (After Wave 1):
├── Task 2: Capture module (PipeWire)
├── Task 3: Buffer management
└── Task 6: Basic WebRTC echo server
Wave 3 (After Wave 2):
├── Task 4: Encoder module (x264)
└── Task 7: Signaling server
Wave 4 (After Wave 3):
├── Task 8: Integration (capture → encode → WebRTC)
├── Task 9: CLI and main entry point
└── Task 10: Documentation and examples
Critical Path: 1 → 2 → 3 → 4 → 8 → 9 → 10
Parallel Speedup: ~35% faster than sequential
```
### Dependency Matrix
| Task | Depends On | Blocks | Can Parallelize With |
|------|------------|--------|---------------------|
| 1 | None | 2, 3, 5, 6 | None (foundational) |
| 5 | None | 8, 9 | 1 |
| 2 | 1, 3 | 4 | 6 |
| 3 | 1 | 2 | 6 |
| 6 | 1 | 7, 8 | 2, 3 |
| 4 | 2 | 8 | 7 |
| 7 | 1, 6 | 8 | 4 |
| 8 | 4, 7 | 9 | None |
| 9 | 8 | 10 | 10 |
| 10 | 8, 9 | None | 9 |
---
## TODOs
- [x] 1. Project Structure and Core Types
**What to do**:
- Create module structure: `src/capture/`, `src/encoder/`, `src/buffer/`, `src/webrtc/`, `src/signaling/`
- Define core types: `CapturedFrame`, `EncodedFrame`, `DmaBufHandle`, `PixelFormat`, `ScreenRegion`
- Define error types: `CaptureError`, `EncoderError`, `WebRtcError`, `SignalingError`
- Create `src/lib.rs` with module exports
- Create `src/error.rs` for centralized error handling
**Must NOT do**:
- Implement any actual capture/encoding logic (types only)
- Add test infrastructure (Task 5)
- Implement configuration parsing
**Recommended Agent Profile**:
> - **Category**: `quick` (simple type definitions)
> - **Skills**: `[]` (no specialized skills needed)
> - **Skills Evaluated but Omitted**: All other skills not needed for type definitions
**Parallelization**:
- **Can Run In Parallel**: NO (foundational)
- **Parallel Group**: None
- **Blocks**: Tasks 2, 3, 5, 6
- **Blocked By**: None
**References**:
**Pattern References** (existing code to follow):
- `DESIGN_CN.md:82-109` - Core type definitions
- `DETAILED_DESIGN_CN.md:548-596` - PipeWire data structures
- `DETAILED_DESIGN_CN.md:970-1018` - Encoder data structures
**API/Type References** (contracts to implement against):
- `pipewire` crate documentation for `pw::Core`, `pw::Stream`, `pw::buffer::Buffer`
- `webrtc` crate for `RTCPeerConnection`, `RTCVideoTrack`
- `async-trait` for `VideoEncoder` trait definition
**Test References**:
- `thiserror` crate for error derive patterns
- Standard Rust project layout conventions
**Documentation References**:
- `DESIGN_CN.md:46-209` - Component breakdown and data structures
- `Cargo.toml` - Dependency versions to use
**External References**:
- pipewire-rs examples: https://gitlab.freedesktop.org/pipewire/pipewire-rs
- webrtc-rs examples: https://github.com/webrtc-rs/webrtc
**WHY Each Reference Matters**:
- Design docs provide exact type definitions - use them verbatim
- External examples show idiomatic usage patterns for complex crates
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo check
# Assert: Exit code 0, no warnings
cargo clippy -- -D warnings
# Assert: Exit code 0
cargo doc --no-deps --document-private-items
# Assert: Docs generated successfully
```
**Evidence to Capture**:
- [x] Module structure verified: `ls -la src/`
- [x] Type compilation output from `cargo check`
- [x] Generated documentation files
**Commit**: NO (group with Task 5)
---
- [x] 2. Capture Module (PipeWire Integration)
**What to do**:
- Implement `src/capture/mod.rs` with PipeWire client
- Create `PipewireCore` struct: manage PipeWire main loop and context
- Create `PipewireStream` struct: handle video stream and buffer dequeue
- Implement frame extraction: Extract DMA-BUF FD, size, stride from buffer
- Create async channel: Send `CapturedFrame` to encoder pipeline
- Implement `DamageTracker` (basic version): Track changed screen regions
- Handle PipeWire events: `param_changed`, `process` callbacks
**Must NOT do**:
- Implement xdg-desktop-portal integration (defer to v2)
- Implement hardware-specific optimizations
- Add complex damage tracking algorithms (use simple block comparison)
**Recommended Agent Profile**:
> - **Category**: `unspecified-high` (complex async FFI integration)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Tasks 3, 6)
- **Parallel Group**: Wave 2 (with Tasks 3, 6)
- **Blocks**: Task 4
- **Blocked By**: Tasks 1, 3
**References**:
**Pattern References**:
- `DESIGN_CN.md:367-516` - Complete capture module implementation
- `DETAILED_DESIGN_CN.md:542-724` - PipeWire client and stream handling
- `DETAILED_DESIGN_CN.md:727-959` - Damage tracker implementation
**API/Type References**:
- `pipewire` crate: `pw::MainLoop`, `pw::Context`, `pw::Core`, `pw::stream::Stream`
- `pipewire::properties!` macro for stream properties
- `pipewire::spa::param::format::Format` for video format
- `async_channel::Sender/Receiver` for async frame passing
**Test References**:
- PipeWire examples in pipewire-rs repository
- DMA-BUF handling patterns in other screen capture projects
**Documentation References**:
- `DESIGN_CN.md:70-110` - Capture manager responsibilities
- `DESIGN_CN.md:213-244` - Data flow from Wayland to capture
**External References**:
- PipeWire protocol docs: https://docs.pipewire.org/
- DMA-BUF kernel docs: https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html
**WHY Each Reference Matters**:
- PipeWire FFI is complex - follow proven patterns from examples
- DMA-BUF handling requires precise memory management - reference docs for safety
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo check
# Assert: No compilation errors
# Create simple capture test
cargo test capture::tests::test_stream_creation
# Assert: Test passes (mock PipeWire or skip if no Wayland session)
# Verify module compiles
cargo build --release --lib
# Assert: capture module in release binary
```
**Evidence to Capture**:
- [x] Module compilation output
- [x] Test execution results
- [x] Binary size after compilation
**Commit**: NO (group with Task 3)
---
- [x] 3. Buffer Management Module
**What to do**:
- Implement `src/buffer/mod.rs` with zero-copy buffer pools
- Create `DmaBufPool`: Manage DMA-BUF file descriptors with reuse
- Create `EncodedBufferPool`: Manage `Bytes` for encoded frames
- Implement `FrameBufferPool`: Unified interface for both pool types
- Use RAII pattern: `Drop` trait for automatic cleanup
- Implement `DmaBufHandle`: Safe wrapper around raw file descriptor
- Add memory tracking: Track buffer lifetimes and prevent leaks
**Must NOT do**:
- Implement GPU memory pools (defer to hardware encoding)
- Add complex memory allocation strategies (use simple VecDeque pools)
- Implement shared memory (defer to v2)
**Recommended Agent Profile**:
> - **Category**: `unspecified-high` (unsafe FFI, memory management)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Tasks 2, 6)
- **Parallel Group**: Wave 2 (with Tasks 2, 6)
- **Blocks**: Task 2
- **Blocked By**: Task 1
**References**:
**Pattern References**:
- `DESIGN_CN.md:518-617` - Frame buffer pool implementation
- `DETAILED_DESIGN_CN.md:287-299` - Buffer module design
- `DESIGN_CN.md:1066-1144` - Buffer sharing mechanisms
**API/Type References**:
- `std::collections::VecDeque` for buffer pools
- `std::os::unix::io::RawFd` for file descriptors
- `bytes::Bytes` for reference-counted buffers
- `std::mem::ManuallyDrop` for custom Drop logic
**Test References**:
- Rust unsafe patterns for FFI
- RAII examples in Rust ecosystem
**Documentation References**:
- `DESIGN_CN.md:182-209` - Buffer manager responsibilities
- `DESIGN_CN.md:1009-1064` - Zero-copy pipeline stages
**External References**:
- DMA-BUF documentation: https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html
- `bytes` crate docs: https://docs.rs/bytes/
**WHY Each Reference Matters**:
- Unsafe FFI requires precise patterns - RAII prevents resource leaks
- Reference design shows proven zero-copy architecture
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo test buffer::tests::test_dma_buf_pool
# Assert: Pool allocates and reuses buffers correctly
cargo test buffer::tests::test_encoded_buffer_pool
# Assert: Bytes pool works with reference counting
cargo test buffer::tests::test_memory_tracking
# Assert: Memory tracker detects leaks (if implemented)
```
**Evidence to Capture**:
- [x] Test execution results
- [x] Memory usage check (valgrind or similar if available)
- [x] Pool performance metrics
**Commit**: YES
- Message: `feat(buffer): implement zero-copy buffer management`
- Files: `src/buffer/mod.rs`, `src/lib.rs`
- Pre-commit: `cargo test --lib`
---
- [x] 4. Encoder Module (Software - x264)
**What to do**:
- Implement `src/encoder/mod.rs` with encoder trait
- Define `VideoEncoder` trait with `encode()`, `reconfigure()`, `request_keyframe()`
- Create `X264Encoder` struct: Wrap x264 software encoder
- Implement encoder initialization: Set low-latency parameters (ultrafast preset, zerolatency tune)
- Implement frame encoding: Convert DMA-BUF to YUV, encode to H.264
- Use zero-copy: Map DMA-BUF once, encode from mapped memory
- Output encoded data: Wrap in `Bytes` for zero-copy to WebRTC
- Implement bitrate control: Basic CBR or VBR
**Must NOT do**:
- Implement VA-API or NVENC encoders (defer to v2, just add trait infrastructure)
- Implement adaptive bitrate control (use fixed bitrate)
- Implement damage-aware encoding (encode full frames)
**Recommended Agent Profile**:
> - **Category**: `unspecified-high` (video encoding, low-latency optimization)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 3 (only)
- **Blocks**: Task 8
- **Blocked By**: Task 2
**References**:
**Pattern References**:
- `DESIGN_CN.md:620-783` - Complete encoder module implementation
- `DESIGN_CN.md:1249-1453` - Low-latency encoder configuration
- `DETAILED_DESIGN_CN.md:963-1184` - Video encoder trait and implementations
**API/Type References**:
- `x264` crate: `x264::Encoder`, `x264::Params`, `x264::Picture`
- `async-trait` for `#[async_trait] VideoEncoder`
- `bytes::Bytes` for zero-copy output
- `async_trait::async_trait` macro
**Test References**:
- x264-rs examples: https://github.com/DaGenix/rust-x264
- Low-latency encoding patterns in OBS Studio code
**Documentation References**:
- `DESIGN_CN.md:112-148` - Encoder pipeline responsibilities
- `DESIGN_CN.md:248-332` - Technology stack and encoder options
- `DESIGN_CN.md:1376-1411` - x264 low-latency parameters
**External References**:
- x264 documentation: https://code.videolan.org/videolan/x264/
- H.264 codec specification
**WHY Each Reference Matters**:
- Low-latency encoding requires precise parameter tuning - use documented presets
- x264 API is complex - examples show correct usage
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo test encoder::tests::test_x264_init
# Assert: Encoder initializes with correct parameters
cargo test encoder::tests::test_encode_frame
# Assert: Frame encodes successfully, output is valid H.264
# Verify encoding performance
cargo test encoder::tests::benchmark_encode --release
# Assert: Encoding latency < 20ms for 1080p frame
```
**Evidence to Capture**:
- [x] Test execution results
- [x] Encoding latency measurements
- [x] Output bitstream validation (using `ffprobe` if available)
**Commit**: YES
- Message: `feat(encoder): implement x264 software encoder`
- Files: `src/encoder/mod.rs`, `src/lib.rs`
- Pre-commit: `cargo test encoder`
---
- [x] 5. Configuration System and Test Infrastructure
**What to do**:
- Create `config.toml` template: Capture settings, encoder config, WebRTC config
- Implement `src/config.rs`: Parse TOML with `serde`
- Define config structs: `CaptureConfig`, `EncoderConfig`, `WebRtcConfig`
- Add validation: Check reasonable value ranges, provide defaults
- Create CLI argument parsing: Use `clap` for command-line overrides
- Set up test infrastructure: Add test dependencies to Cargo.toml
- Create integration test template: `tests/integration_test.rs`
- Set up benchmarking: Add `criterion` for latency measurements
**Must NOT do**:
- Implement hot reload of config
- Add complex validation rules (basic range checks only)
- Implement configuration file watching
**Recommended Agent Profile**:
> - **Category**: `quick` (simple config parsing, boilerplate)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Task 1)
- **Parallel Group**: Wave 1 (with Task 1)
- **Blocks**: Tasks 8, 9
- **Blocked By**: None
**References**:
**Pattern References**:
- `DESIGN_CN.md:90-95` - Capture config structure
- `DESIGN_CN.md:124-130` - Encoder config structure
- `DESIGN_CN.md:169-180` - WebRTC config structure
**API/Type References**:
- `serde` derive macros: `#[derive(Serialize, Deserialize)]`
- `toml` crate: `from_str()` for parsing
- `clap` crate: `Parser` trait for CLI
**Test References**:
- `criterion` examples: https://bheisler.github.io/criterion.rs/
- Integration testing patterns in Rust
**Documentation References**:
- `DESIGN_CN.md:248-259` - Dependencies including config tools
- Configuration file best practices
**External References**:
- TOML spec: https://toml.io/
- clap documentation: https://docs.rs/clap/
**WHY Each Reference Matters**:
- Config structure defined in designs - implement exactly
- Standard Rust patterns for config parsing
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo test config::tests::test_parse_valid_config
# Assert: Config file parses correctly
cargo test config::tests::test_cli_overrides
# Assert: CLI args override config file
cargo test --all-targets
# Assert: All tests pass (including integration template)
cargo bench --no-run
# Assert: Benchmarks compile successfully
```
**Evidence to Capture**:
- [x] Config parsing test results
- [x] Test suite execution output
- [x] Benchmark compilation success
**Commit**: YES (grouped with Task 1)
- Message: `feat: add project structure, types, and config system`
- Files: `src/lib.rs`, `src/error.rs`, `src/config.rs`, `config.toml`, `Cargo.toml`, `tests/integration_test.rs`, `benches/`
- Pre-commit: `cargo test --all`
---
- [x] 6. WebRTC Transport Module
**What to do**:
- Implement `src/webrtc/mod.rs` with WebRTC peer connection management
- Create `WebRtcServer` struct: Manage `RTCPeerConnection` instances
- Create `PeerConnection` wrapper: Encapsulate `webrtc` crate types
- Implement video track: `TrackLocalStaticSample` for encoded frames
- Implement SDP handling: `create_offer()`, `set_remote_description()`, `create_answer()`
- Implement ICE handling: ICE candidate callbacks, STUN/TURN support
- Configure low-latency: Minimize playout delay, disable FEC
- Implement data channels: For input events (mouse/keyboard)
**Must NOT do**:
- Implement custom WebRTC stack (use webrtc-rs as-is)
- Implement TURN server (configure external servers)
- Implement complex ICE strategies (use default)
**Recommended Agent Profile**:
> - **Category**: `unspecified-high` (WebRTC protocol, async networking)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Tasks 2, 3)
- **Parallel Group**: Wave 2 (with Tasks 2, 3)
- **Blocks**: Task 7, 8
- **Blocked By**: Task 1
**References**:
**Pattern References**:
- `DESIGN_CN.md:786-951` - Complete WebRTC module implementation
- `DESIGN_CN.md:1573-1738` - Low-latency WebRTC configuration
- `DETAILED_DESIGN_CN.md:270-286` - WebRTC transport module design
**API/Type References**:
- `webrtc` crate: `RTCPeerConnection`, `RTCVideoTrack`, `RTCDataChannel`
- `webrtc::api::APIBuilder` for API initialization
- `webrtc::peer_connection::sdp` for SDP handling
- `webrtc::media::Sample` for video samples
**Test References**:
- webrtc-rs examples: https://github.com/webrtc-rs/webrtc/tree/main/examples
- WebRTC protocol specs: https://www.w3.org/TR/webrtc/
**Documentation References**:
- `DESIGN_CN.md:150-181` - WebRTC transport responsibilities
- `DESIGN_CN.md:348-360` - WebRTC library options
- `DESIGN_CN.md:1577-1653` - Low-latency WebRTC configuration
**External References**:
- WebRTC MDN: https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API
- ICE specification: https://tools.ietf.org/html/rfc8445
**WHY Each Reference Matters**:
- WebRTC is complex protocol - use proven library and follow examples
- Low-latency config requires precise parameter tuning
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo test webrtc::tests::test_peer_connection_creation
# Assert: Peer connection initializes with correct config
cargo test webrtc::tests::test_sdp_exchange
# Assert: Offer/Answer exchange works correctly
cargo test webrtc::tests::test_video_track
# Assert: Video track accepts and queues samples
```
**Evidence to Capture**:
- [x] Test execution results
- [x] SDP output (captured in test logs)
- [x] ICE candidate logs
**Commit**: YES
- Message: `feat(webrtc): implement WebRTC transport with low-latency config`
- Files: `src/webrtc/mod.rs`, `src/lib.rs`
- Pre-commit: `cargo test webrtc`
---
- [x] 7. Signaling Server
**What to do**:
- Implement `src/signaling/mod.rs` with WebSocket signaling
- Create `SignalingServer` struct: Manage WebSocket connections
- Implement session management: Map session IDs to peer connections
- Implement SDP exchange: `send_offer()`, `receive_answer()`
- Implement ICE candidate relay: `send_ice_candidate()`, `receive_ice_candidate()`
- Handle client connections: Accept, authenticate (basic), track sessions
- Use async IO: `tokio-tungstenite` or `tokio` WebSocket support
**Must NOT do**:
- Implement authentication/authorization (allow all connections)
- Implement persistent storage (in-memory sessions only)
- Implement NAT traversal beyond ICE (no STUN/TURN server hosting)
**Recommended Agent Profile**:
> - **Category**: `unspecified-low` (simple WebSocket server)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Task 4)
- **Parallel Group**: Wave 3 (with Task 4)
- **Blocks**: Task 8
- **Blocked By**: Tasks 1, 6
**References**:
**Pattern References**:
- `DESIGN_CN.md:954-1007` - IPC/signaling implementation example
- `DETAILED_DESIGN_CN.md:301-314` - Signaling module design
- WebSocket echo server examples
**API/Type References**:
- `tokio::net::TcpListener` for TCP listening
- `tokio_tungstenite` crate: `WebSocketStream`, `accept_async()`
- `serde_json` for message serialization
- `async_channel` or `tokio::sync` for coordination
**Test References**:
- WebSocket examples in tokio ecosystem
- Signaling server patterns in WebRTC tutorials
**Documentation References**:
- `DESIGN_CN.md:27-34` - Signaling server in architecture
- Session management best practices
**External References**:
- WebSocket protocol: https://tools.ietf.org/html/rfc6455
- Signaling patterns: https://webrtc.org/getting-started/signaling
**WHY Each Reference Matters**:
- WebSocket signaling is standard WebRTC pattern - follow proven implementation
- Session management required for multi-client support
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo test signaling::tests::test_websocket_connection
# Assert: Client can connect and disconnect
cargo test signaling::tests::test_sdp_exchange
# Assert: SDP offer/answer relay works
cargo test signaling::tests::test_ice_candidate_relay
# Assert: ICE candidates forwarded correctly
```
**Evidence to Capture**:
- [x] Test execution results
- [x] WebSocket message logs
- [x] Session tracking verification
**Commit**: YES
- Message: `feat(signaling): implement WebSocket signaling server`
- Files: `src/signaling/mod.rs`, `src/lib.rs`
- Pre-commit: `cargo test signaling`
---
- [x] 8. End-to-End Integration
**What to do**:
- Implement `src/main.rs` with application entry point
- Create pipeline orchestration: Capture → Buffer → Encoder → WebRTC
- Integrate all modules: Connect channels and data flow
- Implement graceful shutdown: Handle Ctrl+C, clean up resources
- Add metrics collection: Track latency, frame rate, bitrate
- Implement error recovery: Restart failed modules, log errors
- Test with mock WebRTC client: Verify end-to-end flow
**Must NOT do**:
- Implement production deployment (local testing only)
- Add monitoring/alerting beyond logging
- Implement auto-scaling or load balancing
**Recommended Agent Profile**:
> - **Category**: `unspecified-high` (complex orchestration, async coordination)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: NO
- **Parallel Group**: Wave 4
- **Blocks**: Task 9
- **Blocked By**: Tasks 4, 7
**References**:
**Pattern References**:
- `DESIGN_CN.md:1044-1064` - Memory ownership transfer through pipeline
- `DESIGN_CN.md:211-244` - Complete data flow
- `DETAILED_DESIGN_CN.md:417-533` - Frame processing sequence
**API/Type References**:
- `tokio` runtime: `tokio::runtime::Runtime`, `tokio::select!`
- `async_channel` for inter-module communication
- `tracing` for structured logging
**Test References**:
- Integration test patterns
- Graceful shutdown examples in async Rust
**Documentation References**:
- `DESIGN_CN.md:1009-1044` - Zero-copy pipeline stages
- Error handling patterns in async Rust
**External References**:
- Tokio orchestration examples: https://tokio.rs/
- Structured logging: https://docs.rs/tracing/
**WHY Each Reference Matters**:
- End-to-end integration requires precise async coordination
- Zero-copy pipeline depends on correct ownership transfer
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo build --release
# Assert: Binary builds successfully
# Run with test config
timeout 30 cargo run --release -- --config config.toml
# Assert: Application starts, no crashes, logs show pipeline active
# Verify metrics collection
cargo test integration::tests::test_end_to_end_flow
# Assert: Frame flows through complete pipeline, metrics collected
```
**Evidence to Capture**:
- [x] Application startup logs
- [x] Pipeline flow verification logs
- [x] Metrics output (latency, frame rate, bitrate)
- [x] Graceful shutdown logs
**Commit**: YES
- Message: `feat: implement end-to-end pipeline integration`
- Files: `src/main.rs`, `src/lib.rs`
- Pre-commit: `cargo test integration`
---
- [x] 9. CLI and User Interface
**What to do**:
- Complete `src/main.rs` CLI implementation
- Implement subcommands: `start`, `stop`, `status`, `config`
- Add useful flags: `--verbose`, `--log-level`, `--port`
- Implement signal handling: Handle SIGINT, SIGTERM for graceful shutdown
- Add configuration validation: Warn on invalid settings at startup
- Implement status command: Show running sessions, metrics
- Create man page or help text: Document all options
**Must NOT do**:
- Implement TUI or GUI (CLI only)
- Add interactive configuration prompts
- Implement daemon mode (run in foreground)
**Recommended Agent Profile**:
> - **Category**: `quick` (CLI boilerplate, argument parsing)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Task 10)
- **Parallel Group**: Wave 4
- **Blocks**: None
- **Blocked By**: Task 8
**References**:
**Pattern References**:
- CLI examples in Cargo.toml (bin section)
- `clap` crate examples and documentation
- Signal handling in async Rust
**API/Type References**:
- `clap` crate: `Parser`, `Subcommand` derives
- `tokio::signal` for signal handling
- `tracing` for log levels
**Test References**:
- clap documentation for all argument types
- Signal handling patterns
**Documentation References**:
- CLI best practices: https://clig.dev/
- `DESIGN_CN.md` - Configuration options to expose
**External References**:
- clap documentation: https://docs.rs/clap/
**WHY Each Reference Matters**:
- Good CLI design requires following established patterns
- Signal handling critical for graceful shutdown
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
cargo run --release -- --help
# Assert: Help text shows all subcommands and flags
cargo run --release -- start --config config.toml
# Assert: Application starts with correct config
cargo run --release -- status
# Assert: Status command prints session info (or "no sessions")
# Test signal handling
timeout 5 cargo run --release -- start &
PID=$!
sleep 1
kill -INT $PID
wait $PID
# Assert: Exit code 0 (graceful shutdown)
```
**Evidence to Capture**:
- [x] Help output
- [x] Status command output
- [x] Signal handling test results
- [x] Error handling for invalid flags
**Commit**: YES
- Message: `feat(cli): implement complete CLI with subcommands`
- Files: `src/main.rs`
- Pre-commit: `cargo clippy`
---
- [x] 10. Documentation and Examples
**What to do**:
- Create `README.md`: Project overview, features, installation, usage
- Document configuration: Explain all config options in `config.toml.template`
- Add example usage: Show how to start server, connect client
- Document architecture: Explain module design and data flow
- Add troubleshooting section: Common issues and solutions
- Create `examples/` directory: Simple client examples if needed
- Document dependencies: List system-level dependencies (PipeWire, Wayland)
- Add performance notes: Expected latency, resource usage
**Must NOT do**:
- Write extensive API documentation (use Rustdoc comments instead)
- Create video tutorials or complex guides
- Write marketing content (keep technical)
**Recommended Agent Profile**:
> - **Category**: `writing` (documentation creation)
> - **Skills**: `[]`
> - **Skills Evaluated but Omitted**: Not applicable
**Parallelization**:
- **Can Run In Parallel**: YES (with Task 9)
- **Parallel Group**: Wave 4
- **Blocks**: None
- **Blocked By**: Task 8
**References**:
**Pattern References**:
- Rust project README conventions
- Existing design documents (DETAILED_DESIGN_CN.md, etc.)
- Configuration file comments
**API/Type References**:
- Rustdoc: `///` and `//!` documentation comments
**Test References**:
- README examples in popular Rust projects
- Documentation best practices
**Documentation References**:
- `DESIGN_CN.md` - Use architecture diagrams for overview
- `Cargo.toml` - Extract dependency requirements
- Design docs for feature descriptions
**External References**:
- README guidelines: https://www.makeareadme.com/
- Rust API guidelines: https://rust-lang.github.io/api-guidelines/
**WHY Each Reference Matters**:
- Good documentation critical for open-source adoption
- README first thing users see
**Acceptance Criteria**:
**Automated Verification**:
```bash
# Agent runs:
ls -la README.md config.toml.template
# Assert: Files exist and are non-empty
grep -q "Installation" README.md
grep -q "Usage" README.md
grep -q "Architecture" README.md
# Assert: Key sections present
head -20 config.toml.template
# Assert: Template has comments explaining each option
# Verify all public items have docs
cargo doc --no-deps
ls target/doc/wl_webrtc/
# Assert: Documentation generated successfully
```
**Evidence to Capture**:
- [x] README content preview
- [x] Config template preview
- [x] Generated documentation listing
**Commit**: YES
- Message: `docs: add README, config template, and documentation`
- Files: `README.md`, `config.toml.template`, `examples/`
- Pre-commit: None
---
## Commit Strategy
| After Task | Message | Files | Verification |
|------------|---------|-------|--------------|
| 1, 5 | `feat: add project structure, types, and config system` | `src/`, `config.toml`, `Cargo.toml`, `tests/`, `benches/` | `cargo test --all` |
| 3 | `feat(buffer): implement zero-copy buffer management` | `src/buffer/mod.rs`, `src/lib.rs` | `cargo test --lib` |
| 2 | `feat(capture): implement PipeWire screen capture` | `src/capture/mod.rs`, `src/lib.rs` | `cargo test capture` |
| 4 | `feat(encoder): implement x264 software encoder` | `src/encoder/mod.rs`, `src/lib.rs` | `cargo test encoder` |
| 6 | `feat(webrtc): implement WebRTC transport with low-latency config` | `src/webrtc/mod.rs`, `src/lib.rs` | `cargo test webrtc` |
| 7 | `feat(signaling): implement WebSocket signaling server` | `src/signaling/mod.rs`, `src/lib.rs` | `cargo test signaling` |
| 8 | `feat: implement end-to-end pipeline integration` | `src/main.rs`, `src/lib.rs` | `cargo test integration` |
| 9 | `feat(cli): implement complete CLI with subcommands` | `src/main.rs` | `cargo clippy` |
| 10 | `docs: add README, config template, and documentation` | `README.md`, `config.toml.template`, `examples/` | None |
---
## Success Criteria
### Verification Commands
```bash
# Build and test everything
cargo build --release && cargo test --all
# Run basic smoke test
timeout 30 cargo run --release -- start --config config.toml
# Check documentation
cargo doc --no-deps
# Verify CLI
cargo run --release -- --help
```
### Final Checklist
- [x] All 10 tasks completed
- [x] `cargo build --release` succeeds
- [x] `cargo test --all` passes all tests
- [x] End-to-end pipeline verified (capture → encode → send)
- [x] CLI fully functional with all subcommands
- [x] README complete with installation/usage instructions
- [x] Code compiles without warnings (`cargo clippy`)
- [x] Documentation generated successfully
- [x] Config template provided with comments
### Performance Validation (Optional)
- [x] Encoding latency < 20ms for 1080p (measured via benchmarks)
- [x] Capture latency < 5ms (measured via logging)
- [x] Memory usage < 500MB (measured via `ps` or similar)
---
## Appendix
### Notes on Scope Boundaries
**IN SCOPE (This Implementation)**:
- Complete Rust backend implementation
- PipeWire screen capture with DMA-BUF
- x264 software encoder (production-ready)
- WebRTC transport with webrtc-rs
- WebSocket signaling server
- Basic configuration and CLI
- Zero-copy buffer management
- Basic logging and error handling
**OUT OF SCOPE (Future Work)**:
- Hardware encoder implementation (VA-API, NVENC)
- Advanced features: damage tracking, adaptive bitrate, partial region encoding
- Authentication/authorization
- Audio capture and streaming
- Multi-user session management
- Production deployment (Docker, systemd, etc.)
- Browser client implementation
- Comprehensive testing suite (unit + integration + e2e)
- Monitoring/metrics beyond basic logging
### Assumptions Made
1. **Wayland environment available**: Implementation assumes Linux with PipeWire and Wayland compositor
2. **x264 library installed**: System-level x264 library required (via pkg-config)
3. **Single-session focus**: Only one capture session at a time for simplicity
4. **Local network**: Low-latency targets assume LAN environment
5. **Existing WebRTC client**: Backend only, no browser client implementation needed
6. **No authentication**: Allow all WebSocket connections for MVP
7. **Single-threaded encoding**: No parallel encoder pipelines for MVP
### Risks and Mitigations
| Risk | Impact | Mitigation |
|------|--------|------------|
| PipeWire FFI complexity | High | Use proven patterns from pipewire-rs examples |
| DMA-BUF safety | High | Strict RAII, unsafe blocks well-documented, extensive testing |
| WebRTC integration complexity | Medium | Use webrtc-rs as-is, avoid custom implementation |
| Performance targets unmet | Medium | Benchmarking in Task 4, iterative tuning |
| Missing dependencies | Low | Clear documentation of system requirements |
| Testing challenges (requires Wayland) | Medium | Use mock objects where possible, optional tests |
### Alternatives Considered
**WebRTC Library**:
- **Chosen**: `webrtc` (webrtc-rs) - Pure Rust, active development
- **Alternative**: `datachannel` - Another pure Rust option, less mature
**Async Runtime**:
- **Chosen**: `tokio` - Industry standard, excellent ecosystem
- **Alternative**: `async-std` - More modern, smaller ecosystem
**Software Encoder**:
- **Chosen**: `x264` - Ubiquitous, mature, good quality
- **Alternative**: `openh264` - Cisco's implementation, slightly lower quality
### Dependencies Rationale
**Core**:
- `tokio`: Async runtime, chosen for ecosystem and performance
- `pipewire`: Required for screen capture
- `webrtc`: WebRTC implementation, chosen for zero-copy support
- `x264`: Software encoder fallback, ubiquitous support
**Supporting**:
- `bytes`: Zero-copy buffers, critical for performance
- `async-channel`: Async channels, simpler than tokio channels
- `tracing`: Structured logging, modern and flexible
- `serde/toml`: Configuration parsing, standard ecosystem
- `clap`: CLI parsing, excellent help generation
- `anyhow/thiserror`: Error handling, idiomatic Rust
### Known Limitations
1. **Linux-only**: Wayland/PipeWire specific to Linux
2. **Requires Wayland session**: Cannot run in headless or X11 environments
3. **Hardware encoding deferred**: Only x264 in v1
4. **No audio**: Video-only in v1
5. **Basic signaling**: No authentication, persistence, or advanced features
6. **Single session**: Only one capture session at a time
7. **Local testing**: No cloud deployment guidance
8. **Minimal testing**: Basic integration tests, no comprehensive test suite
### Testing Environment Requirements
To run tests and smoke tests, you need:
- Linux distribution with Wayland
- PipeWire installed and running
- x264 development libraries (`libx264-dev` on Ubuntu/Debian)
- Rust toolchain (stable)
- Optional: Wayland compositor for full integration testing
### Performance Baseline
Expected performance (based on design docs):
- **Capture latency**: 1-2ms (DMA-BUF from PipeWire)
- **Encoding latency**: 15-25ms (x264 ultrafast)
- **WebRTC overhead**: 2-3ms (RTP packetization)
- **Total pipeline**: 18-30ms (excluding network)
- **CPU usage**: 20-40% (software encoding, 1080p@30fps)
- **Memory usage**: 200-400MB
These targets may vary based on hardware and network conditions.