# Wayland → WebRTC Remote Desktop Implementation Plan ## TL;DR > **Quick Summary**: Implement a high-performance Rust backend that captures Wayland screens via PipeWire DMA-BUF, encodes to H.264 (hardware/software), and streams to WebRTC clients with 15-25ms latency target. > > **Deliverables**: > - Complete Rust backend (5,000-8,000 LOC) > - 5 major modules: capture, encoder, buffer management, WebRTC transport, signaling > - Configuration system and CLI > - Basic documentation and examples > > **Estimated Effort**: Large (4-6 weeks full-time) > **Parallel Execution**: YES - 4 waves > **Critical Path**: Project setup → Capture → Encoder → WebRTC integration → End-to-end --- ## Context ### Original Request User wants to implement a Wayland to WebRTC remote desktop backend based on three comprehensive design documents (DETAILED_DESIGN_CN.md, DESIGN_CN.md, DESIGN.md). ### Design Documents Analysis Three detailed design documents provided: - **DETAILED_DESIGN_CN.md**: 14,000+ lines covering architecture, components, data structures, performance targets - **DESIGN_CN.md / DESIGN.md**: Technical design with code examples and optimization strategies **Key Requirements from Designs:** - Zero-copy DMA-BUF pipeline for minimal latency - Hardware encoding support (VA-API/NVENC) with software fallback (x264) - WebRTC transport with low-latency configuration - 15-25ms latency (LAN), <100ms (WAN) - 30-60 FPS, up to 4K resolution - Adaptive bitrate and damage tracking ### Current State - **Empty Rust project** - No source code exists - **Cargo.toml configured** with all dependencies (tokio, pipewire, webrtc-rs, x264, etc.) - **Design complete** - Comprehensive specifications available - **No tests or infrastructure** - Starting from scratch --- ## Work Objectives ### Core Objective Build a production-ready remote desktop backend that captures Wayland screen content and streams it to WebRTC clients with ultra-low latency (15-25ms) using zero-copy DMA-BUF architecture. ### Concrete Deliverables - Complete Rust implementation in `src/` directory - 5 functional modules: capture, encoder, buffer, webrtc, signaling - Working CLI application (`src/main.rs`) - Configuration system (`config.toml`) - Basic documentation (README, usage examples) ### Definition of Done - [x] All major modules compile and integrate - [x] End-to-end pipeline works: capture → encode → WebRTC → client receives - [x] Software encoder (x264) functional - [x] Hardware encoder infrastructure ready (VA-API hooks) - [x] `cargo build --release` succeeds - [x] Basic smoke test runs without crashes - [x] README with setup instructions ### Must Have - PipeWire screen capture with DMA-BUF support - Video encoding (at least x264 software encoder) - WebRTC peer connection and media streaming - Signaling server (WebSocket for SDP/ICE exchange) - Zero-copy buffer management - Error handling and logging ### Must NOT Have (Guardrails) - **Audio capture**: Out of scope (design only mentions video) - **Multi-user sessions**: Single session only - **Authentication/Security**: Basic implementation only (no complex auth) - **Hardware encoding full implementation**: Infrastructure only, placeholders for VA-API/NVENC - **Browser client**: Backend only, assume existing WebRTC client - **Persistent storage**: No database or file storage - **Advanced features**: Damage tracking, adaptive bitrate (deferred to v2) --- ## Verification Strategy ### Test Decision - **Infrastructure exists**: NO - **User wants tests**: YES (automated verification) - **Framework**: `criterion` (benchmarks) + simple integration tests ### Automated Verification Approach **For Each Module**: 1. **Unit tests** for core types and error handling 2. **Integration tests** for data flow between modules 3. **Benchmarks** for performance validation (latency < target) **No Browser Testing Required**: Use mock WebRTC behavior or simple echo test for verification. ### If TDD Enabled Each TODO follows RED-GREEN-REFACTOR: **Test Setup Task** (first task): - Install test dependencies - Create basic test infrastructure - Example test to verify setup **Module Tasks**: - RED: Write failing test for feature - GREEN: Implement minimum code to pass - REFACTOR: Clean up while passing tests --- ## Execution Strategy ### Parallel Execution Waves ``` Wave 1 (Start Immediately): ├── Task 1: Project structure and types └── Task 5: Configuration system Wave 2 (After Wave 1): ├── Task 2: Capture module (PipeWire) ├── Task 3: Buffer management └── Task 6: Basic WebRTC echo server Wave 3 (After Wave 2): ├── Task 4: Encoder module (x264) └── Task 7: Signaling server Wave 4 (After Wave 3): ├── Task 8: Integration (capture → encode → WebRTC) ├── Task 9: CLI and main entry point └── Task 10: Documentation and examples Critical Path: 1 → 2 → 3 → 4 → 8 → 9 → 10 Parallel Speedup: ~35% faster than sequential ``` ### Dependency Matrix | Task | Depends On | Blocks | Can Parallelize With | |------|------------|--------|---------------------| | 1 | None | 2, 3, 5, 6 | None (foundational) | | 5 | None | 8, 9 | 1 | | 2 | 1, 3 | 4 | 6 | | 3 | 1 | 2 | 6 | | 6 | 1 | 7, 8 | 2, 3 | | 4 | 2 | 8 | 7 | | 7 | 1, 6 | 8 | 4 | | 8 | 4, 7 | 9 | None | | 9 | 8 | 10 | 10 | | 10 | 8, 9 | None | 9 | --- ## TODOs - [x] 1. Project Structure and Core Types **What to do**: - Create module structure: `src/capture/`, `src/encoder/`, `src/buffer/`, `src/webrtc/`, `src/signaling/` - Define core types: `CapturedFrame`, `EncodedFrame`, `DmaBufHandle`, `PixelFormat`, `ScreenRegion` - Define error types: `CaptureError`, `EncoderError`, `WebRtcError`, `SignalingError` - Create `src/lib.rs` with module exports - Create `src/error.rs` for centralized error handling **Must NOT do**: - Implement any actual capture/encoding logic (types only) - Add test infrastructure (Task 5) - Implement configuration parsing **Recommended Agent Profile**: > - **Category**: `quick` (simple type definitions) > - **Skills**: `[]` (no specialized skills needed) > - **Skills Evaluated but Omitted**: All other skills not needed for type definitions **Parallelization**: - **Can Run In Parallel**: NO (foundational) - **Parallel Group**: None - **Blocks**: Tasks 2, 3, 5, 6 - **Blocked By**: None **References**: **Pattern References** (existing code to follow): - `DESIGN_CN.md:82-109` - Core type definitions - `DETAILED_DESIGN_CN.md:548-596` - PipeWire data structures - `DETAILED_DESIGN_CN.md:970-1018` - Encoder data structures **API/Type References** (contracts to implement against): - `pipewire` crate documentation for `pw::Core`, `pw::Stream`, `pw::buffer::Buffer` - `webrtc` crate for `RTCPeerConnection`, `RTCVideoTrack` - `async-trait` for `VideoEncoder` trait definition **Test References**: - `thiserror` crate for error derive patterns - Standard Rust project layout conventions **Documentation References**: - `DESIGN_CN.md:46-209` - Component breakdown and data structures - `Cargo.toml` - Dependency versions to use **External References**: - pipewire-rs examples: https://gitlab.freedesktop.org/pipewire/pipewire-rs - webrtc-rs examples: https://github.com/webrtc-rs/webrtc **WHY Each Reference Matters**: - Design docs provide exact type definitions - use them verbatim - External examples show idiomatic usage patterns for complex crates **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo check # Assert: Exit code 0, no warnings cargo clippy -- -D warnings # Assert: Exit code 0 cargo doc --no-deps --document-private-items # Assert: Docs generated successfully ``` **Evidence to Capture**: - [x] Module structure verified: `ls -la src/` - [x] Type compilation output from `cargo check` - [x] Generated documentation files **Commit**: NO (group with Task 5) --- - [x] 2. Capture Module (PipeWire Integration) **What to do**: - Implement `src/capture/mod.rs` with PipeWire client - Create `PipewireCore` struct: manage PipeWire main loop and context - Create `PipewireStream` struct: handle video stream and buffer dequeue - Implement frame extraction: Extract DMA-BUF FD, size, stride from buffer - Create async channel: Send `CapturedFrame` to encoder pipeline - Implement `DamageTracker` (basic version): Track changed screen regions - Handle PipeWire events: `param_changed`, `process` callbacks **Must NOT do**: - Implement xdg-desktop-portal integration (defer to v2) - Implement hardware-specific optimizations - Add complex damage tracking algorithms (use simple block comparison) **Recommended Agent Profile**: > - **Category**: `unspecified-high` (complex async FFI integration) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Tasks 3, 6) - **Parallel Group**: Wave 2 (with Tasks 3, 6) - **Blocks**: Task 4 - **Blocked By**: Tasks 1, 3 **References**: **Pattern References**: - `DESIGN_CN.md:367-516` - Complete capture module implementation - `DETAILED_DESIGN_CN.md:542-724` - PipeWire client and stream handling - `DETAILED_DESIGN_CN.md:727-959` - Damage tracker implementation **API/Type References**: - `pipewire` crate: `pw::MainLoop`, `pw::Context`, `pw::Core`, `pw::stream::Stream` - `pipewire::properties!` macro for stream properties - `pipewire::spa::param::format::Format` for video format - `async_channel::Sender/Receiver` for async frame passing **Test References**: - PipeWire examples in pipewire-rs repository - DMA-BUF handling patterns in other screen capture projects **Documentation References**: - `DESIGN_CN.md:70-110` - Capture manager responsibilities - `DESIGN_CN.md:213-244` - Data flow from Wayland to capture **External References**: - PipeWire protocol docs: https://docs.pipewire.org/ - DMA-BUF kernel docs: https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html **WHY Each Reference Matters**: - PipeWire FFI is complex - follow proven patterns from examples - DMA-BUF handling requires precise memory management - reference docs for safety **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo check # Assert: No compilation errors # Create simple capture test cargo test capture::tests::test_stream_creation # Assert: Test passes (mock PipeWire or skip if no Wayland session) # Verify module compiles cargo build --release --lib # Assert: capture module in release binary ``` **Evidence to Capture**: - [x] Module compilation output - [x] Test execution results - [x] Binary size after compilation **Commit**: NO (group with Task 3) --- - [x] 3. Buffer Management Module **What to do**: - Implement `src/buffer/mod.rs` with zero-copy buffer pools - Create `DmaBufPool`: Manage DMA-BUF file descriptors with reuse - Create `EncodedBufferPool`: Manage `Bytes` for encoded frames - Implement `FrameBufferPool`: Unified interface for both pool types - Use RAII pattern: `Drop` trait for automatic cleanup - Implement `DmaBufHandle`: Safe wrapper around raw file descriptor - Add memory tracking: Track buffer lifetimes and prevent leaks **Must NOT do**: - Implement GPU memory pools (defer to hardware encoding) - Add complex memory allocation strategies (use simple VecDeque pools) - Implement shared memory (defer to v2) **Recommended Agent Profile**: > - **Category**: `unspecified-high` (unsafe FFI, memory management) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Tasks 2, 6) - **Parallel Group**: Wave 2 (with Tasks 2, 6) - **Blocks**: Task 2 - **Blocked By**: Task 1 **References**: **Pattern References**: - `DESIGN_CN.md:518-617` - Frame buffer pool implementation - `DETAILED_DESIGN_CN.md:287-299` - Buffer module design - `DESIGN_CN.md:1066-1144` - Buffer sharing mechanisms **API/Type References**: - `std::collections::VecDeque` for buffer pools - `std::os::unix::io::RawFd` for file descriptors - `bytes::Bytes` for reference-counted buffers - `std::mem::ManuallyDrop` for custom Drop logic **Test References**: - Rust unsafe patterns for FFI - RAII examples in Rust ecosystem **Documentation References**: - `DESIGN_CN.md:182-209` - Buffer manager responsibilities - `DESIGN_CN.md:1009-1064` - Zero-copy pipeline stages **External References**: - DMA-BUF documentation: https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html - `bytes` crate docs: https://docs.rs/bytes/ **WHY Each Reference Matters**: - Unsafe FFI requires precise patterns - RAII prevents resource leaks - Reference design shows proven zero-copy architecture **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo test buffer::tests::test_dma_buf_pool # Assert: Pool allocates and reuses buffers correctly cargo test buffer::tests::test_encoded_buffer_pool # Assert: Bytes pool works with reference counting cargo test buffer::tests::test_memory_tracking # Assert: Memory tracker detects leaks (if implemented) ``` **Evidence to Capture**: - [x] Test execution results - [x] Memory usage check (valgrind or similar if available) - [x] Pool performance metrics **Commit**: YES - Message: `feat(buffer): implement zero-copy buffer management` - Files: `src/buffer/mod.rs`, `src/lib.rs` - Pre-commit: `cargo test --lib` --- - [x] 4. Encoder Module (Software - x264) **What to do**: - Implement `src/encoder/mod.rs` with encoder trait - Define `VideoEncoder` trait with `encode()`, `reconfigure()`, `request_keyframe()` - Create `X264Encoder` struct: Wrap x264 software encoder - Implement encoder initialization: Set low-latency parameters (ultrafast preset, zerolatency tune) - Implement frame encoding: Convert DMA-BUF to YUV, encode to H.264 - Use zero-copy: Map DMA-BUF once, encode from mapped memory - Output encoded data: Wrap in `Bytes` for zero-copy to WebRTC - Implement bitrate control: Basic CBR or VBR **Must NOT do**: - Implement VA-API or NVENC encoders (defer to v2, just add trait infrastructure) - Implement adaptive bitrate control (use fixed bitrate) - Implement damage-aware encoding (encode full frames) **Recommended Agent Profile**: > - **Category**: `unspecified-high` (video encoding, low-latency optimization) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: NO - **Parallel Group**: Wave 3 (only) - **Blocks**: Task 8 - **Blocked By**: Task 2 **References**: **Pattern References**: - `DESIGN_CN.md:620-783` - Complete encoder module implementation - `DESIGN_CN.md:1249-1453` - Low-latency encoder configuration - `DETAILED_DESIGN_CN.md:963-1184` - Video encoder trait and implementations **API/Type References**: - `x264` crate: `x264::Encoder`, `x264::Params`, `x264::Picture` - `async-trait` for `#[async_trait] VideoEncoder` - `bytes::Bytes` for zero-copy output - `async_trait::async_trait` macro **Test References**: - x264-rs examples: https://github.com/DaGenix/rust-x264 - Low-latency encoding patterns in OBS Studio code **Documentation References**: - `DESIGN_CN.md:112-148` - Encoder pipeline responsibilities - `DESIGN_CN.md:248-332` - Technology stack and encoder options - `DESIGN_CN.md:1376-1411` - x264 low-latency parameters **External References**: - x264 documentation: https://code.videolan.org/videolan/x264/ - H.264 codec specification **WHY Each Reference Matters**: - Low-latency encoding requires precise parameter tuning - use documented presets - x264 API is complex - examples show correct usage **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo test encoder::tests::test_x264_init # Assert: Encoder initializes with correct parameters cargo test encoder::tests::test_encode_frame # Assert: Frame encodes successfully, output is valid H.264 # Verify encoding performance cargo test encoder::tests::benchmark_encode --release # Assert: Encoding latency < 20ms for 1080p frame ``` **Evidence to Capture**: - [x] Test execution results - [x] Encoding latency measurements - [x] Output bitstream validation (using `ffprobe` if available) **Commit**: YES - Message: `feat(encoder): implement x264 software encoder` - Files: `src/encoder/mod.rs`, `src/lib.rs` - Pre-commit: `cargo test encoder` --- - [x] 5. Configuration System and Test Infrastructure **What to do**: - Create `config.toml` template: Capture settings, encoder config, WebRTC config - Implement `src/config.rs`: Parse TOML with `serde` - Define config structs: `CaptureConfig`, `EncoderConfig`, `WebRtcConfig` - Add validation: Check reasonable value ranges, provide defaults - Create CLI argument parsing: Use `clap` for command-line overrides - Set up test infrastructure: Add test dependencies to Cargo.toml - Create integration test template: `tests/integration_test.rs` - Set up benchmarking: Add `criterion` for latency measurements **Must NOT do**: - Implement hot reload of config - Add complex validation rules (basic range checks only) - Implement configuration file watching **Recommended Agent Profile**: > - **Category**: `quick` (simple config parsing, boilerplate) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Task 1) - **Parallel Group**: Wave 1 (with Task 1) - **Blocks**: Tasks 8, 9 - **Blocked By**: None **References**: **Pattern References**: - `DESIGN_CN.md:90-95` - Capture config structure - `DESIGN_CN.md:124-130` - Encoder config structure - `DESIGN_CN.md:169-180` - WebRTC config structure **API/Type References**: - `serde` derive macros: `#[derive(Serialize, Deserialize)]` - `toml` crate: `from_str()` for parsing - `clap` crate: `Parser` trait for CLI **Test References**: - `criterion` examples: https://bheisler.github.io/criterion.rs/ - Integration testing patterns in Rust **Documentation References**: - `DESIGN_CN.md:248-259` - Dependencies including config tools - Configuration file best practices **External References**: - TOML spec: https://toml.io/ - clap documentation: https://docs.rs/clap/ **WHY Each Reference Matters**: - Config structure defined in designs - implement exactly - Standard Rust patterns for config parsing **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo test config::tests::test_parse_valid_config # Assert: Config file parses correctly cargo test config::tests::test_cli_overrides # Assert: CLI args override config file cargo test --all-targets # Assert: All tests pass (including integration template) cargo bench --no-run # Assert: Benchmarks compile successfully ``` **Evidence to Capture**: - [x] Config parsing test results - [x] Test suite execution output - [x] Benchmark compilation success **Commit**: YES (grouped with Task 1) - Message: `feat: add project structure, types, and config system` - Files: `src/lib.rs`, `src/error.rs`, `src/config.rs`, `config.toml`, `Cargo.toml`, `tests/integration_test.rs`, `benches/` - Pre-commit: `cargo test --all` --- - [x] 6. WebRTC Transport Module **What to do**: - Implement `src/webrtc/mod.rs` with WebRTC peer connection management - Create `WebRtcServer` struct: Manage `RTCPeerConnection` instances - Create `PeerConnection` wrapper: Encapsulate `webrtc` crate types - Implement video track: `TrackLocalStaticSample` for encoded frames - Implement SDP handling: `create_offer()`, `set_remote_description()`, `create_answer()` - Implement ICE handling: ICE candidate callbacks, STUN/TURN support - Configure low-latency: Minimize playout delay, disable FEC - Implement data channels: For input events (mouse/keyboard) **Must NOT do**: - Implement custom WebRTC stack (use webrtc-rs as-is) - Implement TURN server (configure external servers) - Implement complex ICE strategies (use default) **Recommended Agent Profile**: > - **Category**: `unspecified-high` (WebRTC protocol, async networking) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Tasks 2, 3) - **Parallel Group**: Wave 2 (with Tasks 2, 3) - **Blocks**: Task 7, 8 - **Blocked By**: Task 1 **References**: **Pattern References**: - `DESIGN_CN.md:786-951` - Complete WebRTC module implementation - `DESIGN_CN.md:1573-1738` - Low-latency WebRTC configuration - `DETAILED_DESIGN_CN.md:270-286` - WebRTC transport module design **API/Type References**: - `webrtc` crate: `RTCPeerConnection`, `RTCVideoTrack`, `RTCDataChannel` - `webrtc::api::APIBuilder` for API initialization - `webrtc::peer_connection::sdp` for SDP handling - `webrtc::media::Sample` for video samples **Test References**: - webrtc-rs examples: https://github.com/webrtc-rs/webrtc/tree/main/examples - WebRTC protocol specs: https://www.w3.org/TR/webrtc/ **Documentation References**: - `DESIGN_CN.md:150-181` - WebRTC transport responsibilities - `DESIGN_CN.md:348-360` - WebRTC library options - `DESIGN_CN.md:1577-1653` - Low-latency WebRTC configuration **External References**: - WebRTC MDN: https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API - ICE specification: https://tools.ietf.org/html/rfc8445 **WHY Each Reference Matters**: - WebRTC is complex protocol - use proven library and follow examples - Low-latency config requires precise parameter tuning **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo test webrtc::tests::test_peer_connection_creation # Assert: Peer connection initializes with correct config cargo test webrtc::tests::test_sdp_exchange # Assert: Offer/Answer exchange works correctly cargo test webrtc::tests::test_video_track # Assert: Video track accepts and queues samples ``` **Evidence to Capture**: - [x] Test execution results - [x] SDP output (captured in test logs) - [x] ICE candidate logs **Commit**: YES - Message: `feat(webrtc): implement WebRTC transport with low-latency config` - Files: `src/webrtc/mod.rs`, `src/lib.rs` - Pre-commit: `cargo test webrtc` --- - [x] 7. Signaling Server **What to do**: - Implement `src/signaling/mod.rs` with WebSocket signaling - Create `SignalingServer` struct: Manage WebSocket connections - Implement session management: Map session IDs to peer connections - Implement SDP exchange: `send_offer()`, `receive_answer()` - Implement ICE candidate relay: `send_ice_candidate()`, `receive_ice_candidate()` - Handle client connections: Accept, authenticate (basic), track sessions - Use async IO: `tokio-tungstenite` or `tokio` WebSocket support **Must NOT do**: - Implement authentication/authorization (allow all connections) - Implement persistent storage (in-memory sessions only) - Implement NAT traversal beyond ICE (no STUN/TURN server hosting) **Recommended Agent Profile**: > - **Category**: `unspecified-low` (simple WebSocket server) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Task 4) - **Parallel Group**: Wave 3 (with Task 4) - **Blocks**: Task 8 - **Blocked By**: Tasks 1, 6 **References**: **Pattern References**: - `DESIGN_CN.md:954-1007` - IPC/signaling implementation example - `DETAILED_DESIGN_CN.md:301-314` - Signaling module design - WebSocket echo server examples **API/Type References**: - `tokio::net::TcpListener` for TCP listening - `tokio_tungstenite` crate: `WebSocketStream`, `accept_async()` - `serde_json` for message serialization - `async_channel` or `tokio::sync` for coordination **Test References**: - WebSocket examples in tokio ecosystem - Signaling server patterns in WebRTC tutorials **Documentation References**: - `DESIGN_CN.md:27-34` - Signaling server in architecture - Session management best practices **External References**: - WebSocket protocol: https://tools.ietf.org/html/rfc6455 - Signaling patterns: https://webrtc.org/getting-started/signaling **WHY Each Reference Matters**: - WebSocket signaling is standard WebRTC pattern - follow proven implementation - Session management required for multi-client support **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo test signaling::tests::test_websocket_connection # Assert: Client can connect and disconnect cargo test signaling::tests::test_sdp_exchange # Assert: SDP offer/answer relay works cargo test signaling::tests::test_ice_candidate_relay # Assert: ICE candidates forwarded correctly ``` **Evidence to Capture**: - [x] Test execution results - [x] WebSocket message logs - [x] Session tracking verification **Commit**: YES - Message: `feat(signaling): implement WebSocket signaling server` - Files: `src/signaling/mod.rs`, `src/lib.rs` - Pre-commit: `cargo test signaling` --- - [x] 8. End-to-End Integration **What to do**: - Implement `src/main.rs` with application entry point - Create pipeline orchestration: Capture → Buffer → Encoder → WebRTC - Integrate all modules: Connect channels and data flow - Implement graceful shutdown: Handle Ctrl+C, clean up resources - Add metrics collection: Track latency, frame rate, bitrate - Implement error recovery: Restart failed modules, log errors - Test with mock WebRTC client: Verify end-to-end flow **Must NOT do**: - Implement production deployment (local testing only) - Add monitoring/alerting beyond logging - Implement auto-scaling or load balancing **Recommended Agent Profile**: > - **Category**: `unspecified-high` (complex orchestration, async coordination) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: NO - **Parallel Group**: Wave 4 - **Blocks**: Task 9 - **Blocked By**: Tasks 4, 7 **References**: **Pattern References**: - `DESIGN_CN.md:1044-1064` - Memory ownership transfer through pipeline - `DESIGN_CN.md:211-244` - Complete data flow - `DETAILED_DESIGN_CN.md:417-533` - Frame processing sequence **API/Type References**: - `tokio` runtime: `tokio::runtime::Runtime`, `tokio::select!` - `async_channel` for inter-module communication - `tracing` for structured logging **Test References**: - Integration test patterns - Graceful shutdown examples in async Rust **Documentation References**: - `DESIGN_CN.md:1009-1044` - Zero-copy pipeline stages - Error handling patterns in async Rust **External References**: - Tokio orchestration examples: https://tokio.rs/ - Structured logging: https://docs.rs/tracing/ **WHY Each Reference Matters**: - End-to-end integration requires precise async coordination - Zero-copy pipeline depends on correct ownership transfer **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo build --release # Assert: Binary builds successfully # Run with test config timeout 30 cargo run --release -- --config config.toml # Assert: Application starts, no crashes, logs show pipeline active # Verify metrics collection cargo test integration::tests::test_end_to_end_flow # Assert: Frame flows through complete pipeline, metrics collected ``` **Evidence to Capture**: - [x] Application startup logs - [x] Pipeline flow verification logs - [x] Metrics output (latency, frame rate, bitrate) - [x] Graceful shutdown logs **Commit**: YES - Message: `feat: implement end-to-end pipeline integration` - Files: `src/main.rs`, `src/lib.rs` - Pre-commit: `cargo test integration` --- - [x] 9. CLI and User Interface **What to do**: - Complete `src/main.rs` CLI implementation - Implement subcommands: `start`, `stop`, `status`, `config` - Add useful flags: `--verbose`, `--log-level`, `--port` - Implement signal handling: Handle SIGINT, SIGTERM for graceful shutdown - Add configuration validation: Warn on invalid settings at startup - Implement status command: Show running sessions, metrics - Create man page or help text: Document all options **Must NOT do**: - Implement TUI or GUI (CLI only) - Add interactive configuration prompts - Implement daemon mode (run in foreground) **Recommended Agent Profile**: > - **Category**: `quick` (CLI boilerplate, argument parsing) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Task 10) - **Parallel Group**: Wave 4 - **Blocks**: None - **Blocked By**: Task 8 **References**: **Pattern References**: - CLI examples in Cargo.toml (bin section) - `clap` crate examples and documentation - Signal handling in async Rust **API/Type References**: - `clap` crate: `Parser`, `Subcommand` derives - `tokio::signal` for signal handling - `tracing` for log levels **Test References**: - clap documentation for all argument types - Signal handling patterns **Documentation References**: - CLI best practices: https://clig.dev/ - `DESIGN_CN.md` - Configuration options to expose **External References**: - clap documentation: https://docs.rs/clap/ **WHY Each Reference Matters**: - Good CLI design requires following established patterns - Signal handling critical for graceful shutdown **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: cargo run --release -- --help # Assert: Help text shows all subcommands and flags cargo run --release -- start --config config.toml # Assert: Application starts with correct config cargo run --release -- status # Assert: Status command prints session info (or "no sessions") # Test signal handling timeout 5 cargo run --release -- start & PID=$! sleep 1 kill -INT $PID wait $PID # Assert: Exit code 0 (graceful shutdown) ``` **Evidence to Capture**: - [x] Help output - [x] Status command output - [x] Signal handling test results - [x] Error handling for invalid flags **Commit**: YES - Message: `feat(cli): implement complete CLI with subcommands` - Files: `src/main.rs` - Pre-commit: `cargo clippy` --- - [x] 10. Documentation and Examples **What to do**: - Create `README.md`: Project overview, features, installation, usage - Document configuration: Explain all config options in `config.toml.template` - Add example usage: Show how to start server, connect client - Document architecture: Explain module design and data flow - Add troubleshooting section: Common issues and solutions - Create `examples/` directory: Simple client examples if needed - Document dependencies: List system-level dependencies (PipeWire, Wayland) - Add performance notes: Expected latency, resource usage **Must NOT do**: - Write extensive API documentation (use Rustdoc comments instead) - Create video tutorials or complex guides - Write marketing content (keep technical) **Recommended Agent Profile**: > - **Category**: `writing` (documentation creation) > - **Skills**: `[]` > - **Skills Evaluated but Omitted**: Not applicable **Parallelization**: - **Can Run In Parallel**: YES (with Task 9) - **Parallel Group**: Wave 4 - **Blocks**: None - **Blocked By**: Task 8 **References**: **Pattern References**: - Rust project README conventions - Existing design documents (DETAILED_DESIGN_CN.md, etc.) - Configuration file comments **API/Type References**: - Rustdoc: `///` and `//!` documentation comments **Test References**: - README examples in popular Rust projects - Documentation best practices **Documentation References**: - `DESIGN_CN.md` - Use architecture diagrams for overview - `Cargo.toml` - Extract dependency requirements - Design docs for feature descriptions **External References**: - README guidelines: https://www.makeareadme.com/ - Rust API guidelines: https://rust-lang.github.io/api-guidelines/ **WHY Each Reference Matters**: - Good documentation critical for open-source adoption - README first thing users see **Acceptance Criteria**: **Automated Verification**: ```bash # Agent runs: ls -la README.md config.toml.template # Assert: Files exist and are non-empty grep -q "Installation" README.md grep -q "Usage" README.md grep -q "Architecture" README.md # Assert: Key sections present head -20 config.toml.template # Assert: Template has comments explaining each option # Verify all public items have docs cargo doc --no-deps ls target/doc/wl_webrtc/ # Assert: Documentation generated successfully ``` **Evidence to Capture**: - [x] README content preview - [x] Config template preview - [x] Generated documentation listing **Commit**: YES - Message: `docs: add README, config template, and documentation` - Files: `README.md`, `config.toml.template`, `examples/` - Pre-commit: None --- ## Commit Strategy | After Task | Message | Files | Verification | |------------|---------|-------|--------------| | 1, 5 | `feat: add project structure, types, and config system` | `src/`, `config.toml`, `Cargo.toml`, `tests/`, `benches/` | `cargo test --all` | | 3 | `feat(buffer): implement zero-copy buffer management` | `src/buffer/mod.rs`, `src/lib.rs` | `cargo test --lib` | | 2 | `feat(capture): implement PipeWire screen capture` | `src/capture/mod.rs`, `src/lib.rs` | `cargo test capture` | | 4 | `feat(encoder): implement x264 software encoder` | `src/encoder/mod.rs`, `src/lib.rs` | `cargo test encoder` | | 6 | `feat(webrtc): implement WebRTC transport with low-latency config` | `src/webrtc/mod.rs`, `src/lib.rs` | `cargo test webrtc` | | 7 | `feat(signaling): implement WebSocket signaling server` | `src/signaling/mod.rs`, `src/lib.rs` | `cargo test signaling` | | 8 | `feat: implement end-to-end pipeline integration` | `src/main.rs`, `src/lib.rs` | `cargo test integration` | | 9 | `feat(cli): implement complete CLI with subcommands` | `src/main.rs` | `cargo clippy` | | 10 | `docs: add README, config template, and documentation` | `README.md`, `config.toml.template`, `examples/` | None | --- ## Success Criteria ### Verification Commands ```bash # Build and test everything cargo build --release && cargo test --all # Run basic smoke test timeout 30 cargo run --release -- start --config config.toml # Check documentation cargo doc --no-deps # Verify CLI cargo run --release -- --help ``` ### Final Checklist - [x] All 10 tasks completed - [x] `cargo build --release` succeeds - [x] `cargo test --all` passes all tests - [x] End-to-end pipeline verified (capture → encode → send) - [x] CLI fully functional with all subcommands - [x] README complete with installation/usage instructions - [x] Code compiles without warnings (`cargo clippy`) - [x] Documentation generated successfully - [x] Config template provided with comments ### Performance Validation (Optional) - [x] Encoding latency < 20ms for 1080p (measured via benchmarks) - [x] Capture latency < 5ms (measured via logging) - [x] Memory usage < 500MB (measured via `ps` or similar) --- ## Appendix ### Notes on Scope Boundaries **IN SCOPE (This Implementation)**: - Complete Rust backend implementation - PipeWire screen capture with DMA-BUF - x264 software encoder (production-ready) - WebRTC transport with webrtc-rs - WebSocket signaling server - Basic configuration and CLI - Zero-copy buffer management - Basic logging and error handling **OUT OF SCOPE (Future Work)**: - Hardware encoder implementation (VA-API, NVENC) - Advanced features: damage tracking, adaptive bitrate, partial region encoding - Authentication/authorization - Audio capture and streaming - Multi-user session management - Production deployment (Docker, systemd, etc.) - Browser client implementation - Comprehensive testing suite (unit + integration + e2e) - Monitoring/metrics beyond basic logging ### Assumptions Made 1. **Wayland environment available**: Implementation assumes Linux with PipeWire and Wayland compositor 2. **x264 library installed**: System-level x264 library required (via pkg-config) 3. **Single-session focus**: Only one capture session at a time for simplicity 4. **Local network**: Low-latency targets assume LAN environment 5. **Existing WebRTC client**: Backend only, no browser client implementation needed 6. **No authentication**: Allow all WebSocket connections for MVP 7. **Single-threaded encoding**: No parallel encoder pipelines for MVP ### Risks and Mitigations | Risk | Impact | Mitigation | |------|--------|------------| | PipeWire FFI complexity | High | Use proven patterns from pipewire-rs examples | | DMA-BUF safety | High | Strict RAII, unsafe blocks well-documented, extensive testing | | WebRTC integration complexity | Medium | Use webrtc-rs as-is, avoid custom implementation | | Performance targets unmet | Medium | Benchmarking in Task 4, iterative tuning | | Missing dependencies | Low | Clear documentation of system requirements | | Testing challenges (requires Wayland) | Medium | Use mock objects where possible, optional tests | ### Alternatives Considered **WebRTC Library**: - **Chosen**: `webrtc` (webrtc-rs) - Pure Rust, active development - **Alternative**: `datachannel` - Another pure Rust option, less mature **Async Runtime**: - **Chosen**: `tokio` - Industry standard, excellent ecosystem - **Alternative**: `async-std` - More modern, smaller ecosystem **Software Encoder**: - **Chosen**: `x264` - Ubiquitous, mature, good quality - **Alternative**: `openh264` - Cisco's implementation, slightly lower quality ### Dependencies Rationale **Core**: - `tokio`: Async runtime, chosen for ecosystem and performance - `pipewire`: Required for screen capture - `webrtc`: WebRTC implementation, chosen for zero-copy support - `x264`: Software encoder fallback, ubiquitous support **Supporting**: - `bytes`: Zero-copy buffers, critical for performance - `async-channel`: Async channels, simpler than tokio channels - `tracing`: Structured logging, modern and flexible - `serde/toml`: Configuration parsing, standard ecosystem - `clap`: CLI parsing, excellent help generation - `anyhow/thiserror`: Error handling, idiomatic Rust ### Known Limitations 1. **Linux-only**: Wayland/PipeWire specific to Linux 2. **Requires Wayland session**: Cannot run in headless or X11 environments 3. **Hardware encoding deferred**: Only x264 in v1 4. **No audio**: Video-only in v1 5. **Basic signaling**: No authentication, persistence, or advanced features 6. **Single session**: Only one capture session at a time 7. **Local testing**: No cloud deployment guidance 8. **Minimal testing**: Basic integration tests, no comprehensive test suite ### Testing Environment Requirements To run tests and smoke tests, you need: - Linux distribution with Wayland - PipeWire installed and running - x264 development libraries (`libx264-dev` on Ubuntu/Debian) - Rust toolchain (stable) - Optional: Wayland compositor for full integration testing ### Performance Baseline Expected performance (based on design docs): - **Capture latency**: 1-2ms (DMA-BUF from PipeWire) - **Encoding latency**: 15-25ms (x264 ultrafast) - **WebRTC overhead**: 2-3ms (RTP packetization) - **Total pipeline**: 18-30ms (excluding network) - **CPU usage**: 20-40% (software encoding, 1080p@30fps) - **Memory usage**: 200-400MB These targets may vary based on hardware and network conditions.