Files
wl-webrtc/.sisyphus/plans/wl-webrtc-implementation.md
2026-02-03 11:14:25 +08:00

39 KiB

Wayland → WebRTC Remote Desktop Implementation Plan

TL;DR

Quick Summary: Implement a high-performance Rust backend that captures Wayland screens via PipeWire DMA-BUF, encodes to H.264 (hardware/software), and streams to WebRTC clients with 15-25ms latency target.

Deliverables:

  • Complete Rust backend (5,000-8,000 LOC)
  • 5 major modules: capture, encoder, buffer management, WebRTC transport, signaling
  • Configuration system and CLI
  • Basic documentation and examples

Estimated Effort: Large (4-6 weeks full-time) Parallel Execution: YES - 4 waves Critical Path: Project setup → Capture → Encoder → WebRTC integration → End-to-end


Context

Original Request

User wants to implement a Wayland to WebRTC remote desktop backend based on three comprehensive design documents (DETAILED_DESIGN_CN.md, DESIGN_CN.md, DESIGN.md).

Design Documents Analysis

Three detailed design documents provided:

  • DETAILED_DESIGN_CN.md: 14,000+ lines covering architecture, components, data structures, performance targets
  • DESIGN_CN.md / DESIGN.md: Technical design with code examples and optimization strategies

Key Requirements from Designs:

  • Zero-copy DMA-BUF pipeline for minimal latency
  • Hardware encoding support (VA-API/NVENC) with software fallback (x264)
  • WebRTC transport with low-latency configuration
  • 15-25ms latency (LAN), <100ms (WAN)
  • 30-60 FPS, up to 4K resolution
  • Adaptive bitrate and damage tracking

Current State

  • Empty Rust project - No source code exists
  • Cargo.toml configured with all dependencies (tokio, pipewire, webrtc-rs, x264, etc.)
  • Design complete - Comprehensive specifications available
  • No tests or infrastructure - Starting from scratch

Work Objectives

Core Objective

Build a production-ready remote desktop backend that captures Wayland screen content and streams it to WebRTC clients with ultra-low latency (15-25ms) using zero-copy DMA-BUF architecture.

Concrete Deliverables

  • Complete Rust implementation in src/ directory
  • 5 functional modules: capture, encoder, buffer, webrtc, signaling
  • Working CLI application (src/main.rs)
  • Configuration system (config.toml)
  • Basic documentation (README, usage examples)

Definition of Done

  • All major modules compile and integrate
  • End-to-end pipeline works: capture → encode → WebRTC → client receives
  • Software encoder (x264) functional
  • Hardware encoder infrastructure ready (VA-API hooks)
  • cargo build --release succeeds
  • Basic smoke test runs without crashes
  • README with setup instructions

Must Have

  • PipeWire screen capture with DMA-BUF support
  • Video encoding (at least x264 software encoder)
  • WebRTC peer connection and media streaming
  • Signaling server (WebSocket for SDP/ICE exchange)
  • Zero-copy buffer management
  • Error handling and logging

Must NOT Have (Guardrails)

  • Audio capture: Out of scope (design only mentions video)
  • Multi-user sessions: Single session only
  • Authentication/Security: Basic implementation only (no complex auth)
  • Hardware encoding full implementation: Infrastructure only, placeholders for VA-API/NVENC
  • Browser client: Backend only, assume existing WebRTC client
  • Persistent storage: No database or file storage
  • Advanced features: Damage tracking, adaptive bitrate (deferred to v2)

Verification Strategy

Test Decision

  • Infrastructure exists: NO
  • User wants tests: YES (automated verification)
  • Framework: criterion (benchmarks) + simple integration tests

Automated Verification Approach

For Each Module:

  1. Unit tests for core types and error handling
  2. Integration tests for data flow between modules
  3. Benchmarks for performance validation (latency < target)

No Browser Testing Required: Use mock WebRTC behavior or simple echo test for verification.

If TDD Enabled

Each TODO follows RED-GREEN-REFACTOR:

Test Setup Task (first task):

  • Install test dependencies
  • Create basic test infrastructure
  • Example test to verify setup

Module Tasks:

  • RED: Write failing test for feature
  • GREEN: Implement minimum code to pass
  • REFACTOR: Clean up while passing tests

Execution Strategy

Parallel Execution Waves

Wave 1 (Start Immediately):
├── Task 1: Project structure and types
└── Task 5: Configuration system

Wave 2 (After Wave 1):
├── Task 2: Capture module (PipeWire)
├── Task 3: Buffer management
└── Task 6: Basic WebRTC echo server

Wave 3 (After Wave 2):
├── Task 4: Encoder module (x264)
└── Task 7: Signaling server

Wave 4 (After Wave 3):
├── Task 8: Integration (capture → encode → WebRTC)
├── Task 9: CLI and main entry point
└── Task 10: Documentation and examples

Critical Path: 1 → 2 → 3 → 4 → 8 → 9 → 10
Parallel Speedup: ~35% faster than sequential

Dependency Matrix

Task Depends On Blocks Can Parallelize With
1 None 2, 3, 5, 6 None (foundational)
5 None 8, 9 1
2 1, 3 4 6
3 1 2 6
6 1 7, 8 2, 3
4 2 8 7
7 1, 6 8 4
8 4, 7 9 None
9 8 10 10
10 8, 9 None 9

TODOs

  • 1. Project Structure and Core Types

    What to do:

    • Create module structure: src/capture/, src/encoder/, src/buffer/, src/webrtc/, src/signaling/
    • Define core types: CapturedFrame, EncodedFrame, DmaBufHandle, PixelFormat, ScreenRegion
    • Define error types: CaptureError, EncoderError, WebRtcError, SignalingError
    • Create src/lib.rs with module exports
    • Create src/error.rs for centralized error handling

    Must NOT do:

    • Implement any actual capture/encoding logic (types only)
    • Add test infrastructure (Task 5)
    • Implement configuration parsing

    Recommended Agent Profile:

    • Category: quick (simple type definitions)
    • Skills: [] (no specialized skills needed)
    • Skills Evaluated but Omitted: All other skills not needed for type definitions

    Parallelization:

    • Can Run In Parallel: NO (foundational)
    • Parallel Group: None
    • Blocks: Tasks 2, 3, 5, 6
    • Blocked By: None

    References:

    Pattern References (existing code to follow):

    • DESIGN_CN.md:82-109 - Core type definitions
    • DETAILED_DESIGN_CN.md:548-596 - PipeWire data structures
    • DETAILED_DESIGN_CN.md:970-1018 - Encoder data structures

    API/Type References (contracts to implement against):

    • pipewire crate documentation for pw::Core, pw::Stream, pw::buffer::Buffer
    • webrtc crate for RTCPeerConnection, RTCVideoTrack
    • async-trait for VideoEncoder trait definition

    Test References:

    • thiserror crate for error derive patterns
    • Standard Rust project layout conventions

    Documentation References:

    • DESIGN_CN.md:46-209 - Component breakdown and data structures
    • Cargo.toml - Dependency versions to use

    External References:

    WHY Each Reference Matters:

    • Design docs provide exact type definitions - use them verbatim
    • External examples show idiomatic usage patterns for complex crates

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo check
    # Assert: Exit code 0, no warnings
    
    cargo clippy -- -D warnings
    # Assert: Exit code 0
    
    cargo doc --no-deps --document-private-items
    # Assert: Docs generated successfully
    

    Evidence to Capture:

    • Module structure verified: ls -la src/
    • Type compilation output from cargo check
    • Generated documentation files

    Commit: NO (group with Task 5)


  • 2. Capture Module (PipeWire Integration)

    What to do:

    • Implement src/capture/mod.rs with PipeWire client
    • Create PipewireCore struct: manage PipeWire main loop and context
    • Create PipewireStream struct: handle video stream and buffer dequeue
    • Implement frame extraction: Extract DMA-BUF FD, size, stride from buffer
    • Create async channel: Send CapturedFrame to encoder pipeline
    • Implement DamageTracker (basic version): Track changed screen regions
    • Handle PipeWire events: param_changed, process callbacks

    Must NOT do:

    • Implement xdg-desktop-portal integration (defer to v2)
    • Implement hardware-specific optimizations
    • Add complex damage tracking algorithms (use simple block comparison)

    Recommended Agent Profile:

    • Category: unspecified-high (complex async FFI integration)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Tasks 3, 6)
    • Parallel Group: Wave 2 (with Tasks 3, 6)
    • Blocks: Task 4
    • Blocked By: Tasks 1, 3

    References:

    Pattern References:

    • DESIGN_CN.md:367-516 - Complete capture module implementation
    • DETAILED_DESIGN_CN.md:542-724 - PipeWire client and stream handling
    • DETAILED_DESIGN_CN.md:727-959 - Damage tracker implementation

    API/Type References:

    • pipewire crate: pw::MainLoop, pw::Context, pw::Core, pw::stream::Stream
    • pipewire::properties! macro for stream properties
    • pipewire::spa::param::format::Format for video format
    • async_channel::Sender/Receiver for async frame passing

    Test References:

    • PipeWire examples in pipewire-rs repository
    • DMA-BUF handling patterns in other screen capture projects

    Documentation References:

    • DESIGN_CN.md:70-110 - Capture manager responsibilities
    • DESIGN_CN.md:213-244 - Data flow from Wayland to capture

    External References:

    WHY Each Reference Matters:

    • PipeWire FFI is complex - follow proven patterns from examples
    • DMA-BUF handling requires precise memory management - reference docs for safety

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo check
    # Assert: No compilation errors
    
    # Create simple capture test
    cargo test capture::tests::test_stream_creation
    # Assert: Test passes (mock PipeWire or skip if no Wayland session)
    
    # Verify module compiles
    cargo build --release --lib
    # Assert: capture module in release binary
    

    Evidence to Capture:

    • Module compilation output
    • Test execution results
    • Binary size after compilation

    Commit: NO (group with Task 3)


  • 3. Buffer Management Module

    What to do:

    • Implement src/buffer/mod.rs with zero-copy buffer pools
    • Create DmaBufPool: Manage DMA-BUF file descriptors with reuse
    • Create EncodedBufferPool: Manage Bytes for encoded frames
    • Implement FrameBufferPool: Unified interface for both pool types
    • Use RAII pattern: Drop trait for automatic cleanup
    • Implement DmaBufHandle: Safe wrapper around raw file descriptor
    • Add memory tracking: Track buffer lifetimes and prevent leaks

    Must NOT do:

    • Implement GPU memory pools (defer to hardware encoding)
    • Add complex memory allocation strategies (use simple VecDeque pools)
    • Implement shared memory (defer to v2)

    Recommended Agent Profile:

    • Category: unspecified-high (unsafe FFI, memory management)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Tasks 2, 6)
    • Parallel Group: Wave 2 (with Tasks 2, 6)
    • Blocks: Task 2
    • Blocked By: Task 1

    References:

    Pattern References:

    • DESIGN_CN.md:518-617 - Frame buffer pool implementation
    • DETAILED_DESIGN_CN.md:287-299 - Buffer module design
    • DESIGN_CN.md:1066-1144 - Buffer sharing mechanisms

    API/Type References:

    • std::collections::VecDeque for buffer pools
    • std::os::unix::io::RawFd for file descriptors
    • bytes::Bytes for reference-counted buffers
    • std::mem::ManuallyDrop for custom Drop logic

    Test References:

    • Rust unsafe patterns for FFI
    • RAII examples in Rust ecosystem

    Documentation References:

    • DESIGN_CN.md:182-209 - Buffer manager responsibilities
    • DESIGN_CN.md:1009-1064 - Zero-copy pipeline stages

    External References:

    WHY Each Reference Matters:

    • Unsafe FFI requires precise patterns - RAII prevents resource leaks
    • Reference design shows proven zero-copy architecture

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo test buffer::tests::test_dma_buf_pool
    # Assert: Pool allocates and reuses buffers correctly
    
    cargo test buffer::tests::test_encoded_buffer_pool
    # Assert: Bytes pool works with reference counting
    
    cargo test buffer::tests::test_memory_tracking
    # Assert: Memory tracker detects leaks (if implemented)
    

    Evidence to Capture:

    • Test execution results
    • Memory usage check (valgrind or similar if available)
    • Pool performance metrics

    Commit: YES

    • Message: feat(buffer): implement zero-copy buffer management
    • Files: src/buffer/mod.rs, src/lib.rs
    • Pre-commit: cargo test --lib

  • 4. Encoder Module (Software - x264)

    What to do:

    • Implement src/encoder/mod.rs with encoder trait
    • Define VideoEncoder trait with encode(), reconfigure(), request_keyframe()
    • Create X264Encoder struct: Wrap x264 software encoder
    • Implement encoder initialization: Set low-latency parameters (ultrafast preset, zerolatency tune)
    • Implement frame encoding: Convert DMA-BUF to YUV, encode to H.264
    • Use zero-copy: Map DMA-BUF once, encode from mapped memory
    • Output encoded data: Wrap in Bytes for zero-copy to WebRTC
    • Implement bitrate control: Basic CBR or VBR

    Must NOT do:

    • Implement VA-API or NVENC encoders (defer to v2, just add trait infrastructure)
    • Implement adaptive bitrate control (use fixed bitrate)
    • Implement damage-aware encoding (encode full frames)

    Recommended Agent Profile:

    • Category: unspecified-high (video encoding, low-latency optimization)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 3 (only)
    • Blocks: Task 8
    • Blocked By: Task 2

    References:

    Pattern References:

    • DESIGN_CN.md:620-783 - Complete encoder module implementation
    • DESIGN_CN.md:1249-1453 - Low-latency encoder configuration
    • DETAILED_DESIGN_CN.md:963-1184 - Video encoder trait and implementations

    API/Type References:

    • x264 crate: x264::Encoder, x264::Params, x264::Picture
    • async-trait for #[async_trait] VideoEncoder
    • bytes::Bytes for zero-copy output
    • async_trait::async_trait macro

    Test References:

    Documentation References:

    • DESIGN_CN.md:112-148 - Encoder pipeline responsibilities
    • DESIGN_CN.md:248-332 - Technology stack and encoder options
    • DESIGN_CN.md:1376-1411 - x264 low-latency parameters

    External References:

    WHY Each Reference Matters:

    • Low-latency encoding requires precise parameter tuning - use documented presets
    • x264 API is complex - examples show correct usage

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo test encoder::tests::test_x264_init
    # Assert: Encoder initializes with correct parameters
    
    cargo test encoder::tests::test_encode_frame
    # Assert: Frame encodes successfully, output is valid H.264
    
    # Verify encoding performance
    cargo test encoder::tests::benchmark_encode --release
    # Assert: Encoding latency < 20ms for 1080p frame
    

    Evidence to Capture:

    • Test execution results
    • Encoding latency measurements
    • Output bitstream validation (using ffprobe if available)

    Commit: YES

    • Message: feat(encoder): implement x264 software encoder
    • Files: src/encoder/mod.rs, src/lib.rs
    • Pre-commit: cargo test encoder

  • 5. Configuration System and Test Infrastructure

    What to do:

    • Create config.toml template: Capture settings, encoder config, WebRTC config
    • Implement src/config.rs: Parse TOML with serde
    • Define config structs: CaptureConfig, EncoderConfig, WebRtcConfig
    • Add validation: Check reasonable value ranges, provide defaults
    • Create CLI argument parsing: Use clap for command-line overrides
    • Set up test infrastructure: Add test dependencies to Cargo.toml
    • Create integration test template: tests/integration_test.rs
    • Set up benchmarking: Add criterion for latency measurements

    Must NOT do:

    • Implement hot reload of config
    • Add complex validation rules (basic range checks only)
    • Implement configuration file watching

    Recommended Agent Profile:

    • Category: quick (simple config parsing, boilerplate)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Task 1)
    • Parallel Group: Wave 1 (with Task 1)
    • Blocks: Tasks 8, 9
    • Blocked By: None

    References:

    Pattern References:

    • DESIGN_CN.md:90-95 - Capture config structure
    • DESIGN_CN.md:124-130 - Encoder config structure
    • DESIGN_CN.md:169-180 - WebRTC config structure

    API/Type References:

    • serde derive macros: #[derive(Serialize, Deserialize)]
    • toml crate: from_str() for parsing
    • clap crate: Parser trait for CLI

    Test References:

    Documentation References:

    • DESIGN_CN.md:248-259 - Dependencies including config tools
    • Configuration file best practices

    External References:

    WHY Each Reference Matters:

    • Config structure defined in designs - implement exactly
    • Standard Rust patterns for config parsing

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo test config::tests::test_parse_valid_config
    # Assert: Config file parses correctly
    
    cargo test config::tests::test_cli_overrides
    # Assert: CLI args override config file
    
    cargo test --all-targets
    # Assert: All tests pass (including integration template)
    
    cargo bench --no-run
    # Assert: Benchmarks compile successfully
    

    Evidence to Capture:

    • Config parsing test results
    • Test suite execution output
    • Benchmark compilation success

    Commit: YES (grouped with Task 1)

    • Message: feat: add project structure, types, and config system
    • Files: src/lib.rs, src/error.rs, src/config.rs, config.toml, Cargo.toml, tests/integration_test.rs, benches/
    • Pre-commit: cargo test --all

  • 6. WebRTC Transport Module

    What to do:

    • Implement src/webrtc/mod.rs with WebRTC peer connection management
    • Create WebRtcServer struct: Manage RTCPeerConnection instances
    • Create PeerConnection wrapper: Encapsulate webrtc crate types
    • Implement video track: TrackLocalStaticSample for encoded frames
    • Implement SDP handling: create_offer(), set_remote_description(), create_answer()
    • Implement ICE handling: ICE candidate callbacks, STUN/TURN support
    • Configure low-latency: Minimize playout delay, disable FEC
    • Implement data channels: For input events (mouse/keyboard)

    Must NOT do:

    • Implement custom WebRTC stack (use webrtc-rs as-is)
    • Implement TURN server (configure external servers)
    • Implement complex ICE strategies (use default)

    Recommended Agent Profile:

    • Category: unspecified-high (WebRTC protocol, async networking)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Tasks 2, 3)
    • Parallel Group: Wave 2 (with Tasks 2, 3)
    • Blocks: Task 7, 8
    • Blocked By: Task 1

    References:

    Pattern References:

    • DESIGN_CN.md:786-951 - Complete WebRTC module implementation
    • DESIGN_CN.md:1573-1738 - Low-latency WebRTC configuration
    • DETAILED_DESIGN_CN.md:270-286 - WebRTC transport module design

    API/Type References:

    • webrtc crate: RTCPeerConnection, RTCVideoTrack, RTCDataChannel
    • webrtc::api::APIBuilder for API initialization
    • webrtc::peer_connection::sdp for SDP handling
    • webrtc::media::Sample for video samples

    Test References:

    Documentation References:

    • DESIGN_CN.md:150-181 - WebRTC transport responsibilities
    • DESIGN_CN.md:348-360 - WebRTC library options
    • DESIGN_CN.md:1577-1653 - Low-latency WebRTC configuration

    External References:

    WHY Each Reference Matters:

    • WebRTC is complex protocol - use proven library and follow examples
    • Low-latency config requires precise parameter tuning

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo test webrtc::tests::test_peer_connection_creation
    # Assert: Peer connection initializes with correct config
    
    cargo test webrtc::tests::test_sdp_exchange
    # Assert: Offer/Answer exchange works correctly
    
    cargo test webrtc::tests::test_video_track
    # Assert: Video track accepts and queues samples
    

    Evidence to Capture:

    • Test execution results
    • SDP output (captured in test logs)
    • ICE candidate logs

    Commit: YES

    • Message: feat(webrtc): implement WebRTC transport with low-latency config
    • Files: src/webrtc/mod.rs, src/lib.rs
    • Pre-commit: cargo test webrtc

  • 7. Signaling Server

    What to do:

    • Implement src/signaling/mod.rs with WebSocket signaling
    • Create SignalingServer struct: Manage WebSocket connections
    • Implement session management: Map session IDs to peer connections
    • Implement SDP exchange: send_offer(), receive_answer()
    • Implement ICE candidate relay: send_ice_candidate(), receive_ice_candidate()
    • Handle client connections: Accept, authenticate (basic), track sessions
    • Use async IO: tokio-tungstenite or tokio WebSocket support

    Must NOT do:

    • Implement authentication/authorization (allow all connections)
    • Implement persistent storage (in-memory sessions only)
    • Implement NAT traversal beyond ICE (no STUN/TURN server hosting)

    Recommended Agent Profile:

    • Category: unspecified-low (simple WebSocket server)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Task 4)
    • Parallel Group: Wave 3 (with Task 4)
    • Blocks: Task 8
    • Blocked By: Tasks 1, 6

    References:

    Pattern References:

    • DESIGN_CN.md:954-1007 - IPC/signaling implementation example
    • DETAILED_DESIGN_CN.md:301-314 - Signaling module design
    • WebSocket echo server examples

    API/Type References:

    • tokio::net::TcpListener for TCP listening
    • tokio_tungstenite crate: WebSocketStream, accept_async()
    • serde_json for message serialization
    • async_channel or tokio::sync for coordination

    Test References:

    • WebSocket examples in tokio ecosystem
    • Signaling server patterns in WebRTC tutorials

    Documentation References:

    • DESIGN_CN.md:27-34 - Signaling server in architecture
    • Session management best practices

    External References:

    WHY Each Reference Matters:

    • WebSocket signaling is standard WebRTC pattern - follow proven implementation
    • Session management required for multi-client support

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo test signaling::tests::test_websocket_connection
    # Assert: Client can connect and disconnect
    
    cargo test signaling::tests::test_sdp_exchange
    # Assert: SDP offer/answer relay works
    
    cargo test signaling::tests::test_ice_candidate_relay
    # Assert: ICE candidates forwarded correctly
    

    Evidence to Capture:

    • Test execution results
    • WebSocket message logs
    • Session tracking verification

    Commit: YES

    • Message: feat(signaling): implement WebSocket signaling server
    • Files: src/signaling/mod.rs, src/lib.rs
    • Pre-commit: cargo test signaling

  • 8. End-to-End Integration

    What to do:

    • Implement src/main.rs with application entry point
    • Create pipeline orchestration: Capture → Buffer → Encoder → WebRTC
    • Integrate all modules: Connect channels and data flow
    • Implement graceful shutdown: Handle Ctrl+C, clean up resources
    • Add metrics collection: Track latency, frame rate, bitrate
    • Implement error recovery: Restart failed modules, log errors
    • Test with mock WebRTC client: Verify end-to-end flow

    Must NOT do:

    • Implement production deployment (local testing only)
    • Add monitoring/alerting beyond logging
    • Implement auto-scaling or load balancing

    Recommended Agent Profile:

    • Category: unspecified-high (complex orchestration, async coordination)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: NO
    • Parallel Group: Wave 4
    • Blocks: Task 9
    • Blocked By: Tasks 4, 7

    References:

    Pattern References:

    • DESIGN_CN.md:1044-1064 - Memory ownership transfer through pipeline
    • DESIGN_CN.md:211-244 - Complete data flow
    • DETAILED_DESIGN_CN.md:417-533 - Frame processing sequence

    API/Type References:

    • tokio runtime: tokio::runtime::Runtime, tokio::select!
    • async_channel for inter-module communication
    • tracing for structured logging

    Test References:

    • Integration test patterns
    • Graceful shutdown examples in async Rust

    Documentation References:

    • DESIGN_CN.md:1009-1044 - Zero-copy pipeline stages
    • Error handling patterns in async Rust

    External References:

    WHY Each Reference Matters:

    • End-to-end integration requires precise async coordination
    • Zero-copy pipeline depends on correct ownership transfer

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo build --release
    # Assert: Binary builds successfully
    
    # Run with test config
    timeout 30 cargo run --release -- --config config.toml
    # Assert: Application starts, no crashes, logs show pipeline active
    
    # Verify metrics collection
    cargo test integration::tests::test_end_to_end_flow
    # Assert: Frame flows through complete pipeline, metrics collected
    

    Evidence to Capture:

    • Application startup logs
    • Pipeline flow verification logs
    • Metrics output (latency, frame rate, bitrate)
    • Graceful shutdown logs

    Commit: YES

    • Message: feat: implement end-to-end pipeline integration
    • Files: src/main.rs, src/lib.rs
    • Pre-commit: cargo test integration

  • 9. CLI and User Interface

    What to do:

    • Complete src/main.rs CLI implementation
    • Implement subcommands: start, stop, status, config
    • Add useful flags: --verbose, --log-level, --port
    • Implement signal handling: Handle SIGINT, SIGTERM for graceful shutdown
    • Add configuration validation: Warn on invalid settings at startup
    • Implement status command: Show running sessions, metrics
    • Create man page or help text: Document all options

    Must NOT do:

    • Implement TUI or GUI (CLI only)
    • Add interactive configuration prompts
    • Implement daemon mode (run in foreground)

    Recommended Agent Profile:

    • Category: quick (CLI boilerplate, argument parsing)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Task 10)
    • Parallel Group: Wave 4
    • Blocks: None
    • Blocked By: Task 8

    References:

    Pattern References:

    • CLI examples in Cargo.toml (bin section)
    • clap crate examples and documentation
    • Signal handling in async Rust

    API/Type References:

    • clap crate: Parser, Subcommand derives
    • tokio::signal for signal handling
    • tracing for log levels

    Test References:

    • clap documentation for all argument types
    • Signal handling patterns

    Documentation References:

    • CLI best practices: https://clig.dev/
    • DESIGN_CN.md - Configuration options to expose

    External References:

    WHY Each Reference Matters:

    • Good CLI design requires following established patterns
    • Signal handling critical for graceful shutdown

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    cargo run --release -- --help
    # Assert: Help text shows all subcommands and flags
    
    cargo run --release -- start --config config.toml
    # Assert: Application starts with correct config
    
    cargo run --release -- status
    # Assert: Status command prints session info (or "no sessions")
    
    # Test signal handling
    timeout 5 cargo run --release -- start &
    PID=$!
    sleep 1
    kill -INT $PID
    wait $PID
    # Assert: Exit code 0 (graceful shutdown)
    

    Evidence to Capture:

    • Help output
    • Status command output
    • Signal handling test results
    • Error handling for invalid flags

    Commit: YES

    • Message: feat(cli): implement complete CLI with subcommands
    • Files: src/main.rs
    • Pre-commit: cargo clippy

  • 10. Documentation and Examples

    What to do:

    • Create README.md: Project overview, features, installation, usage
    • Document configuration: Explain all config options in config.toml.template
    • Add example usage: Show how to start server, connect client
    • Document architecture: Explain module design and data flow
    • Add troubleshooting section: Common issues and solutions
    • Create examples/ directory: Simple client examples if needed
    • Document dependencies: List system-level dependencies (PipeWire, Wayland)
    • Add performance notes: Expected latency, resource usage

    Must NOT do:

    • Write extensive API documentation (use Rustdoc comments instead)
    • Create video tutorials or complex guides
    • Write marketing content (keep technical)

    Recommended Agent Profile:

    • Category: writing (documentation creation)
    • Skills: []
    • Skills Evaluated but Omitted: Not applicable

    Parallelization:

    • Can Run In Parallel: YES (with Task 9)
    • Parallel Group: Wave 4
    • Blocks: None
    • Blocked By: Task 8

    References:

    Pattern References:

    • Rust project README conventions
    • Existing design documents (DETAILED_DESIGN_CN.md, etc.)
    • Configuration file comments

    API/Type References:

    • Rustdoc: /// and //! documentation comments

    Test References:

    • README examples in popular Rust projects
    • Documentation best practices

    Documentation References:

    • DESIGN_CN.md - Use architecture diagrams for overview
    • Cargo.toml - Extract dependency requirements
    • Design docs for feature descriptions

    External References:

    WHY Each Reference Matters:

    • Good documentation critical for open-source adoption
    • README first thing users see

    Acceptance Criteria:

    Automated Verification:

    # Agent runs:
    ls -la README.md config.toml.template
    # Assert: Files exist and are non-empty
    
    grep -q "Installation" README.md
    grep -q "Usage" README.md
    grep -q "Architecture" README.md
    # Assert: Key sections present
    
    head -20 config.toml.template
    # Assert: Template has comments explaining each option
    
    # Verify all public items have docs
    cargo doc --no-deps
    ls target/doc/wl_webrtc/
    # Assert: Documentation generated successfully
    

    Evidence to Capture:

    • README content preview
    • Config template preview
    • Generated documentation listing

    Commit: YES

    • Message: docs: add README, config template, and documentation
    • Files: README.md, config.toml.template, examples/
    • Pre-commit: None

Commit Strategy

After Task Message Files Verification
1, 5 feat: add project structure, types, and config system src/, config.toml, Cargo.toml, tests/, benches/ cargo test --all
3 feat(buffer): implement zero-copy buffer management src/buffer/mod.rs, src/lib.rs cargo test --lib
2 feat(capture): implement PipeWire screen capture src/capture/mod.rs, src/lib.rs cargo test capture
4 feat(encoder): implement x264 software encoder src/encoder/mod.rs, src/lib.rs cargo test encoder
6 feat(webrtc): implement WebRTC transport with low-latency config src/webrtc/mod.rs, src/lib.rs cargo test webrtc
7 feat(signaling): implement WebSocket signaling server src/signaling/mod.rs, src/lib.rs cargo test signaling
8 feat: implement end-to-end pipeline integration src/main.rs, src/lib.rs cargo test integration
9 feat(cli): implement complete CLI with subcommands src/main.rs cargo clippy
10 docs: add README, config template, and documentation README.md, config.toml.template, examples/ None

Success Criteria

Verification Commands

# Build and test everything
cargo build --release && cargo test --all

# Run basic smoke test
timeout 30 cargo run --release -- start --config config.toml

# Check documentation
cargo doc --no-deps

# Verify CLI
cargo run --release -- --help

Final Checklist

  • All 10 tasks completed
  • cargo build --release succeeds
  • cargo test --all passes all tests
  • End-to-end pipeline verified (capture → encode → send)
  • CLI fully functional with all subcommands
  • README complete with installation/usage instructions
  • Code compiles without warnings (cargo clippy)
  • Documentation generated successfully
  • Config template provided with comments

Performance Validation (Optional)

  • Encoding latency < 20ms for 1080p (measured via benchmarks)
  • Capture latency < 5ms (measured via logging)
  • Memory usage < 500MB (measured via ps or similar)

Appendix

Notes on Scope Boundaries

IN SCOPE (This Implementation):

  • Complete Rust backend implementation
  • PipeWire screen capture with DMA-BUF
  • x264 software encoder (production-ready)
  • WebRTC transport with webrtc-rs
  • WebSocket signaling server
  • Basic configuration and CLI
  • Zero-copy buffer management
  • Basic logging and error handling

OUT OF SCOPE (Future Work):

  • Hardware encoder implementation (VA-API, NVENC)
  • Advanced features: damage tracking, adaptive bitrate, partial region encoding
  • Authentication/authorization
  • Audio capture and streaming
  • Multi-user session management
  • Production deployment (Docker, systemd, etc.)
  • Browser client implementation
  • Comprehensive testing suite (unit + integration + e2e)
  • Monitoring/metrics beyond basic logging

Assumptions Made

  1. Wayland environment available: Implementation assumes Linux with PipeWire and Wayland compositor
  2. x264 library installed: System-level x264 library required (via pkg-config)
  3. Single-session focus: Only one capture session at a time for simplicity
  4. Local network: Low-latency targets assume LAN environment
  5. Existing WebRTC client: Backend only, no browser client implementation needed
  6. No authentication: Allow all WebSocket connections for MVP
  7. Single-threaded encoding: No parallel encoder pipelines for MVP

Risks and Mitigations

Risk Impact Mitigation
PipeWire FFI complexity High Use proven patterns from pipewire-rs examples
DMA-BUF safety High Strict RAII, unsafe blocks well-documented, extensive testing
WebRTC integration complexity Medium Use webrtc-rs as-is, avoid custom implementation
Performance targets unmet Medium Benchmarking in Task 4, iterative tuning
Missing dependencies Low Clear documentation of system requirements
Testing challenges (requires Wayland) Medium Use mock objects where possible, optional tests

Alternatives Considered

WebRTC Library:

  • Chosen: webrtc (webrtc-rs) - Pure Rust, active development
  • Alternative: datachannel - Another pure Rust option, less mature

Async Runtime:

  • Chosen: tokio - Industry standard, excellent ecosystem
  • Alternative: async-std - More modern, smaller ecosystem

Software Encoder:

  • Chosen: x264 - Ubiquitous, mature, good quality
  • Alternative: openh264 - Cisco's implementation, slightly lower quality

Dependencies Rationale

Core:

  • tokio: Async runtime, chosen for ecosystem and performance
  • pipewire: Required for screen capture
  • webrtc: WebRTC implementation, chosen for zero-copy support
  • x264: Software encoder fallback, ubiquitous support

Supporting:

  • bytes: Zero-copy buffers, critical for performance
  • async-channel: Async channels, simpler than tokio channels
  • tracing: Structured logging, modern and flexible
  • serde/toml: Configuration parsing, standard ecosystem
  • clap: CLI parsing, excellent help generation
  • anyhow/thiserror: Error handling, idiomatic Rust

Known Limitations

  1. Linux-only: Wayland/PipeWire specific to Linux
  2. Requires Wayland session: Cannot run in headless or X11 environments
  3. Hardware encoding deferred: Only x264 in v1
  4. No audio: Video-only in v1
  5. Basic signaling: No authentication, persistence, or advanced features
  6. Single session: Only one capture session at a time
  7. Local testing: No cloud deployment guidance
  8. Minimal testing: Basic integration tests, no comprehensive test suite

Testing Environment Requirements

To run tests and smoke tests, you need:

  • Linux distribution with Wayland
  • PipeWire installed and running
  • x264 development libraries (libx264-dev on Ubuntu/Debian)
  • Rust toolchain (stable)
  • Optional: Wayland compositor for full integration testing

Performance Baseline

Expected performance (based on design docs):

  • Capture latency: 1-2ms (DMA-BUF from PipeWire)
  • Encoding latency: 15-25ms (x264 ultrafast)
  • WebRTC overhead: 2-3ms (RTP packetization)
  • Total pipeline: 18-30ms (excluding network)
  • CPU usage: 20-40% (software encoding, 1080p@30fps)
  • Memory usage: 200-400MB

These targets may vary based on hardware and network conditions.