39 KiB
Wayland → WebRTC Remote Desktop Implementation Plan
TL;DR
Quick Summary: Implement a high-performance Rust backend that captures Wayland screens via PipeWire DMA-BUF, encodes to H.264 (hardware/software), and streams to WebRTC clients with 15-25ms latency target.
Deliverables:
- Complete Rust backend (5,000-8,000 LOC)
- 5 major modules: capture, encoder, buffer management, WebRTC transport, signaling
- Configuration system and CLI
- Basic documentation and examples
Estimated Effort: Large (4-6 weeks full-time) Parallel Execution: YES - 4 waves Critical Path: Project setup → Capture → Encoder → WebRTC integration → End-to-end
Context
Original Request
User wants to implement a Wayland to WebRTC remote desktop backend based on three comprehensive design documents (DETAILED_DESIGN_CN.md, DESIGN_CN.md, DESIGN.md).
Design Documents Analysis
Three detailed design documents provided:
- DETAILED_DESIGN_CN.md: 14,000+ lines covering architecture, components, data structures, performance targets
- DESIGN_CN.md / DESIGN.md: Technical design with code examples and optimization strategies
Key Requirements from Designs:
- Zero-copy DMA-BUF pipeline for minimal latency
- Hardware encoding support (VA-API/NVENC) with software fallback (x264)
- WebRTC transport with low-latency configuration
- 15-25ms latency (LAN), <100ms (WAN)
- 30-60 FPS, up to 4K resolution
- Adaptive bitrate and damage tracking
Current State
- Empty Rust project - No source code exists
- Cargo.toml configured with all dependencies (tokio, pipewire, webrtc-rs, x264, etc.)
- Design complete - Comprehensive specifications available
- No tests or infrastructure - Starting from scratch
Work Objectives
Core Objective
Build a production-ready remote desktop backend that captures Wayland screen content and streams it to WebRTC clients with ultra-low latency (15-25ms) using zero-copy DMA-BUF architecture.
Concrete Deliverables
- Complete Rust implementation in
src/directory - 5 functional modules: capture, encoder, buffer, webrtc, signaling
- Working CLI application (
src/main.rs) - Configuration system (
config.toml) - Basic documentation (README, usage examples)
Definition of Done
- All major modules compile and integrate
- End-to-end pipeline works: capture → encode → WebRTC → client receives
- Software encoder (x264) functional
- Hardware encoder infrastructure ready (VA-API hooks)
cargo build --releasesucceeds- Basic smoke test runs without crashes
- README with setup instructions
Must Have
- PipeWire screen capture with DMA-BUF support
- Video encoding (at least x264 software encoder)
- WebRTC peer connection and media streaming
- Signaling server (WebSocket for SDP/ICE exchange)
- Zero-copy buffer management
- Error handling and logging
Must NOT Have (Guardrails)
- Audio capture: Out of scope (design only mentions video)
- Multi-user sessions: Single session only
- Authentication/Security: Basic implementation only (no complex auth)
- Hardware encoding full implementation: Infrastructure only, placeholders for VA-API/NVENC
- Browser client: Backend only, assume existing WebRTC client
- Persistent storage: No database or file storage
- Advanced features: Damage tracking, adaptive bitrate (deferred to v2)
Verification Strategy
Test Decision
- Infrastructure exists: NO
- User wants tests: YES (automated verification)
- Framework:
criterion(benchmarks) + simple integration tests
Automated Verification Approach
For Each Module:
- Unit tests for core types and error handling
- Integration tests for data flow between modules
- Benchmarks for performance validation (latency < target)
No Browser Testing Required: Use mock WebRTC behavior or simple echo test for verification.
If TDD Enabled
Each TODO follows RED-GREEN-REFACTOR:
Test Setup Task (first task):
- Install test dependencies
- Create basic test infrastructure
- Example test to verify setup
Module Tasks:
- RED: Write failing test for feature
- GREEN: Implement minimum code to pass
- REFACTOR: Clean up while passing tests
Execution Strategy
Parallel Execution Waves
Wave 1 (Start Immediately):
├── Task 1: Project structure and types
└── Task 5: Configuration system
Wave 2 (After Wave 1):
├── Task 2: Capture module (PipeWire)
├── Task 3: Buffer management
└── Task 6: Basic WebRTC echo server
Wave 3 (After Wave 2):
├── Task 4: Encoder module (x264)
└── Task 7: Signaling server
Wave 4 (After Wave 3):
├── Task 8: Integration (capture → encode → WebRTC)
├── Task 9: CLI and main entry point
└── Task 10: Documentation and examples
Critical Path: 1 → 2 → 3 → 4 → 8 → 9 → 10
Parallel Speedup: ~35% faster than sequential
Dependency Matrix
| Task | Depends On | Blocks | Can Parallelize With |
|---|---|---|---|
| 1 | None | 2, 3, 5, 6 | None (foundational) |
| 5 | None | 8, 9 | 1 |
| 2 | 1, 3 | 4 | 6 |
| 3 | 1 | 2 | 6 |
| 6 | 1 | 7, 8 | 2, 3 |
| 4 | 2 | 8 | 7 |
| 7 | 1, 6 | 8 | 4 |
| 8 | 4, 7 | 9 | None |
| 9 | 8 | 10 | 10 |
| 10 | 8, 9 | None | 9 |
TODOs
-
1. Project Structure and Core Types
What to do:
- Create module structure:
src/capture/,src/encoder/,src/buffer/,src/webrtc/,src/signaling/ - Define core types:
CapturedFrame,EncodedFrame,DmaBufHandle,PixelFormat,ScreenRegion - Define error types:
CaptureError,EncoderError,WebRtcError,SignalingError - Create
src/lib.rswith module exports - Create
src/error.rsfor centralized error handling
Must NOT do:
- Implement any actual capture/encoding logic (types only)
- Add test infrastructure (Task 5)
- Implement configuration parsing
Recommended Agent Profile:
- Category:
quick(simple type definitions) - Skills:
[](no specialized skills needed) - Skills Evaluated but Omitted: All other skills not needed for type definitions
Parallelization:
- Can Run In Parallel: NO (foundational)
- Parallel Group: None
- Blocks: Tasks 2, 3, 5, 6
- Blocked By: None
References:
Pattern References (existing code to follow):
DESIGN_CN.md:82-109- Core type definitionsDETAILED_DESIGN_CN.md:548-596- PipeWire data structuresDETAILED_DESIGN_CN.md:970-1018- Encoder data structures
API/Type References (contracts to implement against):
pipewirecrate documentation forpw::Core,pw::Stream,pw::buffer::Bufferwebrtccrate forRTCPeerConnection,RTCVideoTrackasync-traitforVideoEncodertrait definition
Test References:
thiserrorcrate for error derive patterns- Standard Rust project layout conventions
Documentation References:
DESIGN_CN.md:46-209- Component breakdown and data structuresCargo.toml- Dependency versions to use
External References:
- pipewire-rs examples: https://gitlab.freedesktop.org/pipewire/pipewire-rs
- webrtc-rs examples: https://github.com/webrtc-rs/webrtc
WHY Each Reference Matters:
- Design docs provide exact type definitions - use them verbatim
- External examples show idiomatic usage patterns for complex crates
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo check # Assert: Exit code 0, no warnings cargo clippy -- -D warnings # Assert: Exit code 0 cargo doc --no-deps --document-private-items # Assert: Docs generated successfullyEvidence to Capture:
- Module structure verified:
ls -la src/ - Type compilation output from
cargo check - Generated documentation files
Commit: NO (group with Task 5)
- Create module structure:
-
2. Capture Module (PipeWire Integration)
What to do:
- Implement
src/capture/mod.rswith PipeWire client - Create
PipewireCorestruct: manage PipeWire main loop and context - Create
PipewireStreamstruct: handle video stream and buffer dequeue - Implement frame extraction: Extract DMA-BUF FD, size, stride from buffer
- Create async channel: Send
CapturedFrameto encoder pipeline - Implement
DamageTracker(basic version): Track changed screen regions - Handle PipeWire events:
param_changed,processcallbacks
Must NOT do:
- Implement xdg-desktop-portal integration (defer to v2)
- Implement hardware-specific optimizations
- Add complex damage tracking algorithms (use simple block comparison)
Recommended Agent Profile:
- Category:
unspecified-high(complex async FFI integration) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Tasks 3, 6)
- Parallel Group: Wave 2 (with Tasks 3, 6)
- Blocks: Task 4
- Blocked By: Tasks 1, 3
References:
Pattern References:
DESIGN_CN.md:367-516- Complete capture module implementationDETAILED_DESIGN_CN.md:542-724- PipeWire client and stream handlingDETAILED_DESIGN_CN.md:727-959- Damage tracker implementation
API/Type References:
pipewirecrate:pw::MainLoop,pw::Context,pw::Core,pw::stream::Streampipewire::properties!macro for stream propertiespipewire::spa::param::format::Formatfor video formatasync_channel::Sender/Receiverfor async frame passing
Test References:
- PipeWire examples in pipewire-rs repository
- DMA-BUF handling patterns in other screen capture projects
Documentation References:
DESIGN_CN.md:70-110- Capture manager responsibilitiesDESIGN_CN.md:213-244- Data flow from Wayland to capture
External References:
- PipeWire protocol docs: https://docs.pipewire.org/
- DMA-BUF kernel docs: https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html
WHY Each Reference Matters:
- PipeWire FFI is complex - follow proven patterns from examples
- DMA-BUF handling requires precise memory management - reference docs for safety
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo check # Assert: No compilation errors # Create simple capture test cargo test capture::tests::test_stream_creation # Assert: Test passes (mock PipeWire or skip if no Wayland session) # Verify module compiles cargo build --release --lib # Assert: capture module in release binaryEvidence to Capture:
- Module compilation output
- Test execution results
- Binary size after compilation
Commit: NO (group with Task 3)
- Implement
-
3. Buffer Management Module
What to do:
- Implement
src/buffer/mod.rswith zero-copy buffer pools - Create
DmaBufPool: Manage DMA-BUF file descriptors with reuse - Create
EncodedBufferPool: ManageBytesfor encoded frames - Implement
FrameBufferPool: Unified interface for both pool types - Use RAII pattern:
Droptrait for automatic cleanup - Implement
DmaBufHandle: Safe wrapper around raw file descriptor - Add memory tracking: Track buffer lifetimes and prevent leaks
Must NOT do:
- Implement GPU memory pools (defer to hardware encoding)
- Add complex memory allocation strategies (use simple VecDeque pools)
- Implement shared memory (defer to v2)
Recommended Agent Profile:
- Category:
unspecified-high(unsafe FFI, memory management) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Tasks 2, 6)
- Parallel Group: Wave 2 (with Tasks 2, 6)
- Blocks: Task 2
- Blocked By: Task 1
References:
Pattern References:
DESIGN_CN.md:518-617- Frame buffer pool implementationDETAILED_DESIGN_CN.md:287-299- Buffer module designDESIGN_CN.md:1066-1144- Buffer sharing mechanisms
API/Type References:
std::collections::VecDequefor buffer poolsstd::os::unix::io::RawFdfor file descriptorsbytes::Bytesfor reference-counted buffersstd::mem::ManuallyDropfor custom Drop logic
Test References:
- Rust unsafe patterns for FFI
- RAII examples in Rust ecosystem
Documentation References:
DESIGN_CN.md:182-209- Buffer manager responsibilitiesDESIGN_CN.md:1009-1064- Zero-copy pipeline stages
External References:
- DMA-BUF documentation: https://www.kernel.org/doc/html/latest/driver-api/dma-buf.html
bytescrate docs: https://docs.rs/bytes/
WHY Each Reference Matters:
- Unsafe FFI requires precise patterns - RAII prevents resource leaks
- Reference design shows proven zero-copy architecture
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo test buffer::tests::test_dma_buf_pool # Assert: Pool allocates and reuses buffers correctly cargo test buffer::tests::test_encoded_buffer_pool # Assert: Bytes pool works with reference counting cargo test buffer::tests::test_memory_tracking # Assert: Memory tracker detects leaks (if implemented)Evidence to Capture:
- Test execution results
- Memory usage check (valgrind or similar if available)
- Pool performance metrics
Commit: YES
- Message:
feat(buffer): implement zero-copy buffer management - Files:
src/buffer/mod.rs,src/lib.rs - Pre-commit:
cargo test --lib
- Implement
-
4. Encoder Module (Software - x264)
What to do:
- Implement
src/encoder/mod.rswith encoder trait - Define
VideoEncodertrait withencode(),reconfigure(),request_keyframe() - Create
X264Encoderstruct: Wrap x264 software encoder - Implement encoder initialization: Set low-latency parameters (ultrafast preset, zerolatency tune)
- Implement frame encoding: Convert DMA-BUF to YUV, encode to H.264
- Use zero-copy: Map DMA-BUF once, encode from mapped memory
- Output encoded data: Wrap in
Bytesfor zero-copy to WebRTC - Implement bitrate control: Basic CBR or VBR
Must NOT do:
- Implement VA-API or NVENC encoders (defer to v2, just add trait infrastructure)
- Implement adaptive bitrate control (use fixed bitrate)
- Implement damage-aware encoding (encode full frames)
Recommended Agent Profile:
- Category:
unspecified-high(video encoding, low-latency optimization) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 3 (only)
- Blocks: Task 8
- Blocked By: Task 2
References:
Pattern References:
DESIGN_CN.md:620-783- Complete encoder module implementationDESIGN_CN.md:1249-1453- Low-latency encoder configurationDETAILED_DESIGN_CN.md:963-1184- Video encoder trait and implementations
API/Type References:
x264crate:x264::Encoder,x264::Params,x264::Pictureasync-traitfor#[async_trait] VideoEncoderbytes::Bytesfor zero-copy outputasync_trait::async_traitmacro
Test References:
- x264-rs examples: https://github.com/DaGenix/rust-x264
- Low-latency encoding patterns in OBS Studio code
Documentation References:
DESIGN_CN.md:112-148- Encoder pipeline responsibilitiesDESIGN_CN.md:248-332- Technology stack and encoder optionsDESIGN_CN.md:1376-1411- x264 low-latency parameters
External References:
- x264 documentation: https://code.videolan.org/videolan/x264/
- H.264 codec specification
WHY Each Reference Matters:
- Low-latency encoding requires precise parameter tuning - use documented presets
- x264 API is complex - examples show correct usage
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo test encoder::tests::test_x264_init # Assert: Encoder initializes with correct parameters cargo test encoder::tests::test_encode_frame # Assert: Frame encodes successfully, output is valid H.264 # Verify encoding performance cargo test encoder::tests::benchmark_encode --release # Assert: Encoding latency < 20ms for 1080p frameEvidence to Capture:
- Test execution results
- Encoding latency measurements
- Output bitstream validation (using
ffprobeif available)
Commit: YES
- Message:
feat(encoder): implement x264 software encoder - Files:
src/encoder/mod.rs,src/lib.rs - Pre-commit:
cargo test encoder
- Implement
-
5. Configuration System and Test Infrastructure
What to do:
- Create
config.tomltemplate: Capture settings, encoder config, WebRTC config - Implement
src/config.rs: Parse TOML withserde - Define config structs:
CaptureConfig,EncoderConfig,WebRtcConfig - Add validation: Check reasonable value ranges, provide defaults
- Create CLI argument parsing: Use
clapfor command-line overrides - Set up test infrastructure: Add test dependencies to Cargo.toml
- Create integration test template:
tests/integration_test.rs - Set up benchmarking: Add
criterionfor latency measurements
Must NOT do:
- Implement hot reload of config
- Add complex validation rules (basic range checks only)
- Implement configuration file watching
Recommended Agent Profile:
- Category:
quick(simple config parsing, boilerplate) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Task 1)
- Parallel Group: Wave 1 (with Task 1)
- Blocks: Tasks 8, 9
- Blocked By: None
References:
Pattern References:
DESIGN_CN.md:90-95- Capture config structureDESIGN_CN.md:124-130- Encoder config structureDESIGN_CN.md:169-180- WebRTC config structure
API/Type References:
serdederive macros:#[derive(Serialize, Deserialize)]tomlcrate:from_str()for parsingclapcrate:Parsertrait for CLI
Test References:
criterionexamples: https://bheisler.github.io/criterion.rs/- Integration testing patterns in Rust
Documentation References:
DESIGN_CN.md:248-259- Dependencies including config tools- Configuration file best practices
External References:
- TOML spec: https://toml.io/
- clap documentation: https://docs.rs/clap/
WHY Each Reference Matters:
- Config structure defined in designs - implement exactly
- Standard Rust patterns for config parsing
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo test config::tests::test_parse_valid_config # Assert: Config file parses correctly cargo test config::tests::test_cli_overrides # Assert: CLI args override config file cargo test --all-targets # Assert: All tests pass (including integration template) cargo bench --no-run # Assert: Benchmarks compile successfullyEvidence to Capture:
- Config parsing test results
- Test suite execution output
- Benchmark compilation success
Commit: YES (grouped with Task 1)
- Message:
feat: add project structure, types, and config system - Files:
src/lib.rs,src/error.rs,src/config.rs,config.toml,Cargo.toml,tests/integration_test.rs,benches/ - Pre-commit:
cargo test --all
- Create
-
6. WebRTC Transport Module
What to do:
- Implement
src/webrtc/mod.rswith WebRTC peer connection management - Create
WebRtcServerstruct: ManageRTCPeerConnectioninstances - Create
PeerConnectionwrapper: Encapsulatewebrtccrate types - Implement video track:
TrackLocalStaticSamplefor encoded frames - Implement SDP handling:
create_offer(),set_remote_description(),create_answer() - Implement ICE handling: ICE candidate callbacks, STUN/TURN support
- Configure low-latency: Minimize playout delay, disable FEC
- Implement data channels: For input events (mouse/keyboard)
Must NOT do:
- Implement custom WebRTC stack (use webrtc-rs as-is)
- Implement TURN server (configure external servers)
- Implement complex ICE strategies (use default)
Recommended Agent Profile:
- Category:
unspecified-high(WebRTC protocol, async networking) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Tasks 2, 3)
- Parallel Group: Wave 2 (with Tasks 2, 3)
- Blocks: Task 7, 8
- Blocked By: Task 1
References:
Pattern References:
DESIGN_CN.md:786-951- Complete WebRTC module implementationDESIGN_CN.md:1573-1738- Low-latency WebRTC configurationDETAILED_DESIGN_CN.md:270-286- WebRTC transport module design
API/Type References:
webrtccrate:RTCPeerConnection,RTCVideoTrack,RTCDataChannelwebrtc::api::APIBuilderfor API initializationwebrtc::peer_connection::sdpfor SDP handlingwebrtc::media::Samplefor video samples
Test References:
- webrtc-rs examples: https://github.com/webrtc-rs/webrtc/tree/main/examples
- WebRTC protocol specs: https://www.w3.org/TR/webrtc/
Documentation References:
DESIGN_CN.md:150-181- WebRTC transport responsibilitiesDESIGN_CN.md:348-360- WebRTC library optionsDESIGN_CN.md:1577-1653- Low-latency WebRTC configuration
External References:
- WebRTC MDN: https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API
- ICE specification: https://tools.ietf.org/html/rfc8445
WHY Each Reference Matters:
- WebRTC is complex protocol - use proven library and follow examples
- Low-latency config requires precise parameter tuning
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo test webrtc::tests::test_peer_connection_creation # Assert: Peer connection initializes with correct config cargo test webrtc::tests::test_sdp_exchange # Assert: Offer/Answer exchange works correctly cargo test webrtc::tests::test_video_track # Assert: Video track accepts and queues samplesEvidence to Capture:
- Test execution results
- SDP output (captured in test logs)
- ICE candidate logs
Commit: YES
- Message:
feat(webrtc): implement WebRTC transport with low-latency config - Files:
src/webrtc/mod.rs,src/lib.rs - Pre-commit:
cargo test webrtc
- Implement
-
7. Signaling Server
What to do:
- Implement
src/signaling/mod.rswith WebSocket signaling - Create
SignalingServerstruct: Manage WebSocket connections - Implement session management: Map session IDs to peer connections
- Implement SDP exchange:
send_offer(),receive_answer() - Implement ICE candidate relay:
send_ice_candidate(),receive_ice_candidate() - Handle client connections: Accept, authenticate (basic), track sessions
- Use async IO:
tokio-tungsteniteortokioWebSocket support
Must NOT do:
- Implement authentication/authorization (allow all connections)
- Implement persistent storage (in-memory sessions only)
- Implement NAT traversal beyond ICE (no STUN/TURN server hosting)
Recommended Agent Profile:
- Category:
unspecified-low(simple WebSocket server) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Task 4)
- Parallel Group: Wave 3 (with Task 4)
- Blocks: Task 8
- Blocked By: Tasks 1, 6
References:
Pattern References:
DESIGN_CN.md:954-1007- IPC/signaling implementation exampleDETAILED_DESIGN_CN.md:301-314- Signaling module design- WebSocket echo server examples
API/Type References:
tokio::net::TcpListenerfor TCP listeningtokio_tungstenitecrate:WebSocketStream,accept_async()serde_jsonfor message serializationasync_channelortokio::syncfor coordination
Test References:
- WebSocket examples in tokio ecosystem
- Signaling server patterns in WebRTC tutorials
Documentation References:
DESIGN_CN.md:27-34- Signaling server in architecture- Session management best practices
External References:
- WebSocket protocol: https://tools.ietf.org/html/rfc6455
- Signaling patterns: https://webrtc.org/getting-started/signaling
WHY Each Reference Matters:
- WebSocket signaling is standard WebRTC pattern - follow proven implementation
- Session management required for multi-client support
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo test signaling::tests::test_websocket_connection # Assert: Client can connect and disconnect cargo test signaling::tests::test_sdp_exchange # Assert: SDP offer/answer relay works cargo test signaling::tests::test_ice_candidate_relay # Assert: ICE candidates forwarded correctlyEvidence to Capture:
- Test execution results
- WebSocket message logs
- Session tracking verification
Commit: YES
- Message:
feat(signaling): implement WebSocket signaling server - Files:
src/signaling/mod.rs,src/lib.rs - Pre-commit:
cargo test signaling
- Implement
-
8. End-to-End Integration
What to do:
- Implement
src/main.rswith application entry point - Create pipeline orchestration: Capture → Buffer → Encoder → WebRTC
- Integrate all modules: Connect channels and data flow
- Implement graceful shutdown: Handle Ctrl+C, clean up resources
- Add metrics collection: Track latency, frame rate, bitrate
- Implement error recovery: Restart failed modules, log errors
- Test with mock WebRTC client: Verify end-to-end flow
Must NOT do:
- Implement production deployment (local testing only)
- Add monitoring/alerting beyond logging
- Implement auto-scaling or load balancing
Recommended Agent Profile:
- Category:
unspecified-high(complex orchestration, async coordination) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Wave 4
- Blocks: Task 9
- Blocked By: Tasks 4, 7
References:
Pattern References:
DESIGN_CN.md:1044-1064- Memory ownership transfer through pipelineDESIGN_CN.md:211-244- Complete data flowDETAILED_DESIGN_CN.md:417-533- Frame processing sequence
API/Type References:
tokioruntime:tokio::runtime::Runtime,tokio::select!async_channelfor inter-module communicationtracingfor structured logging
Test References:
- Integration test patterns
- Graceful shutdown examples in async Rust
Documentation References:
DESIGN_CN.md:1009-1044- Zero-copy pipeline stages- Error handling patterns in async Rust
External References:
- Tokio orchestration examples: https://tokio.rs/
- Structured logging: https://docs.rs/tracing/
WHY Each Reference Matters:
- End-to-end integration requires precise async coordination
- Zero-copy pipeline depends on correct ownership transfer
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo build --release # Assert: Binary builds successfully # Run with test config timeout 30 cargo run --release -- --config config.toml # Assert: Application starts, no crashes, logs show pipeline active # Verify metrics collection cargo test integration::tests::test_end_to_end_flow # Assert: Frame flows through complete pipeline, metrics collectedEvidence to Capture:
- Application startup logs
- Pipeline flow verification logs
- Metrics output (latency, frame rate, bitrate)
- Graceful shutdown logs
Commit: YES
- Message:
feat: implement end-to-end pipeline integration - Files:
src/main.rs,src/lib.rs - Pre-commit:
cargo test integration
- Implement
-
9. CLI and User Interface
What to do:
- Complete
src/main.rsCLI implementation - Implement subcommands:
start,stop,status,config - Add useful flags:
--verbose,--log-level,--port - Implement signal handling: Handle SIGINT, SIGTERM for graceful shutdown
- Add configuration validation: Warn on invalid settings at startup
- Implement status command: Show running sessions, metrics
- Create man page or help text: Document all options
Must NOT do:
- Implement TUI or GUI (CLI only)
- Add interactive configuration prompts
- Implement daemon mode (run in foreground)
Recommended Agent Profile:
- Category:
quick(CLI boilerplate, argument parsing) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Task 10)
- Parallel Group: Wave 4
- Blocks: None
- Blocked By: Task 8
References:
Pattern References:
- CLI examples in Cargo.toml (bin section)
clapcrate examples and documentation- Signal handling in async Rust
API/Type References:
clapcrate:Parser,Subcommandderivestokio::signalfor signal handlingtracingfor log levels
Test References:
- clap documentation for all argument types
- Signal handling patterns
Documentation References:
- CLI best practices: https://clig.dev/
DESIGN_CN.md- Configuration options to expose
External References:
- clap documentation: https://docs.rs/clap/
WHY Each Reference Matters:
- Good CLI design requires following established patterns
- Signal handling critical for graceful shutdown
Acceptance Criteria:
Automated Verification:
# Agent runs: cargo run --release -- --help # Assert: Help text shows all subcommands and flags cargo run --release -- start --config config.toml # Assert: Application starts with correct config cargo run --release -- status # Assert: Status command prints session info (or "no sessions") # Test signal handling timeout 5 cargo run --release -- start & PID=$! sleep 1 kill -INT $PID wait $PID # Assert: Exit code 0 (graceful shutdown)Evidence to Capture:
- Help output
- Status command output
- Signal handling test results
- Error handling for invalid flags
Commit: YES
- Message:
feat(cli): implement complete CLI with subcommands - Files:
src/main.rs - Pre-commit:
cargo clippy
- Complete
-
10. Documentation and Examples
What to do:
- Create
README.md: Project overview, features, installation, usage - Document configuration: Explain all config options in
config.toml.template - Add example usage: Show how to start server, connect client
- Document architecture: Explain module design and data flow
- Add troubleshooting section: Common issues and solutions
- Create
examples/directory: Simple client examples if needed - Document dependencies: List system-level dependencies (PipeWire, Wayland)
- Add performance notes: Expected latency, resource usage
Must NOT do:
- Write extensive API documentation (use Rustdoc comments instead)
- Create video tutorials or complex guides
- Write marketing content (keep technical)
Recommended Agent Profile:
- Category:
writing(documentation creation) - Skills:
[] - Skills Evaluated but Omitted: Not applicable
Parallelization:
- Can Run In Parallel: YES (with Task 9)
- Parallel Group: Wave 4
- Blocks: None
- Blocked By: Task 8
References:
Pattern References:
- Rust project README conventions
- Existing design documents (DETAILED_DESIGN_CN.md, etc.)
- Configuration file comments
API/Type References:
- Rustdoc:
///and//!documentation comments
Test References:
- README examples in popular Rust projects
- Documentation best practices
Documentation References:
DESIGN_CN.md- Use architecture diagrams for overviewCargo.toml- Extract dependency requirements- Design docs for feature descriptions
External References:
- README guidelines: https://www.makeareadme.com/
- Rust API guidelines: https://rust-lang.github.io/api-guidelines/
WHY Each Reference Matters:
- Good documentation critical for open-source adoption
- README first thing users see
Acceptance Criteria:
Automated Verification:
# Agent runs: ls -la README.md config.toml.template # Assert: Files exist and are non-empty grep -q "Installation" README.md grep -q "Usage" README.md grep -q "Architecture" README.md # Assert: Key sections present head -20 config.toml.template # Assert: Template has comments explaining each option # Verify all public items have docs cargo doc --no-deps ls target/doc/wl_webrtc/ # Assert: Documentation generated successfullyEvidence to Capture:
- README content preview
- Config template preview
- Generated documentation listing
Commit: YES
- Message:
docs: add README, config template, and documentation - Files:
README.md,config.toml.template,examples/ - Pre-commit: None
- Create
Commit Strategy
| After Task | Message | Files | Verification |
|---|---|---|---|
| 1, 5 | feat: add project structure, types, and config system |
src/, config.toml, Cargo.toml, tests/, benches/ |
cargo test --all |
| 3 | feat(buffer): implement zero-copy buffer management |
src/buffer/mod.rs, src/lib.rs |
cargo test --lib |
| 2 | feat(capture): implement PipeWire screen capture |
src/capture/mod.rs, src/lib.rs |
cargo test capture |
| 4 | feat(encoder): implement x264 software encoder |
src/encoder/mod.rs, src/lib.rs |
cargo test encoder |
| 6 | feat(webrtc): implement WebRTC transport with low-latency config |
src/webrtc/mod.rs, src/lib.rs |
cargo test webrtc |
| 7 | feat(signaling): implement WebSocket signaling server |
src/signaling/mod.rs, src/lib.rs |
cargo test signaling |
| 8 | feat: implement end-to-end pipeline integration |
src/main.rs, src/lib.rs |
cargo test integration |
| 9 | feat(cli): implement complete CLI with subcommands |
src/main.rs |
cargo clippy |
| 10 | docs: add README, config template, and documentation |
README.md, config.toml.template, examples/ |
None |
Success Criteria
Verification Commands
# Build and test everything
cargo build --release && cargo test --all
# Run basic smoke test
timeout 30 cargo run --release -- start --config config.toml
# Check documentation
cargo doc --no-deps
# Verify CLI
cargo run --release -- --help
Final Checklist
- All 10 tasks completed
cargo build --releasesucceedscargo test --allpasses all tests- End-to-end pipeline verified (capture → encode → send)
- CLI fully functional with all subcommands
- README complete with installation/usage instructions
- Code compiles without warnings (
cargo clippy) - Documentation generated successfully
- Config template provided with comments
Performance Validation (Optional)
- Encoding latency < 20ms for 1080p (measured via benchmarks)
- Capture latency < 5ms (measured via logging)
- Memory usage < 500MB (measured via
psor similar)
Appendix
Notes on Scope Boundaries
IN SCOPE (This Implementation):
- Complete Rust backend implementation
- PipeWire screen capture with DMA-BUF
- x264 software encoder (production-ready)
- WebRTC transport with webrtc-rs
- WebSocket signaling server
- Basic configuration and CLI
- Zero-copy buffer management
- Basic logging and error handling
OUT OF SCOPE (Future Work):
- Hardware encoder implementation (VA-API, NVENC)
- Advanced features: damage tracking, adaptive bitrate, partial region encoding
- Authentication/authorization
- Audio capture and streaming
- Multi-user session management
- Production deployment (Docker, systemd, etc.)
- Browser client implementation
- Comprehensive testing suite (unit + integration + e2e)
- Monitoring/metrics beyond basic logging
Assumptions Made
- Wayland environment available: Implementation assumes Linux with PipeWire and Wayland compositor
- x264 library installed: System-level x264 library required (via pkg-config)
- Single-session focus: Only one capture session at a time for simplicity
- Local network: Low-latency targets assume LAN environment
- Existing WebRTC client: Backend only, no browser client implementation needed
- No authentication: Allow all WebSocket connections for MVP
- Single-threaded encoding: No parallel encoder pipelines for MVP
Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| PipeWire FFI complexity | High | Use proven patterns from pipewire-rs examples |
| DMA-BUF safety | High | Strict RAII, unsafe blocks well-documented, extensive testing |
| WebRTC integration complexity | Medium | Use webrtc-rs as-is, avoid custom implementation |
| Performance targets unmet | Medium | Benchmarking in Task 4, iterative tuning |
| Missing dependencies | Low | Clear documentation of system requirements |
| Testing challenges (requires Wayland) | Medium | Use mock objects where possible, optional tests |
Alternatives Considered
WebRTC Library:
- Chosen:
webrtc(webrtc-rs) - Pure Rust, active development - Alternative:
datachannel- Another pure Rust option, less mature
Async Runtime:
- Chosen:
tokio- Industry standard, excellent ecosystem - Alternative:
async-std- More modern, smaller ecosystem
Software Encoder:
- Chosen:
x264- Ubiquitous, mature, good quality - Alternative:
openh264- Cisco's implementation, slightly lower quality
Dependencies Rationale
Core:
tokio: Async runtime, chosen for ecosystem and performancepipewire: Required for screen capturewebrtc: WebRTC implementation, chosen for zero-copy supportx264: Software encoder fallback, ubiquitous support
Supporting:
bytes: Zero-copy buffers, critical for performanceasync-channel: Async channels, simpler than tokio channelstracing: Structured logging, modern and flexibleserde/toml: Configuration parsing, standard ecosystemclap: CLI parsing, excellent help generationanyhow/thiserror: Error handling, idiomatic Rust
Known Limitations
- Linux-only: Wayland/PipeWire specific to Linux
- Requires Wayland session: Cannot run in headless or X11 environments
- Hardware encoding deferred: Only x264 in v1
- No audio: Video-only in v1
- Basic signaling: No authentication, persistence, or advanced features
- Single session: Only one capture session at a time
- Local testing: No cloud deployment guidance
- Minimal testing: Basic integration tests, no comprehensive test suite
Testing Environment Requirements
To run tests and smoke tests, you need:
- Linux distribution with Wayland
- PipeWire installed and running
- x264 development libraries (
libx264-devon Ubuntu/Debian) - Rust toolchain (stable)
- Optional: Wayland compositor for full integration testing
Performance Baseline
Expected performance (based on design docs):
- Capture latency: 1-2ms (DMA-BUF from PipeWire)
- Encoding latency: 15-25ms (x264 ultrafast)
- WebRTC overhead: 2-3ms (RTP packetization)
- Total pipeline: 18-30ms (excluding network)
- CPU usage: 20-40% (software encoding, 1080p@30fps)
- Memory usage: 200-400MB
These targets may vary based on hardware and network conditions.