Commit Graph

9 Commits

Author SHA1 Message Date
dailz
826f544569 feat(portal): async encode pipeline - decouple capture from encoding
Split synchronous encode pipeline so sws_scale + libx264 runs on a
dedicated thread, leaving only VAAPI import + GPU scale + GPU→CPU
transfer on the main capture thread.

Problem: encode_p95 occasionally hit 74ms, blocking the entire capture
pipeline and causing capture_gap_max=356ms stutter.

Solution:
- avhw.rs: Split SwEncState into SwEncImport (main thread: VAAPI import,
  filter_graph scale, GPU→CPU transfer) and SwEncEncode (encode thread:
  sws_scale NV12→YUV420P, libx264 encode). New CpuNv12Frame struct
  carries owned pixel data across threads via crossbeam channel.
  SwEncState wraps both for backward compat (MP4/sync path untouched).
- state_portal.rs: WebRTC portal path spawns 'wl-webrtc-encode' thread
  with bounded(2) input channel (drop-newest backpressure) and separate
  timing channel. Graceful shutdown: drop webrtc_rx → drop input_tx →
  join encode thread → flush sync encoder.
- stats.rs: Add record_import() + record_encode_thread() for async timing.

Results: encode_p95 stable at 2.9-4.2ms (was 11-74ms), capture_fps
stable 59-60fps, cap_gap_p95 17-19ms. Remaining capture stalls traced
to PipeWire compositor frame delivery (external, not our code).
2026-06-07 16:55:28 +08:00
dailz
46367ef6b5 fix(state): add WebRTC support to wlr-screencopy backend
Fixes #1 -- --port mode with wlr-screencopy backend caused panic at
negotiate_format() because self.args.output is None and .expect() was
called unconditionally.

Changes:
- Introduce StreamingEncoder enum wrapping EncState (MP4) and
  SwEncState (WebRTC) with unified frames_rgb/encode_frame/flush API
- Add WebRTC fields to State<S> (webrtc, webrtc_tx, webrtc_rx,
  webrtc_frames_sent) matching Portal backend pattern
- State::new() returns Result<Self> for clean WebRtcState init failure
- negotiate_format() branches on webrtc_tx: WebRTC path uses
  SwEncState::new_webrtc(), MP4 path unchanged (hardware VAAPI)
- Add poll_webrtc() method to drive signaling + channel drain
- Event loop calls poll_webrtc() each iteration
- Fix pre-existing test/bench Args construction (Option<String> output,
  missing no_persist field)
2026-06-04 22:10:46 +08:00
dailz
b0ed6548a6 feat: add WebRTC streaming via str0m + portal session persistence
- Add src/webrtc.rs: HTTP signaling server + str0m Sans-IO WebRTC transport
  with H.264 Annex-B → RTP packetization and key-frame request handling
- avhw: introduce FrameOutput enum (Muxer | Channel) so SwEncState can
  output to either MP4 muxer or crossbeam channel for WebRTC
- cap_portal: support portal session restore tokens (PersistMode::ExplicitlyRevoked)
  to skip re-authorization dialog; add --no-persist flag to force fresh dialog
- args: make --output optional when --port is used for WebRTC mode
- state_portal: integrate WebRTC pipeline (encoder channel → RTP forwarding)
  with shorter GOP for WebRTC (fps/2, min 10)
- main: redirect tracing to stderr; validate --output or --port required
- Add dependencies: str0m 0.20, serde_json 1, dirs 6
2026-06-04 20:54:16 +08:00
dailz
74f4dc826d perf(portal): achieve 58-60fps PipeWire screen capture
- Force PipeWire quantum=512 via NODE_FORCE_QUANTUM (48000/512=93Hz scheduling)
- Switch to libx264 ultrafast/zerolatency with 6 threads
- Use two-phase poll_and_encode: blocking recv_timeout for first frame,
  non-blocking try_recv drain for subsequent frames
- Remove fps_limit from portal path (PW already rate-limits via quantum/KWin;
  fps_limit's min_interval was silently dropping ~10% of valid frames)
- Remove diagnostic instrumentation (TIMING/PIPEWIRE logs, timing fields,
  pw_stats counters)
- Add lightweight production stats: per-10s fps log + shutdown summary
- Prefer libx264 over libopenh264 (better quality at same speed)
2026-05-30 08:44:15 +08:00
dailz
d80b34f44f feat: GPU-downscale + software H.264 encode pipeline (WIP)
Add SwEncState in avhw.rs: GPU pipeline using scale_vaapi to downscale
4K BGRA -> 2K NV12 on AMD iGPU, then software encode with libopenh264.

- import_dma_buf_to_vaapi: av_hwframe_map based DMA-BUF import
- SwEncState: GPU filter graph (scale_vaapi) + NV12->YUV420P + libopenh264
- state_portal.rs: integrated SwEncState, auto DRM device detection
- vaapi_import_bench.rs: CPU vs GPU pipeline benchmark
- sw_encode_bench.rs: software encode benchmark

Benchmark results: GPU pipeline ~91 FPS theoretical (10.95ms/frame)
vs CPU pipeline ~33 FPS (30.21ms/frame).

Known issue: only 1 frame encoded in production recording,
diagnostic STATS logging added to debug frame flow.
2026-05-29 22:04:12 +08:00
dailz
dcf8d1affb docs: add Chinese documentation comments to core modules
Add comprehensive Chinese documentation comments to cap_portal,
main, and state_portal modules covering architecture, lifecycle,
and data flow for each component.
2026-05-25 08:56:43 +08:00
dailz
d7fbb5256c feat: add KWin/KDE Plasma screen capture via xdg-desktop-portal ScreenCast + PipeWire
Add a second capture backend for compositors without wlr-screencopy
(KWin, GNOME, etc.) using the xdg-desktop-portal ScreenCast interface
and PipeWire DMA-BUF streaming.

New files:
- src/backend_detect.rs: auto-detect wlr-screencopy vs portal backend
- src/cap_portal.rs: Portal session setup + PipeWire DMA-BUF thread
- src/state_portal.rs: StatePortal encoder pipeline (DMA-BUF → VAAPI)

Changes:
- Cargo.toml: add ashpd 0.13, tokio 1, pipewire 0.9, libspa 0.9,
  crossbeam-channel 0.5
- src/args.rs: add --backend CLI flag
- src/avhw.rs: extract create_encoder() from inline State code
- src/main.rs: route to portal or wlr-screencopy based on backend
- src/state.rs: fix params.destroy() on dup failure, cleanup
  in_flight_surface on copy fail, use create_encoder()
- tests/integration_test.rs: add --backend flag tests
2026-05-11 08:49:08 +08:00
dailz
6835f1f6cd fix(main): call fps_limit.flush() before encoder flush on shutdown
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-14 16:45:47 +08:00
dailz
6d49222de8 feat: Phase 1 MVP with audit fixes — Wayland screen capture + VAAPI encoding
Phase 1 MVP implementation of wl-webrtc: Wayland screen capture tool
with hardware-accelerated VAAPI H.264 encoding and WebTransport output.

Includes all 9 runtime bug fixes from code audit (fix-audit-issues plan):

CRITICAL:
- C2: h264_metadata BSF with repeat_sps/repeat_pps in encode pipeline
- C4: FpsLimit wired as timing gate in on_copy_complete

HIGH:
- C3+A2: DRM device discovery via dmabuf feedback MainDevice event,
  unified resolve_drm_path() helper (CLI > compositor > auto > fallback)
- H2: Separate physical_size (mm) from mode_size (pixels) in wl_output
- H1+A3: Multi-output warning + named-output-not-found error

MEDIUM:
- M5: tv_sec u32->u64 to avoid Y2106 timestamp truncation
- M4: Guard against SHM Buffer event (DMA-BUF only)

Key components:
- src/avhw.rs: FFmpeg VAAPI encoder + filter graph + BSF pipeline
- src/state.rs: Wayland event loop + output negotiation + screencopy
- src/cap_wlr_screencopy.rs: wlr-screencopy capture source
- src/fps_limit.rs: Frame rate limiting with configurable target
- src/transform.rs: Frame format conversion utilities
2026-04-05 23:35:00 +08:00