feat(portal): async encode pipeline - decouple capture from encoding

Split synchronous encode pipeline so sws_scale + libx264 runs on a
dedicated thread, leaving only VAAPI import + GPU scale + GPU→CPU
transfer on the main capture thread.

Problem: encode_p95 occasionally hit 74ms, blocking the entire capture
pipeline and causing capture_gap_max=356ms stutter.

Solution:
- avhw.rs: Split SwEncState into SwEncImport (main thread: VAAPI import,
  filter_graph scale, GPU→CPU transfer) and SwEncEncode (encode thread:
  sws_scale NV12→YUV420P, libx264 encode). New CpuNv12Frame struct
  carries owned pixel data across threads via crossbeam channel.
  SwEncState wraps both for backward compat (MP4/sync path untouched).
- state_portal.rs: WebRTC portal path spawns 'wl-webrtc-encode' thread
  with bounded(2) input channel (drop-newest backpressure) and separate
  timing channel. Graceful shutdown: drop webrtc_rx → drop input_tx →
  join encode thread → flush sync encoder.
- stats.rs: Add record_import() + record_encode_thread() for async timing.

Results: encode_p95 stable at 2.9-4.2ms (was 11-74ms), capture_fps
stable 59-60fps, cap_gap_p95 17-19ms. Remaining capture stalls traced
to PipeWire compositor frame delivery (external, not our code).
This commit is contained in:
dailz
2026-06-07 16:55:28 +08:00
parent aae030f309
commit 826f544569
8 changed files with 561 additions and 236 deletions

View File

@@ -14,7 +14,7 @@ signal-hook = "0.3"
signal-hook-mio = { version = "0.2", features = ["support-v1_0"] }
clap = { version = "4", features = ["derive"] }
tracing = "0.1"
tracing-subscriber = "0.3"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
anyhow = "1"
drm = "0.12"
drm-fourcc = "2"