Split synchronous encode pipeline so sws_scale + libx264 runs on a dedicated thread, leaving only VAAPI import + GPU scale + GPU→CPU transfer on the main capture thread. Problem: encode_p95 occasionally hit 74ms, blocking the entire capture pipeline and causing capture_gap_max=356ms stutter. Solution: - avhw.rs: Split SwEncState into SwEncImport (main thread: VAAPI import, filter_graph scale, GPU→CPU transfer) and SwEncEncode (encode thread: sws_scale NV12→YUV420P, libx264 encode). New CpuNv12Frame struct carries owned pixel data across threads via crossbeam channel. SwEncState wraps both for backward compat (MP4/sync path untouched). - state_portal.rs: WebRTC portal path spawns 'wl-webrtc-encode' thread with bounded(2) input channel (drop-newest backpressure) and separate timing channel. Graceful shutdown: drop webrtc_rx → drop input_tx → join encode thread → flush sync encoder. - stats.rs: Add record_import() + record_encode_thread() for async timing. Results: encode_p95 stable at 2.9-4.2ms (was 11-74ms), capture_fps stable 59-60fps, cap_gap_p95 17-19ms. Remaining capture stalls traced to PipeWire compositor frame delivery (external, not our code).
66 KiB
66 KiB