lines_read was incorrectly set to max_lines.min(total) (the loop upper bound)
instead of the actual number of successfully read lines. Now tracks lines_read
via get_line(i).is_some() counter and uses max_lines.min(total) as the correct
loop upper bound to handle empty file edge case.
Fixes#43
Replace `static mut OLD_SIGBUS_HANDLER` with AtomicU8 + AtomicPtr to
remove data race UB when concurrent benchmarks call open() from multiple
threads.
Key changes:
- Use `Once::call_once` to guarantee single handler installation
- Publish old handler to atomics BEFORE installing new handler (closes
the handler-active-but-state-unpublished race window)
- Read atomics with Acquire in signal handler (async-signal-safe)
- Align si_addr to page boundary before mmap(MAP_FIXED)
- Add concurrent test: 8 threads open all 5 variants simultaneously