|
FastLED 3.9.15
|
| FASTLED_FORCE_INLINE FL_IRAM FL_OPTIMIZE_FUNCTION void fl::detail::wave8_transpose_16_bf1 | ( | const u8 | lanes[16], |
| u8 | W0, | ||
| u8 | W1, | ||
| u8 | output[16 *sizeof(Wave8Byte)] ) |
BF1: chipset-aware direct encode for Wave8 16-lane (#2548 deep-dive).
Bypasses the byte_lut entirely by exploiting the algebraic identity: output_bit(s, p, lane) = M0_p XOR (input_bit_(7-s)_of_lane AND D_p) where W0/W1 are the bit-0 / bit-1 waveform patterns (chipset constants), M0_p = (W0 >> (7-p)) & 1, D_p = M0_p XOR M1_p. The bit-transpose of input lane bytes (giving the per-symbol column bytes) is computed ONCE by spread_transpose16_symbol — replacing the prior 8 calls (one per symbol) plus the 16 byte_lut expansions.
Bit-identical to wave8_transpose_16(expand(lanes_input), output). Works for ANY Wave8 chipset/timing — W0/W1 are derived from the byte_lut at runtime.
Measured 6 822 → 1 757 µs/frame (3.88× faster than pipe4, 5.49× vs baseline) when fused with pipe4 (#2548 final).
Definition at line 315 of file wave8.hpp.
References spread_transpose16_symbol(), fl::W0, and fl::W1.
Referenced by fl::wave8Transpose_16_bf1().
Here is the call graph for this function:
Here is the caller graph for this function: