FastLED 3.9.15
Loading...
Searching...
No Matches

◆ wave8Transpose_16x4_bf1_pipe4()

FL_OPTIMIZE_FUNCTION FL_IRAM void fl::wave8Transpose_16x4_bf1_pipe4 ( const u8(&) lanes_a[16],
const u8(&) lanes_b[16],
const u8(&) lanes_c[16],
const u8(&) lanes_d[16],
const Wave8ByteExpansionLut & lut,
u8(&) output_a[16 *sizeof(Wave8Byte)],
u8(&) output_b[16 *sizeof(Wave8Byte)],
u8(&) output_c[16 *sizeof(Wave8Byte)],
u8(&) output_d[16 *sizeof(Wave8Byte)] )

BF1 + pipe4: 4-position-pipelined direct encode (#2548 deep-dive).

Combines BF1's algorithmic reduction with pipe4 cross-position ILP. Empirical peak of all prototypes: 9 651 → 1 757 µs/frame = 5.49× speedup on P4 v1.3 16-lane × 256 LEDs. 4.4× faster than the 7 680 µs WS2812B 16-lane TX target — encode now has massive headroom for ISR-driven chunked streaming.

Definition at line 198 of file wave8.cpp.hpp.

206 {
207 const u8 W0 = lut.lut[0x00].symbols[0].data;
208 const u8 W1 = lut.lut[0xFF].symbols[0].data;
209 detail::wave8_transpose_16x4_bf1_pipe4(lanes_a, lanes_b, lanes_c, lanes_d,
210 W0, W1,
211 output_a, output_b, output_c, output_d);
212}
FASTLED_FORCE_INLINE FL_IRAM FL_OPTIMIZE_FUNCTION void wave8_transpose_16x4_bf1_pipe4(const u8 lanes_a[16], const u8 lanes_b[16], const u8 lanes_c[16], const u8 lanes_d[16], u8 W0, u8 W1, u8 output_a[16 *sizeof(Wave8Byte)], u8 output_b[16 *sizeof(Wave8Byte)], u8 output_c[16 *sizeof(Wave8Byte)], u8 output_d[16 *sizeof(Wave8Byte)])
BF1 + pipe4: 4-position software-pipelined BF1 (#2548 deep-dive).
Definition wave8.hpp:464
unsigned char u8
Definition stdint.h:131
@ W1
White is second.
Definition eorder.h:26
@ W0
White is first.
Definition eorder.h:27

References FL_RESTRICT_PARAM, W0, W1, and fl::detail::wave8_transpose_16x4_bf1_pipe4().

+ Here is the call graph for this function: