Detailed Description

Shared u32 "spread LUT" bit-matrix transpose primitive (no SIMD, no u64).

Benchmarked (#2533) on the ESP32-P4 (RV32, in-order): ~2× faster than the unrolled-naive transpose (16-lane 6649→3353 µs = 1.98×; 8-lane 3356→1764 µs = 1.90×; bit-exact). For each output it computes acc = OR over 8 lanes of spread(laneByte) << lane where spread() pre-positions a byte's pulse bits into 8 separate bytes. All ops are native u32 and independent (no dependency chain, no emulated u64) — exactly what the in-order core schedules best — and the tiny table is cache-resident. Used by both wave8 (8 symbols) and wave3 (3 symbols); the per-symbol op is an 8-bit transpose, identical for both.

Definition in file bit_spread_lut.hpp.

#include "fl/stl/compiler_control.h"
#include "fl/stl/int.h"

Include dependency graph for bit_spread_lut.hpp:

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Namespaces
namespace	fl
	Base definition for an LED controller.

namespace	fl::detail
	Compile-time linker keep-alive hook for a single `fl::Bus`.

Functions
FASTLED_FORCE_INLINE FL_IRAM FL_OPTIMIZE_FUNCTION void	fl::detail::spread_transpose16_symbol (const u8 l[16], u8 out[16])
	Transpose one symbol of 16 lanes (16 input bytes) into 16 output bytes: 8 pulses × 2 bytes, low byte = lanes 0-7, high byte = lanes 8-15, pulse order 7..0 (out[0] = pulse 7 low).

FASTLED_FORCE_INLINE FL_IRAM FL_OPTIMIZE_FUNCTION void	fl::detail::spread_transpose8_symbol (const u8 l[8], u8 out[8])
	Transpose one symbol of 8 lanes (8 input bytes) into 8 output bytes: 8 pulses × 1 byte (bit L = lane L), pulse order 7..0 (out[0] = pulse 7).

FASTLED_FORCE_INLINE u32	fl::detail::spreadA (u8 v)
	Pulses 7,6,5,4 of v (byte j = bit (7-j)). Depends only on the high nibble.

FASTLED_FORCE_INLINE u32	fl::detail::spreadB (u8 v)
	Pulses 3,2,1,0 of v (byte j = bit (3-j)). Depends only on the low nibble.

Variables
constexpr u32	fl::detail::kSpreadNibble [16]
	kSpreadNibble[n] places the 4 bits of nibble n at bit 0 of 4 separate bytes: byte0 = bit3(n), byte1 = bit2(n), byte2 = bit1(n), byte3 = bit0(n).

Detailed Description

Namespaces

Functions

Variables