CPU Instructions & Intrinsics¶
Overview¶
Assembly¶
-
winasm: The x86 Assembly community and official home of WinAsm Studio and HiEditor
-
0xAX/asm: Learning assembly for linux-x64
SIMD¶
Intel MMX & SSE¶
-
SSE (Streaming SIMD Extentions)
- SSE - Vectorizing conditional code
- SSE图像算法优化系列(cnblogs)
ARM NEON¶
Arm NEON technology is an advanced SIMD (single instruction multiple data) architecture extension for the Arm Cortex-A series and Cortex-R52 processors.
Compiler Options:
- test ARM NEON
-
Raspberry Pi 3 Model B
-
g++ options
- for the compilation error
error: ‘vfmaq_f32’ was not declared in this scope
, you might add the option-mfpu=neon-vfpv4
to enable__ARM_FEATURE_FMA
in arm_neon.h
Reference Books:
- NEON Programmer’s Guide
- ARM® NEON Intrinsics Reference