How to enable NEON and VFPV3 when compiling in C++ for Beaglebone/RPi?

Sysprogs forums Forums VisualGDB How to enable NEON and VFPV3 when compiling in C++ for Beaglebone/RPi?

Tagged: , ,

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
    Posts
  • #24604
    hanooi
    Participant

    Hello,

    I’m following the building OpenCV for the Raspberry Pi 2 example.  I’m interested in enabling the use of NEON and VFPV3 for my compiled shared objects but I don’t know where to go in VisualGDB Project Properties to set these flags.  Per this article but for Python, using the NEON instructions can result in a 40% to 50% speedup:

    Optimizing OpenCV on the Raspberry Pi

    The flags in the above article were set by adding the “-D ENABLE_NEON=ON -D ENABLE_VFPV3=ON” to cmake.

    Thanks,

    Han Ooi

     

    #24605
    aronrubin
    Participant

    I have the following in my project’s CMakeLists.txt:

    if(CMAKE_SYSTEM_PROCESSOR MATCHES “BCM28|armv7”)
    add_compile_options(-march=armv8-a+crc -mcpu=cortex-a53 -mfpu=neon-fp-armv8)
    endif()

    #24608
    support
    Keymaster

    Hi All,

    Thanks @aronrubin for sharing your solution!

    Editing CMakeLists.txt should indeed do the trick. If not, please check the generated .mak files for the compiler command lines (if -mfpu=neon-fp-armv8 does not appear there, the CMAKE_SYSTEM_PROCESSOR might not be matched properly).

    If you would like to follow the tutorial linked above to the letter, you can use the VisualGDB Project Properties -> CMake Project Settings -> Extra CMake Configuration Variables setting to specify flags like ENABLE_NEON. E.g. simply ENABLE_NEON=ON”.

    #24621
    aronrubin
    Participant

    To determine if opencv is using these instructions at runtime use the command line:

    opencv_version --hw

    #24622
    support
    Keymaster

    Thanks very much for sharing this!

    #24625
    hanooi
    Participant

    Did just that and got the below. It worked! Thanks so much!

    4.0.1
    OpenCV’s HW features list:
    ID= 1 (MMX) -> N/A
    ID= 2 (SSE) -> N/A
    ID= 3 (SSE2) -> N/A
    ID= 4 (SSE3) -> N/A
    ID= 5 (SSSE3) -> N/A
    ID= 6 (SSE4.1) -> N/A
    ID= 7 (SSE4.2) -> N/A
    ID= 8 (POPCNT) -> N/A
    ID= 9 (FP16) -> ON
    ID= 10 (AVX) -> N/A
    ID= 11 (AVX2) -> N/A
    ID= 12 (FMA3) -> N/A
    ID= 13 (AVX512F) -> N/A
    ID= 14 (AVX512BW) -> N/A
    ID= 15 (AVX512CD) -> N/A
    ID= 16 (AVX512DQ) -> N/A
    ID= 17 (AVX512ER) -> N/A
    ID= 18 (AVX512IFMA) -> N/A
    ID= 19 (AVX512PF) -> N/A
    ID= 20 (AVX512VBMI) -> N/A
    ID= 21 (AVX512VL) -> N/A
    ID=100 (NEON) -> ON
    ID=200 (VSX) -> N/A
    ID=201 (VSX3) -> N/A
    ID=256 (AVX512-SKX) -> N/A
    Total available: 2

Viewing 6 posts - 1 through 6 (of 6 total)
  • You must be logged in to reply to this topic.