All posts by Ivan Shcherbakov

The new Create-from-use engine makes C++ coding easier

One of the annoyances of C++ compared to higher-level languages like C# is the relatively large overhead when creating new methods or functions. Unlike C# where you can just call the not-yet-existing method and then select “create method stub”, C++ makes things harder: creating a new method normally means manual edits to both header and source files, that can quickly get annoying.

To help VisualGDB users improve their productivity the newest VisualGDB 5.0 Preview 2 build contains a special create-from-use engine that automates method creation in a smart way.
Continue reading The new Create-from-use engine makes C++ coding easier

OpenOCD usability taken to a new level

OpenOCD is a great tool. It allows debugging many modern ARM-based microcontrollers using multitudes of JTAG/SWD programmers: from the $50 FlySwatter to the $800 J-Link Pro. It allows customizing lots of settings and it’s open-source so if you encounter some problems with it, you can quickly pinpoint them by stepping through its code. Despite all those strong sides, it has one major drawback – it’s not easy to setup.

A typical OpenOCD debugging setup involves finding the correct initialization scripts, customizing them to match your device/interface and often manually preparing and installing USB drivers. Although generally doable, this looks like a major inconvenience for anyone who expects a “plug-and-play” experience.
Continue reading OpenOCD usability taken to a new level

Fixing -rpath-link issues with cross-compilers

If you are using a cross-compiler to build a non-trivial Linux app (e.g. including QT libraries), you may encounter errors like this when linking your binary:

ld.exe: warning:, needed by .../, not found (try using -rpath or -rpath-link)
ld.exe: warning:, needed by .../, not found (try using -rpath or -rpath-link)
ld.exe: warning:, needed by .../, not found (try using -rpath or -rpath-link)
ld.exe: warning:, needed by .../, not found (try using -rpath or -rpath-link)
ld.exe: warning:, needed by .../, not found (try using -rpath or -rpath-link)

The output is usually followed by a long list of undefined reference to `xxx’ errors.
Continue reading Fixing -rpath-link issues with cross-compilers

Resolving library symbol load errors when debugging with cross-toolchains

Using cross-compilation toolchains to build code for your embedded Linux boards (such as Raspberry PI) can be cool. It’s faster than building your code on a slow embedded box with few RAM and disk space, it’s easier to edit files directly, it’s more comfortable to have all necessary files at hand.

There’s one common problem, however. If you use your cross-compilation toolchain to debug a remote Linux box using GDBServer, you may end up in a situation when GDB silently fails to load symbols for your libraries unless you manually execute the sharedlibrary command. This article explains why this happens and how to resolve it.
Continue reading Resolving library symbol load errors when debugging with cross-toolchains

Android Virtual Devices – not painfully slow anymore

If you have ever tried debugging anything (especially the native code) using the Android AVD Manager, you have probably noticed how slow it is. Taking minutes to load symbols, multiple seconds to do a single step, sometimes eternity to load and start a massive app…

The reason for it is how the emulation is implemented. Trying to emulate an Android device with an ARM CPU is not an easy job for your desktop computer or a laptop with an x86/x64 processor. The ARM instructions are completely different from the x86 ones so your host CPU cannot execute them directly. Although the Android emulator seems to be based on QEMU that supports dynamic translation, the results are still far from being impressive – a real Android device is usually around 10 times faster.

Things go differently when you want to emulate an x86-based Android device. As the instruction set is now compatible with the one used by your computer’s CPU, the emulation can be done using hardware virtualization techniques and can get as fast as VirtualBox or VMWare do when virtualizing another desktop OS.

A small tool that makes such blazing fast virtualization possible is the Intel Hardware Execution Manager a.k.a. HAXM – it installs as a background service and automatically comes into play when you start an x86-based virtual Android device:

You can easily use it to test your projects with the Android x86 emulator, but you’ll need to do a bit of additional setup selecting x86 as the app ABI if your project uses native code. We’ve published a detailed tutorial explaining how to setup HAXM A to Z.

Enjoy your debugging!

How to debug VS-Android projects

Native Android development can be really puzzling. Multiple tools, multiple versions, undocumented features, script errors, Google clearly stating that NDK will not benefit most apps…

As of February 2013, there are two ways of building a native Android app with Visual Studio: our VIsualGDB tool and the vs-android project. Although vs-android does not include a debugger, it is still used by many developers to build native Android apps. The key is its build system. Being based on the new MSBuild engine introduced with Visual Studio 2010, it creates a separate platform in Visual Studio using the Android GCC instead of Microsoft C++ compiler. Although this does not involve NDK makefiles and will not automatically reflect any changes made with the new NDK releases, it is still a pretty smooth way of building your Android app.

Having said this, we’re proud to announce that VisualGDB 3.0 is now compatible with vs-android. We have made it really simple and smooth: simply open an existing vs-android project, select it as the startup one and then use the Android->Debug Android App command in Visual Studio to deploy and debug it automatically.

There is also a detailed step-by-step tutorial on debugging vs-android projects with VisualGDB.

Building a Win32 GCC cross-compiler for Debian systems

We’ve just finished building a Win32 toolchain for building Raspberry PI applications (BTW, if you have not checked Raspberry PI yet, do it, as it’s a pretty amazing gadget for it price). The build process was far away from being straight-forward and slightly different from the way barebone cross-compilers are built, so this post summarizes all problems and gives workarounds for them.

The build

The main difference between building a cross-compiler for a Linux system and a barebone system (such as arm-eabi) is the immense amount of libraries already available on the target system, each of them having include and library files under /usr/include and /usr/lib. Building separate versions of them on Windows would not only be tricky and time-consuming, but would also cause troubles in case our Windows binaries end up slightly different from the ones on the target Linux machine. So we simply import the headers and the libraries from the Linux machine before building the toolchain.

Importing the files

Identifying what to import is easy: just run this command line on your Linux box:

 echo | gcc -v -E -

You will get something like that:

Using built-in specs.
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.3-8+rpi1' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 4.6.3 (Debian 4.6.3-8+rpi1) 
COLLECT_GCC_OPTIONS='-v' '-E' '-march=armv6' '-mfloat-abi=hard' '-mfpu=vfp'
 /usr/lib/gcc/arm-linux-gnueabihf/4.6/cc1 -E -quiet -v -imultilib . -imultiarch arm-linux-gnueabihf - -march=armv6 -mfloat-abi=hard -mfpu=vfp
ignoring nonexistent directory "/usr/local/include/arm-linux-gnueabihf"
ignoring nonexistent directory "/usr/lib/gcc/arm-linux-gnueabihf/4.6/../../../../arm-linux-gnueabihf/include"
#include "..." search starts here:
#include <...> search starts here:
End of search list.
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "<stdin>"
COLLECT_GCC_OPTIONS='-v' '-E' '-march=armv6' '-mfloat-abi=hard' '-mfpu=vfp'

The keywords here are  #include and LIBRARY_PATH. Most of the directories are empty or do not exist, so on Raspberry PI you end up with 3 key directories to import:

  • /usr/include
  • /usr/lib
  • /lib

Transfer them to your Windows machine and save somewhere preserving the relative paths.

Another important point is to look at the “Configured with” message that the target gcc shows, as we will need to replicate those arguments when building our cross-compiler.

Sysroot vs. Prefix

The usual way of building a barebone cross-compiler is to provide the target and prefix parameters to the configure script and let it decide where to put the includes and libraries:

../gcc-X.Y/configure --prefix=/c/my-gcc-folder/ --target=arm-eabi

This won’t be work with cross-compilers for Linux due to different locations of headers and libraries, so the configuration process is slightly different:

  1. Go to the usual target directory for your toolchain (e.g. c:\folder\arm-linux-gnueabihf)
  2. Create a subdirectory there (e.g. called sysroot, the name is arbitrary)
  3. Copy the imported Linux files to that directory, preserving the paths (e.g. /usr/include becomes c:\folder\arm-linux-gnueabihf\sysroot\usr\include).
  4. When configuring binutils, GCC and GDB provide an additional argument:

That’s not all though. If you are impatient enough to try building your GCC now, you will encounter multiple missing header/library errors when it comes up to building libgcc and other target libraries. In fact, the compiler you’ll get won’t ever find most of the headers. This can be diagnosed by running xgcc.exe the same way we did for Linux gcc:

echo | xgcc -v -E -

You might find some really strange paths there: e.g. c:/folder/sysrootc:/mingw/msys/1.0/include, or some of the Linux library paths (e.g. /usr/lib/arm-linux-gnueabihf) will be missing.

There are two totally different reasons behind this problem:

  1. Trying to maximize compatibility with Windows, MinGW/MSys silently replaces /usr/include with the MSys include directory. While this is useful when building Windows software, it has one inadvertent side effect: it replaces the NATIVE_SYSTEM_HEADER_DIR definition passed to GCC via command line with the MSys include path! Thus gcc ends up trying to append c:/mingw/msys/1.0/include to your sysroot, as it believes it’s the standard include file location.
  2. The Debian Linux distro (that Raspberry PI is based on) uses a slightly different include/lib path structure (e.g. /usr/include/<target>) and building GCC without the Debian patches would result in a crippled GCC.

The second one can be easily fixed by using the sources from Debian repositories (run apt-get source gcc-<version> on your Linux machine or apply Debian patches manually). The first one can be resolved by adding the following code to cppdefault.c:

#if defined(__MINGW32__) &amp;&amp; defined(TARGET_SYSTEM_ROOT)
#define NATIVE_SYSTEM_HEADER_DIR "/usr/include"

Building target libs

Another surprise will be waiting for you when your GCC build process tries to link libgcc. The immense amount of object files stuffed into just one link command line will exceed the Windows maximum command line length and will cause a strange CreateProcess() error to appear. The following command causes the overflow:

    $(subst @multilib_flags@,$(CFLAGS) -B./,$(subst \
        @multilib_dir@,$(MULTIDIR),$(subst \
        @shlib_objs@,$(objects),$(subst \
        @shlib_base_name@,libgcc_s,$(subst \
        @shlib_map_file@,$(mapfile),$(subst \
        @shlib_slibdir_qual@,$(MULTIOSSUBDIR),$(subst \

The following simple hack changes it to use a response file instead:

    echo $(objects) > __objects.txt
    $(subst @multilib_flags@,$(CFLAGS) -B./,$(subst \
        @multilib_dir@,$(MULTIDIR),$(subst \
        @shlib_objs@,@__objects.txt,$(subst \
        @shlib_base_name@,libgcc_s,$(subst \
        @shlib_map_file@,$(mapfile),$(subst \
        @shlib_slibdir_qual@,$(MULTIOSSUBDIR),$(subst \

Another problem would be related to wrong definition of caddr_t caused by conflicting Windows and Linux definitions and is resolved by adding -DUSED_FOR_TARGET to CFLAGS used for building libgcc.

Debugging with GDB

The sysroot trick is vital if you are planning to debug your apps with gdb running on Windows via gdbserver. Unless your sysroot structure exactly replicates the Linux directory structure and unless gdb is also configured with the –with-sysroot argument, you’ll get the following error message when trying to debug your programs:

warning: Unable to find dynamic linker breakpoint function.

This would mean a failure trying to load symbols for rendering most of module-related functionality inoperable.  Needless to say, your debugging won’t be very usable afterwards…

The summary

Summarizing all previously said, here’s a checklist for building your Linux cross-compiler on Windows:

  • If you are targeting Debian, apply Debian patches to GCC.
  • Fix NATIVE_SYSTEM_HEADER_DIR inside cppdefault.c
  • Use –with-sysroot when configuring binutils, gcc and gdb
  • Import Linux includes and libs before building gcc
  • Fix $(objects) in libgcc/Makefile
  • Add -DUSED_FOR_TARGET to libgcc CFLAGS
  • Test your toolchain with some C++ code (including STL)
  • Test debugging with gdbserver to ensure all symbols are usable

If you like it simple…

If you don’t feel like reliving the adventurous enterprise of building your own Windows toolchain for Raspberry PI, you can download our pre-built version.




Beware: a bug in memcpy() in MacOS 10.7 Kernel

I was just creating a truncated port of STLPort for MacOS kernel environment for one of our Mac drivers and stumbled upon a nasty bug. Any attempt to initialize an std::string immediately caused a kernel panic.

Investigating the problem revealed that the memcpy() function that is manually coded in assembly does not actually care about the return value. Makes sense, how often did you use the return value of memcpy()? I never did. Just until finding out that STLPort heavily does and crashes in case it’s wrong.

I’ve created a simple test case to reproduce the bug:

The __ucopy_trivial() function that is supposed to return the pointer to the end of the destination array actually returns 0x02. Looking more into the memcpy() function shows that the authors have simply forgotten to do anything with the $rax register holding the return value:

That’s consistent wit the contents of the xnu-1699.22.81/osfmk/x86_64/bcopy.s file:

/* void *memcpy((void *) to, (const void *) from, (size_t) bcount) */
/*            rdi,              rsi,          rdx   */
 * Note: memcpy does not support overlapping copies
    movq    %rdx,%rcx
    shrq    $3,%rcx                /* copy by 64-bit words */
    cld                    /* copy forwards */
    movq    %rdx,%rcx
    andq    $7,%rcx                /* any bytes left? */

Oops, what %rax? 🙂

I believe one of the reasons reason why this bug has not been immediately discovered is because in most of the cases when you call memcpy(), gcc will use the $rax register to hold the first argument before placing it to $rdi. Thus if you try to reproduce the bug with a simple call to memcpy() alone, you won’t see any problem:

The solution

The solution is simple: just make a wrapper around memcpy() and put it somewhere in your global header files so that the memcpy()-related code will actually use it:

static inline void *memcpy_workaround(void *dst, 
                                      const void *src, size_t len)
    memcpy(dst, src, len);
    return dst;

#define memcpy memcpy_workaround



Making breakpoints work with VMWare gdb stub

If you are debugging your Linux or MacOS kernel drivers frequently, running the guest OS under VMWare and using the VMWare gdb stub can save you a lot of time: fast reliable debugging experience handled by VMWare on top of the operating system itself.

There’s just one “feature” that can cost you a lot of time if you’re and just took me a whole evening to figure out.

When you add debug stub support to your VM, you add something like this to your VMX file:

debugStub.listen.guest32.remote = "TRUE" 
debugStub.listen.guest64.remote = "TRUE"

Then you start debugging your kernel being so happy about the speed and reliability and then suddenly notice that you can’t set more than 4 breakpoints.

The problem is that VMWare 8 uses the “hide breakpoints” mode by default. I.e. instead of making your breakpoints with the “int 3” instructions like any normal debugger would do, it uses the scarce hardware debugging registers, limiting the amount of the breakpoints you can have to 4. It happens regardless of the GDB command you use to set the breakpoint. I.e. the normal “break” will still be interpreted as “hbreak”.

The solution is very simple. Just add this line to your VMX file:

debugStub.hideBreakpoints = "FALSE"

This disables the breakpoint hiding mode and lets VMWare use the good old int 3 instruction to set the breakpoints so that you can debug normally again.