Home » Android » c++ – Segmentation fault when using dlclose(…) on android platform

c++ – Segmentation fault when using dlclose(…) on android platform

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have some problems when using the dynamic loading API (<dlfcn.h>: dlopen(), dlclose(), etc) on Android.
I’m using NDK standalone toolchain (version 8) to compile the applications and libraries.
The Android version is 2.2.1 Froyo.

Here is the source code of the simple shared library.

#include <stdio.h>

int iii = 0;
int *ptr = NULL;

__attribute__((constructor))
static void init()
{
    iii = 653;
}

__attribute__((destructor))
static void cleanup()
{
}

int aaa(int i)
{
    printf("aaa %d\n", iii);
}

Here is the program source code which uses the mentioned library.

#include <dlfcn.h>
#include <stdlib.h>
#include <stdio.h>

int main()
{
    void *handle;
    typedef int (*func)(int);
    func bbb;

    printf("start...\n");

    handle = dlopen("/data/testt/test.so", RTLD_LAZY);
    if (!handle)
    {
        return 0;
    }

    bbb = (func)dlsym(handle, "aaa");
    if (bbb == NULL)
    {
        return 0;
    }

    bbb(1);

    dlclose(handle);
    printf("exit...\n");

    return 0;
}

With these sources everything is working fine, but when I try to use some STL functions or classes, the program crashes with a segmentation fault, when the main() function exits, for example when using this source code for the shared library.

#include <iostream>

using namespace std;

int iii = 0;
int *ptr = NULL;

__attribute__((constructor))
static void init()
{
    iii = 653;
}

__attribute__((destructor))
static void cleanup()
{
}

int aaa(int i)
{
    cout << iii << endl;
}

With this code, the program crashes with segmentation fault after or the during main() function exit.
I have tried couple of tests and found the following results.

  1. Without using of STL everything is working fine.
  2. When use STL and do not call dlclose() at the end, everything is working fine.
  3. I tried to compile with various compilation flags like -fno-use-cxa-atexit or -fuse-cxa-atexit, the result is the same.

What is wrong in my code that uses the STL?

How to&Answers:

Looks like I found the reason of the bug. I have tried another example with the following source files:
Here is the source code of the simple class:
myclass.h

class MyClass
{
public:
    MyClass();
    ~MyClass();
    void Set();
    void Show();
private:
    int *pArray;
};

myclass.cpp

#include <stdio.h>
#include <stdlib.h>
#include "myclass.h"

MyClass::MyClass()
{
    pArray = (int *)malloc(sizeof(int) * 5);
}

MyClass::~MyClass()
{
    free(pArray);
    pArray = NULL;
}

void MyClass::Set()
{
    if (pArray != NULL)
    {
        pArray[0] = 0;
        pArray[1] = 1;
        pArray[2] = 2;
        pArray[3] = 3;
        pArray[4] = 4;
    }
}

void MyClass::Show()
{
    if (pArray != NULL)
    {
        for (int i = 0; i < 5; i++)
        {
            printf("pArray[%d] = %d\n", i, pArray[i]);
        }
    }
}

As you can see from the code I did not used any STL related stuff.
Here is the source files of the functions library exports.
func.h

#ifdef __cplusplus
extern "C" {
#endif

int SetBabe(int);
int ShowBabe(int);

#ifdef __cplusplus
}
#endif

func.cpp

#include <stdio.h>
#include "myclass.h"
#include "func.h"

MyClass cls;

__attribute__((constructor))
static void init()
{

}

__attribute__((destructor))
static void cleanup()
{

}

int SetBabe(int i)
{
    cls.Set();
    return i;
}

int ShowBabe(int i)
{
    cls.Show();
    return i;
}

And finally this is the source code of the programm that uses the library.
main.cpp

#include <dlfcn.h>
#include <stdlib.h>
#include <stdio.h>
#include "../simple_lib/func.h"

int main()
{
    void *handle;
    typedef int (*func)(int);
    func bbb;

    printf("start...\n");

    handle = dlopen("/data/testt/test.so", RTLD_LAZY);
    if (!handle)
    {
        printf("%s\n", dlerror());
        return 0;
    }

    bbb = (func)dlsym(handle, "SetBabe");
    if (bbb == NULL)
    {
        printf("%s\n", dlerror());
        return 0;
    }
    bbb(1);

    bbb = (func)dlsym(handle, "ShowBabe");
    if (bbb == NULL)
    {
        printf("%s\n", dlerror());
        return 0;
    }
    bbb(1);

    dlclose(handle);
    printf("exit...\n");

    return 0;
}

Again as you can see the program using the library also does not using any STL related stuff, but after run of the program I got the same segmentation fault during main(...) function exit. So the issue is not connected to STL itself, and it is hidden in some other place. Then after some long research I found the bug.
Normally the destructors of static C++ variables are called immediately before main(...) function exit, if they are defined in main program, or if they are defined in some library and you are using it, then the destructors should be called immediately before dlclose(...).
On Android OS all destructors(defined in main program or in some library you are using) of static C++ variables are called during main(...) function exit. So what happens in our case? We have cls static C++ variable defined in library we are using. Then immediately before main(...) function exit we call dlclose(...) function, as a result library closed and cls becomes non valid. But the pointer of cls is stored somewhere and it’s destructor should be called during main(...) function exit, and because at the time of call it is already invalid, we get segmentation fault. So the solution is to not call dlclose(...) and everything should be fine. Unfortunately with this solution we cannot use attribute((destructor)) for deinitializing of something we want to deinitialize, because it is called as a result of dlclose(...) call.

Answer:

I have a general aversion to calling dlclose(). The problem is that you must ensure that nothing will try to execute code in the shared library after it has been unmapped, or you will get a segmentation fault.

The most common way to fail is to create an object whose destructor is defined in or calls code defined in the shared library. If the object still exists after dlclose(), your app will crash when the object is deleted.

If you look at logcat you should see a debuggerd stack trace. If you can decode that with the arm-eabi-addr2line tool you should be able to determine if it’s in a destructor, and if so, for what class. Alternatively, take the crash address, strip off the high 12 bits, and use that as an offset into the library that was dlclose()d and try to figure out what code lives at that address.

Answer:

I encountered the same headache on Linux. A work-around that fixes my segfault is to put these lines in the same file as main(), so that dlclose() is called after main returns:

static void* handle = 0;
void myDLClose(void) __attribute__ ((destructor));
void myDLClose(void)
{
    dlclose(handle);
}

int main()
{
    handle = dlopen(...);
    /* ... real work ... */
    return 0;
}

The root cause of dlclose-induced segfault may be that a particular implementation of dlclose() does not clean up the global variables inside the shared object.

Answer:

You need to compile with -fpic as a compiler flag for the application that is using dlopen() and dlclose(). You should also try error handling via dlerror() and perhaps checking if the assignment of your function pointer is valid, even if it’s not NULL the function pointer could be pointing to something invalid from the initialization, dlsym() is not guaranteed to return NULL on android if it cannot find a symbol. Refer to the android documentation opposed to the posix compliant stuff, not everything is posix compliant on android.

Answer:

You should use extern “C” to declare you function aaa()