Android NativeCrash capture and parsing

In Android development, NE has always been a problem that can not be ignored but is extremely difficult to solve. The reason is that it involves cross-end development and analysis. You need to be familiar with Java, C & C + + and NDK development at the same time, and the solution is not as clear as Java exceptions. In order to solve some doubts, this paper will explore the capture, analysis and restoration of NE.

1, NE introduction

The full name of NE is NativeCrash, which refers to the errors generated during the operation of C or C + +. Ne is different from ordinary Java errors. Ordinary logcat cannot be directly restored to a readable stack. Generally, it has no source code and cannot be debugged.

Therefore, engineers in the daily application layer usually ignore NE errors even if we have cloud diagnosis logs inside. In case of these problems, can engineers in the application layer who do not know much about C + + solve the problem of restoring the stack, quickly locate or solve the problem of NE?

The following will focus on:

1.1 so composition

Let's first understand the composition of so. A complete so is composed of C code and some debug information. These debug information will record the comparison table of all methods in so, that is, the corresponding table of method name and its cheap address, which is also called symbol table. This kind of so is also called non strip, and the general volume will be relatively large.

Usually, the so of release needs a strip operation, so the debug information in the so after strip will be stripped, and the volume of the whole so will be reduced.

As shown in the figure below:

 

You can see the size comparison before and after strip as follows.

If you don't know NE or so, you can simply understand the debug information as a mapping file in Java code confusion. Only with this mapping file can you perform stack analysis.

If the stack information is lost, the stack cannot be restored and the problem cannot be solved.

Therefore, these debug information is particularly important. It is the key information for us to analyze the NE problem. When compiling so, we must keep a copy of so that has not been striped or stripped symbol table information for later problem analysis, and each compiled so needs to be saved. Once the code is modified and recompiled, the symbol table information before and after modification will not correspond and cannot be analyzed.

1.2 viewing so status

In fact, you can also view the status of so through the command line. You can use the file command on the Mac. You can view some basic information of so in the return value of the command.

As shown in the figure below, striped stands for so without debug information, with debug_ Info, not striped represents so with debug information.

file libbreakpad-core-s.so
libbreakpad-core-s.so: *******, BuildID[sha1]=54ad86d708f4dc0926ad220b098d2a9e71da235a, stripped
file libbreakpad-core.so
libbreakpad-core.so: ******, BuildID[sha1]=54ad86d708f4dc0926ad220b098d2a9e71da235a, with debug_info, not stripped

If you are a Windows system, I advise you to install a Linux subsystem and execute the same command in Linux. You can also get this information.

Next, let's take a look at how we get so in two states.

1.3 get strip and so not striped

At present, Android Studio will output strip and non strip so at the same time whether it is compiled by mk or Cmake. As shown in the figure below, the two corresponding so generated by Cmake compiling so.

so path before strip: build/intermediates/transforms/mergeJniLibs

so path after strip: build/intermediates/transforms/stripDebugSymbol

In addition, you can also manually strip through the tool aarch64 Linux Android strip provided by the Android SDK. Aarch64 Linux Android strip is located in the directory / Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains.

There are many versions of this tool, mainly for different mobile phone CPU architectures. If you don't know the CPU architecture of the mobile phone, you can connect the mobile phone and use the following command to view it:

adb shell cat /proc/cpuinfo
Processor   : AArch64 Processor rev 12 (aarch64)

As shown in the figure above, the CPU of my mobile phone uses aarch64, so the corresponding tool aarch64 Linux Android strip is used. Since NDK provides many tools, it can be used according to this principle in the future:

aarch64 framework
/Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains/aarch64-linux-android-4.9/prebuilt/darwin-x86_64/bin/aarch64-linux-android-strip
arm framework
/Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-android-strip

Use the following command to strip the debug so directly

aarch64-linux-android-strip --strip-all libbreakpad-core.so

When compiling with Cmake, you can add the following commands to directly compile the so of strip

#set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} -s")
#set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -s")

When compiling with mk file, you can add the following commands or directly compile the so of strip

-fvisibility=hidden

2, NE capture and analysis

NE parsing is stack parsing. Of course, all the premise is that you need to save a signed table, that is, an so that is not striped. If you only have so after strip, there is nothing you can do. The stack can hardly be restored.

There are generally three ways to capture and restore the stack.

2.1 logcat capture

As the name suggests, it is captured through logcat. We open logcat through Android Studio and create an NE. We can only see many symbols like #00 PC 00000000000011a0. There is no log that can be read directly. We want to output a log that can be read directly through logcat.

You can use NDK stack, a tool provided under Android/SDK/NDK, which can directly parse the log output by NE into a readable log.

ndk stack is usually located under the tools of ndk, and the address under Mac is

/Users/XXXX/Library/Android/sdk/ndk/21.3.6528147/ndk-stack

Then execute console commands in this directory, or execute them in terminal of Android Studio

adb shell logcat | androidsdk Absolute path/ndk-stack -sym so Directory

In this way, when NE occurs in the application, the console will output the following log. It can be seen from the log that the so corresponding to the crash and the corresponding method name. If there is the source code of c, it is easy to locate the problem.

promote:~ njvivo$ adb shell logcat | ndk-stack -sym libbreakpad-core.so
********** Crash dump: **********
Build fingerprint: 'vivo/PD1809/PD1809:8.1.0/OPM1.171019.026/compil04252203:user/release-keys'
#00 0x00000000000161a0 /data/app/com.android.necase-lEp0warh8FqicyY1YqGXXA==/lib/arm64/libbreakpad-core.so (Java_com_online_breakpad_BreakpadInit_nUpdateLaunchInfo+16)
#01 0x00000000000090cc /data/app/com.android.necase-lEp0warh8FqicyY1YqGXXA==/oat/arm64/base.odex (offset 0x9000)
Crash dump is completed

In fact, the principle of NDK stack is internal integration. addr2line is used to parse the stack in real time and display it on the console.

Seeing some small partners here, I feel that this is not very simple, but the actual crash scenario is not easy to reproduce. Second, the user's scenario is sometimes difficult to simulate. Then how to monitor and locate the online NE crash? There are two ways.

2.2 log analysis through DropBox -- applicable to system applications

This is very simple. DropBox will record various logs of JE, NE and ANR. You only need to send the logs below DropBox for analysis and solution. A log example is pasted below.

 

Resolution scheme 1:

With the help of the above NDK stack tool, you can directly parse the logs under the DropBox into a stack, from which you can see that the crash is on the breakpad CPP line 111 in the Crash() method.

ndk-stack -sym /Users/njvivo/Desktop/NE -dump data_app_native_crash@1605531663898.txt
********** Crash dump: **********
Build fingerprint: 'vivo/PD1809/PD1809:8.1.0/OPM1.171019.026/compil04252203:user/release-keys'
#00 0x00000000000161a0 /data/app/com.android.necase-lEp0warh8FqicyY1YqGXXA==/lib/arm64/libbreakpad-core.so (Java_com_online_breakpad_BreakpadInit_nUpdateLaunchInfo+16)
Crash()
/Users/njvivo/Documents/project/Breakpad/breakpad-build/src/main/cpp/breakpad.cpp:111:8
Java_com_online_breakpad_BreakpadInit_nUpdateLaunchInfo
/Users/njvivo/Documents/project/Breakpad/breakpad-build/src/main/cpp/breakpad.cpp:122:0
#01 0x00000000000090cc /data/app/com.android.necase-lEp0warh8FqicyY1YqGXXA==/oat/arm64/base.odex (offset 0x9000)
Crash dump is completed

Resolution scheme 2:

Or use the tool Linux Android addr2line provided by Android/SDK/NDK, which is located in the directory / Users/njvivo/Library/Android/sdk/ndk and has two versions.

aarch64 framework
/Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains/aarch64-linux-android-4.9/prebuilt/darwin-x86_64/bin/aarch64-linux-android-addr2line
arm framework
/Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-addr2line

The command is used as follows. The crash address and method can also be resolved by combining the so that is not strip ed and the stack symbol 00000000000011a0 that appears in the log.

aarch64-linux-android-addr2line -f -C -e libbreakpad-core.so 00000000000161a0
 
Crash()
/Users/njvivo/Documents/project/Breakpad/breakpad-build/src/main/cpp/breakpad.cpp:111

Based on the above, it seems very simple, but there is a fatal problem that DropBox can only be accessed by system applications, and non system applications can't get logs at all. What should non system applications do?

2.3 capture and analysis through BreakPad - applicable to all applications

Non system applications can be monitored and analyzed through the open source tool BreakPad provided by google. Crash SDK also adopts this method. It can monitor the occurrence of NE in real time and record relevant files, so as to report the crash and the startup and scene of the corresponding application crash.

Here is a brief introduction to how to use BreakPad.

2.3.1 implementation function of breakpad

BreakPad mainly provides two functions: NE listening and callback, generating minidump files, that is, files at the end of dmp, and two tools, symbol table tool and stack restore tool.

 

  • Symbol table tool: used to extract debug information from so and obtain the symbol table corresponding to the stack.
  • Stack restore tool: used to restore the dump file generated by BreakPad into symbols, that is, stack offset value.

These two tools will be generated when compiling the source code of BreakPad.

Minidump will be generated after compilation_ Stackwalk tool. If some students don't want to compile, Android Studio itself also provides this tool.

This minidump_ The stackwalk program also exists under the directory of Android Studio. You can take it out and use it directly. If you don't want to compile, you can take it directly under the directory. The Mac path is:

/Applications/Android Studio.app/Contents/bin/lldb/bin/minidump_stackwalk

 

2.3.2 capture principle of breakpad

It can be seen from the above that when an ne crash occurs in the application, BreakPad can write the minidump file corresponding to NE locally and call back to the application layer. The application layer can do some processing for this crash to capture statistics. After uploading the minidump file, it will be combined with minidump_stackwalk and addr2line tools can restore the actual stack. The schematic diagram is as follows:

 

When NE occurs in the application, BreakPad will generate a dump file locally on the mobile phone, as shown in the figure:

After obtaining the above files, we can only know that NE has occurred in the application, but these files are actually unreadable and need to be parsed.

Here's how to analyze the NE generated above:

2.3.3 parsing dump file

1. Get the dump file of NE crash and transfer the minidump file just obtained_ Stackwalk and dump files can be placed in the same directory or not. When filling in the path, you can fill in the absolute path.

Then execute the following command in the terminal window under this directory, which means minidump_stackwalk parses the dump file, and the parsed information is output to the crashlog in the current directory Txt file.

./minidump_stackwalk xxxxxxxx.dmp >crashLog.txt

2. After execution, minidump_stackwalk will write NE related information to crashlog Txt, as shown in the figure:

 

3. According to the parsed NE information, pay attention to the red box in the figure, and you can know the cause of the crash libbreakpad-core.so Inside, 0x161a0 indicates that the crash occurs at a position offset 161a0 from the root position

2.3.4 get crash stack

1. Using the addr2line tool mentioned earlier, the method, number of lines and call stack relationship of crash can be obtained according to the so file of crash and the offset address (0x161a0).

2. Run the following command in the terminal window of its root directory pair.

arm-linux-androideabi-addr2line -C -f -e ${SOPATH} ${Address}
-C -f           //Number of lines in which the function name is printed incorrectly
-e                //Print the corresponding path and number of lines of the wrong address
${SOPATH}         //so library path
${Address}        //Multiple stack error message addresses that need to be converted can be added, but the middle should be separated by spaces, for example: 0x161a0

3. The following figure is an example of real operation

aarch64-linux-android-addr2line -f -C -e libbreakpad-core.so 0x161a0
Crash()
/Users/njvivo/Documents/project/Breakpad/breakpad-build/src/main/cpp/breakpad.cpp:111

As can be seen from the above figure, the crash occurred on breakpad On line 111 of cpp file, the function name is Crash(), which is consistent with the real file. The crash code is as follows:

void Crash() {
    volatile int *a = (int *) (NULL);
    *a = 1; //This is 111 lines in the code
}
 
extern "C"
JNIEXPORT void JNICALL
Java_com_online_breakpad_BreakpadInit_nUpdateLaunchInfo(JNIEnv *env, jobject instance,
                                                        jstring mLaunchInfoStr_) {
 
 
    DO_TRY
    {
        Crash();
        const char *mLaunchInfoStr = env->GetStringUTFChars(mLaunchInfoStr_, 0);
        launch_info = (char *) mLaunchInfoStr;
//        env->ReleaseStringUTFChars(mLaunchInfoStr_, mLaunchInfoStr);
    }
    DO_CATCH("updateLaunchInfo");
 
}

Based on the above, you can apply the collected dump file to parse the detailed stack information of NE.

III. extraction of so# symbol table

3.1 symbol table for extracting so

From the above, we know that so contains some debug information, also known as symbol table. How can we separate these debug information? ndk also provides us with relevant tools.

aarch64 framework
/Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains/aarch64-linux-android-4.9/prebuilt/darwin-x86_64/bin/aarch64-linux-android-objdump
arm framework
/Users/njvivo/Library/Android/sdk/ndk/21.3.6528147/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-android-objdump

The following is the operation mode of the command. Through this command, the debug information in so can be extracted into the file.

promote:~ njvivo$ aarch64-linux-android-objdump -S libbreakpad-core.so > breakpad.asm

 

3.2 symbol table analysis

3.2.1 direct analysis

The following figure shows the output symbol table file. Combined with the above log and the following symbol table file, we can also analyze the stack.

As shown in the log, it has been indicated that the crash address is 161a0, and the code corresponding to 161a0 is * a=1. From the above analysis, we know that the crash is on breakpad The 111 line of CPP, that is, the position of * a=1, is exactly as expected.

backtrace:
#00 pc 00000000000161a0 /data/app/com.android.necase-lEp0warh8FqicyY1YqGXXA==/lib/arm64/libbreakpad-core.so (Java_com_online_breakpad_BreakpadInit_nUpdateLaunchInfo+16)
#01 pc 00000000000090cc /data/app/com.android.necase-lEp0warh8FqicyY1YqGXXA==/oat/arm64/base.odex (offset 0x9000)

 

3.2.2 tool analysis

google provides a python tool that combines the symbol table and log to directly analyze the stack, which can be accessed by Python tools https://code.google.com/archive/p/android-ndk-stacktrace-analyzer/ Can be downloaded.

Execute the command to parse the relevant stack. The tool can be used for batch parsing on the server, which will not be described in detail here.

python parse_stack.py <asm-file> <logcat-file>

 

3.2.3 brief analysis of offset position

The above article mentioned a concept of offset position, which the author doesn't know much about, but there is a concept roughly. If C code has a code of root position, each line of code has an offset position relative to the root code.

As shown in the above figure, there is a line of statement in the example log (java_com_online_breakpad_breakpadinit_nupdatelunchinfo + 16), + 16 represents the backward offset of 16 from the position of nupdatelunchinfo method.

As can be seen from the above figure, the position of nUpdateLaunchInfo method is 16190, offset 16, that is, 16190 + 10 (hexadecimal 16 is converted to hexadecimal 10)=161a0, which is the same as the log output.

IV. summary

The above is all the content of this article. It mainly describes some basic knowledge of so, as well as the crash of NE in Android, capture and resolution scheme. I hope this document can help small partners related to ne. At the same time, the subsequent crash SDK will also support the resolution function of related ne.

Author: vivo Malian

Tags: C++ Android

Posted by rrn on Sat, 16 Apr 2022 07:31:14 +0930