APP crawler -- appium extracts data

Use appium to automatically control mobile devices and extract data

learning target
  1. Understand how the appium-python-client module locates elements and extracts their text content
  2. Understand how the appium-python-client module controls sliding actions

Take controlling the sliding of the Douyin app and obtaining information such as the nickname and number of likes of the Douyin short video publisher as an example

2.1 Install the appium-python-client module and start the installed environment

2.1.1 Install appium-python-client module

Execute pip install appium-python-client in the virtual environment of window

2.1.2 Start the Yeshen simulator, enter the bin directory of the installation path where the Yeshen simulator is located, enter the cmd terminal, and use the adb command to establish a connection between the adb server and the simulator

  1. adb devices
C:\Program Files (x86)\Nox\bin>adb devices
List of devices attached
* daemon not running; starting now at tcp:5037
* daemon started successfully
copy
  1. nox_adb.exe connect 127.0.0.1:62001
C:\Program Files (x86)\Nox\bin>nox_adb.exe connect 127.0.0.1:62001
already connected to 127.0.0.1:62001
copy
  1. adb devices
C:\Program Files (x86)\Nox\bin>adb devices
List of devices attached
127.0.0.1:62001 device
copy

2.1.3 Start appium-desktop, click start server to start appium service

[Appium] Welcome to Appium v1.10.0
[Appium] Appium REST http interface listener started on 0.0.0.0:4723
copy

2.1.4 Use the content learned in the previous section to obtain the Desired Capabilities parameters

  1. Get the model of the simulated device
    • Open Settings - About Tablet
    • Check the model number to get the model number of the simulated device
  2. Get app package name and app process name
    • Open the Douyin short video app in the simulator
    • When the adb connection is correct, enter adb shell in the cmd under the bin directory of the Yeshen simulator installation directory
    • After entering the adb shell, enter dumpsys activity | grep mFocusedActivity
    • ``com.ss.android.ugc.aweme` is the app package name
    • .main.MainActivity is the process name Note that there is a dot in front.

2.2 Initialize and obtain mobile device resolution

Complete the code as follows, and run the code to view the effect: If the Douyin app is started in the simulator and prints out the resolution of the simulated device, it is successful

from appium import webdriver

# Initialize the configuration and set the Desired Capabilities parameters
desired_caps = {
    'platformName': 'Android',
    'deviceName': 'SM-G955F',
    'appPackage': 'com.ss.android.ugc.aweme',
    'appActivity': '.main.MainActivity'
}
# Specify Appium Server
server = 'http://localhost:4723/wd/hub'
# Create a new driver
driver = webdriver.Remote(server, desired_caps)
# Get the resolution (px) of the emulator/phone
width = driver.get_window_size()['width']
height = driver.get_window_size()['height']
print(width, height)
copy
  • mobile device resolution
    • driver.get_window_size()['width']
    • driver.get_window_size()['height']

2.3 Methods of locating elements and extracting text

2.3.1 Click the magnifying glass icon in the upper right corner of appium desktop

Fill in the configuration as shown in the figure, and click start session

2.3.2 The method of using the positioning interface is shown in the figure below

2.3.3 Click on the name of the author of the short video to view and obtain the id of the element

2.3.4 Use code in python to get the text content of the element through the element id

After instantiating the appium driver object, add the following code, run and view the effect

# Get various information about the video: use appium desktop to locate elements
print(driver.find_element_by_id('bc').text)  # publisher name
print(driver.find_element_by_id('al9').text)  # Likes
print(driver.find_element_by_id('al_').text)  # Number of messages
print(driver.find_element_by_id('a23').text)  # The name of the video may not exist, and an error is reported
copy
  • Methods for locating elements and getting their text content
    • driver.find_element_by_id( element id).text
    • driver.find_element_by_xpath (xpath rules for locating elements).text

Tags: Python Android shell

Posted by Risingstar on Sun, 27 Nov 2022 16:22:40 +1030