Day675.Tomcat's "seniors" - in-depth dismantling of Tomcat & Jetty

Tomcat's "seniors"

Hi, I'm Achang, today I learned and recorded the functions of a series of classes in Tomcat for startup process management and loading

The script startup.sh in Tomcat's /bin directory starts Tomcat, so do you know what happened after we executed this script?

Take a look at the flow chart below:

  1. Tomcat is essentially a Java program, so the startup.sh script starts a JVM to run Tomcat's startup class, Bootstrap.
  2. Bootstrap's main task is to initialize Tomcat's classloader and create Catalina.
  3. Catalina is a startup class that parses server.xml, creates corresponding components, and calls Server's start method.
  4. The responsibility of the Server component is to manage the Service component, which will be responsible for calling the start method of the Service.
  5. The responsibility of the Service component is to manage the connector and the top-level container Engine, so it calls the start method of the connector and Engine.

Think of Bootstrap as a god, it initializes the class loader, the tool that creates everything.

If we compare Tomcat to a company, then Catalina should be the founder of the company, because Catalina is responsible for forming the team, that is, creating the Server and its subcomponents.

Server is the CEO of the company and is responsible for managing multiple business groups, each of which is a Service.

Service is the general manager of the business group and it manages two functional departments:

  • One is the external marketing department, that is, the connector assembly;
  • The other is the internal R&D department, which is the container component.

Engine is the R&D manager because Engine is the top-level container component. You can see that these startup classes or components do not handle specific requests. Their tasks are mainly "management", managing the life cycle of lower-level components, and assigning tasks to lower-level components, that is, routing requests to those responsible for "work". components.

1. Catalina

Catalina's main task is to create a Server. Instead of directly creating a new Server instance, it needs to resolve the Server XML, put it in Server Various components configured in the XML are created one by one, and then the init method and start method of the Server component are called, so that the whole Tomcat is started

As a "manager", Catalina also needs to deal with various "abnormal" situations, such as when we shut down Tomcat through "Ctrl + C", how will Tomcat stop gracefully and clean up resources? So Catalina registers a "shutdown hook" in the JVM.

public void start() {
    //1. If the Server instance held is empty, parse server.xml to create it
    if (getServer() == null) {
        load();
    }
    //2. If the creation fails, exit with an error
    if (getServer() == null) {
        log.fatal(sm.getString("catalina.noServer"));
        return;
    }

    //3. Start Server
    try {
        getServer().start();
    } catch (LifecycleException e) {
        return;
    }

    //Create and register shutdown hook
    if (useShutdownHook) {
        if (shutdownHook == null) {
            shutdownHook = new CatalinaShutdownHook();
        }
        Runtime.getRuntime().addShutdownHook(shutdownHook);
    }

    //Use the await method to listen for stop requests
    if (await) {
        await();
        stop();
    }
}

So what is a "shutdown hook" and what does it do?

If we need to do some cleanup work when the JVM shuts down, such as flushing cached data to disk, or cleaning up some temporary files, we can register a "shutdown hook" with the JVM.

A "shutdown hook" is actually a thread, and the JVM will try to execute the thread's run method before stopping.

What Tomcat's "shutdown hook" CatalinaShutdownHook does.

protected class CatalinaShutdownHook extends Thread {

    @Override
    public void run() {
        try {
            if (getServer() != null) {
                Catalina.this.stop();
            }
        } catch (Throwable ex) {
           ...
        }
    }
}

Tomcat's "shutdown hook" actually executes the Server's stop method, which releases and cleans up all resources.

2. Server components

The specific implementation class of the Server component is StandardServer. Let's take a look at what functions StandardServer implements.

Server inherits LifecycleBase, its life cycle is managed uniformly, and its sub component is Service, so it also needs to manage the life cycle of Service, that is, call the start method of Service components when starting, and call their stop method when stopping

Server maintains several Service components internally, which are stored in arrays. How does Server add a Service to the array?

@Override
public void addService(Service service) {

    service.setServer(this);

    synchronized (servicesLock) {
        //Create a new array of length + 1
        Service results[] = new Service[services.length + 1];
        
        //copy old data
        System.arraycopy(services, 0, results, 0, services.length);
        results[services.length] = service;
        services = results;

        //Start the Service component
        if (getState().isAvailable()) {
            try {
                service.start();
            } catch (LifecycleException e) {
                // Ignore
            }
        }

        //Trigger the listener event
        support.firePropertyChange("service", null, service);
    }

}

From the above code, you can see that it does not allocate a long array at the beginning, but dynamically expand the array length in the process of adding. When adding a new Service instance, it will create a new array and The purpose of copying the contents of the original array to the new array is to save memory space

In addition, the Server component also has an important task to start a Socket to listen to the stop port, that's why you can shut down Tomcat with the shutdown command.

I don't know if you noticed, the last line of code in Catalina's startup method above is to call Server's await method.

In the await method, a Socket will be created to listen to port 8005, and the connection request on the Socket will be received in an infinite loop. If a new connection arrives, the connection will be established, and then the data will be read from the Socket;

If the read data is the stop command "SHUTDOWN", exit the loop and enter the stop process.

3. Service components

The concrete implementation class of the Service component is StandardService, look at its definition and key member variables.

public class StandardService extends LifecycleBase implements Service {
    //name
    private String name = null;
    
    //Server instance
    private Server server = null;

    //Connector array
    protected Connector connectors[] = new Connector[0];
    private final Object connectorsLock = new Object();

    //Corresponding Engine container
    private Engine engine = null;
    
    //mapper and its listeners
    protected final Mapper mapper = new Mapper();
    protected final MapperListener mapperListener = new MapperListener(this);

StandardService inherits the LifecycleBase abstract class, and there are some familiar components in StandardService, such as Server, Connector, Engine and Mapper.

Then why is there a MapperListener? This is because Tomcat supports hot deployment. When the deployment of Web applications changes, the mapping information in Mapper will also change. Mapperlistener is a listener, which monitors the changes of containers and updates the information to Mapper. This is a typical observer mode

As a component in the "administration" role, it is most important to maintain the lifecycle of other components.

In addition, when starting various components, pay attention to their dependencies, that is, pay attention to the order of starting. Let's take a look at the Service startup method:

protected void startInternal() throws LifecycleException {

    //1. Trigger the start listener
    setState(LifecycleState.STARTING);

    //2. Start Engine first, Engine will start its sub-container
    if (engine != null) {
        synchronized (engine) {
            engine.start();
        }
    }
    
    //3. Restart the Mapper listener
    mapperListener.start();

    //4. Finally start the connector, the connector will start its subcomponents, such as Endpoint
    synchronized (connectorsLock) {
        for (Connector connector: connectors) {
            if (connector.getState() != LifecycleState.FAILED) {
                connector.start();
            }
        }
    }
}

As you can see from the startup method, the Service first starts the Engine component, then starts the Mapper listener, and finally starts the connector.

This is easy to understand, because after the inner component is started, it can provide services to the outside world, and only then can the outer connector component be started.

Mapper also depends on container components, which can only monitor their changes after container components are started, so Mapper and MapperListener are started after container components.

The order in which components are stopped is the reverse of the order in which they are started, based on their dependencies.

4. Engine components

How the container component Engine is implemented.

Engine is essentially a container, so it inherits the ContainerBase base class and implements the Engine interface.

public class StandardEngine extends ContainerBase implements Engine {
}

We know that the sub-container of Engine is Host, so it holds an array of Host containers. These functions are abstracted into ContainerBase, which has such a data structure:

protected final HashMap<String, Container> children = new HashMap<>();

ContainerBase uses HashMap to save its sub containers, and ContainerBase also implements the "addition, deletion, modification and query" of sub containers, and even provides a default implementation for the start and stop of sub components. For example, ContainerBase uses a special thread pool to start sub containers

for (int i = 0; i < children.length; i++) {
   results.add(startStopExecutor.submit(new StartChild(children[i])));
}

So Engine reuses this method directly when starting the Host subcontainer.

So what did Engine do by itself? We know that the most important function of the container component is to process requests, and the "processing" of the request by the Engine container is actually to forward the request to a certain Host sub-container for processing, which is implemented by Valve.

Each container component has a Pipeline, and the Pipeline has a basic valve (Basic Valve), and the basic valve of the Engine container is defined as follows:

final class StandardEngineValve extends ValveBase {

    public final void invoke(Request request, Response response)
      throws IOException, ServletException {
  
      //Get the Host container in the request
      Host host = request.getHost();
      if (host == null) {
          return;
      }
  
      // Call the first Valve in the Pipeline in the Host container
      host.getPipeline().getFirst().invoke(request, response);
  }
  
}

This basic valve implementation is very simple, just forwarding the request to the Host container.

You may be curious. As you can see from the code, the Host container object that processes the request is obtained from the request. How can there be a Host container in the request object?

This is because the Mapper component has already routed the request before the request reaches the Engine container. The Mapper component locates the corresponding container through the requested URL and saves the container object in the request object.

V. Summary

It is done by the startup class and the "high-level" components. They all play the role of "management", responsible for creating sub-components, assembling them together, and also mastering the "power of life and death" of sub-components.

So when we are designing such a component, we need to consider two aspects:

  • First of all, it is necessary to choose a suitable data structure to save the sub-components. For example, the Server uses an array to save the Service component, and adopts a dynamic expansion method. This is because the array structure is simple and occupies a small amount of memory;
  • For another example, ContainerBase uses HashMap to save sub-containers. Although Map takes up a little more memory, you can quickly find sub-containers through Map.

Therefore, in actual work, we also need to select the appropriate data structure according to the specific scenarios and requirements.

Secondly, it is also necessary to decide the order of starting and stopping of sub-components according to their dependencies, and how to stop gracefully to prevent resource leakage in abnormal situations. This is exactly what "managers" should consider.

Tags: jvm Tomcat Jetty

Posted by a94060 on Thu, 21 Jul 2022 02:25:30 +0930