Flask source code articles: Flask routing rules and request matching process (super detailed, easy to understand)

If you don't want to read the specific analysis process, you can read the summary directly, and you can understand it!

1 Routing-related operations at startup

The so-called routing principle is how Flask creates its own routing system, and when a request comes, how to accurately locate the processing function according to the routing system and respond to the request.

This section uses the simplest Flask application example to explain the routing principle, as follows:

from flask import Flask

app = Flask(__name__)


@app.route('/')
def hello_world():
    return 'Hello World!'


if __name__ == '__main__':
    app.run()

(1) Analyze app.route()

First, the route registration is realized through the route() decorator under the scaffold(Flask inherits the scaffold) package, and its source code is as follows:

def route(self, rule: str, **options: t.Any) -> t.Callable:
    def decorator(f: t.Callable) -> t.Callable:
      	# get endpoint
        endpoint = options.pop("endpoint", None)
        # Add routing, rule is the routing string from app.route(), and '/'
        self.add_url_rule(rule, endpoint, f, **options)
        return f

    return decorator

You can see that this decorator mainly does two things: 1. Get endpoint; 2. Add routing.

The most important of these is the function add_url_rule(), which is used to add routing mappings.

Replenish:

  1. The endpoint is used later when Flask stores routing and function name mappings. If not specified, the default is the decorated function name. How to use it will be analyzed later;
  2. Because the essence of app.route() is still the add_url_rule() function, we can also use this function directly. For usage, please refer to the article Flask-routing.

(2) Analyze add_url_rule()

Let's take a look at what add_url_rule does. Its core source code is as follows:

class Flask(Scaffold):
    # Only the main code to be discussed here, other codes are omitted
    
    url_rule_class = Rule
    url_map_class = Map
    
    def __init__(
        self,
        import_name: str,
        static_url_path: t.Optional[str] = None,
        static_folder: t.Optional[t.Union[str, os.PathLike]] = "static",
        static_host: t.Optional[str] = None,
        host_matching: bool = False,
        subdomain_matching: bool = False,
        template_folder: t.Optional[str] = "templates",
        instance_path: t.Optional[str] = None,
        instance_relative_config: bool = False,
        root_path: t.Optional[str] = None,
    ):
        super().__init__(
            import_name=import_name,
            static_folder=static_folder,
            static_url_path=static_url_path,
            template_folder=template_folder,
            root_path=root_path,
        )
        
        self.url_map = self.url_map_class()
        self.url_map.host_matching = host_matching
        self.subdomain_matching = subdomain_matching
  
    def add_url_rule(
        self,
        rule: str,
        endpoint: t.Optional[str] = None,
        view_func: t.Optional[t.Callable] = None,
        provide_automatic_options: t.Optional[bool] = None,
        **options: t.Any,
     ) -> None:
         # 1. If no endpoint is provided, get the default endpoint
         if endpoint is None:
             endpoint = _endpoint_from_view_func(view_func)  # type: ignore
         options["endpoint"] = endpoint

          # 2. Get the request method, in the method parameter of the decorator @app.route('/index', methods=['POST','GET'])
          # If not specified, a default tuple is given ("GET",)
          # Regarding the processing of provide_automatic_options, I will not look at it for now
          methods = options.pop("methods", None)
          if methods is None:
              methods = getattr(view_func, "methods", None) or ("GET",)
          if isinstance(methods, str):
              raise TypeError(
                  "Allowed methods must be a list of strings, for"
                  ' example: @app.route(..., methods=["POST"])'
              )
          methods = {item.upper() for item in methods}

          # Methods that should always be added
          required_methods = set(getattr(view_func, "required_methods", ()))

          # starting with Flask 0.8 the view_func object can disable and
          # force-enable the automatic options handling.
          if provide_automatic_options is None:
              provide_automatic_options = getattr(
                  view_func, "provide_automatic_options", None
              )

          if provide_automatic_options is None:
              if "OPTIONS" not in methods:
                  provide_automatic_options = True
                  required_methods.add("OPTIONS")
              else:
                  provide_automatic_options = False

          # Add the required methods now.
          methods |= required_methods

          # 3. An important step: the url_rule_class method instantiates the Rule object
          rule = self.url_rule_class(rule, methods=methods, **options)
          rule.provide_automatic_options = provide_automatic_options  # type: ignore
					
          # 4. An important step: the add method of url_map (Map object)
          self.url_map.add(rule)
          
          # 5. Determine whether the mapping between endpoint and view_func exists. If other view_func already uses this endpoint, an error will be reported, otherwise the new mapping will be added to self.view_functions
          # self.view_functions inherits from Scaffold and is a dictionary object
          if view_func is not None:
              old_func = self.view_functions.get(endpoint)
              if old_func is not None and old_func != view_func:
                  raise AssertionError(
                      "View function mapping is overwriting an existing"
                      f" endpoint function: {endpoint}"
                  )
              self.view_functions[endpoint] = view_func

Analyzing the above source code, this method mainly does the following things:

  1. If no endpoint is provided, get the default endpoint;
  2. Get the request method, in the method parameter of the decorator @app.route('/index', methods=['POST','GET']), if not specified, give a default tuple ("GET",) ;
  3. self.url_rule_class(): Instantiate the Rule object, and the Rule class will be explained later;
  4. Call the self.url_map.add() method, where self.url_map is a Map object, which will be explained later;
  5. self.view_functions[endpoint] = view_func adds the mapping of endpoint and view_func.

Among them, instantiating the Rule object and self.url_map.add() are the core of Falsk routing. The Rule class and Map class are analyzed below.

(3) Analyze the Rule class

The Rule class is under the werkzeug.routing module, and there are many source codes. Here we only extract the main codes we use, as follows:

class Rule(RuleFactory):
  	def __init__(
        self,
        string: str,
        methods: t.Optional[t.Iterable[str]] = None,
        endpoint: t.Optional[str] = None,
        # Code for other parameters omitted here ...
    ) -> None:
        if not string.startswith("/"):
            raise ValueError("urls must start with a leading slash")
        self.rule = string
        self.is_leaf = not string.endswith("/")
        self.map: "Map" = None  # type: ignore
        self.methods = methods
        self.endpoint: str = endpoint 
				# Other initialization code omitted
        
    def get_rules(self, map: "Map") -> t.Iterator["Rule"]:
        """Obtain map Object rule iterator"""
        yield self
    
    def bind(self, map: "Map", rebind: bool = False) -> None:
        """Bundle map object bound to Rule on the object, and according to rule and map information to create a path Regular expressions, stored in rule Object self._regex In the attribute, when the route is matched, use"""
        if self.map is not None and not rebind:
             raise RuntimeError(f"url rule {self!r} already bound to map {self.map!r}")
        # Bind the map object to the Rule object
        self.map = map
        if self.strict_slashes is None:
            self.strict_slashes = map.strict_slashes
        if self.merge_slashes is None:
            self.merge_slashes = map.merge_slashes
        if self.subdomain is None:
            self.subdomain = map.default_subdomain
        # Call the compile method to create a regular expression
        self.compile()
     
    def compile(self) -> None:
        """Write regular expression and store to property self._regex middle"""
				# The regular parsing process code is omitted here
        regex = f"^{''.join(regex_parts)}{tail}$"
        self._regex = re.compile(regex)
        
    def match(
        self, path: str, method: t.Optional[str] = None
    		) -> t.Optional[t.MutableMapping[str, t.Any]]:
        """This function is used to verify the incoming path Whether the parameter (routing) can be matched, if it cannot be matched, it will be returned None"""
        # Part of the code is omitted, only the main code is extracted, just look at the general logic
        if not self.build_only:
            require_redirect = False
					  # 1. Find the result set of path according to the regular result after bind (self._regex regular)
            m = self._regex.search(path)
            if m is not None:
                groups = m.groupdict()
            
            # 2. Edit the matched result set, add it to a result dictionary and return
            result = {}
            for name, value in groups.items():
                try:
                  	value = self._converters[name].to_python(value)
                except ValidationError:
                  	return None
                result[str(name)] = value
                if self.defaults:
                  	result.update(self.defaults)
            return result
        return None    

The Rule class inherits from the RuleFactory class, and the main parameters are:

  • string: routing string
  • methods: routing method
  • endpoint: endpoint parameter

A Rule instance represents a URL pattern, and a WSGI application will process many different URL patterns, and at the same time generate many Rule instances, and these instances will be passed as parameters to the Map class.

(4) Analyze the Map class

The Map class is also under the werkzeug.routing module, which has many source codes. Here we only extract the main codes we use. The main source codes are as follows:

class Map:
  	def __init__(
        self,
        rules: t.Optional[t.Iterable[RuleFactory]] = None
        # Other parameters are omitted here
    ) -> None:
      	# A private variable self._rules list is maintained according to the rules parameter passed in
        self._rules: t.List[Rule] = []
        # Mapping of endpoint and rule
        self._rules_by_endpoint: t.Dict[str, t.List[Rule]] = {}
        # Other initialization operations are omitted here
        
    def add(self, rulefactory: RuleFactory) -> None:
        """
        Bundle Rule object or a RuleFactory object added to map and bound to map,Require rule not bound
        """
        for rule in rulefactory.get_rules(self):
            # Call the bind method of the rule object
            rule.bind(self)
            # Add the rule object to the self._rules list
            self._rules.append(rule)
            # Add the mapping of endpoint and rule to the attribute self._rules_by_endpoint
            self._rules_by_endpoint.setdefault(rule.endpoint, []).append(rule)
        self._remap = True
        
    def bind(
        self,
        server_name: str,
        script_name: t.Optional[str] = None,
        subdomain: t.Optional[str] = None,
        url_scheme: str = "http",
        default_method: str = "GET",
        path_info: t.Optional[str] = None,
        query_args: t.Optional[t.Union[t.Mapping[str, t.Any], str]] = None,
    ) -> "MapAdapter":
        """
        return a new class MapAdapter
        """
        server_name = server_name.lower()
        if self.host_matching:
            if subdomain is not None:
                raise RuntimeError("host matching enabled and a subdomain was provided")
        elif subdomain is None:
            subdomain = self.default_subdomain
        if script_name is None:
            script_name = "/"
        if path_info is None:
            path_info = "/"

        try:
            server_name = _encode_idna(server_name)  # type: ignore
        except UnicodeError as e:
            raise BadHost() from e

        return MapAdapter(
            self,
            server_name,
            script_name,
            subdomain,
            url_scheme,
            path_info,
            default_method,
            query_args,
        )

The Map class has two very important properties:

  1. self._rules, the attribute is a list, which stores a series of Rule objects;
  2. self._rules_by_endpoint:

Among them is the core method add(), here is our analysis of the app.add_url_rule() method is the method called in step 4. It will be explained in detail later.

(5) Analyze the MapAdapter class

In the Map class, the MapAdapter class will be used. Let's get to know this class:

class MapAdapter:

    """`Map.bind`or`Map.bind_to_environ` will return this class
    Mainly used for matching
    """

    def match(
        self,
        path_info: t.Optional[str] = None,
        method: t.Optional[str] = None,
        return_rule: bool = False,
        query_args: t.Optional[t.Union[t.Mapping[str, t.Any], str]] = None,
        websocket: t.Optional[bool] = None,
    ) -> t.Tuple[t.Union[str, Rule], t.Mapping[str, t.Any]]:
        """The route that matches the request and the Rule object"""
				# Only the main code is extracted, and a lot of code is omitted...
				
        # Here are the main steps: traverse the rule list of the map object, and match with the path in turn
        for rule in self.map._rules:
            try:
                # Call the match method of the rule object to return the matching result
                rv = rule.match(path, method)
            except RequestPath as e:
						 # A lot of code omitted below...
         
         # Return the rule object (or endpoint) and matching routing results
         if return_rule:
            return rule, rv
         else:
             return rule.endpoint, rv

The Map.bind or Map.bind_to_environ method returns a MapAdapter object.

Among them, the core method of the MapAdapter object is match. The main steps are to traverse the rule list of the map object and match with the path in turn. Of course, the match method of the rule object is called to return the matching result.

Next, you can independently see how the Map class and the Rlue object are used together. See the following example:

from werkzeug.routing import Map, Rule

m = Map([
    Rule('/', endpoint='index'),
    Rule('/blog', endpoint='blog/index'),
    Rule('/blog/<int:id>', endpoint='blog/detail')
])

# Returns a MapAdapter object
map_adapter = m.bind("example.com", "/")

# The match method of the MapAdapter object will return the matching result
print(map_adapter.match("/", "GET"))
# ('index', {})

print(map_adapter.match("/blog/42"))
# ('blog/detail', {'id': 42})

print(map_adapter.match("/blog"))
# ('blog/index', {})

It can be seen that the Map object returns a MapAdapter object through bind, and the match method of the MapAdapter object can find the route matching result.

(6) Analyze url_rule_class()

The first major step of add_url_rule is rule = self.url_rule_class(rule, methods=methods, **options), which creates a Rule object.

When analyzing the Rlue class, we know that the Rule object mainly has three attributes: string (routing string), methods, and endpoint. Take a specific example below to see what the instantiated Rule object looks like.

Or the top example at the beginning, let's look at the specific properties of the object when the Rule object is instantiated after passing the @app.route('/') code, as follows through debug:

It can be seen that the rule attribute of the rule object is the passed route, the endpoint attribute is obtained through the function name, the method attribute is the supported request method, and 'HEAD' and OPTIONS are added by default.

(7) Analysis map.add(rule)

The second main step of add_url_rule is self.url_map.add(rule), which calls the add method of the Map object.

When analyzing the Map object in step 4, it was mentioned, now let's go back and take a closer look at what this method does:

def add(self, rulefactory: RuleFactory) -> None:
        """
        Bundle Rule object or a RuleFactory object added to map and bound to map,Require rule not bound
        """
        for rule in rulefactory.get_rules(self):
            # Call the bind method of the rule object
            rule.bind(self)
            # Add the rule object to the self._rules list
            self._rules.append(rule)
            # Add the mapping of endpoint and rule to the attribute self._rules_by_endpoint
            self._rules_by_endpoint.setdefault(rule.endpoint, []).append(rule)
        self._remap = True

In fact, the main thing is to add the Rule object or a RuleFactory object instantiated in the previous step to the Map object and bind it to the map.

Mainly did two things:

  1. Call the bind method of the rule object rule.bind(self): This method was mentioned when analyzing the Rule class. Its main function is to bind the map object to the Rule object and create a regular expression based on the rule and map information (self._regex attribute of the Rule object).
  2. Add the rule object to the self._rules list of the Map object;
  3. Add the mapping of endpoint and rule to the attribute self._rules_by_endpoint (a dictionary) of the Map object;

We can use an example to see what the Map object becomes after add ing. Through debug ging, the results are as follows:

It can be seen that both the self._rules and self._rules_by_endpoint attributes of Map contain the data corresponding to the newly added '/' route (where /static/<path:filename> is the route for the location of the static file added by default).

The above analysis is over, how to add a routing map.

2 The route matching process when the request comes in

(1) Analyze wsgi_app

Let's analyze how to perform route matching based on the previous Map and Rlue objects when the previous request comes in.

In the previous chapter, we analyzed that when a request comes in, it will first pass through the wsgi server to process the request and then call the __call__ method of the Flask app. The code is as follows:

def __call__(self, environ: dict, start_response: t.Callable) -> t.Any:
    """The WSGI server calls the Flask application object as the
    WSGI application. This calls :meth:`wsgi_app`, which can be
    wrapped to apply middleware.
    """
    return self.wsgi_app(environ, start_response)

It can be seen that the wsgi_app() method of the app is actually called. Its code is as follows:

def wsgi_app(self, environ: dict, start_response: t.Callable) -> t.Any:
  	# 1. Get the request context
    ctx = self.request_context(environ)
    error: t.Optional[BaseException] = None
    try:
        try:
          	# 2. Call the push method of the request context
            ctx.push()
            # 3. Call full_dispatch_request() to distribute the request and get the response result
            response = self.full_dispatch_request()
        except Exception as e:
            error = e
            response = self.handle_exception(e)
        except: 
            error = sys.exc_info()[1]
            raise
        return response(environ, start_response)
    finally:
        if self.should_ignore_error(error):
            error = None
        ctx.auto_pop(error)

There are 3 main steps in it:

  1. Get the request context: ctx = self.request_context(environ)
  2. Call the push method of the request context: ctx.push()
  3. Call full_dispatch_request() to distribute the request and get the response result: response = self.full_dispatch_request()

Let's analyze the function of each step one by one.

(2) Analyze request_context

The first major step of the wsgi_app method.

This method is mainly to obtain a context object.

environ (environment variables, etc.) needs to be passed into the method. Context is also an important concept in Flask. Of course, the next chapter will focus on the analysis of context. This chapter only focuses on what we need.

def request_context(self, environ: dict) -> RequestContext:
    return RequestContext(self, environ)

This method is to create a RequestContext object. Part of the source code of the RequestContext class is as follows:

class RequestContext:
    def __init__(
        self,
        app: "Flask",
        environ: dict,
        request: t.Optional["Request"] = None,
        session: t.Optional["SessionMixin"] = None,
    ) -> None:
        self.app = app
        if request is None:
            request = app.request_class(environ)
        self.request = request
        self.url_adapter = None
        try:
          	# Here is the key point, calling the create_url_adapter method of the Falsk object to get the MapAdapter object
            self.url_adapter = app.create_url_adapter(self.request)
        except HTTPException as e:
            self.request.routing_exception = e
        self.flashes = None
        self.session = session
				# Other code omitted...
    # The source code of other methods is omitted...

In the initialization method of creating the RequestContext object, a very important step is to obtain the MapAdapter object.

We have analyzed its function in Section 4 above, and it is mainly used to match routes.

Let's look at the source code of create_url_adapter:

def create_url_adapter(
    self, request: t.Optional[Request]
) -> t.Optional[MapAdapter]:
    if request is not None:
        if not self.subdomain_matching:
            subdomain = self.url_map.default_subdomain or None
        else:
            subdomain = None
				# Here is the key point, calling the bind_to_environ method of the Map object
        return self.url_map.bind_to_environ(
            request.environ,
            server_name=self.config["SERVER_NAME"],
            subdomain=subdomain,
        )
    if self.config["SERVER_NAME"] is not None:
      	# Here is the key point, calling the bind method of the Map object
        return self.url_map.bind(
            self.config["SERVER_NAME"],
            script_name=self.config["APPLICATION_ROOT"],
            url_scheme=self.config["PREFERRED_URL_SCHEME"],
        )

    return None

It can be seen that the main function of this method is to call the bind_to_environ method or bind method of the Map object. I also analyzed it when I talked about the Map class. These two methods mainly return the MapAdapter object.

(3) Analyze ctx.push

The second major step of the wsgi_app method.

After the context object is obtained in the wsgi method, the push method is called, and the code is as follows (only the core code is kept):

class RequestContext:
    def __init__(
        self,
        app: "Flask",
        environ: dict,
        request: t.Optional["Request"] = None,
        session: t.Optional["SessionMixin"] = None,
    ) -> None:
        # Code omitted
        pass

    def match_request(self) -> None:
        try:
          	# 1. The match method of the MapAdapter object is called, and the rule object and parameter object are returned
            result = self.url_adapter.match(return_rule=True) 
            # 2. Put the rule object and parameter object into the request context
            self.request.url_rule, self.request.view_args = result
        except HTTPException as e:
            self.request.routing_exception = e

    def push(self) -> None:
        """Binds the request context to the current context."""
        # The pre-verification processing code (context, session, etc. processing) is omitted here
        if self.url_adapter is not None:
          	# The match_request method is called
            self.match_request()

It can be seen that the push method mainly calls the match_request method, which mainly does the following two things:

  1. Calling the match method of the MapAdapter object will match the route of the current request according to the routing information stored in the Map object, and return the rule object and parameter object.
  2. Put the rule object and parameter object into the request context.

(4) Analyze full_dispatch_request

The third major step of the wsgi_app method.

The source code of the full_dispatch_request method is as follows:

def full_dispatch_request(self) -> Response:
    self.try_trigger_before_first_request_functions()
    try:
        request_started.send(self)
        rv = self.preprocess_request()
        if rv is None:
          	# Here are the main steps: Distributing the request
            rv = self.dispatch_request()
    except Exception as e:
        rv = self.handle_user_exception(e)
    return self.finalize_request(rv)

Among them, dispatch_request() is the core method.

The source code of the dispatch_request() method is as follows:

def dispatch_request(self) -> ResponseReturnValue:
  	# 1. Get the request context object
    req = _request_ctx_stack.top.request
    if req.routing_exception is not None:
        self.raise_routing_exception(req)
    # 2. Get the Rule object that exists in the context from the context
    rule = req.url_rule
    # if we provide automatic options for this URL and the
    # request came with the OPTIONS method, reply automatically
    if (
        getattr(rule, "provide_automatic_options", False)
        and req.method == "OPTIONS"
    ):
        return self.make_default_options_response()
    # Here is the key point: get the corresponding view function from the self.view_functions attribute according to the endpoint attribute of the rule object,
    # Then pass the parameters in the context to the view function and call the view function to process the request and return the processing result
    return self.ensure_sync(self.view_functions[rule .endpoint])(**req.view_args)

The main steps of this method are as follows:

  1. req = _request_ctx_stack.top.request: Get the request context object
  2. rule = req.url_rule: Get the Rule object that exists in the context from the context, which is put into the context by the ctx.push() method
  3. Obtain the corresponding view function from the self.view_functions property according to the endpoint property of the rule object, then pass the parameters in the context to the view function and call the view function to process the request, and return the processing result: return self.ensure_sync(self.view_functions[rule .endpoint])(**req.view_args)

At this point, a complete request is processed.

3 Summary

According to the above analysis, the routing rules and request matching are summarized as follows:

When the app starts:

  1. app.route() calls the add_url_rule method.
  2. In the add_url_rule method: call self.url_rule_class() to instantiate the Rule object;
  3. In the add_url_rule method: call the self.url_map.add() method to store the mapping relationship between the Rule object, endpoint and view function in the Map object.
  4. In the add_url_rule method: self.view_functions[endpoint] = view_func adds the mapping of endpoint and view_func.

Request matching process:

  1. When the request comes in, the WSGI server processes and calls the __call__ method of the Flask app, and then calls the wsgi_app method;
  2. Create a context object in the wsgi_app method: ctx = self.request_context(environ). Then instantiate the MapAdapter object as the context object property;
  3. The push method of the context object is called in the wsgi_app method: ctx.push(). This method mainly uses the match method of the MapAdapter object.
  4. The match method of the MapAdapter object calls the match method of the rule object. This method matches the route of the current request according to the routing information stored in the Map object, and gets the rule object and parameter object into the context object.
  5. The full_dispatch_request method is called in the wsgi_app method, and then the dispatch_request() method is called inside;
  6. In the dispatch_request() method: get the request context object, get the Rule object inside, get the corresponding view function from the self.view_functions attribute according to the endpoint attribute of the rule object, and then pass the parameters in the context to the view function and call the view The function processes the request and returns the processing result.

The whole flow chart is as follows:

In the first half, how to use Rule and Map objects to establish routing rules when the project starts; the second half is how to use routing rules to match when requests come in.

Tags: Back-end Python Flask

Posted by gfadmin on Tue, 07 Mar 2023 02:52:40 +1030