Evaluating custom expressions with parse trees

Not long ago, we encountered such a demand: the project party needs to monitor each business system. If the score of the business system is lower than a predetermined score, the monitoring system will automatically send alarm information to the relevant person in charge.


It doesn't seem difficult. We divide the state of resources from high to low into three levels: fatal, serious and warning. The state of the whole business system is affected by the most serious nodes. For example, if the state of a resource in the business system is fatal, the whole business system is fatal.

However, many projects on the demand side adopt load balancing or distributed deployment. The downtime of a node does not affect the continuous operation of the whole system. This simple single rule can not effectively judge the overall operation state of the system.


In the figure above, although the available area A is hung up, the whole system is still running, and the whole system cannot be simply sentenced to death. It seems that the concept of "score" is necessary. It is necessary to score each node or cluster in the system, finally calculate the score of the whole system, and then judge the final state of the system according to the preset threshold.


In view of this situation, we design three expressions: MIN (take the minimum value of the cluster), MAX (take the maximum value of the cluster) and AVG (take the weighted average of the cluster). Specifically, each expression calculates the score of a cluster. Of course, a cluster can contain only one node or other clusters. In this way, the above figure can be calculated with the following expression:

AVG (zone A * 1, zone B * 1)

             = AVG(MAX(ECSA1, ECSA2) * 1, MAX(ECSB1, ECSB2) * 1) 

Where "* 1" means the weight is 1. For the above formula, the weights of available area A and available area B are the same. The above formula is equal to:

AVG(MAX(20, 20) * 1, MAX(100, 100) * 1) = (20 + 100) / (1 + 1) = 60

The health of the service is 60 points, and a corresponding level of alarm will be generated.

According to this rule, any nesting can be carried out:

MAX(AVG(60*3, 70*2, MAX(40, 70) * 4), MAX(30, 40, AVG(30*3, 90*7)), 50)

The next question is how to calculate the result according to the given expression on the premise of strong readability of the code?


Before releasing the task, I gave several test cases:

AVG(AVG(10*2, 2*2) * 1, 20 * 10) = 19

MAX(MIN(10, 5),10,MAX(20,21) = 21

MAX(AVG(60*3, MAX(40, 70) * 4)) = 66

MAX(AVG(60*3, 70*2, MAX(40, 70) * 4), MAX(30, 40, AVG(30*3, 90*7))) = 72

AVG(30*3, 90*7, AVG(20*4, 30*4) * 1) = 68

Ten minutes later, a remote working partner picked up the task and quickly submitted the results.


His idea is to convert the expression into a parse tree:

In this way, the results can be calculated step by step from bottom to top.

Maybe we can continue to analyze downward and decompose 60 * 3 into:


For this project, only AVG involves weighting operators, so this decomposition is not necessary. In addition, the weighting of clusters, such as AVG(MAX(40,10) * 2), will be decomposed into:


For the child nodes of AVG, it is either a complete expression, such as a*b; Or split into two expressions, such as a and * b. During calculation, all a*b can be changed into a and * b through standardized processing, and finally ensure that the leaf nodes of AVG must be even, and each two can form a pair.

An expression is a string composed of three elements: Operator (AVG, MAX, MIN, multiplier), separator (left and right parentheses, comma) and number. Therefore, the following static variables are defined:


 1 /*
 2   * Symbols in custom rules, where Max, min and AVG are operators and WEIGHT is a weighted symbol
 3 */
 4 final static public String MAX = "MAX";    //Seek maximum
 5 final static public String MIN = "MIN";    //Minimum
 6 final static public String AVG = "AVG";    //weighted mean 
 7 final static public String WEIGHT = "*";   //Weighted multiplication
 8 final static public String SEPERATOR = ",";
 9 final static public String BRACKETS_L = "(";
10 final static public String BRACKETS_R = ")";


Then identify the node by traversing the string and insert the node into the parse tree. The main code is as follows:


 1  static OPTree build(String expression) {
 2         // Expression syntax tree
 3         OPTree tree = null;
 4         // Current node of syntax tree
 5         OPTree.Node currentNode = null;
 6         String sign = ""; // Operator tag
 7         String num = "";  // Digital mark
 8         // Example expression: MAX(10,-1,30,MIN(40,50),AVG(60*3, 70*2))
 9         // take "MAX(10,2)*4" Convert to  "MAX(10,2),*4", So that MAX(10,2)and"*4"Level, simplified processing
10         char[] chars = expression.replaceAll(" ", "").replaceAll("\\)\\*", "),\\*").toCharArray();
11         for(char c : chars) {
12             if(('0' <= c && c <= '9') || c == WEIGHT.charAt(0)) {
13                 num += c;
14                 continue;
15             }
17             sign += c;
18             // 1. encounter MAX, MIN, AVG,Insert it as the root node into the syntax tree and currentNode Point to it;
19             switch (sign) {
20                 case MAX:
21                 case MIN:
22                 case AVG:
23                     if (tree == null) {
24                         tree = new OPTree(sign);
25                         currentNode = tree.getRoot();
26                     } else {
27                         currentNode = tree.insert(sign, currentNode);
28                     }
29                     sign = "";
30                     break;
31                 // 2. Encounter“(", Clear the operator mark and number mark directly
32                 case BRACKETS_L:
33                     sign = "";
34                     num = "";
35                     break;
36                 // 3. Encounter“,": 
37                 //  3.1 If num Non empty, Directly num As currentNode Leaf node of
38                 //  3.2 If num Empty, direct currentNode Point to the upper node
39                 case SEPERATOR:
40                     assert currentNode != null;
41                     if (!num.isEmpty()) {
42                         tree.insertChild(valueTranse(num), currentNode);
43                     } else {
44                         currentNode = currentNode.getParent();
45                     }
46                     num = "";
47                     sign = "";
48                     break;
49                 // 4. Encounter“)",
50                 //  4.1 If num Not empty, first num As currentNode Leaf node of
51                 //  4.2 If num Empty, direct currentNode Point to the upper node
52                 case BRACKETS_R:
53                     assert tree != null;
54                     if (!num.isEmpty()) {
55                         tree.insertChild(valueTranse(num), currentNode);
56                     }
57                     else {
58                         currentNode = currentNode.getParent();
59                     }
60                     num = "";
61                     sign = "";
62                     break;
63             }
64         }
66         return tree;
67     }


If you remove comments and some preset static variables, the core code is about 50 lines, which is much more beautiful than repeatedly using regular expression matching.


The algorithm pre defines the currentnode as the current node of the parse tree, records the symbols and numbers of the expression with sign and num, and traverses each character in the expression from left to right:

  0. When a number or "-" is encountered, splice it to the end of num;

  1. When Max, min and AVG are encountered, insert them into the syntax tree as the root node, point the currentNode to it, and then clear the sign;

  2. When "(", clear sign and num directly;

  3. In case of ",", if num is not empty, directly use num as the leaf node of currentNode; otherwise, directly point currentNode to the node of the previous layer; Then clear sign and num;

  4. In case of ")", if num is not empty, first take num as the leaf node of currentNode, otherwise directly point currentNode to the node of the upper layer; Then clear sign and num;

After traversing the last node, the parse tree construction is completed.

Take MAX(10,-1,30,MIN(40,50),AVG(60*3, 70*2)) as an example to verify the whole process:


Finally, the currentnode will point to the root node of the parse tree.

With the parse tree, the score of the whole business system can be calculated from bottom to top through post order traversal.


Source: WeChat official account "I am the 8."

This article focuses on learning, research and sharing. If you need to reprint, please contact me, indicating the author and source, non-commercial use!  

Scan the two-dimensional code and pay attention to the author's official account. "I am the 8."


Tags: data structure

Posted by jigaruu on Mon, 18 Apr 2022 10:02:15 +0930