ES6 string extension, template string, template compilation, label template

ES6 (III) string extension, template string, template compilation and label template

1. Unicode representation of characters

ES6 strengthens the support for Unicode, allowing a character to be represented in the form of \uxxxx, where xxxx represents the Unicode code point of the character.

// "a"

However, this representation is limited to characters with code points between \u0000~\uFFFF. Characters outside this range must be represented in two double bytes.

// "Pyramid"

// " 7"

The above code indicates that if \u is directly followed by a value exceeding 0xFFFF (such as \u20BB7), JavaScript will understand it as \u20BB+7. Since \u20BB is a non printable character, only a space is displayed followed by a 7.

ES6 improves this point. As long as the code point is placed in braces, the character can be correctly interpreted.

// "Pyramid"

// "ABC"

let hello = 123;
hell\u{6F} // 123

'\u{1F680}' === '\uD83D\uDE80'
// true

In the above code, the last example shows that the brace notation is equivalent to the four byte UTF-16 encoding.

With this representation, JavaScript has six ways to represent a character.

'\z' === 'z'  // true
'\172' === 'z' // true
'\x7A' === 'z' // true
'\u007A' === 'z' // true
'\u{7A}' === 'z' // true

2. traversal interface of string

ES6 adds an Iterator interface for strings (see the chapter "Iterator") so that strings can be used for Of loop traversal.

for (let codePoint of 'foo') {
// "f"
// "o"
// "o"

In addition to traversing the string, the greatest advantage of this traverser is that it can identify code points greater than 0xFFFF, which cannot be identified by the traditional for loop.

let text = String.fromCodePoint(0x20BB7);

for (let i = 0; i < text.length; i++) {
// " "
// " "

for (let i of text) {
// "Pyramid"

In the above code, the string text has only one character, but the for loop will think that it contains two characters (neither of which can be printed), and for The of loop correctly recognizes this character.

3. input U+2028 and U+2029 directly

==JavaScript strings allow direct input of characters, as well as escape forms of input characters== For example, the Unicode code point of "Zhong" is U+4e2d. You can enter this Chinese character directly in the string or its escape form \u4e2d. The two are equivalent.

'in' === '\u4e2d' // true

However, JavaScript specifies that there are five characters, which cannot be used directly in the string, and only the escape form can be used.

  • U+005C: reverse solidus
  • U+000D: carriage return
  • U+2028: line separator
  • U+2029: paragraph separator
  • U+000A: line feed

For example, a string cannot contain a backslash directly. It must be escaped and written as \ \ or \u005c.

There is no problem with this rule. The trouble is that JSON format allows you to directly use U+2028 (line separator) and U+2029 (segment separator) in the string. In this way, the JSON output by the server is Parse, you may directly report an error.

const json = '"\u2028"';
JSON.parse(json); // Possible error

JSON format has been frozen (RFC 7159) and cannot be modified. To eliminate this error, ES2019 Allow JavaScript strings to enter U+2028 (line separator) and U+2029 (segment separator) directly.

const PS = eval("'\u2029'");

According to this proposal, the above code will not report an error.

Note that the template string now allows these two characters to be entered directly. In addition, regular expressions are still not allowed to enter these two characters directly, which is no problem, because JSON does not allow regular expressions to be included directly.

4. JSON. Modification of stringify()

According to the standard, JSON data must be UTF-8 encoded. But now json The stringify () method may return a string that does not conform to the UTF-8 standard.

Specifically, the UTF-8 standard stipulates that the code points between 0xD800 and 0xDFFF cannot be used alone, but must be used in pairs. For example, \uD834\uDF06 are two code points, but they must be used together to represent the character 𝌆. This is an alternative method for representing characters with code points greater than 0xFFFF. It is illegal to use \uD834 and \uDFO6 alone, or reverse the order, because \uDF06\uD834 has no corresponding characters.

JSON. The problem with stringify() is that it may return a single code point between 0xD800 and 0xDFFF.

JSON.stringify('\u{D834}') // "\u{D834}"

To ensure that a valid UTF-8 character is returned, ES2019 Changed json The behavior of stringify(). If a single code point between 0xD800 and 0xDFFF is encountered, or there is no matching form, it will return the escape string and leave it to the application to decide the next step.

JSON.stringify('\u{D834}') // ""\\uD834""
JSON.stringify('\uDF06\uD834') // ""\\udf06\\ud834""

5. template string

In the traditional JavaScript language, the output template is usually written like this (jQuery method is used below).

  'There are <b>' + basket.count + '</b> ' +
  'items in your basket, ' +
  '<em>' + basket.onSale +
  '</em> are on sale!'

The above method is rather cumbersome and inconvenient. ES6 introduces template strings to solve this problem.

  There are <b>${basket.count}</b> items
   in your basket, <em>${basket.onSale}</em>
  are on sale!

The template string is an enhanced version of the string. It is identified by backquotes (`). It is omitted. It can be used as a normal string, or it can be used to define multi line strings, or embed variables in strings.

// Normal string
`In JavaScript '\n' is a line-feed.`

// Multiline string
`In JavaScript this is
 not legal.`

console.log(`string text line 1
string text line 2`);

// Embedding variables in strings
let name = "Bob", time = "today";
`Hello ${name}, how are you ${time}?`

The template strings in the above code are all represented by backquotes. If you need to use backquotes in the template string, you should use a backslash to escape.

let greeting = `\`Yo\` World!`;

If you use a template string to represent a multiline string, all spaces and indents are retained in the output.


In the above code, spaces and newlines of all template strings are reserved. For example, there will be a newline in front of the <ul> tag. If you do not want this newline, you can use the trim method to eliminate it.


Variables are embedded in the template string, and the variable name needs to be written in ${}.

function authorize(user, action) {
  if (!user.hasPrivilege(action)) {
    throw new Error(
      // The traditional way of writing is
      // 'User '
      // +
      // + ' is not authorized to do '
      // + action
      // + '.'
      `User ${} is not authorized to do ${action}.`);

Arbitrary JavaScript expressions can be placed inside braces, which can be used for operations and reference object properties.

let x = 1;
let y = 2;

`${x} + ${y} = ${x + y}`
// "1 + 2 = 3"

`${x} + ${y * 2} = ${x + y * 2}`
// "1 + 4 = 5"

let obj = {x: 1, y: 2};
`${obj.x + obj.y}`
// "3"

Functions can also be called in the template string.

function fn() {
  return "Hello World";

`foo ${fn()} bar`
// foo Hello World bar

If the value in braces is not a string, it will be converted to a string according to the general rules. For example, an object in braces will call the toString method of the object by default.

If the variable in the template string is not declared, an error will be reported.

// Variable place is not declared
let msg = `Hello, ${place}`;
// report errors

Since the inside of the braces of the template string is to execute JavaScript code, if the inside of the braces is a string, it will be output as is.

`Hello ${'World'}`
// "Hello World"

Template strings can even be nested.

const tmpl = addrs => `
  ${ => `

In the above code, another template string is embedded in the variables of the template string. The usage method is as follows.

const data = [
    { first: '<Jane>', last: 'Bond' },
    { first: 'Lars', last: '<Croft>' },

// <table>
//   <tr><td><Jane></td></tr>
//   <tr><td>Bond</td></tr>
//   <tr><td>Lars</td></tr>
//   <tr><td><Croft></td></tr>
// </table>

If you need to reference the template string itself, it can be executed when necessary, and can be written as a function.

let func = (name) => `Hello ${name}!`;
func('Jack') // "Hello Jack!"

In the above code, the template string is written as the return value of a function. Executing this function is equivalent to executing the template string.

6. instance: template compilation (Review)

Next, let's take a look at an example of generating a formal template through a template string.

let template = `
  <% for(let i=0; i <; i++) { %>
    <li><%=[i] %></li>
  <% } %>

The above code (pseudo code) places a regular template in the template string. This template uses <%...%> Place JavaScript code using <% =...%> Output JavaScript expressions.

How do I compile this template string?

One idea is to convert it to a JavaScript expression string.

for(let i=0; i <; i++) {

This transformation uses regular expressions.

let evalExpr = /<%=(.+?)%>/g;
let expr = /<%([\s\S]+?)%>/g;

template = template
  .replace(evalExpr, '`); \n  echo( $1 ); \n  echo(`')
  .replace(expr, '`); \n $1 \n  echo(`');

template = 'echo(`' + template + '`);';

Then, encapsulate the template in a function and return it.

let script =
`(function parse(data){
  let output = "";

  function echo(html){
    output += html;

  ${ template }

  return output;

return script;

Assemble the above contents into a template compilation function compile.

function compile(template){
  const evalExpr = /<%=(.+?)%>/g;
  const expr = /<%([\s\S]+?)%>/g;

  template = template
    .replace(evalExpr, '`); \n  echo( $1 ); \n  echo(`')
    .replace(expr, '`); \n $1 \n  echo(`');

  template = 'echo(`' + template + '`);';

  let script =
  `(function parse(data){
    let output = "";

    function echo(html){
      output += html;

    ${ template }

    return output;

  return script;

The compile function is used as follows.

let parse = eval(compile(template));
div.innerHTML = parse({ supplies: [ "broom", "mop", "cleaner" ] });
//   <ul>
//     <li>broom</li>
//     <li>mop</li>
//     <li>cleaner</li>
//   </ul>

7. label template

The functions of template strings are not limited to the above. It can be immediately followed by a function name that will be called to process the template string. This is called the "tagged template" function.

// Equivalent to

Label template is not a template, but a special form of function call. "Tag" refers to a function, and the template string immediately following it is its parameter.

However, if there are variables in the template character, it is not a simple call. Instead, the template string will be processed into multiple parameters before calling the function.

let a = 5;
let b = 10;

tag`Hello ${ a + b } world ${ a * b }`;
// Equivalent to
tag(['Hello ', ' world ', ''], 15, 50);

In the above code, the template string is preceded by an identification name tag, which is a function. The return value of the entire expression is the return value after the tag function processes the template string.

The function tag receives multiple parameters in turn.

function tag(stringArr, value1, value2){
  // ...

// Equivalent to

function tag(stringArr, ...values){
  // ...

The first parameter of the tag function is an array. The members of the array are those parts of the template string that do not have variable replacement. That is, variable replacement only occurs between the first member and the second member, and between the second member and the third member of the array.

Other parameters of the tag function are the values of each variable of the template string after being replaced. In this example, the template string contains two variables, so the tag will accept two parameters, value1 and value2.

The actual values of all parameters of the tag function are as follows.

  • The first parameter: ['hello ',' world ',' ']
  • Second parameter: 15
  • Third parameter: 50

That is, the tag function is actually called in the following form.

tag(['Hello ', ' world ', ''], 15, 50)

We can code the tag function as needed. The following is a way to write the tag function and the running results.

let a = 5;
let b = 10;

function tag(s, v1, v2) {

  return "OK";

tag`Hello ${ a + b } world ${ a * b}`;
// "Hello "
// " world "
// ""
// 15
// 50
// "OK"

Here is a more complex example.

let total = 30;
let msg = passthru`The total is ${total} (${total*1.05} with tax)`;

function passthru(literals) {
  let result = '';
  let i = 0;

  while (i < literals.length) {
    result += literals[i++];
    if (i < arguments.length) {
      result += arguments[i];

  return result;

msg // "The total is 30 (31.5 with tax)"

The above example shows how to piece back the parameters according to their original positions.

The passthru function uses the rest parameter as follows.

function passthru(literals, ...values) {
  let output = "";
  let index;
  for (index = 0; index < values.length; index++) {
    output += literals[index] + values[index];

  output += literals[index]
  return output;

An important application of "tag template" is to filter HTML strings to prevent users from entering malicious content.

let message =
  SaferHTML`<p>${sender} has sent you a message.</p>`;

function SaferHTML(templateData) {
  let s = templateData[0];
  for (let i = 1; i < arguments.length; i++) {
    let arg = String(arguments[i]);

    // Escape special characters in the substitution. Skip special characters in substitution
    s += arg.replace(/&/g, "&amp;")
            .replace(/</g, "&lt;")
            .replace(/>/g, "&gt;");

    // Don't escape special characters in the template. Do not skip special characters in the template
    s += templateData[i];
  return s;

In the above code, the sender variable is often provided by the user. After being processed by the SaferHTML function, the special characters in it will be escaped.

let sender = '<script>alert("abc")</script>'; // Malicious code
let message = SaferHTML`<p>${sender} has sent you a message.</p>`;

// <p>&lt;script&gt;alert("abc")&lt;/script&gt; has sent you a message.</p>

Another application of label templates is multilingual conversion (internationalization).

i18n`Welcome to ${siteName}, you are visitor number ${visitorNumber}!`
// "Welcome to xxx, you are the xxxx visitor!"

The template string itself cannot replace template libraries such as Mustache, because there are no conditional judgment and circular processing functions, but you can add these functions by yourself through the tag function.

// The following hashTemplate function
// Is a custom template processing function
let libraryHtml = hashTemplate`
    #for book in ${myBooks}
      <li><i>#{book.title}</i> by #{}</li>

In addition, you can even use tag templates to embed other languages in the JavaScript language.

      defaultValue='${this.state.value}' />

The above code converts a DOM string into a React object through the jsx function. You can find the jsx function in GitHub Concrete implementation.

The following is a hypothetical example of running java code in JavaScript code through Java functions.

class HelloWorldApp {
  public static void main(String[] args) {
    System.out.println("Hello World!"); // Display the string.

The first parameter of the template handler function (template string array) also has a raw attribute.

// ["123", raw: Array[1]]

In the above code, console The parameter that log accepts is actually an array. The array has a raw attribute, which saves the escaped original string.

Look at the example below.

tag`First line\nSecond line`

function tag(strings) {
  // strings.raw[0] is "first line\nsecond line"
  // Printout "First line\nSecond line"

In the above code, the first parameter strings of the tag function has a raw attribute and also points to an array. The members of this array are exactly the same as the strings array. For example, if the strings array is ["First line\nSecond line"], then strings The raw array is ["First line\nSecond line"]. The only difference between the two is that the slashes in the string have been escaped. For example, strings The raw array treats \n as two characters, \n, and not as a newline character. This is designed to facilitate obtaining the original template before escape.

8. restrictions on Template Strings

As mentioned earlier, other languages can be embedded in the label template. However, the template string will escape the string by default, so that it cannot be embedded in other languages.

For example, the LaTEX language can be embedded in the tag template.

function latex(strings) {
  // ...

let document = latex`
\newcommand{\fun}{\textbf{Fun!}}  // Normal operation
\newcommand{\unicode}{\textbf{Unicode!}} // report errors
\newcommand{\xerxes}{\textbf{King!}} // report errors

Breve over the h goes \u{h}ere // report errors

In the above code, the template string embedded in the variable document is completely legal for the LaTEX language, but the JavaScript engine will report an error. The reason lies in the escape of strings.

The template string will escape \u00FF and \u{42} as Unicode characters, so \unicode parsing will cause errors; And \x56 will be escaped as a hexadecimal string, so \xerxes will report an error. That is to say, \u and \x have special meanings in LaTEX, but JavaScript has escaped them.

To solve this problem, ES2018 relax Limits the escape of strings in label templates. If an illegal string escape is encountered, it will return undefined instead of an error, and the original string can be obtained from the raw attribute.

function tag(strs) {
  strs[0] === undefined
  strs.raw[0] === "\\unicode and \\u{55}";
tag`\unicode and \u{55}`

In the above code, the template string should have reported an error. However, the restriction on string escape is relaxed, so no error is reported. The JavaScript engine sets the first character to undefined, but the raw attribute can still get the original string. Therefore, the tag function can still process the original string.

Note that this relaxation of string escape will only take effect when the tag template parses the string. If it is not a tag template, an error will still be reported.

let bad = `bad escape sequence: \unicode`; // report errors

Tags: ECMAScript string JSON unicode

Posted by anthonyfellows on Sun, 07 Aug 2022 01:41:21 +0930