Python escape character

inputreprprinthexNamenameapplication
\"""22Quotation MarkQuotation marks
\'''27Apostropheapostrophe
\N---\N{name}characterDisplay characters by unicode name
\U---\UXXXXXXXXcharacterUnicode : 0XFFF~0X110000
\\\\\2FSolidusSlash
\a\x07-7Same as \ 7Same as \ 7
\b\x08-8BackspaceBackspaceKeyboard Backspace
\f\x0c-CForm FeedPage changeIn some circumstances, go to the next page
\n\n-ANew LineLine feedKeyboard Enter
\r\r-DCarriage ReturnenterEnter without line change
\t\t-9Horizontal TabulationHorizontal tabulationKeyboard Tab
\u---\uXXXXcharacterUnicode : 0xFF~0xFFFF
\v\x0b-BVertical TabulationLongitudinal tabulation
\x---\xXXcharacterUnicode : 0x00~0xFF
\000
\0\x00-0NullemptyStop replication
\1\x011Start of HeadingHeader start
\2\x02-2Start of TextStart of textIndent the beginning of the paragraph by two characters
\3\x033End of TextEnd of text
\4\x044End of TransmissionEnd of transmission
\5\x055Enquiryinquiry
\6\x066Acknowledgeconfirm
\7\x077BellRing the bellSystem prompt tone
\----Continuation characterEnd of code line

1. Simple combination of \ + ascii to realize common non character editing, such as carriage return, backspace, tabulation, etc
2. Ring, program prompt
3. Operations necessary for syntax reasons, such as slashes and quotation marks
4. Enter unicode characters, corresponding hexadecimal: \ XFF \ ufffff \ uffffffff
5. Enter unicode characters, corresponding to name, \ N{Bell} [note]
6. Enter unicode characters, corresponding to octal, \ 000 \ 0 \ 00

\a,\x07,\007,\7,\07,\N{BEL}
These are all meant to ring a bell, but pay attention to the latter two. If the character is followed, the octal character (0 ~ 7) cannot be used

\N{Bell}, I wanted to output \ a, but he output 🔔, I checked it on the Internet. I made a mistake on the Internet
Bell
(BEL)
I wanted to find a way to query the name of characters, but I couldn't find the namereplace, which is commonly used on the Internet. It can only encode characters outside the ascii range. And python's encode is ultimately built on the kernel. A few codepage s are just a simple mapping.
After some tossing, you'd better write one yourself!
Python escape character \ N {...} Unicode fully supported code

'a'.encode('name',errors='namereplace')
UnicodeEncodeError: 'charmap' codec can't encode character '\x61' in position 0: character maps to <undefined>

'fragrant'.encode('ascii',errors='namereplace')
Out[94]: b'\\N{CJK UNIFIED IDEOGRAPH-9999}'

b'\\N{CJK UNIFIED IDEOGRAPH-9999}'.decode('unicode_escape')
Out[95]: 'fragrant'

if 1:print('1\
2')
12
if 1:print('1\
    2')
1    2
if 1:print\
('12')
12
if 1:print('''1
2''')
1
2
if 1:print('''1\
2''')
12
if 1:print(r'''1\
2''')
1\
2
for i in range(0x80):
    if (c:=chr(i)).isprintable():
        try:s=eval("'\%s'"%c)
        except:print(c,'Error')
        else:
            if s!='\\'+c:
                print(c,repr(s))
                
" '"' # '\ "' double quotation marks
' "'" # "\ '" single quotation mark
0 '\x00' # NUL null control character
1 '\x01' # 
2 '\x02'
3 '\x03'
4 '\x04'
5 '\x05'
6 '\x06'
7 '\x07'
N Error
U Error
\ '\\'
a '\x07'
b '\x08'
f '\x0c'
n '\n'
r '\r'
t '\t'
u Error
v '\x0b'
x Error

From a simple understanding, the escape character is to replace a character that cannot be printed or has other meanings with the following byte after \ is turned on.
It is one of Python syntax. Since \ has a printed character, it must be typed on the keyboard, so I start all printable ascii with \. If its literal expression is different from its generated value, it is an escape character.

C0 control

0 ~ 7 above corresponds to the C0 control of ascii.

for i in range(0x100):
...     c=chr(i)
...     print(repr(c),end=' ')
...
'\x00' '\x01' '\x02' '\x03' '\x04' '\x05' '\x06' '\x07' '\x08' '\t' '\n' '\x0b' '\x0c' '\r' '\x0e' '\x0f' '\x10' '\x11' '\x12' '\x13' '\x14' '\x15' '\x16' '\x17' '\x18' '\x19' '\x1a' '\x1b' '\x1c' '\x1d' '\x1e' '\x1f' ' ' '!' '"' '#' '$' '%' '&' "'" '(' ')' '*' '+' ',' '-' '.' '/' '0' '1' '2' '3' '4' '5' '6' '7' '8' '9' ':' ';' '<' '=' '>' '?' '@' 'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M' 'N' 'O' 'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W' 'X' 'Y' 'Z' '[' '\\' ']' '^' '_' '`' 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z' '{' '|' '}' '~' '\x7f' '\x80' '\x81' '\x82' '\x83' '\x84' '\x85' '\x86' '\x87' '\x88' '\x89' '\x8a' '\x8b' '\x8c' '\x8d' '\x8e' '\x8f' '\x90' '\x91' '\x92' '\x93' '\x94' '\x95' '\x96' '\x97' '\x98' '\x99' '\x9a' '\x9b' '\x9c' '\x9d' '\x9e' '\x9f' '\xa0' '¡' '¢' '£' '¤' '¥' '¦' '§' '¨' '©' 'ª' '«' '¬' '\xad' '®' '¯' '°' '±' '²' '³' '´' 'µ' '¶' '·' '¸' '¹' 'º' '»' '¼' '½' '¾' '¿' 'À' 'Á' 'Â' 'Ã' 'Ä' 'Å' 'Æ' 'Ç' 'È' 'É' 'Ê' 'Ë' 'Ì' 'Í' 'Î' 'Ï' 'Ð' 'Ñ' 'Ò' 'Ó' 'Ô' 'Õ' 'Ö' '×' 'Ø' 'Ù' 'Ú' 'Û' 'Ü' 'Ý' 'Þ' 'ß' 'à' 'á' 'â' 'ã' 'ä' 'å' 'æ' 'ç' 'è' 'é' 'ê' 'ë' 'ì' 'í' 'î' 'ï' 'ð' 'ñ' 'ò' 'ó' 'ô' 'õ' 'ö' '÷' 'ø' 'ù' 'ú' 'û' 'ü' 'ý' 'þ' 'ÿ'

In fact, all the above characters with \ are escape characters. Their transfer target is only one character long, but they can't be printed.

The above is also the Unicode of encode_ Escape and raw_unicode_escape code is different. unicode_escape treats escape characters in strings as strings.

'ÿ'.encode('unicode_escape')
Out[24]: b'\\xff'

'ÿ'.encode('raw_unicode_escape')
Out[25]: b'\xff'

'fragrant'.encode('raw_unicode_escape')
Out[26]: b'\\u9999'

'\n'.encode('raw_unicode_escape')
Out[27]: b'\n'

'\n'.encode('unicode_escape')
Out[28]: b'\\n'

'\ t', '\ n' and '\ r' are defined as the same literal and content
'\ a', 'b', 'f' and '\ v', can be entered, but the content is converted to the corresponding \ x
\',', ", because of the need of grammar, it must be escaped to output correctly

Tags: Python

Posted by White_Coffee on Mon, 18 Apr 2022 03:12:19 +0930