Excel for Python automation
Method 1: execute the command with pip
Install openpyxl module pip install openpyxl
Method 2: in pychar: File - > setting - > Project Interpreter on the left, there is a + in the upper right corner, search openpyxl, and then install package
Excel read
Read the corresponding table
- Open an existing Excel table
from openpyxl import load_workbook exl = load_workbook(filename = 'Why don't you say you won't.xlsx') print(exl.sheetnames)
['No, No']
- According to the name or form
exl_1 = load_workbook(filename = 'Why don't you say you won't.xlsx') # print(exl_1.sheetnames) sheet = exl_1['No, No'] # A sheet directly named "no is no"
# If there is only one table: sheet = exl_1.active
- Gets the size occupied by Excel content
print(sheet.dimensions) #See how many rows there are in the data
A1:D7
Read cell
- Get the specific content of a cell
cell = sheet.cell(row=1,column=2) #Specify the number of rows and columns print(cell.value) cell_1 = sheet['A2'] #Specify coordinates print(cell_1.value)
Sales unit price Dora A dream
- Get the row, column and coordinates corresponding to the cell
print(cell_1.row, cell_1.column, cell.coordinate) # 2 1 is the location of cell A2, and B1 is the location of cell
2 1 B1
Read the values of multiple grids
- Specify coordinate range
cells = sheet['A1:C8'] #Value of A1 to C8 area
- Specifies the value of the row
Row = sheet[1] #Value of line 1 Rows = sheet[1:2] #Values in lines 1 to 2
- Specifies the value of the column
Column = sheet['A'] #Column A Columns = sheet['A:C'] #Columns A to C
- Specifies the value of the range
# Row acquisition for row in sheet.iter_rows(min_row = 1, max_row = 5, min_col = 2, max_col = 6): print(row) # A column consists of multiple cells. If you need to get the value of each cell, you can get it circularly for cell in row: print(cell.value) # Column acquisition for col in sheet.iter_cols(min_row = 1, max_row = 5, min_col = 2, max_col = 6): print(col) for cell in col: print(cell.value)
Exercises
Find text_ Table sheet1 in 1.xlsx contains empty grids and outputs the coordinates of these grids
from openpyxl import load_workbook exl = load_workbook('test_1.xlsx') sheet = exl.active for row in sheet.iter_rows(min_row = 1, max_row = 5, min_col = 1, max_col = 4): #View the number of rows and columns of the corresponding table for cell in row: if not cell.value: print(cell.coordinate)
Note: in practical application, you can check which values are empty, find the corresponding data and fill in, such as questionnaire survey.
Excel write
Write cells and save
from openpyxl import load_workbook exl = load_workbook(filename = 'Why don't you say you won't.xlsx') sheet = exl.active sheet['A1'] = 'hello I am fighting against Buddha' #Or cell = sheet['A1 '] #cell.value = 'hello word' exl.save(filename = 'test.xlsx') #Stored in the original Excel table, if a new file is created, it can be named with different names
Write line data and save
- Write a row of data and save
import xlwt workbook = xlwt.Workbook(encoding = 'utf-8') # Create a sheet sheet = workbook.add_sheet('What are you doing?') # Write excel # Parameter corresponding row, column and value sheet.write(1,0,label = 'I didn't do anything') # Save as a new table workbook.save('It's strange to believe you.xlsx')
- Write multiple rows of data and save it. If there is a corresponding table in the following operations, the original table will be directly overwritten.
import xlwt # from openpyxl import load_workbook book = xlwt.Workbook(encoding = 'utf-8') # Create a sheet exl = book.add_sheet('Just numbers, not words') data = [[1,22,1], [3,23,3], [3,25,2]] for i in range(len(data)): for j in range(len(data[i])): exl.write(i,j,data[i][j]) book.save('Mathematics makes me happy.xlsx')
When using openpyxl module in python, an error occurs: File is not a zip file.
Experience from netizens:
The biggest reason is that it is not a real xlsx file. If it is a new file created through the library xlwt, or an xlsx file obtained by modifying the suffix by myself, an error will be reported. The solution I encounter is basically to create an xlsx file by myself using office. On the Internet, it is said that it is due to the encryption of the new version of office, Real xlsx files can only be created through office software.
Python pit:
-
openpyxl and (xlrd,xlwt) cannot be mixed!
-
Project name, package name, file name and module name cannot be repeated!
-
(xlrd, xlwt) never allow openpyxl to read or write read and written files
The following code can change the corresponding position to the corresponding data, and other data will not be overwritten
from openpyxl import load_workbook exl = load_workbook(filename = 'Mathematics makes me happy.xlsx') xml_sheet = exl["Just numbers, not words"] data = [[1, 22, 3], [2, "I'm an alien",5], [3, 255, "It's you, too"]] for i in range(len(data)): for j in range(len(data[i])): print(i, j, data[i][j]) xml_sheet.cell(row=i + 1, column=j + 1).value = data[i][j] exl.save('Mathematics makes me happy.xlsx')
0 0 1 0 1 22 0 2 3 1 0 2 1 1 I'm an alien 1 2 5 2 0 3 2 1 255 2 2 It's you, too
Write formula to cell and save
from openpyxl import load_workbook exl = load_workbook(filename ='test.xlsx') sheet =exl.active # This step is very important because you need to open this table to perform subsequent operations sheet['D2'] = '=B2*C2' exl.save('test_new.xlsx')
Insert column data
- Insert a column
from openpyxl import load_workbook exl = load_workbook(filename ='test.xlsx') sheet =exl.active sheet.insert_cols(idx=2) #idx=2 column 2, insert a column before column 2 exl.save('test_11.xlsx')
- Insert multiple columns
#Insert 5 columns before column 2 as an example sheet.insert_cols(idx=2, amount=5) exl.save('test_22.xlsx')
Insert row data
Insert a line (or lines) before line 2
#Insert a row sheet.insert_rows(idx=2) #Insert multiple rows sheet.insert_rows(idx=2, amount=5) # Remember to save to see the corresponding changes.
delete
- Delete multiple columns
sheet.delete_cols(idx=5, amount=2) #Delete 2 columns before column 5
- Delete multiple rows
sheet.delete_rows(idx=2, amount=5)
move
When the number is positive, it means down or right, and when it is negative, it means up or left
move_range("data area", rows=1,cols=-1,translate=True) moves down 1 row and left 1 column
To move a specified range of cells in a worksheet:
ws.move_range("D4:F10", rows=-1, cols=2)
The above code moves the cells in the range D4: F10 up one column and right two columns. The moved cells will cover the existing cells.
If the cell contains formulas, openpyxl can translate them. Since this is not the desired result, it is disabled by default. Only the formulas in the moved cells will be translated, and the references of other cells or defined names to cells will not be updated. You can use the analytical formula converter to do this:
ws.move_range("G4:H10", rows=1, cols=1, translate=True)
The above code will move the formula within the specified cell range by one row and one column relative to the reference.
from openpyxl import load_workbook exl = load_workbook(filename ='Why don't you say you won't.xlsx') sheet =exl.active sheet.move_range('C1:D4', rows =1, cols=-1) exl.save(filename='I am a ghost spirit.xlsx')
Sheet table operation
- Create a new sheet
from openpyxl import load_workbook # Read the sheet name of the file exl = load_workbook(filename = 'Meter reading boss.xlsx') print(exl.sheetnames)
['Is that you', 'yes', 'no']
# Create a new sheet exl.create_sheet('Oh, none of us')
<Worksheet "Oh, none of us">
- Copy existing sheet
ws = exl['yes'] exl.copy_worksheet(ws)
<Worksheet "yes Copy">
- Modify sheet table name
sheet = exl['Is that you'] sheet.title = 'Is it really you'
All the above must be saved again before you can have the effect you want.
Create a new Excel table
from openpyxl import Workbook wb = Workbook() sheet = wb.active wb.save(filename = 'Luna is not lulu.xlsx')
Excel style
Set font style
-
Set font style
Font(name = "font name", size = font size, bold = bold, italic = italic, color = font color)
- Font note the initial capitalization
from openpyxl import Workbook from openpyxl.styles import Font # Create a new table workbook = Workbook() sheet = workbook.active cell = sheet['A1'] font = Font(name='Cursive script', size=10, bold=True, italic=True, color='FF0000') cell.font = font workbook.save('excel Style setting test run.xlsx')
- Set the font style of multiple grids
from openpyxl import Workbook from openpyxl.styles import Font workbook = Workbook() sheet = workbook.active font = Font(name='official script', size=10, bold=True, italic=True, color='FF000000') for x in range(1,5): sheet.cell(row=x,column=2).font = font #Replace the 1, 2, 3 and 4 row styles of the second column one by one workbook.save('excel style.xlsx')
You want to set the same format for a column / row, as follows:
from openpyxl import load_workbook from openpyxl.styles import Font exl = load_workbook(filename ='Why don't you say you won't.xlsx') sheet =exl.active cells = sheet[2] font = Font(name='typeface', size=10, bold=True, italic=True, color='FF000000') for cell in cells: cell.font = font exl.save('See if I've changed.xlsx')
---------------------------------------------------------------------------
1. Set alignment style
Horizontal alignment: distributed, justify, center, left, fill, centerContinuous, right, general
Vertical alignment: bottom, distributed, justify, center, top
from openpyxl import Workbook from openpyxl.styles import Alignment workbook = Workbook() sheet = workbook.active cell = sheet['A1'] alignment = Alignment(horizontal="center",vertical="center",text_rotation=30,wrap_text=True) cell.alignment = alignment workbook.save('Alignment style.xlsx')
2. Set cell border style
Side(style = edge style, color = edge color), the color can only be a hexadecimal color name
Border(left = left line style, right = right line style, top = upper line style, bottom = lower line style)
Edge style: double, mediumdashdotdot, slatdashdot, dashdotdot, dotted, hair, mediumdashed, dashed, dashdot, thin, mediumdashdot, medium, thick
from openpyxl import Workbook from openpyxl.styles import Side,Border workbook = Workbook() sheet = workbook.active cell = sheet['D6'] side_left = Side(style='double', color='FF0000') side_right = Side(style='dashDot', color='FF000000') #Set the format of the side first border = Border(left=side_left, right=side_right, top=side_right, bottom=side_left) #Substitute into the sideline cell.border = border workbook.save('Cell border style.xlsx')
3. Set cell filling style
- PatternFill(fill_type = fill pattern, fgColor = fill color)
- GradientFill(stop = (gradient color 1, gradient color 2,...))
from openpyxl import Workbook from openpyxl.styles import PatternFill,GradientFill workbook = Workbook() sheet = workbook.active cell1 = sheet['B3'] cell2 = sheet['B4'] #Monochrome fill pattern_fill = PatternFill(fill_type='solid', fgColor ="FF0000") cell1.fill = pattern_fill #Fountain Fill gradient_fill = GradientFill(stop=('FFFFFF', '99ccff','000000')) cell2.fill = gradient_fill workbook.save(filename='Cell fill pattern.xlsx')
4. Set row height and column width
- row_dimensions [line number] Height = row height
- column_dimensions [column number] Width = column width
from openpyxl import Workbook workbook = Workbook() sheet = workbook.active sheet.row_dimensions[1].height = 50 sheet.column_dimensions['C'].width = 20 workbook.save(filename='Row height and column width settings.xlsx')
5. Merge cells
- merge_ Cells (cell number to be merged)
- merage_cells(start_row = start row number, start_column = start column number, end_row = end row number, end_column = end column number)
from openpyxl import Workbook workbook = Workbook() sheet = workbook.active sheet.merge_cells('A1:B2') sheet.merge_cells(start_row=1, start_column=3,end_row=2, end_column=4) workbook.save(filename='merge cell.xlsx')
6. Cancel cell merging
- unmerge_ Cells (cell number to be merged)
- unmerge_cells(start_row = start row number, start_column = start column number, end_row = end row number, end_column = end column number)
from openpyxl import load_workbook workbook = load_workbook(filename="merge cell.xlsx") sheet = workbook.active sheet.unmerge_cells('A1:B2') sheet.unmerge_cells(start_row=1, start_column=3,end_row=2, end_column=4) workbook.save(filename='Unmerge cells.xlsx')
Exercises
Open the "why can't you? xlsx" file, find out the lines with unit price more than 5 in the file, and mark them in red, bold and border.
from openpyxl import load_workbook from openpyxl.styles import Font, Side, Border workbook = load_workbook('Why don't you say you won't.xlsx') sheet = workbook.active price = sheet['B'] row_lst = [] for cell in price: if isinstance(cell.value, int) and cell.value > 5: row_lst.append(cell.row) side_left = Side(style='mediumDashDotDot', color='FF0000') side_right = Side(style='mediumDashDotDot', color='6C91FC') side_top = Side(style='thin', color='3E7AAB') side_bottom = Side(style='double', color='3E7AAB') border = Border(left=side_left, right=side_right, top=side_top, bottom=side_bottom) font = Font(bold=True, color='FF0000') for row in row_lst: for cell in sheet[row]: cell.font = font cell.border = border workbook.save('Exercises.xlsx')