Newer
Older
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"wb.remove_sheet(wb.get_sheet_by_name('Middle Sheet'))\n",
"wb.remove_sheet(wb.get_sheet_by_name('Sheet1'))\n",
"wb.get_sheet_names()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `remove_sheet()` method takes a Worksheet object, not a string of the sheet name, as its argument. If you know only the name of a sheet you want to remove, call `get_sheet_by_name()` and pass its return value into `remove_sheet()`.\n",
"\n",
"Remember to call the `save()` method to save the changes after adding sheets to or removing sheets from the workbook.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Writing Values to Cells\n",
"\n",
"Writing values to cells is much like writing values to keys in a dictionary. Enter this into the interactive shell:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openpyxl\n",
"wb = openpyxl.Workbook()\n",
"Sheet = wb.get_sheet_by_name('Sheet')\n",
"sheet['A1'] = 'Hello world!'\n",
"sheet['A1'].value\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you have the cell’s coordinate as a string, you can use it just like a dictionary key on the Worksheet object to specify which cell to write to.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Project: Updating a Spreadsheet\n",
"\n",
"In this project, you’ll write a program to update cells in a spreadsheet of produce sales. Your program will look through the spreadsheet, find specific kinds of produce, and update their prices. Download this spreadsheet from http://nostarch.com/automatestuff/. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each row represents an individual sale. The columns are the type of produce sold (A), the cost per pound of that produce (B), the number of pounds sold (C), and the total revenue from the sale (D). The `TOTAL` column is set to the Excel formula `=ROUND(B3*C3, 2)`, which multiplies the cost per pound by the number of pounds sold and rounds the result to the nearest cent. With this formula, the cells in the `TOTAL` column will automatically update themselves if there is a change in column B or C.\n",
"\n",
"Now imagine that the prices of garlic, celery, and lemons were entered incorrectly, leaving you with the boring task of going through thousands of rows in this spreadsheet to update the cost per pound for any garlic, celery, and lemon rows. You can’t do a simple find-and-replace for the price because there might be other items with the same price that you don’t want to mistakenly “correct.” For thousands of rows, this would take hours to do by hand. But you can write a program that can accomplish this in seconds."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Your program does the following:\n",
"\n",
"- Loops over all the rows.\n",
"\n",
"- If the row is for garlic, celery, or lemons, changes the price.\n",
"\n",
"This means your code will need to do the following:\n",
"\n",
"- Open the spreadsheet file.\n",
"\n",
"- For each row, check whether the value in column A is *Celery, Garlic, or Lemon*.\n",
"\n",
"- If it is, update the price in column B.\n",
"\n",
"- Save the spreadsheet to a new file (so that you don’t lose the old spreadsheet, just in case).\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 1: Set Up a Data Structure with the Update Information"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The prices that you need to update are as follows:\n",
"\n",
"`Celery 1.19`\n",
"\n",
"`Garlic 3.07`\n",
"\n",
"`Lemon 1.27`\n",
"\n",
"You could write code like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if produceName == 'Celery':\n",
" cellObj = 1.19\n",
"if produceName == 'Garlic':\n",
" cellObj = 3.07\n",
"if produceName == 'Lemon':\n",
" cellObj = 1.27"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Having the produce and updated price data hardcoded like this is a bit inelegant. If you needed to update the spreadsheet again with different prices or different produce, you would have to change a lot of the code. Every time you change code, you risk introducing bugs.\n",
"\n",
"A more flexible solution is to store the corrected price information in a dictionary and write your code to use this data structure. In a new file editor window, enter the following code:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#! python3\n",
"# updateProduce.py - Corrects costs in produce sales spreadsheet.\n",
"\n",
"import openpyxl\n",
"\n",
"wb = openpyxl.load_workbook('produceSales.xlsx')\n",
"sheet = wb.get_sheet_by_name('Sheet')\n",
"\n",
"# The produce types and their updated prices\n",
"PRICE_UPDATES = {'Garlic': 3.07,\n",
" 'Celery': 1.19,\n",
" 'Lemon': 1.27}\n",
"\n",
"# TODO: Loop through the rows and update the prices."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Save this as updateProduce.py. If you need to update the spreadsheet again, you’ll need to update only the PRICE_UPDATES dictionary, not any other code."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 2: Check All Rows and Update Incorrect Prices\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The next part of the program will loop through all the rows in the spreadsheet. Add the following code to the bottom of `updateProduce.py`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#! python3\n",
"# updateProduce.py - Corrects costs in produce sales spreadsheet.\n",
"\n",
"--snip--"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
" # Loop through the rows and update the prices.\n",
"for rowNum in range(2, sheet.max_row): # skip the first row\n",
" produceName = sheet.cell(row=rowNum, column=1).value\n",
" if produceName in PRICE_UPDATES:\n",
" sheet.cell(row=rowNum, column=2).value = PRICE_UPDATES[produceName]\n",
" \n",
"wb.save('updatedProduceSales.xlsx')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We loop through the rows starting at row 2, since row 1 is just the header. The cell in column 1 (that is, column A) will be stored in the variable `produceName`. If produceName exists as a key in the `PRICE_UPDATES` dictionary, then you know this is a row that must have its price corrected. The correct price will be in `PRICE_UPDATES[produceName]`.\n",
"\n",
"Notice how clean using `PRICE_UPDATES` makes the code. Only one if statement, rather than code like `if produceName == 'Garlic':`, is necessary for every type of produce to update. And since the code uses the `PRICE_UPDATES` dictionary instead of hardcoding the produce names and updated costs into the for loop, you modify only the `PRICE_UPDATES` dictionary and not the code if the produce sales spreadsheet needs additional changes.\n",
"\n",
"After going through the entire spreadsheet and making changes, the code saves the Workbook object to `updatedProduceSales.xlsx`. It doesn’t overwrite the old spreadsheet just in case there’s a bug in your program and the updated spreadsheet is wrong. After checking that the updated spreadsheet looks right, you can delete the old spreadsheet.\n",
"\n",
"You can download the complete source code for this program from http://nostarch.com/automatestuff/."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Ideas for Similar Programs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since many office workers use Excel spreadsheets all the time, a program that can automatically edit and write Excel files could be really useful. Such a program could do the following:\n",
"\n",
"- Read data from one spreadsheet and write it to parts of other spreadsheets.\n",
"\n",
"- Read data from websites, text files, or the clipboard and write it to a spreadsheet.\n",
"\n",
"- Automatically “clean up” data in spreadsheets. For example, it could use regular expressions to read multiple formats of phone numbers and edit them to a single, standard format.\n",
"\n"
]
}
],
"metadata": {
"language_info": {
"name": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}