Skip to content
Snippets Groups Projects
Working_with_excel_spreadsheets.ipynb 43.7 KiB
Newer Older
  • Learn to ignore specific revisions
  •    "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
        "wb.remove_sheet(wb.get_sheet_by_name('Middle Sheet'))\n",
        "wb.remove_sheet(wb.get_sheet_by_name('Sheet1'))\n",
        "wb.get_sheet_names()"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "The `remove_sheet()` method takes a Worksheet object, not a string of the sheet name, as its argument. If you know only the name of a sheet you want to remove, call `get_sheet_by_name()` and pass its return value into `remove_sheet()`.\n",
        "\n",
        "Remember to call the `save()` method to save the changes after adding sheets to or removing sheets from the workbook.\n"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### Writing Values to Cells\n",
        "\n",
        "Writing values to cells is much like writing values to keys in a dictionary. Enter this into the interactive shell:\n"
       ]
      },
      {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
        "import openpyxl\n",
        "wb = openpyxl.Workbook()\n",
        "Sheet = wb.get_sheet_by_name('Sheet')\n",
        "sheet['A1'] = 'Hello world!'\n",
        "sheet['A1'].value\n"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "If you have the cell’s coordinate as a string, you can use it just like a dictionary key on the Worksheet object to specify which cell to write to.\n"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### Project: Updating a Spreadsheet\n",
        "\n",
        "In this project, you’ll write a program to update cells in a spreadsheet of produce sales. Your program will look through the spreadsheet, find specific kinds of produce, and update their prices. Download this spreadsheet from http://nostarch.com/automatestuff/. "
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Each row represents an individual sale. The columns are the type of produce sold (A), the cost per pound of that produce (B), the number of pounds sold (C), and the total revenue from the sale (D). The `TOTAL` column is set to the Excel formula `=ROUND(B3*C3, 2)`, which multiplies the cost per pound by the number of pounds sold and rounds the result to the nearest cent. With this formula, the cells in the `TOTAL` column will automatically update themselves if there is a change in column B or C.\n",
        "\n",
        "Now imagine that the prices of garlic, celery, and lemons were entered incorrectly, leaving you with the boring task of going through thousands of rows in this spreadsheet to update the cost per pound for any garlic, celery, and lemon rows. You can’t do a simple find-and-replace for the price because there might be other items with the same price that you don’t want to mistakenly “correct.” For thousands of rows, this would take hours to do by hand. But you can write a program that can accomplish this in seconds."
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Your program does the following:\n",
        "\n",
        "- Loops over all the rows.\n",
        "\n",
        "- If the row is for garlic, celery, or lemons, changes the price.\n",
        "\n",
        "This means your code will need to do the following:\n",
        "\n",
        "- Open the spreadsheet file.\n",
        "\n",
        "- For each row, check whether the value in column A is *Celery, Garlic, or Lemon*.\n",
        "\n",
        "- If it is, update the price in column B.\n",
        "\n",
        "- Save the spreadsheet to a new file (so that you don’t lose the old spreadsheet, just in case).\n",
        "\n"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### Step 1: Set Up a Data Structure with the Update Information"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "The prices that you need to update are as follows:\n",
        "\n",
        "`Celery 1.19`\n",
        "\n",
        "`Garlic 3.07`\n",
        "\n",
        "`Lemon 1.27`\n",
        "\n",
        "You could write code like this:"
       ]
      },
      {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
        "if produceName == 'Celery':\n",
        "    cellObj = 1.19\n",
        "if produceName == 'Garlic':\n",
        "    cellObj = 3.07\n",
        "if produceName == 'Lemon':\n",
        "    cellObj = 1.27"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Having the produce and updated price data hardcoded like this is a bit inelegant. If you needed to update the spreadsheet again with different prices or different produce, you would have to change a lot of the code. Every time you change code, you risk introducing bugs.\n",
        "\n",
        "A more flexible solution is to store the corrected price information in a dictionary and write your code to use this data structure. In a new file editor window, enter the following code:"
       ]
      },
      {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
        "#! python3\n",
        "# updateProduce.py - Corrects costs in produce sales spreadsheet.\n",
        "\n",
        "import openpyxl\n",
        "\n",
        "wb = openpyxl.load_workbook('produceSales.xlsx')\n",
        "sheet = wb.get_sheet_by_name('Sheet')\n",
        "\n",
        "# The produce types and their updated prices\n",
        "PRICE_UPDATES = {'Garlic': 3.07,\n",
        "                 'Celery': 1.19,\n",
        "                 'Lemon': 1.27}\n",
        "\n",
        "# TODO: Loop through the rows and update the prices."
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Save this as updateProduce.py. If you need to update the spreadsheet again, you’ll need to update only the PRICE_UPDATES dictionary, not any other code."
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### Step 2: Check All Rows and Update Incorrect Prices\n"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "The next part of the program will loop through all the rows in the spreadsheet. Add the following code to the bottom of `updateProduce.py`:"
       ]
      },
      {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
        "#! python3\n",
        "# updateProduce.py - Corrects costs in produce sales spreadsheet.\n",
        "\n",
        "--snip--"
       ]
      },
      {
       "cell_type": "code",
       "execution_count": null,
       "metadata": {},
       "outputs": [],
       "source": [
        "   # Loop through the rows and update the prices.\n",
        "for rowNum in range(2, sheet.max_row):  # skip the first row\n",
        "    produceName = sheet.cell(row=rowNum, column=1).value\n",
        "    if produceName in PRICE_UPDATES:\n",
        "        sheet.cell(row=rowNum, column=2).value = PRICE_UPDATES[produceName]\n",
        "        \n",
        "wb.save('updatedProduceSales.xlsx')"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "We loop through the rows starting at row 2, since row 1 is just the header. The cell in column 1 (that is, column A) will be stored in the variable `produceName`. If produceName exists as a key in the `PRICE_UPDATES` dictionary, then you know this is a row that must have its price corrected. The correct price will be in `PRICE_UPDATES[produceName]`.\n",
        "\n",
        "Notice how clean using `PRICE_UPDATES` makes the code. Only one if statement, rather than code like `if produceName == 'Garlic':`, is necessary for every type of produce to update. And since the code uses the `PRICE_UPDATES` dictionary instead of hardcoding the produce names and updated costs into the for loop, you modify only the `PRICE_UPDATES` dictionary and not the code if the produce sales spreadsheet needs additional changes.\n",
        "\n",
        "After going through the entire spreadsheet and making changes, the code saves the Workbook object to `updatedProduceSales.xlsx`. It doesn’t overwrite the old spreadsheet just in case there’s a bug in your program and the updated spreadsheet is wrong. After checking that the updated spreadsheet looks right, you can delete the old spreadsheet.\n",
        "\n",
        "You can download the complete source code for this program from http://nostarch.com/automatestuff/."
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "### Ideas for Similar Programs"
       ]
      },
      {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
        "Since many office workers use Excel spreadsheets all the time, a program that can automatically edit and write Excel files could be really useful. Such a program could do the following:\n",
        "\n",
        "- Read data from one spreadsheet and write it to parts of other spreadsheets.\n",
        "\n",
        "- Read data from websites, text files, or the clipboard and write it to a spreadsheet.\n",
        "\n",
        "- Automatically “clean up” data in spreadsheets. For example, it could use regular expressions to read multiple formats of phone numbers and edit them to a single, standard format.\n",
        "\n"
       ]
      }
     ],
     "metadata": {
      "language_info": {
       "name": "python",
       "pygments_lexer": "ipython3"
      }
     },
     "nbformat": 4,
     "nbformat_minor": 2
    }