Welcome to Python Excels!
This is Python Excels, a series of blog posts that describe different techniques for automating tasks in Excel with the Python language.
It’s time to give back. And get back. I’ll be sharing some ideas from the Python programs I’ve written, and talk about what problems were solved. I like to share work that is as complete as possible, so in most cases you should be able to download the script and try it right away for yourself.
Getting started with Python is easy, you just need to download the Python installation package and install onto your computer.
Long before I even started thinking about writing a blog covering the integration possibilities between Excel and Python, the folks at Simplistix were crafting some of the handiest Python modules around: xlrd, xlwt and xlutils.
After completing the last example, I wanted to try something a little more interesting with XLWT this time around. This article describes a short script that uses Python and XLWT to download some raw data from the web, parse it, and write a spreadsheet with a new column derived from the data.
Now it’s getting interesting. Reading and writing spreadsheets with XLRD and XLWT (as discussed in the two previous posts is sufficient for many tasks and you don’t even need a copy of Excel to do it. But to really open up your data and fully extract all the information possible, you’ll need Excel and its powerful set of functions and features like pivot tables and charting.
To get you started, I’ve illustrated a number of common tasks you can do with Python and Excel. Each program below is a self contained example, just copy it, paste it and run it.
A handy feature in Excel is the ability to quickly record a Visual Basic (VB) macro and save it. It’s also fairly simple to take a captured VB macro, tweak it slightly and use it in your Python scripts. I’ve used this capability dozens of times over the years to capture a sequence of operations that modify a spreadsheet and build a pivot table or chart, then integrate the macro into a Python script. It wasn’t always apparent to me how I could take the macro and use it within Python or other scripting languages beside Visual Basic, but I now have a pretty good grasp and want to share the technique with you.
The last post introduced a technique for recording a Visual Basic macro within Excel and migrating it to Python. This exercise will build on those techniques while leveraging Python for more of the work.
The previous posts have used Excel and Python to create and manipulate small spreadsheets. In reality, Python and Excel are especially well suited to tackling large data sets. This post will illustrate some techniques for cleaning up data downloaded from corporate ERP systems such as SAP and Oracle, and getting it ready for some serious data mining with Excel.
In my professional life, I’ve found few people who have a solid knowledge of pivot tables and are really comfortable using them in Excel. If you aren’t aware of pivot tables or haven’t had the time to try out this function in Excel, pivot tables provide a way to cross tabulate, sort, segregate and aggregate tabular data, enabling you to quickly summarize data and extract totals, averages, and other information from the source data.
In this post I’ll develop and explain the Python code to create a set of pivot tables automatically in worksheet.
Python and Excel can help you quickly clean up a spreadsheet, organize data and build useful reports in very few lines of code. Another useful data preparation technique is to build new columns of information based on the available data, as shown in this post.
If you’re going to offer your Windows users a new utility, you better provide an icon to click and let them drag stuff onto it. And if something goes wrong, you better have a decent error message. This post will take the Pivot Table generation script developed in the Extending Pivot Table Data post and turn it into a user friendly Windows program with better flexibility and improved user experience.
Here’s another application for Python and Excel: parsing a collection of spreadsheets for specific data.