With the 2015 Cricket World Cup in full swing, there’s all sorts of folks who are interested in digging into the details about their home team, their upcoming rivals or their favorite players. With all the data available, how do you get it into shape so that you can analyze what’s important to you (or your team)? This is a job for Excel and its data analysis tools—Power Query and Power View.
In this post, we explore how to grab Cricket data from a handful of different sources, mash it up using Excel’s Power Query and then, with Power View, visualize the results in ways that are informative, interactive, interesting and even cool.
One of the PowerView sheets from one of ten Power View sheets in the workbook.
Want to follow along, or see all this for yourself? You can download the workbook and check out what we’ve done: WCRecords_Cricket_v2.xlsx
Let’s take a look at the steps taken to create this workbook. Along the way, feel free to imagine how a data analysis project of your own might be similar.
Find interesting data
First, we need to find some interesting data, and there’s an ocean of interesting data online. For the Cricket World Cup, we used www.cricketarchive.com, which has all sorts of public data organized into tables based on various slices—players, teams, seasons—that is interesting to analyze, mash-up and visualize. (The term mash-up means the same as transform, it’s just more fun to say). Now that we found a data source, we are ready to bring it into our workbook and begin to shape or merge it in ways that were interesting to us. That’s a job for Power Query.
Connect to the data using Power Query
Power Query excels (no pun intended) at connecting to all sorts of different data sources. Whether the data is in a workbook, in a company database, an OData feed or some website that has nifty tables posted, Power Query can connect to any—or all of them. But what’s even more compelling is that Power Query then allows you to take all those different connections and shape, merge, append or otherwise mash them up however you want.
In our case, we wanted to connect to a bunch of the tables from www.cricketarchive.com and create connections among them to see how different players, teams, and countries or regions have fared in their competitions over the years.
For example, we connected to the cricket archive from Power Query to make data available within our workbook. From there we were able to make a bunch of other connections to that site, all of which are available in the workbook. But it’s not enough just to link to data, once we have a connection to the data, we want to shape and transform it in ways that meet our needs.
Clean up and transform data using Power Query
Connecting to online data is pretty cool, but that’s just the tip of the Power Query iceberg of functionality. Beneath the surface, Power Query is a powerhouse of data transformation, letting you shape any data source in all sorts of ways—without changing the underlying data source. That point is worth clarifying: when you transform data (remove columns, change data types, merge data sources, so on) using Power Query, you only transform your view (or instance) of the data. The underlying data source remains unchanged.
Let’s use an analogy, to illustrate this important point. Let’s say you have an app on your phone that displays pictures. With that app, you can put special lenses on any picture to give the picture a special hue, warp it—or apply other fun transformations to the picture. You apply these transformations to the picture, so that your view of the photo is how you want it—but the original photo remains unchanged. The same goes with Power Query transformations—you can adjust your view of the underlying data sources, and bring your view into Excel for further analysis, but the underlying data sources remain unchanged.
Back to our cricket data. In this workbook we performed transformations such as:
- Renamed columns
- Split columns
- Changed data types
- Removed null values from columns
- Merged with additional data sets
- Other transformations
We specified these transformations in Query Editor, which is where Power Query queries are created, modified, managed or changed. You can open Query Editor from the Power Query ribbon in Excel. Simply select Show Pane to display all queries in the workbook and then double-click the query from the Workbook Queries pane.
Steps to launch the Query Editor in order to specify transformations in the selected query.
Once the query is opened in the Query Editor window, transformations made in the query are recorded in the Applied Steps section, in the order they were applied. When Power Query runs the query again (remember, against the original and unchanged data sources) your view of that data (transformed by your applied steps) can be created again with fidelity.
We applied a bunch of transformations—on a bunch of different tables we found at www.cricketarchive.com—and had a pretty good data model against which we could start our analysis. And perhaps the most interesting and compelling way of analyzing data is doing so with visualizations.
To see a step-by-step tutorial on how to combine and transform data using Power Query, check out this article. And remember, you can also download the workbook and play with it all you want.
Now that we used Power Query to transform the data, visualizing the data is a job for Power View. And that brings us to our final step.
Visualize the data using Power View
With the data transformed, it’s time to glean interesting facts, trends and insights from what we how have. To create visualizations, we need a new Power View report sheet, which is created as its own sheet in the workbook. To insert a Power View sheet, select Power View on the ribbon and then click the Power View button.
When a new Power View sheet is created, the Power View Fields pane is displayed, and beside it, a blank canvas awaits your creativity.
Blank Power View sheet.
To get started, simply drag a field from Power View Fields and drop it onto the canvas. Power View creates a basic visualization, based on the field added to the canvas. Note that you can create visualizations by dragging fields to the box at the bottom of the Power View Fields pane—either approach creates a visualization. Drag another field onto the canvas (even onto the visualization Power View just created), and the fun begins.
You can play around with the fields and visualizations until you get the desired visualization.
The following image shows two fields that have been dropped onto the canvas and a third being added. Ground was added first and then Runs margin was dropped on top of it. Losing team is being added to the existing visualization.
The next image shows the Power View visualization after the Losing team field is dropped onto the canvas:
Looks pretty basic, we know. But with a bit of creativity, or some intent on what you want to analyze and learn, you can create all sorts of interesting visualizations, or collections thereof, with Power View. To learn more about how to create visualizations, take a look at this series of tutorials.
The following three images show additional Power View visualizations. The first one shows centuries by players over many years of the Cricket World Cup, the second demonstrates how to interact with Power View visualizations, and the third shows batsman aggregates across Cricket World Cups.
The Power View reports are interactive, too. Here’s the same report, with 2011 selected:
And here’s another Power View report, with a completely different visualization:
There are many more Power View reports in the workbook, which you’re welcome to check out for yourself—just download the workbook and explore.
We hope you enjoyed this post, and best of luck to you and your favorite cricket team! We look forward to writing more posts in the future.
—Selvakumar Rajakumar, senior support escalation engineer for the SQL Business Intelligence team (CSS)
—Muthukumaran Arumugam, support engineer for the SQL Business Intelligence team (CSS)
—Carla Sabotta, senior content developer for the SQL Server team
—David Iseminger, senior content developer for the Power BI team
Power Query for Excel is available with an Office 365 ProPlus subscription, Office 2010 Professional Plus with Software Assurance, Office 2013 Professional Plus or Excel 2013 Standalone. Learn more about how to get started with Power Query for Excel.
Power View for Excel is available with an Office 365 ProPlus subscription, Office 2013 Professional Plus or Excel 2013 Standalone. Learn more about how to get started with Power View for Excel.
Learn about all the powerful data analysis features in Excel and take your analysis further by sharing and collaborating on business insights with colleagues using Power BI.