I have been spending quite a bit of time lately thinking about how to improve my processes. Here is another example.
I was recently asked to review the setup of a number of schools that I am a part of supporting. At the moment there are nearly three hundred schools. Once upon a time when there were only fifty or so schools. Back then, such a task was tedious, but not that time consuming. However, as the numbers increase, such reviews start to become a cumbersome affair.
The focus of my analysis was understanding the rollover rules for each semester. This started with the creation of a unique list of all the schools in the data set using the UNIQUE formula. With this list, I also created an additional number column in which I numbered each school. I then used a VLOOKUP to add this information to the data so that I could visually break-up the data using conditional formatting with the custom formula.
= ISEVEN($B2)
For the next step, I created a data validation to help categorise what sort of setup a school had. In addition to this, I created a data validation for the issues column produced from the column itself. To do this I created a separate UNIQUE column, summarising all the issues. I then referenced the UNIQUE column and simply chose ‘show warning’ for an invalid responses. This way, if there was an issue I had previously entered, I simply selected them from the dropdown list.
Once complete, I created a series of queries MATCHED with the various issues and then used a JOIN to summarise all the schools associated with each issue in one cell:
=ARRAYFORMULA(JOIN(", ",TRANSPOSE(QUERY(A3:J202,"SELECT A WHERE J matches '.*Description of the issue.*'"))))
On reflection, I wondered how I could cut down on this manual process. My intent was to create a template in which I could load the raw data and then have it broken down for me using formulas.
After spending a lot of time trying to reverse engineer the output I wanted using various IF, FILTER and UNIQUE formulas, I realised that rather than trying to embed everything in one formula, I maybe better suited in creating an additional table to help further organise the data.
The first step was to pull the data into this helper table using a QUERY. This means that once loading external data into sheet1 (the default first sheet), I just needed to refresh the query for everything to work.
In addition to this, I created a unique reference for each line by joining together the different pieces of information associated with each line to make a unique reference.
=TEXT(JOIN("",C2:F2),"")
In a new tab, I put together an analysis of the schools. I started this by generating a UNIQUE list of all the schools.
=UNIQUE(DATA!A2:A)
Using this data as a reference, I then created a query for each school, joining the different rollover rules together.
=(JOIN("|",QUERY(DATA!A$2:I,"SELECT I WHERE A = '"&A2&"'")))
I then used this data to produce a summary of the data. This began with a UNIQUE list of combined rollover rules.
=UNIQUE(QUERY(SCHOOLS!A2:C,"SELECT B,C WHERE A IS NOT NULL"))
Using this as a reference, I queried all the schools with the same rule and joined this data together.
="E"&JOIN(",E",QUERY(SCHOOLS!A2:B,"SELECT A WHERE B ='"&C2&"'"))
I also created another columns in which I used a SWITCH formula to describe each of the sets of rules in a more meaningful manner beginning with a category: “standard”, “two semesters” and “needs to be fixed”. Associated with this, I created a column which pulled this category and then used this information to create a series of conditional formulas in order to visually differentiate the different setups.
For anyone interested, a copy of the template can be found here.
After completing this exercise, I have found that I can easily adjust this template for any other data I wish to analysis quickly. It is also testament to the power of making your own tools. I am left wondering if I could use macros or scripts to improve the process, but for now my formulas all duct taped together is working for me. As Tom Woodward sums up:
I’m not a programmer so I have to find ways to get things done until I learn more.
Getting Things Done with Google Sheets – Creating a Template to Review Data by Aaron Davis is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Each year in my work, we ask the question as to how might we improve efficiencies associated with the end of year process. In the…