Whether you share it or not, documenting and recording your process as you go along has its own rewards: You’ll start to see the work you’re doing more clearly and feel like you’re making progress. Austin Kleon ‘Show Your Work!’

The team I am a part of was recently asked if there was any work that we could possibly automate to save time and effort. Leaving aside the strange question of asking the boiling frog how the boiling process might be sped up, I was left thinking about the limits of automation when what is trying to be automated is not always straight-forward and often involves contextual knowledge. My suggestion in response was not to automate the task, but possibly automate the process.

Ever since I have stepped into my new role in the technical team, I have made a conscious effort to capture everything that I have learnt and store it in a canonical space as one of the issues I had early on was that it was too difficult to find anything. This then allows me to produce repeatable processes. (In some respects I imagine that organising information in this manner actually makes it easier for an AI assistant to support?)

In addition to automating the day-to-day, I recently built out a spreadsheet to help with a particular task of cleaning up duplicate records across the database. Given a list of hundreds of records to review, I felt that saving any time and effort was going to make a difference. My thought was to create a series of declare statements to use to query the database, then copy the raw data, review it, create tasks for the relevant teams and produce a summary of the records changed and those that remained the same.

Doubling Down on the Declare

One of the issues I initially found when producing my query was updating all the variables each time. It therefore occurred to me that rather than changing the names and various IDs each time, which was always fiddly and prone to error, I could create a series of declare statements that I updated with the SQL statement each time. This allowed me to leave the rest of the query the same:

="
DECLARE @Sur1 VARCHAR(30) = '"&D2&"'
DECLARE @Sur2 VARCHAR(30) = 'SUR2'
DECLARE @Pre1 VARCHAR(30) = '"&E2&"'
DECLARE @Pre2 VARCHAR(30) = 'PRE2'
DECLARE @Login1 VARCHAR(30) = '"&H2&"'
DECLARE @Login2 VARCHAR(30) = 'CECVIDS\UPN2'
DECLARE @Login3 VARCHAR(30) = 'CECVIDS\UPN3'
DECLARE @CNUM1 VARCHAR(15) = 'CX'
DECLARE @CNUM2 VARCHAR(15) = 'CX'
DECLARE @CNUM3 VARCHAR(15) = 'CX'"

In addition to this, I stripped out any additional information that was included within a text string of possible information and had this represented on unique lines after the initial list:

--"&IFNA(JOIN(CHAR(10)&"--",ARRAYFORMULA(LEFT(QUERY(TRIM(MID(SUBSTITUTE(J2, ":", REPT(" ", LEN(J2))), LEN(J2)*ROW(INDIRECT("1:"&LEN(J2)))-LEN(J2)+1, LEN(J2))),"SELECT * WHERE Col1 LIKE 'C%'"),16))))&""

The output then looks something like this:

--"
DECLARE @Sur1 VARCHAR(30) = 'Davis'
DECLARE @Sur2 VARCHAR(30) = 'SUR2'
DECLARE @Pre1 VARCHAR(30) = 'Aaron'
DECLARE @Pre2 VARCHAR(30) = 'PRE2'
DECLARE @Login1 VARCHAR(30) = 'CECVIDS\01234567'
DECLARE @Login2 VARCHAR(30) = 'CECVIDS\UPN2'
DECLARE @Login3 VARCHAR(30) = 'CECVIDS\UPN3'
DECLARE @CNUM1 VARCHAR(15) = 'C0000001'
DECLARE @CNUM2 VARCHAR(15) = 'CX'
DECLARE @CNUM3 VARCHAR(15) = 'CX'
--CECVIDS\30000001
--C1000001"

I commented the additional information out so that I could use it if required or simply run the updated script it was. It also meant that I did not have to delete the quotations at the start and end that come across when copying from Google Sheets every time. I did tinker with how I could get the additional logins and numbers to prepopulate within the declare statement, but hit a wall as the number of additional values was not consistent.

Streamlining the Analysis and Fix

Once I had all the records in the system, I then had to organise them into those to be retained and those to be updated. The first step was to create a conditional formulas using the following custom formula that quickly highlighted possible duplicate records:

=COUNTIFS(L:L, L1, L:L, "<>BirthDate")

This allowed me to then reorder the list and review the rest of the details. Once I had done this, I then created a summary of the data.
Often I create a config tab where I keep tab where I create various formulas that I can refer to. This is particular important when using variable NAMED RANGES. As I wanted to make this spreadsheet somewhat self-contained, I decided to just use the first row of the sheet, which I then hid. This included a summary of the different community names in a single cell:

=LEFT(JOIN("/",UNIQUE(QUERY(F2:F,"SELECT F WHERE NOT F = 'NameExternal'"))),LEN(JOIN("/",UNIQUE(QUERY(F2:F,"SELECT F WHERE NOT F = 'NameExternal'"))))-1)


A summary of staff numbers in a single cell:

=IFNA(JOIN(",",UNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC"))))


And a summary of login records in a cell

IFNA(JOIN(", ",UNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%'"))))

Each was given a Named Range that could be easily referred to in other formulas.

Initially, I created a formula which combined these different attributes, but I realised I then had to explain what sort of duplicate it was. Overall, there are four types of duplicate records:

  1. User with multiple staff numbers and login records
  2. User with multiple login records
  3. User with multiple staff numbers
  4. User with multiple community records

To accommodate these differences, I created an formula using the named ranges to combine the IF and AND formulas:

=IF( AND( IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))>1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%' ORDER BY H")))>1), F1&" has multiple staff numbers: "&G1&" and multiple login records: "&SUBSTITUTE(H1,"IDS\",""),
IF(AND(IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))=1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%' ORDER BY H")))>1), F1&" ( "&G1&" ) has multiple login records: "&SUBSTITUTE(H1,"IDS\",""),
IF(AND(IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))=1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%' ORDER BY H")))=1), F1&" ( "&G1&" / logins: "&SUBSTITUTE(H1,"IDS\","")&") has duplicate community records",
IF(AND(IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))>1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CECV%' ORDER BY H")))=1), F1&" logins: "&SUBSTITUTE(H1,"IDS\"," ")&" has multiple staff numbers: "&G1&"",FALSE))))

Using the IF and AND combination allowed me to quickly and easily produce a summary of the issue in one sentence to raise in our incident management system.

In addition to this, I created a SWITCH formula to provide the server listener name, as opposed to the IP address provided in the data export:

=IFNA(SWITCH(B2:B,
"IP Address 1","Listener Name 1",
"IP Address 2","Listener Name 2",
"IP Address 3","Listener Name 3",
"IP Address 4","Listener Name 4",
"IP Address 5","Listener Name 5"))

Summarising the Changes

In addition to the summary of issue, I also created a summary of changes that I could easily copy and paste into the incident notes once I had cleaned up the various duplicates:

={"","","","","","";
"Record(s) updated / merged:","","","","","";
"","","","","","";
SORT(Indirect(Table1));
"","","","","","";
"Record(s) retained:","","","","","";
"","","","","","";
SORT(Indirect(Table2))}

As the number of records varied, I created a NAMED RANGE for records updated / merged and records retained and used the INDIRECT formula to refer to them.

The formula used for Table1 was:

="C2:H"&MATCH("Database Name", Indirect(CThreeC), 0) + ROW(INDIRECT(CThree)) - 2

While the formula used for Table2 was:

=ADDRESS(MATCH("Database Name", INDIRECT(CThreeC), 0) + ROW(INDIRECT(CThree))-1, COLUMN(INDIRECT(CThree)))&":H"

As I was continually moving and deleting rows around in the spreadsheet, I found that it kept on breaking these named ranges, so that is why I created two further NAMED RANGES.

This was formula used for CThree:

="$C$3"

And this was formula used for CThreeC:

="$C$3:$C"

Again, all this ‘magic‘ was hidden in the top row of the spreadsheet.

An example of the spreadsheet with the various formulas can be found here.


There has been a lot spoken at my workplace lately about what it means to be proud. Personally, fixing 250+ duplicate records is not necessarily something I am ‘proud’ of or excited by, but I am proud of my ability to improve processes and identify ways in which my actions can make it easier for others to do their job. The challenge I often have is being able to hand over such solutions to others. Most prefer their own way of working, even if that means complaining that they do not have time for this or that. I feel that the issue is finding the right balance between improving a process and creating more work.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

The power of query in sorting out data in Sheets

I get a lot of requests for writing various queries, focusing on different information. Although I am always happy to help out, I am also interested in potential of creating spreadsheet templates that people can copy and use themselves. One template I developed recently was build around interpreting different data exports.

Often these export files have a lot of information, which people get lost in. Although they could just delete the columns that they do not want, the feedback is that that process is tedious. Therefore, I created a spreadsheet where users can import a CSV and then select the fields that they want to work with. I did this using a QUERY formula and a series of checkboxes associated with which columns to SELECT and which columns to ORDER BY.

="{QUERY(Sheet1!A1:CJ,"&char(34)&"SELECT "&JOIN(", ",QUERY(A3:D,"SELECT C WHERE D = TRUE"))&" ORDER BY "&JOIN(", ",IFNA(QUERY(A3:E,"SELECT C WHERE E = TRUE"),"Col1"))&char(34)&",1)}"

Users can then copy the query formula and run it in a separate tab.

The current workflow can be summarised as follows:

  1. Create a copy of the Google Sheet template.
  2. Import the data spreadsheet (File > Import > Upload > Browse > Open > Insert New Sheet(s) > Import data). This will bring the data into a new tab ‘Sheet1’
  3. In the SUMMARY tab, refresh the formula in cell A3, then use the checkboxes to select which fields to be displayed and the field(s) to order by.
  4. Copy the formula created in Cell B2 and paste it in Cell A1 in the data tab, adding ‘=’ at the start to activate it

Although this works fine, I was left wondering if it was possible to instead display the data dynamically, rather than copying and pasting the formula. I imagine that I would need some sort of ‘TO_FORMULA’ type of function, where the text is translated into a formula.

As always, thoughts and comments welcome.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

The Web is binding not just pages but us human beings in new ways. We are the true "small pieces" of the Web, and we are loosely joining ourselves in ways that we're still inventing. David Weinberger Small Pieces, Loosely Joined

Each year in my work, we ask the question as to how might we improve efficiencies associated with the end of year process. In the past I have created a database of school contacts, a process for generating timetables and a template for reviewing data. Two particular requests that were raised this year were the desire to reduce emails associated with updates and personalise correspondences sent to schools.

A Dashboard of One’s Own

In the past we used to have a dashboard with a dropdown as to who completed each task for each school. The problem with this was two fold. One, schools had no visibility to where the end of year process is at as this was only visible to support staff. Two, there was a lot of dependency on emails to know where things are at.

Associated with this, we shared a template with each school which included form allocations. My solution was to create a tab in the school template with a checklist that provided schools with clarity where the process was at and then pull this information into a shared spreadsheet so that the support team had an understanding where things were at. The challenge with this was to create and collect the templates in an efficient manner.

The process for generating the templates in the past was to manually create a copy using the /copy method and then sharing this with schools. This always felt cumbersome. I wondered if there was a more efficient method for creating the templates. I also wanted a more efficient method for collecting all the links. I remember Alice Keeler created a Google Sheets Add-on for making a copy of rubrics, so I knew that it was technically possible. Googling the problem, I found a video (and template) from Carl Arrowsmith that walked through using Google Apps Script to make multiple copies of same file.

What I like about finding such resources is getting into the code and unpacking how it all works, and then making my own changes, especially when the person sharing leaves various notes within the code to guide you. I have always wanted to dig further into Apps Script, but never really found the time to start from scratch. A couple of changes I made was including the school name and school number in both the template and title of the document. You can find a copy of my updates here.

Once I had made the changes, I generated a template for each school. This not only created all the copies, but also provided a copy of each URL in the spreadsheet. I then brought this list back into a dashboard and imported the checklist using the following formula:

=TRANSPOSE(IMPORTRANGE("school spreadsheet","Sheet1!B3:B")

In addition to adding conditional formatting to highlight whether the task was complete or waiting, I added a status cell to provide an update at a glance:

=IF($S2=FALSE,"Waiting on Support",IF($T2=FALSE,"Waiting on School",IF($U2=FALSE,"Waiting on Support",IF($V2=FALSE,"Waiting on Support",IF($W2=FALSE,"Waiting on School",IF($X2=FALSE,"Waiting on Support",IF($Y2=FALSE,"Waiting on Support",IF($Z2=FALSE,"Waiting on Support",IF($AA2=FALSE,"Waiting on School",IF($AB2=FALSE,"Waiting on Support",IF($AC2=FALSE,"Waiting on Support",IF($AD2=FALSE,"Waiting on Support","All Complete"))))))))))))

Although this summary did not provide details about who completed each task, it did however provide more visibility. I guess you can’t have everything?

Personalising Emails

The other improvement related to sending out an email to 300+ schools. In the past we would just use an Outlook template. However, as each of the emails contained a unique link, this no longer worked. As I had a list of these links in a spreadsheet, I worked out that I could just create a unique email for each school, with the link being a variable:

="Dear to whom it may concern,</code></p>
Below you will find a link to the Google Sheet.</p>
"&Sheet1!R4&"
Instructions for filling in this information can be found in the End of Year guide.</p>
School Support"

It then occurred to me that as I had a list of the staff associated with each school that maybe I could replace the cold ‘Dear to whom it may concern’ with ‘Dear NAME’. The problem I was faced with though is that I had all the names in one cell, with each on a new line:

Mahatma Gandhi

Virginia Woolf

Fiona Hardy

My first step then was to SUBSTITUTE the new line (CHAR(10)) with a comma:

=SUBSTITUTE(C4,CHAR(10),", ")

However, I was then left with the following:

Mahatma Gandhi, Virginia Woolf, Fiona Hardy

My next task was to somehow replace the last comma in the string with an ‘and’. Unsure how to go about it, I went online and found a REGEXREPLACE on the Infoinspired website that achieved the desired outcome:

=regexreplace(A1, "(.*),", "$1 and")

This then provided me with the following:

Mahatma Gandhi, Virginia Woolf and Fiona Hardy

I then wondered if I could somehow remove the surname and leave only the first name in the list. After scratching my head for a while, I wondered if I could QUERY the list and select only the names that were not followed by a comma. I found the following REGEX formula:

"^[^,]+$"

I subsequently, SPLIT and TRANSPOSED my list of names and used a QUERY with a match containing the REGEX formula.

=regexreplace(SUBSTITUTE(JOIN(", ",QUERY(TRANSPOSE(SPLIT(SUBSTITUTE(C4,CHAR(10),", ")&","," ")),"SELECT * WHERE Col1 MATCHES '^[^,]+$'")),CHAR(10),", "), "(.*),", "$1 and")

Returning to my email template, I then replaced the ‘to whom it may concern’ with the formula. This meant I could quickly and easily create an email that addressed the actual users and included their spreadsheet.

="Dear "&regexreplace(SUBSTITUTE(JOIN(", ",QUERY(TRANSPOSE(SPLIT(SUBSTITUTE(C2,CHAR(10),", ")&amp;","," ")),"SELECT * WHERE Col1 MATCHES '^[^,]+$'")),CHAR(10),", "), "(.*),", "$1 and")&",
Below you will find a link to the Google Sheet.
"&Sheet1!R2&"
Instructions for filling in this information can be found in the End of Year guide.
School Support 

Providing the following outcome:

Dear Mahatma, Virginia and Thomas,

Below you will find a link to the Google Sheet.

UNIQUE LINK

Instructions for filling in this information can be found in the End of Year guide.

School Support

I am sure there are more efficient ways to achieve the same outcome using different applications that I do not necessarily have at my disposal. For example, it would be even better to automatically send the the email from the Google Sheet, rather than copy and pasting the text into Outlook. I also must be honest, even though I completed Ben Collins’ REGEX course, I still have not got my head around it all yet, but I feel that the first point of learning is not always knowing how to do something, but actually knowing that it is possible.

As always, comments welcome.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

I snickered at QR codes because I couldn’t see any real use for them. It took a pandemic to reveal their serious utility. And while I generally avoid predicting the future, I suspect QR codes might be with us for a while, because they’re turning out to be extremely useful even outside of the realm of pandemic-adaptation. Restaurants are enjoying not having to print and reprint menus; contactless payments are super convenient even in situations where you’re not worrying about fomites. Clive Thompson ‘4 Lessons From the Improbable Rise of QR Codes’

One of the things that I often struggle with is with the purpose my online meanderings. Although I agree about ‘collecting the dots‘, it can sometimes be hard to justify in amongst the everyday hustle and bustle. However, again and again I find myself diving into something I read long after the fact.

Today I was testing barcodes with an electronic sign-in system. I had a Crystal Report which produces student cards that included a barcode associated with the ID. However, I found that although I could get them to work with the scanner, it was very fiddly and was far from optimal.

Knowing that any adjustments to the Crystal Report would take some time as it involved a number of teams and processes, I wondered if I could produce my own cards using a spreadsheet. I noticed that the third-party application used a QR Code when enrolling a new device which has really quick, so I wondered if I could also use a QR Code for the student cards. After doing an initial test with one student ID to confirm that QR Codes would work, I searched up a post from Ben Collins regarding the generation of QR Codes using Google Sheets. After exporting a test copy of the student data and importing this into the Google Sheet in Sheet1, I created a template where a QR Code was generated for the each student ID:

=ARRAYFORMULA(IMAGE("https://chart.googleapis.com/chart?chs=250x250&cht=qr&chl="&ENCODEURL(QUERY(Sheet1!A2:A,"SELECT A WHERE A IS NOT NULL")) ))

Then in the next column I entered the student details, including first name (Column L), last name (Column K), class/form (Column I), year (Column F) and ID (Column A). I also used Char(10) to separate this information onto different lines.

=ARRAYFORMULA(QUERY(Sheet1!L2:L,"SELECT L WHERE L IS NOT NULL")&" "&QUERY(Sheet1!K2:K,"SELECT K WHERE K IS NOT NULL")&""&Char(10)&"Class "&QUERY(Sheet1!I2:I,"SELECT I WHERE I IS NOT NULL")&" / Year "&QUERY(Sheet1!F2:F,"SELECT F WHERE F IS NOT NULL")&""&Char(10)&"ID "&QUERY(Sheet1!A2:A,"SELECT A WHERE A IS NOT NULL"))

I was then left with a list of cards, the problem is that I wanted to print two columns to a page. To achieve this I created two dynamic named ranges, one for column one:

"Sheet2!A1:C"&(ROUNDUP(COUNTUNIQUE(Sheet2!B1:B)/2))

And column two:

="Sheet2!A"&(ROUNDUP(COUNTUNIQUE(Sheet2!B1:B)/2))+1&":C"&(ROUNDUP(COUNTUNIQUE(Sheet2!B1:B)))

Then in a new tab I used the INDIRECT formula to bring in the two columns.

The last step was to adjust the printer settings. This included extending the margins and changing the scale to 55%.

Knowing that this would be used by different people, I made a copy of the template and deleted Sheet1 that could be used as a master. I then put together the following instructions for how it could be used from scratch.

  1. 1. Create a copy of the QR Code Google Sheet https://docs.google.com/spreadsheets/d/1lzfHDR9229g5xWTb16JPRjKDTBHlix1NmOTJE9fp-ck/copy
  2. Import the student spreadsheet previously saved and Insert a New Sheet
  3. Click on Sheet2
  4. Click on Cell A1 and press Enter
  5. Click on Cell B1 and press Enter
  6. In Cell E2 add ‘=’ at the start of the formula
  7. In Cell F2 add ‘=’ at the start of the formula
  8. Click on StudentCards tab
  9. Click on FIle > Print

Not sure if anyone else has used Google Sheets for similar purposes. Comments appreciated.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

I’m not a programmer so I have to find ways to get things done until I learn more. Tom Woodward ‘Text Acrobatics in Google Spreadsheets’

I have been spending quite a bit of time lately thinking about how to improve my processes. Here is another example.

I was recently asked to review the setup of a number of schools that I am a part of supporting. At the moment there are nearly three hundred schools. Once upon a time when there were only fifty or so schools. Back then, such a task was tedious, but not that time consuming. However, as the numbers increase, such reviews start to become a cumbersome affair.

The focus of my analysis was understanding the rollover rules for each semester. This started with the creation of a unique list of all the schools in the data set using the UNIQUE formula. With this list, I also created an additional number column in which I numbered each school. I then used a VLOOKUP to add this information to the data so that I could visually break-up the data using conditional formatting with the custom formula.

= ISEVEN($B2)

For the next step, I created a data validation to help categorise what sort of setup a school had. In addition to this, I created a data validation for the issues column produced from the column itself. To do this I created a separate UNIQUE column, summarising all the issues. I then referenced the UNIQUE column and simply chose ‘show warning’ for an invalid responses. This way, if there was an issue I had previously entered, I simply selected them from the dropdown list.

Once complete, I created a series of queries MATCHED with the various issues and then used a JOIN to summarise all the schools associated with each issue in one cell:

=ARRAYFORMULA(JOIN(", ",TRANSPOSE(QUERY(A3:J202,"SELECT A WHERE J matches '.*Description of the issue.*'"))))

On reflection, I wondered how I could cut down on this manual process. My intent was to create a template in which I could load the raw data and then have it broken down for me using formulas.

After spending a lot of time trying to reverse engineer the output I wanted using various IF, FILTER and UNIQUE formulas, I realised that rather than trying to embed everything in one formula, I maybe better suited in creating an additional table to help further organise the data.

The first step was to pull the data into this helper table using a QUERY. This means that once loading external data into sheet1 (the default first sheet), I just needed to refresh the query for everything to work.

In addition to this, I created a unique reference for each line by joining together the different pieces of information associated with each line to make a unique reference.

=TEXT(JOIN("",C2:F2),"")

In a new tab, I put together an analysis of the schools. I started this by generating a UNIQUE list of all the schools.

=UNIQUE(DATA!A2:A)

Using this data as a reference, I then created a query for each school, joining the different rollover rules together.

=(JOIN("|",QUERY(DATA!A$2:I,"SELECT I WHERE A = '"&A2&"'")))

I then used this data to produce a summary of the data. This began with a UNIQUE list of combined rollover rules.

=UNIQUE(QUERY(SCHOOLS!A2:C,"SELECT B,C WHERE A IS NOT NULL"))

Using this as a reference, I queried all the schools with the same rule and joined this data together.

="E"&JOIN(",E",QUERY(SCHOOLS!A2:B,"SELECT A WHERE B ='"&C2&"'"))

I also created another columns in which I used a SWITCH formula to describe each of the sets of rules in a more meaningful manner beginning with a category: “standard”, “two semesters” and “needs to be fixed”. Associated with this, I created a column which pulled this category and then used this information to create a series of conditional formulas in order to visually differentiate the different setups.

For anyone interested, a copy of the template can be found here.


After completing this exercise, I have found that I can easily adjust this template for any other data I wish to analysis quickly. It is also testament to the power of making your own tools. I am left wondering if I could use macros or scripts to improve the process, but for now my formulas all duct taped together is working for me. As Tom Woodward sums up:

I’m not a programmer so I have to find ways to get things done until I learn more.

If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

If you’re trying to understand complex formulas, peel the layers back until you reach the core (which is hopefully a function you understand!). Then, build it back up in steps to get back to the full formula. Ben Collins ‘Use The Onion Framework To Approach Complex Formulas’

I was recently asked to have a look at spreadsheet that listed all the guides and videos located in a support folder as a reference. This folder also included a number of folders within folders, which created a level of complexity.

The current workflow involved using an Add-on File Cabinet from the developers behind Awesome Table to generate a list. The problem with this list is that the exported data was not in the desired format. Whereas the output focused on file, folder and link, the list produced using File Cabinet included other information such as owner, creation date and file size. In addition to that, there were some folders and files listed that needed to be removed. Therefore, it had become manual heavy exercise to refresh the data and update the directory meaning it did not happen that regularly.

After looking at the list and the current finished product and the list produced using File Cabinet, I realised that rather than displaying the sub-folder, that the directory was showing a top level folder. Therefore to manage this, I used a SWITCH formula to create a new column updating the folder name to the overall area it belonged to.

To create the SWITCH, I created a UNIQUE list of all the folders and then pasted them as values so that I could delete those not required to be displayed in the directory. After culling the list, I then added the area in the cell next to each folder. Once I had these two columns complete, I used this formula to add quotation marks to them all:

=ArrayFormula(""""&Sheet1!A:D&"""")

I then copied this list into the SWITCH formula and added in the commas:

=IFNA(SWITCH(B1:B,
" EOY 2020","EOY",
"Absences and Attendances","eSIS",
"Class Maintenance","eSIS",
"Crystal Reports","eSIS",
"Debtors","eSIS",
"eSIS - Information Sheets","eSIS",
"eSIS Navigation-Overviews","eSIS",
"General _ Misc","eSIS",
"Managing Community","eSIS",
"Managing Student Lifecycle","eSIS",
"Reporting","eSIS",
"Staff Maintenance","eSIS",
"Student Medical Maintenance","eSIS",
"SynWeb","eSIS",
"Timetabling","eSIS",
"eLearn General ","eLearn",
"eLearn QRGs","eLearn",
"Accounts Payable","eFIN",
"General","eFIN",
"Purchasing","eFIN",
"ePortal","ePortal",
"Create New Staff","eHR",
"eHR - General","eHR",
"eHR Materials","eHR",
"Employee Self Service","eHR",
"Employee Self Service (ESS)","eHR",
"End of Year 2020 - eHR & Payroll Guide","eHR",
"Leave","eHR",
"ICON General Information","General Information",
""," "
))

With this addition column, I then used a QUERY to capture the title, area and link to present as a directory in a separate spreadsheet to share with a wider audience:

=SORT(QUERY(IMPORTRANGE("docs.google.com/spreadsheets/d/.../edit","Sheet2!A:D"),"SELECT Col1,Col4,Col3 WHERE Col4 is NOT NULL"),2,FALSE)

Ideally, if creating this solution from scratch, I would probably have started each folder with a prefix indicating which area it belonged to, therefore avoiding the need for the SWITCH formula. However, that horse had long bolted. I also like the idea of using the SWITCH formula to manage which folders are displayed. It is definitely easier to pass on to somebody else.


I feel that it would be plausible to find a different script (see for example these examples from Alice Keeler, mesgarpour and Spreedsheet Dev) that might do some other things, such as run the process as a cron job or pull only the desired data. However, that is beyond my current skillset and patience level to dig any further at this point.. Therefore, I will stick to using various formulas to filter out the data for me.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

The challenge to me is to go beyond the question of instruction and understanding of different languages. Beyond debates about fitting it within an already crowded curriculum. Instead the focus should be on creating the conditions in which students are able to take action and create new possibilities. Maybe this involves Minecraft, Ozobot or Spheros, maybe it doesn’t. Most importantly it involves going beyond worrying about training or competency, as Ian Chunn would have it, and instead embracing the world of making by leading the learning. - Aaron Davis ‘Did Someone Say Coding?’

In my work, I recently had an issue with absence data where the system had generated some duplicate data. Short of creating an actual report within the system (which I neither have the time or access to do), I decided to have a go at creating a reusable template that I could easily share with anyone. The idea was to take the data and produce a number of specific outputs, including the duplicates and a specific set of ‘All Day’ duplicates.

I started the template by creating a spreadsheet with three tabs: Sheet2, DUPLICATES and ALLDAYDUPLICATES. Sheet2 contained a summary of the ID, event data, name, absence type and event type.

=QUERY(Sheet1!A1:R,"SELECT A,R,E,F,G WHERE A IS NOT NULL")

I then made a column of unique data by combining the ID and event date. To do this, I used Ben Collins’ trick using formulas in a named range as a means of capturing the variability of the data.

=ARRAYFORMULA(CONCAT(INDIRECT(IDS),INDIRECT(DATES)))

With IDS named range being:

="'Sheet2'!A2:A"&COUNT(Sheet2!A2:A)+1

and DATES named range being:

="'Sheet2'!B2:B"&COUNT(Sheet2!B2:B)+1

I then used the column of unique data to create a TRUE / FALSE column as to whether the absence might be a duplicate.

=ARRAYFORMULA(countif($F:$F,(INDIRECT(UNICON)))>1)

With the UNICON named range being:

="'Sheet2'!F2:F"&COUNTA(Sheet2!F2:F)+1

I borrowed this formula in part from Collins who used something similar to use Conditional Formatting to highlight duplicates:

=countif(A:A,A1)>1

In the DUPLICATES tab, I created a QUERY that brought in the duplicates ordered by date

=QUERY(Sheet2!A:G, "SELECT A,B,C,D,E WHERE G = TRUE and A IS NOT NULL ORDER BY B")

While in the ALLDAYDUPLICATES I created another QUERY which focused on ‘All Day Absences’

=QUERY(DUPLICATES!A:E, "SELECT A,B,C,D,E WHERE E = 'AllDayAbsence'")

With all this in place, to query any data, I make a copy of the template. I then import the absence data exported from the external system. Although I could possibly copy and paste the data, I have found that this is fraught with dangers. In regards to the ‘Import Location’ I select Insert New Sheet(s). This then loads the new data into Sheet1. I then click on the formula in Sheet2!A1 and press enter to refresh the QUERY formula and populate the rest of the spreadsheet.

For those interested, a copy of the spreadsheet can be found here.

Growing up, I never really went deep into the world Microsoft Excel. It was not until exploring Google Sheets that I started appreciating the power and potentials. One of the consequences of this is that a lot of the habits that I have learnt are not applicable in reverse. For example, Ben Collins has a site discussing the power of moving from Excel to Google Sheets. This includes a discussion of IMPORT features and the QUERY function. However, there is nothing that I am aware of that goes the other way. I guess this is the nature of coding in general. Every language has its own set of affordances and we work within these constraints. What the situation has taught me though is that sometimes solving problems is about working with what is at hand. I guess the challenge is developing our box of tricks so that we do have something at hand. In some ways, this reminds me of something Amy Burvall once said:

“in order to connect dots, one must first have the dots”

So what about you? What small tools have you created to solve a particular problem? What are the strategies that you use to continually add to your toolbox so that you are always ready when a problem may arise? As always, comments and webmentions welcome.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

When I was doing a bit of coding every day, I found I could much more quickly grasp key concepts. But if I stopped for a few days or, every so often, a few weeks, when a crush of work in my day-job and a load of personal-life responsibilities arrived, it was like wiping the slate clean. I’d come back to work on a coding project and I’d have forgotten a shocking amount of basic stuff. Clive Thompson ‘Ten Lessons I Learned While Teaching Myself to Code’

The current pandemic has led to many changes in habits. One of which is that I like to be prepared when I go to the supermarket, especially when doing a big shop. Fine I may not last out the two to three weeks that Zeynep Tufekci flagged early on:

For food, you can just buy two or three weeks’ worth of shelf-stable food that you would eat anyway, and be done; this could include canned food like beans and vegetables, pasta, rice, cereals or oats, oils/fats, nuts and dried fruits. It’s really not that hard because we’re talking two-three weeks, so whatever you get is fine. It doesn’t have to be expensive or super healthy or specialized ready-to-eat meals in camo boxes guaranteed to survive the meteor strike! Rice, beans, salsa, ramen, some sort of cooking oil, oatmeal, nuts and dried or canned fruits and vegetables enough for two weeks can be had at relatively little cost and take up fairly little space.

However, I at least try and limit how often I go out. That is usually the intent of lockdown measures (Melbourne is currently in its fifth lockdown at the point of writing) used to hammer the virus. One strategy I have used is to be clear about what I might need at the shops split into different sections. As a part of this, I wrote out a list of essential items and have been using this to create the weekly shopping list. Today, I decided to have a go at turning this into a spreadsheet using Google Sheets that I could use to generate the list. Here then are my steps:

List of Items

I started by writing something of a complete list of items. Associated with this, I categorised each item in a separate column. To save from writing each category each time, I created a separate list of unique categories and then used this with data validation to create a dynamic drop-down list. This meant that if I added a new category it would then be added as an option. In a third column, I added a checkbox for each item to be used to produce the weekly list.

Switch the Category

Added to the category, I used the SWITCH formula to create a sort order.

=IFNA(ARRAYFORMULA(SWITCH(B2:B,"Veg",1,"Fruit",2,"Meat",3,"Dairy",4,"Bakery",5,"Sweets",6,"Non",7,"Freezer",8,"Other",9)))

Generating the Summary List

Once the items required were ticked, I wanted a summary that I could copy into a message. My initial iteration was a simple query: =QUERY(Sheet1!A2:D,"SELECT C WHERE D = TRUE ORDER BY A") The problem with this is that it did not put each item on a new line. To fix that, I used the JOIN function and the New Line character.

=JOIN(CHAR(10),QUERY(Sheet1!A2:D,"SELECT C WHERE D = TRUE ORDER BY A")) 

Although this put each item on a new line, I then wondered about adding an emoji for each section to break up the information. To do this, I combined the new line and emoji characters, with a separate query for each category.

=""&CHAR(129477)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 1 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(127822)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 2 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(129385)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 3 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(129472)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 4 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(127838)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 5 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(127851)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 6 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(129387)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 7 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(129482)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 8 and D = TRUE"))} &""&CHAR(10)&""&CHAR(10)&""&CHAR(129531)&""&CHAR(10)&""& {JOIN(CHAR(10),QUERY(A2:D,"SELECT C WHERE A = 9 and D = TRUE"))}

Resetting Selections

The last task was to add a script to reset the checkboxes. For this, I added a script via app scripts I found here and created a button as a trigger. You can make a copy of the whole spreadsheet here.


When it comes to coding, Clive Thompson talks about learning to code by doing something every day and doing so with purpose. Here is another example of a solution that is as much about learning as it is about the solution itself. Comments and recommendations on improvements welcome.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

The power of query in sorting out data in Sheets

One of the challenges we have within the project I am currently a part of is that we have never had a contextual reference point for school information. Although different teams have had various lists floating around, we needed one that we owned as a whole. Added to this, the lists were often kept in an Excel spreadsheet meaning nobody ever really knew if they had the definitive version. Being an organisation that uses Google, I suggested to a colleague that we work together to develop a spreadsheet in Google Sheets and share a link with colleagues. Here then are some of the steps we have taken in developing this:

Data in, Data Out

If there is one thing that I have learnt from Ben Collins’ work, it is the importance of cleaning up your data before you do anything else. This includes avoiding merged cells, an issue we had with older versions of the data. The focus was creating a dataset with each cell telling a particular story. Other than using CTRL+ENTER to separate the different parts of addresses and locking the header, there was no formatting applied to the core data.

We were also deliberate with how we collated the information. With 40+ columns, we spent some time splitting the data into four groups – details, contacts, administration and learning – with empty columns at the end of each in case additional fields needed to be added at a latter date. Although I wanted to use Google Forms to structure all this, we decided not to because of the fluid nature of the dataset. There are also times when we wanted to be able to update the data in bulk for various reasons.

Ranges, Imported and Named

So that people  were able to access the information, a second document was set up, with the data brought in via the IMPORTRANGE formula.

=IMPORTRANGE("https://docs.google.com/spreadsheets/d/abcdefghijklmnopqrstuvwxyz/edit", "CONTACTS!A1:CA500")

This meant that the wider team could have access to the content, without messing up the original data. Although we could have restricted access to the tab or range in the source document, I was concerned that this would be fiddly to maintain, therefore chose a separate document altogether.

With the IMPORTRANGE, I imported this into a sheet and created a named range to reference this.

MAIN!A1:BE500

I tried to embed the IMPORTRANGE within my formulas, but for some reason it would not work, even after I gave permission. Therefore, I resorted to simply bringing in a copy working with this. I also thought then there is only one external call running.

Different Data for Different Purposes

Once I had my data in place, I used a QUERY to reorganise it. In addition to having a single point of reference, my argument to my colleague was that we can represent this data in different ways depending on the purpose. As David Krevitt explains:

QUERY combines all of the capabilities of arithmetic functions (SUM, COUNT, AVERAGE) with the filtering abilities of a function like FILTER.

Using the QUERY formula I created a copy of the core data removing any blank columns.

=QUERY(MAIN,"SELECT B,C,E,D,F,G,J,K,L,M,N,O,P,Q,R,S,T,U,V,Y,Z,AA,AB,AC,AD,AE,AI,AJ,AK,AL,AM,AN,AO,AP,AQ,AR,AS,AT,AZ WHERE C IS NOT NULL ORDER BY B")

As I set the data up with space to grow, I used

WHERE C IS NOT NULL

This removed any blank rows. I also used

ORDER BY B

As this column contained the school numbers (i.e. E1234). In order to get this to work, we removed the E from the source document and used the custom formatting to add the E back to the number, while also being able to sort numerically.

In addition to the main list, I created a number of specific queries, focusing on things such as pay periods, business managers and applications being used.

=QUERY(ICON, "SELECT B,C WHERE X = 'Yes' ", -1)

I tried to make a query where the user would tick a checkbox associated with the information required, but could not figure how to create a variable associated with the SELECT function.

Custom Lookups

Another way of engaging with the data was to focus on a particular school. To do this I created a series of VLOOKUP formulas revolving around the E number which was strategically placed in the first column of the dataset.

=VLOOKUP(B$2,vLookup,5,FALSE)

In the cell above, B2 is the cell where the school number is entered and 5 is the column for the information to be displayed. I then repeated this formula for all the other information to be displayed.

Another use was to create a lookup for the business managers associated with the different schools.

=VLOOKUP(A2,BM!Z1:AH500,2,FALSE)

This involved making a new named range with the name of the business manager in the first column. I also used data validation restricted to the names in the dataset.

I also then combined this search key with the QUERY formula to display the list of schools and their core information:

=QUERY(MAIN,"Select B,C,E,J,AZ WHERE Z = '"&$A$2&"' ORDER BY B")

Mix and Match Formatting

In order to make the data a bit more friendly, I added some conditional formatting. The challenge was that I wanted different colours for different groups of schools. I did this using the school number as the reference:

=left($A2,2)=”E1″

I then used an even formula to break up the lists:

=ISEVEN(ROW())

By placing the even conditional formula at the top of the list, this means that it is prioritised.


I have included this all here as a reference. I am not sure this is much use to anyone. As always, thoughts and questions welcome.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

The power of query in sorting out data in Sheets

There are many challenges to sharing specific data in Google Sheets, some of these can be overcome using the QUERY formula.


I attended a day recently continuing the look into ongoing reporting. One of the focuses involved reflecting on various points pf data. Something that stood out was the willingness of teacher to share data. Whether it be with students, teachers or parents, for some providing access can still be a challenge.

I discussed this with a principal attending the day and one suggestion made is that it can be hard to share particular data, without sharing everyones data. This is especially the case when talking with parents or conferencing a student. An answer is using the QUERY formula.

With Google the chosen platform, a lot of teachers store their data in Google Sheets.

QUERY combines a whole lot of functionality into the one formula. As David Krevitt explains:

QUERY combines all of the capabilities of arithmetic functions (SUM, COUNT, AVERAGE) with the filtering abilities of a function like FILTER.

It is a language developed by Google, using the principles of Structured Query Language (SQL).

For example, you may wish to share the results of just one students:

To do this, you select the columns you want to copy from the MASTER tab and which value you wish to filter by:

=QUERY(MASTER!A:F, "SELECT C,D,E,F WHERE A = 'Donna'", -1)

If you wished to quickly create a tab for each student, Alice Keeler has created a script for generating tabs from a list. This personalised information can then be shared with students (see Jake Miller’s explanantion). Another thing to consider is to protect formulas by adjusting permissions at a celular level, as well as prevent others from copying the file, therefore getting the information that way.

The other option is to create a dynamic selector involving either a name:

Or even a whole form:

These options might be used when working with colleagues or talking with parents. You are able to bring up just the information required. Depending on the data, you can also create dynamic charts. One other benefit to using the QUERY formula is that it allows you to quickly and easily reorder the representation of data. So lets say ‘F’ is associated with literacy testing and you would like that at the start. Rather than writing SELECT C,D,E,F you would write SELECT F,C,D,E.

This is only the tip of the iceberg of what the QUERY formula can do. For more information, see posts from Ben Collins and David Krevitt. Collins also ran two webinars, which you can go back and watch. One on the basics, while the other getting a bit more complicated.


One thing to note when using Google Sheets to store data is what sort of information you are collecting. In some districts and regions there are issues raised about storing ‘sensitive data’ in platforms like Google.

Reflecting on Class Dojo, Ben Williamson explains that ‘sensitive’ can be the consequence of collecting data:

The ‘sensitive information’ contained in ClassDojo is the behavioural record built up from teachers tapping reward points into the app.

This same concern needs to be considered in regards to Sheets, especially with the changes being brought about by GDPR.


As always, comments welcome. Webmentions too.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.