Whether you share it or not, documenting and recording your process as you go along has its own rewards: You’ll start to see the work you’re doing more clearly and feel like you’re making progress. Austin Kleon ‘Show Your Work!’

The team I am a part of was recently asked if there was any work that we could possibly automate to save time and effort. Leaving aside the strange question of asking the boiling frog how the boiling process might be sped up, I was left thinking about the limits of automation when what is trying to be automated is not always straight-forward and often involves contextual knowledge. My suggestion in response was not to automate the task, but possibly automate the process.

Ever since I have stepped into my new role in the technical team, I have made a conscious effort to capture everything that I have learnt and store it in a canonical space as one of the issues I had early on was that it was too difficult to find anything. This then allows me to produce repeatable processes. (In some respects I imagine that organising information in this manner actually makes it easier for an AI assistant to support?)

In addition to automating the day-to-day, I recently built out a spreadsheet to help with a particular task of cleaning up duplicate records across the database. Given a list of hundreds of records to review, I felt that saving any time and effort was going to make a difference. My thought was to create a series of declare statements to use to query the database, then copy the raw data, review it, create tasks for the relevant teams and produce a summary of the records changed and those that remained the same.

Doubling Down on the Declare

One of the issues I initially found when producing my query was updating all the variables each time. It therefore occurred to me that rather than changing the names and various IDs each time, which was always fiddly and prone to error, I could create a series of declare statements that I updated with the SQL statement each time. This allowed me to leave the rest of the query the same:

="
DECLARE @Sur1 VARCHAR(30) = '"&D2&"'
DECLARE @Sur2 VARCHAR(30) = 'SUR2'
DECLARE @Pre1 VARCHAR(30) = '"&E2&"'
DECLARE @Pre2 VARCHAR(30) = 'PRE2'
DECLARE @Login1 VARCHAR(30) = '"&H2&"'
DECLARE @Login2 VARCHAR(30) = 'CECVIDS\UPN2'
DECLARE @Login3 VARCHAR(30) = 'CECVIDS\UPN3'
DECLARE @CNUM1 VARCHAR(15) = 'CX'
DECLARE @CNUM2 VARCHAR(15) = 'CX'
DECLARE @CNUM3 VARCHAR(15) = 'CX'"

In addition to this, I stripped out any additional information that was included within a text string of possible information and had this represented on unique lines after the initial list:

--"&IFNA(JOIN(CHAR(10)&"--",ARRAYFORMULA(LEFT(QUERY(TRIM(MID(SUBSTITUTE(J2, ":", REPT(" ", LEN(J2))), LEN(J2)*ROW(INDIRECT("1:"&LEN(J2)))-LEN(J2)+1, LEN(J2))),"SELECT * WHERE Col1 LIKE 'C%'"),16))))&""

The output then looks something like this:

--"
DECLARE @Sur1 VARCHAR(30) = 'Davis'
DECLARE @Sur2 VARCHAR(30) = 'SUR2'
DECLARE @Pre1 VARCHAR(30) = 'Aaron'
DECLARE @Pre2 VARCHAR(30) = 'PRE2'
DECLARE @Login1 VARCHAR(30) = 'CECVIDS\01234567'
DECLARE @Login2 VARCHAR(30) = 'CECVIDS\UPN2'
DECLARE @Login3 VARCHAR(30) = 'CECVIDS\UPN3'
DECLARE @CNUM1 VARCHAR(15) = 'C0000001'
DECLARE @CNUM2 VARCHAR(15) = 'CX'
DECLARE @CNUM3 VARCHAR(15) = 'CX'
--CECVIDS\30000001
--C1000001"

I commented the additional information out so that I could use it if required or simply run the updated script it was. It also meant that I did not have to delete the quotations at the start and end that come across when copying from Google Sheets every time. I did tinker with how I could get the additional logins and numbers to prepopulate within the declare statement, but hit a wall as the number of additional values was not consistent.

Streamlining the Analysis and Fix

Once I had all the records in the system, I then had to organise them into those to be retained and those to be updated. The first step was to create a conditional formulas using the following custom formula that quickly highlighted possible duplicate records:

=COUNTIFS(L:L, L1, L:L, "<>BirthDate")

This allowed me to then reorder the list and review the rest of the details. Once I had done this, I then created a summary of the data.
Often I create a config tab where I keep tab where I create various formulas that I can refer to. This is particular important when using variable NAMED RANGES. As I wanted to make this spreadsheet somewhat self-contained, I decided to just use the first row of the sheet, which I then hid. This included a summary of the different community names in a single cell:

=LEFT(JOIN("/",UNIQUE(QUERY(F2:F,"SELECT F WHERE NOT F = 'NameExternal'"))),LEN(JOIN("/",UNIQUE(QUERY(F2:F,"SELECT F WHERE NOT F = 'NameExternal'"))))-1)


A summary of staff numbers in a single cell:

=IFNA(JOIN(",",UNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC"))))


And a summary of login records in a cell

IFNA(JOIN(", ",UNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%'"))))

Each was given a Named Range that could be easily referred to in other formulas.

Initially, I created a formula which combined these different attributes, but I realised I then had to explain what sort of duplicate it was. Overall, there are four types of duplicate records:

  1. User with multiple staff numbers and login records
  2. User with multiple login records
  3. User with multiple staff numbers
  4. User with multiple community records

To accommodate these differences, I created an formula using the named ranges to combine the IF and AND formulas:

=IF( AND( IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))>1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%' ORDER BY H")))>1), F1&" has multiple staff numbers: "&G1&" and multiple login records: "&SUBSTITUTE(H1,"IDS\",""),
IF(AND(IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))=1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%' ORDER BY H")))>1), F1&" ( "&G1&" ) has multiple login records: "&SUBSTITUTE(H1,"IDS\",""),
IF(AND(IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))=1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CEC%' ORDER BY H")))=1), F1&" ( "&G1&" / logins: "&SUBSTITUTE(H1,"IDS\","")&") has duplicate community records",
IF(AND(IFNA(COUNTUNIQUE(QUERY(G2:G,"SELECT G WHERE G LIKE 'C%' ORDER BY G ASC")))>1, IFNA(COUNTUNIQUE(QUERY(H2:H,"SELECT H WHERE H LIKE 'CECV%' ORDER BY H")))=1), F1&" logins: "&SUBSTITUTE(H1,"IDS\"," ")&" has multiple staff numbers: "&G1&"",FALSE))))

Using the IF and AND combination allowed me to quickly and easily produce a summary of the issue in one sentence to raise in our incident management system.

In addition to this, I created a SWITCH formula to provide the server listener name, as opposed to the IP address provided in the data export:

=IFNA(SWITCH(B2:B,
"IP Address 1","Listener Name 1",
"IP Address 2","Listener Name 2",
"IP Address 3","Listener Name 3",
"IP Address 4","Listener Name 4",
"IP Address 5","Listener Name 5"))

Summarising the Changes

In addition to the summary of issue, I also created a summary of changes that I could easily copy and paste into the incident notes once I had cleaned up the various duplicates:

={"","","","","","";
"Record(s) updated / merged:","","","","","";
"","","","","","";
SORT(Indirect(Table1));
"","","","","","";
"Record(s) retained:","","","","","";
"","","","","","";
SORT(Indirect(Table2))}

As the number of records varied, I created a NAMED RANGE for records updated / merged and records retained and used the INDIRECT formula to refer to them.

The formula used for Table1 was:

="C2:H"&MATCH("Database Name", Indirect(CThreeC), 0) + ROW(INDIRECT(CThree)) - 2

While the formula used for Table2 was:

=ADDRESS(MATCH("Database Name", INDIRECT(CThreeC), 0) + ROW(INDIRECT(CThree))-1, COLUMN(INDIRECT(CThree)))&":H"

As I was continually moving and deleting rows around in the spreadsheet, I found that it kept on breaking these named ranges, so that is why I created two further NAMED RANGES.

This was formula used for CThree:

="$C$3"

And this was formula used for CThreeC:

="$C$3:$C"

Again, all this ‘magic‘ was hidden in the top row of the spreadsheet.

An example of the spreadsheet with the various formulas can be found here.


There has been a lot spoken at my workplace lately about what it means to be proud. Personally, fixing 250+ duplicate records is not necessarily something I am ‘proud’ of or excited by, but I am proud of my ability to improve processes and identify ways in which my actions can make it easier for others to do their job. The challenge I often have is being able to hand over such solutions to others. Most prefer their own way of working, even if that means complaining that they do not have time for this or that. I feel that the issue is finding the right balance between improving a process and creating more work.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.

She had the odd idea that the terrorism question had become a fad, like body piercing or flares; a fashion that had come and would go like this season’s colours. Maybe, thought the Doll, if it was just like fashion, it was simply about a few people building careers, making money, getting power, and it wasn’t really about making the world safer or better at all. Maybe it was like Botox, something to hide the truth. Richard Flanagan ‘The Unknown Terrorist’

The Unknown Terrorist, the fourth novel by the Australian novelist Richard Flanagan, tells the story of Gina ‘Doll’ Davies, a stripper who becomes embroiled in a terrorist plot. The various elements of her life, whether it be not having a bank account or the job that she chooses to do, mean that she is made the scapegoat for a terrorist threat.

Gina Davies, also known as the Doll, a 26-year-old exotic dancer in a Sydney, Australia, gentlemen’s club, undergoes this Kafkaesque experience. On the night of Sydney’s Mardi Gras parade, she sleeps with a man she has just met, an attractive Syrian computer programmer with a cocaine habit. A day later, after his sudden disappearance, she has been turned into Australia’s most hunted woman — or, as a newspaper would have it, the “Dancer of Death.” By following the desperate flight of this once normal, now supposedly lethal, woman, Flanagan suggests the accused herself has become the victim of an insidious institutional terrorism.

Source: Unusual Suspect by Uzodinma Iweala

The novel progressively unfurls from there, jumping between various interconnected characters, including her friend Wilder, the current affair host Richard Cody, ASIO spook Siv Harmsen, drugs detective Nick Loukakis, manager of Counter Terrorism Unit Tony Buchanan, and the man with a lot of money, Frank Moretti. Along with bouncing between the different characters the novel bounces between the constant noise of the media, the various vested interests associated with each of the characters, and the truth of the situation as seen by the omnipotent reader.

Through the novel, Flanagan explores ideas around power and terrorism:

I’m important to them, Wilder, because if you can make up a terrorist you’ve given people the Devil. They love the Devil. They need the Devil. That’s my job. You get me?”

Source: The Unknown Terrorist by Richard Flanagan

Identity:

At the club you danced for money, and you danced because you were Krystal or Jodie or Amber. The one thing you never dared dance was yourself.

Source: The Unknown Terrorist by Richard Flanagan

Media, truth and reality:

“It is horrifying,” Siv Harmsen agreed, “and we need stories that remind people of what horrifying things might just happen.”

Source: The Unknown Terrorist by Richard Flanagan

And perception:

There could be no doubt about it; they were Australia and, looking around Katie Moretti’s grand dining room and its new furniture and its splendid view, it was readily apparent to them all what Australia was, and all of Australia was as splendid as it was obvious—it was them! It was their success and their prosperity; their mansions and apartments! Their Porsches and Bentleys and Beemers! Their getaways in the tropics! Their yachts and motorcruisers! Their influence, their privileges, their certainties! Who could doubt it? Who would question it? Who would wish to change any of it?

Source: The Unknown Terrorist by Richard Flanagan

I originally read Richard Flanagan’s The Unknown Terrorist in 2008. I was led back to it after reading Trent Dalton’s Lola in the Mirror and thinking about characters who exist outside of or on the edges of society. It is always intriguing to think about how the worlds differ, in particular living a life without a bank account in our progressively cashless society and the development of social media. Published in 2006, it is very much a reflection of a post-9/11 world before platforms, such as Twitter and Facebook, came to the fore.

It was also interesting reading it after the recent stabbing attack at Bondi Junction and the way in which the media misidentified the killer. However, rather than being hunted down, the misidentified killer sued for defamation. In his review of The Unknown Terrorist, David Marr questioned the reality of the situation Flanagan portrays. Marr asks where the defamation lawyers are? Or why Gina Davies does not ring a lawyer?

Where was the defamation lawyer at Channel Six asking “What are our defences?” before Cody’s one-hour mishmash of guesswork and grainy footage – also called The Unknown Terrorist – went to air? And it doesn’t really help if the answer to that question is: stop being picky, this is a thriller. Without a sense that these horrors might happen, there’s not much thrill, either.

Source: The Unknown Terrorist by David Marr

When I originally read The Unknown Terrorist, I had not read any of Flanagan’s other books. However, I was curious coming back to the book after reading Flanagan’s Man Booker winning novel The Narrow Road to the Deep North. Although they are different books, what they share is a respect for the complexity of life and character. Flanagan sets the reader up for certain judgments, such as working as a stripper, only to then peel back layers to add details about past upbringing or past relationships that force the reader to reconsider these simplistic points of view. This is something that Peter Conrad touches on in his review.

Despite all this puffery, Flanagan’s homeland is no longer a community, hardly even a society. Its people, like the lap dancer with her craving for designer clothes, are mired in materialism, obsessed with mortgages, superannuation payments and the acquisition of the latest, shiniest gadget. Their venality makes moral cowards of them, and the government terrorises them into brown-tonguing Bush by appealing to their economic anxiety and to their skulking xenophobia.

Source: Days of thunder erupt Down Under by Peter Conrad

The only problem with this is that The Unknown Terrorist often depends upon the same clichés and stereotypes it is trying to critic in order to dig deeper into the truth. As Magdalena Ball captures in her review:

The book is full of clichés and stereotypes as brutal as those Flanagan criticises. The poll dancers who talk about the Doll are all utterly vacuous. The bad guys, Lee Moon, Frank Moretti, the anchorman Richard Cody, or the wealthy people at Katie Moretti’s party are all characters with no depth or dimension to render them realistic. Sydney itself is seen as a kind of game park with grungy areas like Kings Cross, suburbian areas like the West, or wealthy areas like Double Bay all fulfilling their stereotypical functions

Source: A review of The Unknown Terrorist by Richard Flanagan – Compulsive Reader by Magdalena Ball

Even with all these flaws, I enjoyed The Unknown Terrorist, I guess I bought into the suspense of Flanagan’s thriller and attempt to capture a world beyond the stereotypes, to explore the haves and have nots, even if that is problematic at times.


If you enjoy what you read here, feel free to sign up for my monthly newsletter to catch up on all things learning, edtech and storytelling.