top of page

Troubleshooting Idempotency - if you can pronounce it you're mostly there

  • Writer: Tamara Copple
    Tamara Copple
  • Oct 5, 2024
  • 5 min read

I learned a new term recently: idempotency.

 

Go ahead.  Say it out loud.


Idempotent. Idempotency. I-dem-poten-cy.

 

I said it too fast to myself the first time and missed the syllable “-dem-”. Now I have to work at it.

 

Knowing that, if you can pronounce it without giggling, you’re doing better than me.


Okay, back on topic.

 

I am the product manager for a Microsoft canvas app (a type of low-code tool) that helps facilitate letter exchanges between sponsors and kids. The business rule is simple: each child should only need to write one letter at a time. But one day, our agency staff reported that some children had two letter requests open simultaneously. When I investigated, I noticed a pattern: the unique Correspondence IDs for each letter—like 33334 and 33338—were very close in sequence. This implied they were created almost at the same time. Both records were generated by the same Power Automate workflow.

 

It turned out that a batch job to close out letters had failed and the developer restarted it multiple times. Each time it ran, it re-closed the same letter it had already processed before the error. This caused the CRM workflow to create yet another new letter request, even though the previous one was still active. This led to the redundant letters being pushed into my letter application that confused the agency staff and created extra work.

 

So, what exactly is “idempotency”? It is a big word for a super-important concept in software development (and higher maths, apparently, but that's another story)


Don’t worry, we’re not going down a technical rabbit hole here. In plain terms, idempotency is the idea that doing something once should have the same effect as doing it multiple times. Imagine pressing a button to submit an online order. Whether you press it once, twice, or ten times, the action only happens once. That’s idempotency in action. No matter how many times you request it, the result stays the same.

 



In my case, when the batch job closed the letter record, it triggered the workflow to create a new request. Subsequent triggers to close the same letter should have been ignored. We should have implemented a ‘one-and-done’ rule in our letter generation workflow.

 

Idempotency is important in software because systems frequently encounter unforeseen challenges. For instance, when a network issue causes a job or a process to fail, the natural reaction might be to retry it. Without idempotency, these retries could create duplicate records or trigger unintended results. So in my case, when our batch job restarted, the system created redundant letter requests because it didn’t know it had already processed one for that child.

 

Why Should We Care About Idempotency?


Here’s why idempotency is important for anyone, especially in a business context. Let’s take online shopping as an example. You click “Pay Now,” but nothing happens. Naturally, you hit the button again. Without idempotency, you might accidentally purchase the same item twice, and be charged twice! Software that implements idempotency ensures that the order processes once, no matter how many times you click. One way Amazon prevents you from placing your order twice with extra button clicks is by including a step to empty out your cart after the transaction processes.

 

Now, apply that to larger systems in business. Imagine billing systems, subscription renewals, or workflows like the letter exchange I manage. If we don’t control for idempotency, those systems can easily spiral out of control when something goes wrong, leading to inefficiencies, confusion, and lost trust. Whether it’s your customers being charged multiple times or receiving duplicate communications, the outcome can be frustrating for everyone - including those of us who have to clean up the mess it leaves.

 

How we fixed the issue

 

We put in two preventative check points to solve the issue.  The first checkpoint happens when the batch job sets the letter record to a terminal status like “closed.”  If the job is restarted, it compares the existing value on the record to the value it wants to update. If the values are the same, we won’t update it again.

 

Second, we added a step to the Power Automate workflow that triggers next when a letter is set to "closed." Now, at the beginning of the flow, it checks for an existing open letter of the same type, and if it finds one, the workflow ends with no action. If there is no open letter, it proceeds to create the next letter request.

 

Troubleshooting tips


If part of your job is troubleshooting issues, or quality checking processes, certain patterns might tip you off to a possible idempotency problem. Pointing out these characteristics, much like describing symptoms to a doctor, could help your developers zero in on the offending issue more quickly. Any time you can help them avoid dead-end investigations, the happier everyone is.

 

  1. Redundant records: Like I noticed, if records appear in pairs or multiples and shouldn’t, it’s a red flag.

  2. Unexpected sequential ID's: Pay attention if you see similar IDs created at nearly the same time. In my case, we create hundreds of letter requests per day, but seldom two letters for the same child in one day.

  3. Analyze the output against your business rules.  Although there are exceptions, generally, one child should only have one open participation letter at a time. So any time I see a child with more than one open participation letter request I investigate. As a product or business analysis professional you should be familiar enough with your business rules to spot anomalous behavior, especially when someone can give you specific records as an example, like our agency staff pointed out. 

    1. Pro question 1:  According to your business rules, could this situation ever happen and still be legit?

    2. Pro question 2: Does the evidence fit that edge case?

    3. Pro question 3: Have you uncovered a new edge case?

  4. Repeated actions: If the system unexpectedly takes the same action more than once — like creating multiple orders, letters, or entries that are suspiciously similar to each other in timing, content or other characteristics, and they don’t match your business rules, you may have a problem.

  5. Timing of events: If errors or duplicate entries consistently occur around the same time, especially after a job restart or network failure, this could indicate an idempotency issue. The time alone can help point to the specific job that caused the issue. Records created within seconds of each other would have made me suspect the CRM system.  The fact that the records were created minutes apart made me wonder if some outside job was influencing it.  Both are worth investigating because either one could be the case.


By implementing checks and balances—like we did with the CRM and the letter management app—you ensure that even if a process runs multiple times, it only has the intended effect once. The key is making sure the system is “smart” enough to recognize when a process has already occurred and not repeat it unnecessarily.



Comments


  • Instagram
  • Facebook
  • LinkedIn
bottom of page