Issues with POS API and cart operations

Incident Report for Retail Operations

Postmortem

Firstly, we would like to sincerely apologize for the disruption caused by the issues affecting the POS API.

Summary of the Issue
We observed a spike in errors originating from the POS API, which were traced to connectivity problems between on-premises POS services and Azure. These connections rely on Azure relays.

Root Cause
Upon investigation, we identified that the root cause was an expired certificate. Although the certificate had been renewed ahead of its expiration, the updated certificate was not added to the relevant application registration in Microsoft Entra ID.
As a result, once the old certificate expired and all associated tokens became invalid, the Azure relay connections failed.

Impact
This failure affected cart operations that depend on communication over the Azure relays, leading to degraded functionality for end users.

Resolution and Preventive Measures
We have now thoroughly documented the certificate renewal process, including its dependencies, to ensure this oversight does not happen again. Additional checks and automation are being considered to further reduce the risk of similar incidents.

We deeply regret the inconvenience this caused and appreciate your patience and understanding.

Posted Aug 18, 2025 - 10:22 CEST

Resolved

Communication over the relays is up and running and we have returned to normal operational status.
A PIR will be issued shortly.
Posted Aug 18, 2025 - 08:41 CEST

Monitoring

The issue has been identified and mitigated. We are in the process of restarting services on-premises to ensure that the connection to the Azure Relay is re-established.
Posted Aug 18, 2025 - 08:03 CEST

Investigating

We are currently investigating an issue in POS API where we see that cart operations are failing due to problems with the access to on-premises relays.
Posted Aug 18, 2025 - 07:37 CEST
This incident affected: POS API and Mobile POS Backend.