Customer emails to terrapinworks@umd.edu are not being received by our team

Incident Report for UMD Engineering IT

Postmortem

Incident Postmortem: Email Delivery Service Disruption

Incident ID: nch65p2lsr9l
Duration: Approximately 24 hours (March 31, 2025 14:00 - April 1, 2025 17:00 EDT)
Impact: Email delivery to the Terrapin Works ticketing system was interrupted, affecting customer communications and service request processing.

Issue Summary

On April 1, 2025, we became aware that emails sent to terrapinworks@umd.edu were not being properly delivered to our ticketing system. This resulted in delayed responses to customer inquiries and service requests for approximately 24 hours prior to discovery of the issue. The issue affected all incoming email communications but did not impact existing tickets or other communication channels.

Root Cause

The incident stemmed from an incomplete email infrastructure migration that began two weeks prior. During a planned migration from a Google Group to a Google Account for our primary service email (terrapinworks@umd.edu), a critical step in the process was inadvertently omitted:

  1. The existing Google Group was successfully renamed to terrapinworks-backup@umd.edu
  2. A new Google Account (terrapinworks@umd.edu) was properly created
  3. However, the email alias automatically created during the renaming process was not removed from the backup account

This configuration allowed both addresses to receive emails intended for terrapinworks@umd.edu. The system continued to function temporarily because our JitBit ticketing system was configured to pull emails from a forwarded address (terrapinworks@eng.umd.edu). When this forwarding mechanism unexpectedly failed on March 31, all email ingestion to the ticketing system ceased.

Resolution and Recovery

Our technical team implemented the following actions to resolve the issue:

  1. Identified the email delivery failure through our monitoring alerts and customer reports.
  2. Diagnosed the root cause as an authentication issue between JitBit and our email infrastructure.
  3. Collaborated with UMD's Division of IT (DIT) UNIX/Email Services team to:
* Reconfigure JitBit to pull emails directly from the new [terrapinworks@umd.edu](mailto:terrapinworks@umd.edu) Google Account via OAuth.
* Remove the conflicting alias from the backup account.
* Establish proper email forwarding rules.
  1. Executed comprehensive testing to verify email flow through the entire system.
  2. Confirmed successful email delivery to the ticketing system at 17:00 EDT on April 1, 2025.

Corrective and Preventative Measures

To prevent similar incidents in the future, we are implementing the following improvements:

  1. Enhanced Migration Procedures:
* Developed a comprehensive checklist for email infrastructure changes.
* Instituted mandatory verification testing after each migration step.
* Created documentation detailing the complete Google Group to Google Account migration process.
  1. Process Improvements:
* Established clearer communication channels with DIT for account management requests.
* Scheduled regular audits of email configurations.

We sincerely apologize for any inconvenience this service disruption may have caused. Our team remains committed to providing reliable service and continuously improving our systems to prevent similar incidents in the future.

Posted Apr 01, 2025 - 17:25 EDT

Resolved

We were able to work with Division of IT to quickly rectify the issue. Inbound emails are now confirmed as being received by our team.
Posted Apr 01, 2025 - 17:09 EDT

Identified

The issue has been identified and a fix is being implemented.
Posted Apr 01, 2025 - 15:47 EDT
This incident affected: Core Services (Ticketing System (JitBit)).