Has SenderBase.org caused widespread email disruptions this week?
Note: This post is about an ongoing issue and is based on available information. Updates will be amended, and corrections made as new information becomes available.
InfusionPoints, as well as many of our customers leverage Office365 as our email service provider. On 9-May, we became aware of an issue where our email was being bounced by both a large business partner and a major government agency that we typically send 10s to 100s of emails daily. Specifically, we would receive a delay non-delivery report (NDR) a few hours after sending an email to these organizations. On investigation via Forefront Online Protection for Exchange (FOPE), we found that our messages were in deferral due to our mail system being characterized by the recipient system as untrusted. An example trace from FOPE is provided below.
From the trace, it is apparent that the recipient system at va.gov is characterizing the email (or the email sender's infrastructure) as un-trustworthy.
In Deferral: 452 Your message could not be delivered to the intended recipient. This may because your message is characterized to be unsolicited email or from an untrusted mail delivery system. We s…[sic]
Around this time, we found that numerous other Office365 tenants were experiencing the same issue with sending to va.gov as well as other government and non-government domains. The play-by-play of the painstaking process of finding these other tenants late last week, and over the weekend is documented on this Microsoft community portal thread which at the time of writing is the most active thread on the community portal. Collectively, we've been able to help Microsoft link over 6 open service requests together, and through our contacts, we have escalated the issue to the highest levels at Microsoft. From what we've been able to surmise, it is safe to say that all Office365 tenants are being blocked from sending to these domains and probably others as well.
Microsoft's original suggestion to resolve the issue was to contact our recipient's tech support and ask that they add Microsoft's IP blocks to their whitelists. Yeah, good luck with that! Federal agencies like VA will flat-out refuse these requests based on policy… The following was copied from VA's response to us:
The sending Mail servers from Microsoft have a negative Sender Base Reputation Score (SBRS). Inbound email coming from any one of the servers that have a negative SBRS are going to be throttled to a specific number of recipients per hour. The only way to resolve this is to contact your ISP and have them work with Senderbase.org to clean up their servers and improve their scores.
So Microsoft's scores are low with Senderbase (owned by Cisco) -- Microsoft is telling me to contact the recipient -- The recipient is telling me to tell Microsoft (our ISP in this case) to clean up their servers – and now there's a 3rd party Senderbase.org involved. So, this is starting to be comical… Here we have three large organizations that are all pointing to opposing culprits, and leaving small companies like ours (who would typically have the least means at their disposal) to essentially "crowd-source" a resolution to the problem. And Cloud Services is supposed to be the Best Thing Ever ™ for Small Business? But that is a separate blog post…
Senderbase.org is "the world's largest email and Web traffic monitoring network" according to Cisco's about page at Senderbase.org. They also make it very easy to find on their front page that they are not blocking your email… Via two links that simply say "Blocked" that link here…
I guess this is a pre-emptive response to anyone who might ask that question? And they are technically correct, that they do not directly block email as far as I can tell… They just score your email infrastructure using some algorithm and then provide that score to their broad list of subscribers who implicitly trust those scores as a metric to decide whether or not to block your email... back to this in a minute.
In the midst of putting all of this together, we begin to hear of evidence that this issue may be broader and further reaching, affecting more senders than just Office365 tenants. We hear that a major government contractor has unexpectedly started blocking email from an unusually high number of domains including fdic.gov, dhs.gov, usda.gov, va.gov and Accenture.com. Because we have no other information available, we can only wonder (assume?) that this major government contractor also uses senderbase.org to score its senders.
Then, commenter Michshin on the previously mentioned Microsoft community portal thread drops this bomb on us on Sunday, 13-May at 2:50PM EDT.
…Cisco changed their senderbase scoring system, causing various folks to drop from Good to neutral. Anyone who subscribes to the Senderbase.org scoring system to throttle or filter email is having this problem. Cisco need to fix their busted algorithm. After contacting cisco, they said that subscribers should change their settings to allow free passage (whitelisting) of neutral mail. Everyone needs to beat up senderbase to fix their false positive determination. Remember that Microsoft and the end users did not change their email behavior pattern, cisco changed their algorithm in how the rate folks, if you don't send enough email then they changed their threshold for rating you to neutral; doesn't mean that they've confirmed that you're doing anything bad, they just don't have enough data to give you a Good rating.
We have no source to verify the claims in this comment (we have requested sources), but as it stands, this explanation makes the most sense out of everything that has been put together so far. So the question, as it stands, is did Senderbase.org take the "Ready, Fire, Aim" approach with a change to their algorithm? (if their algorithm did in-fact change) If the algorithm did change, were the changes communicated appropriately to those who subscribe to these ratings for the purposes of blocking email?
This whole issue reminds me of the problems with credit agencies several years back. In those days, consumers had little recourse to resolve problem credit scores, and it was difficult to even find out what was in their credit report. The scoring mechanisms were unpublished, and were trusted implicitly by those issuing credit. If a mistake was made with a credit score, it was solely up to the consumer to resolve with very little recourse or information at their disposal. In light of Cloud Services' increasing prominence are reforms similar to those made in the credit industry needed? We'll be exploring those issues in future blog posts.
If anyone with more information or interest comes across this thread, please contact me via our contact form. Thanks for reading, and we'll post updates as they become available.
Update 5/14/2012 4:15pm EDT
Cisco has sent out the following tweets this afternoon pointing to Microsoft.
@shr0p Cisco observed that MSFT had a spam incident last week that cuased a negative impact to the reputation on some of their IP addresses— Cisco Security (@CiscoSecurity) May 14, 2012
@shr0p We are working closely with MSFT to help with the recovery of the repuation to the IPs affected. Thanks for reaching out!— Cisco Security (@CiscoSecurity) May 14, 2012
Update 5/14/2012 5:30pm EDT
A VP at Microsoft who is in contact with some of the customers on the community forum thread sent the following...
We have worked with Cisco to set the reputation of our outbound edge servers to 0. That still puts them in the neutral category, but many customers are only queuing mail from servers with Neutral reputation and a negative score. We have posted guidance to this effect for our customers, Cisco has provided guidance on how to configure to work around the issues in IronPort, and we continue to work to improve the reputation and get back into the ‘good’ rating. It will continue to take time as we need enough good mail to be sent to train Cisco’s system on our IPs. Resetting the baseline to 0 should materially accelerate the process thought I have no firm SLA. As you can imagine we are monitoring this very closely across Cisco and Microsoft.
Update 5/15/2012 2:37pm EDT
After Cisco's "meh" reply to my tweets yesterday, I responded with this late last night. No reply received, nor expected.
As of 1:30pm we are seeing improvements of Microsoft server scores, and it appears that mail flow to VA.gov is improving, however, we are still experiencing some delays in our emails being sent to VA.gov.