fedops blog

Privacy in Computing

Mon 06 December 2021

Read the M365 EULA

Posted by fedops in Cloud   

In July 2020 The Register, our usual source of unbiased news (ahem), ran an interesting piece on a pending court case against Microsoft in California. Within the article the actual complaint document was linked via courtlistener.com, to wit: https://www.courtlistener.com/recap/gov.uscourts.cand.362635/gov.uscourts.cand.362635.1.0.pdf.

This thing makes for good, if somewhat dry, reading. It's information straight from the horse's mouth, though; you might be missing important things that you really want to know about.

The first interesting point that I'll glance over here is the forwarding of data to Facebook. This is an interesting GDPR violation in its own right. And probably raises even more eyebrows in 2021 after all the Facebook scandals. But Facebook snooping on you is so 2020.

The actual meat of the thing are the terms and conditions set forth by Microsoft in the O365/M365/Azure EULAs as to how they monetize customer data they are getting access to as part of your daily operations. This is interesting enough for you as a privateer, and doubly so for any company using these services.

I doubt people actually read this legalese, but perhaps they should. Allow me to elaborate.

Your Data?

If you as a company IT department introduce Office365 (or Microsoft365 or whatever it's called tomorrow) you move most of your data storage from servers you control to servers Microsoft controls. You become a tenant on their Infrastructure-as-a-Service.

This is sometimes näively called "The Cloud", probably to "cloud" the fact that thousands of customers are sharing resources on tens of thousands of servers all controlled by the same entity, Microsoft. Think thousands of apartment dwellers living in the same tenement building, with the drunken facility caretaker holding one single key in his shaking hand which fits every apartment plus basement storage cubicle. And closed circuit TV.

This means that not only the data at rest is stored there, but also that most if not all communication flows operate over their environment, depending on how many of the "included at no charge" extra services you chose to use. And by golly are you going to want to use ALL of them because remember, as an IT boss your annual bonus depends on cost reductions. So why would you use any competing product from other companies, even if they were better?

So this includes Emails with their attachments, and calendar appointments also with attachments, sent through their Exchange systems. Any files your employees store in Onedrive, and everything that's stored in Sharepoints. All login information in Azure Active Directory. All the chats, phone calls (including speech-to-text and translated calls), and channels including attachments/images/shared files in Teams, which themselves are stored inside dedicated Sharepoint folders. And last but not least any source code and other artifacts inside Azure Devops repos, should you use that.

So in short, Microsoft has complete access to any information, confidential or otherwise, you work with on a daily basis. This might well include data related to your own customers and other third parties who may or may not be aware of that fact.

Reading the complaint it gets interesting on page 17 where they start to quote from Microsoft's own Ts&Cs. Following are a few excerpts which I found interesting for my own employer's use case. Note that "third-party developers" are organizations that have a separate contract with Microsoft outside of and unbeknownst to you as a "customer".

Paragraph 83 on page 19:

"Among other things, Microsoft gives third-party developers information about the documents and projects those non-consenting business customers worked on. Microsoft allows those third-party developers to search the content of its business customers' emails and to access their schedules, locations,and availability status, i.e., whether they are "available" or "away.""

Non-consenting business customers - that would be you. Following on directly in paragraph 84:

"In advertising its developer platform to third-party developers, Microsoft touts the enormous value of its customers' data, highlighting how developers will get data not just about the authorized user, but also about other users who communicate with the authorized user. For example, Microsoft explains to developers that they can "perform searches for people who are relevant to the [Microsoft] user and have expressed an interest in communicating with that user" about specific topics, such as pizzas. Microsoft explains that "[t]opics in this context are just words that have been used most by users in email conversations. Microsoft extracts such words and creates an index for this data to facilitate . . . searches.""

Straight forward. Others that you have no dealings with may freely search through the contents plus metadata of your emails for interesting (to them) things. And of course will they not only see, for example, your emails, but also anybody else's that happen to be quoted or replied to even if those are not cloudified yet.

Ok, nothing to see here. Moving right along to paragraph 87:

"Microsoft uses and shares business customers' data - including the content of their documents, emails, email attachments, text, audio, and video files - with hundreds of subcontractors (or "subprocessors," as Microsoft sometimes calls them), not only to provide customers with the services they purchased, but also to serve Microsoft's separate commercial ventures, including discovering new business insights and developing new services, products, or features for Microsoft's benefit, such as artificial intelligence applications and development interfaces."

That's interesting. But surely at least these subcontractors would be held to store that data in encrypted form, right? Well, paragraph 89:

"Microsoft does not require its subcontractors to encrypt business customers' data and does not disclose that fact to its business customers. [...]"

Oh. So unsecured Amazon S3 buckets (or the equivalent in Azureworld), here we come...

Then let's see something about the Graph API, outlined in paragraph 94/95:

"Microsoft boasts that Security Graph API is built off the "uniquely broad and deep" insights Microsoft obtained for itself by scanning "400 billion" of its customers' emails and "data from 700 million Azure user accounts." Microsoft also harvests business customer data to develop and sell to others a marketing product called Microsoft Audience Network, which Microsoft admits derives enormous value from processing customer data.

So as of 2020 Microsoft had hoovered up the data contained within 400 billion emails from 700 million accounts and sold it on to its "partners" touting "uniquely broad and deep insights".

Gmail anyone? Except these aren't your usual private emails with pornhub invoices and your flight schedules to Thailand but rather EVERYTHING your employees are sending around all day long. Marketing plans? Check. Product roadmaps? Check. Salary slips, source code review documents, CAD drawings? Check check check. Those Powerpoint slide decks marked "Highly Confidential" you tell your people not to share with anyone? Check.

But wait, it gets even better:

In Microsoft's own words: What sets Microsoft Audience Ads apart is their rich user understanding that powers high performance. The Microsoft Graph consists of robust data sets, including search and web activity, LinkedIn professional profiles, demographics and more.

Ah yes, web activity and Linkedin. It's easy to forget: you're constantly logged in with your browser while you're working within Office365. And they've made it really convenient, with long-life session tokens that go for weeks without requiring a re-login. And while logged in ALL your web activity is accessible to them and linked to your specific specific user id.

Very convenient. And yes, it includes Linkedin, another Microsoft property that people willingly publish internal information on. Such as who they work with and what they do within your company. And, less obviously, this also includes any other website that uses Microsoft's account system to authenticate/identify users.

Interesting. Surely that much data is only updated so often and probably condensed into profiles? Mh...

The data is continually updated every second based on user activities. By mapping audience data on such an enormous scale, the Graph helps us spot trends and uncover insights, both of which allow you to effectively reach your customers."

And by "your customers", remember we're talking about the customers of the third parties here. The customers of the Microsoft customers you don't have any dealings with, or even know who they are or sell your data to.

Some of us have wondered why using things like Word365 requires a constant online connection, even when working on offline documents. How quaint. Well, now you know.

Summary

So to condense this into digestible facts:

  • Microsoft enables themselves access to the complete set of your data as a cloud customer, in flight as well as at rest, including contents, metadata, and behavioral data.
  • It then enriches this with everything else you do on your computers while logged in, and provides said data over a real-time API to paying customers (of theirs) to use as they see fit.
  • It does not discriminate between PII data, trade secrets, or anything else, and for the European crowd: yes, the data gets freely exported and sold outside of GDPR land.

If you sign up to any of this you're completely out of your mind.