April 23, 2014

What is known about NSA's PRISM program

In June last year the Snowden-leaks started with the disclosure of the PRISM-program. For many people it stands for NSA surveillance in general because they often have still no idea what PRISM is actually about.

Therefore, this article presents a wrap up of almost everything we know about the PRISM program, combining information from my earlier postings and from other media and government sources.

The famous internal NSA presentation about PRISM consists of 41 slides. Edward Snowden initially asked The Washington Post to publish the full slide deck, but the paper refused and so only 4 were subsequently published by The Guardian. A few other slides were revealed later on. In total 13 slides have been published and 4 were incidentally or partially shown on television.

All of them are shown here, in an order that probably comes closest to the original presentation. The slides have a number which is only for reference. If new slides of this PRISM presentation become available, they will be added here.

1. This slide was one of the first four revealed by The Guardian and The Washington Post on June 6, 2013, and shows the title of the presentation.

All slides are marked TOP SECRET//SI//ORCON/NOFORN, which means they are classified as Top Secret and protected by the control system for Special Intelligence (SI). The dissemination is strictly controlled by the originator, while it's generally prohibited to release them to foreign nationals.

The SIGINT Activity Designator (SIGAD) of the PRISM program is US-984XN, which indicates that PRISM is part of the BLARNEY-family and used for collecting data under the authority of the FISA Amendments Act.

> See also: PRISM as part of the BLARNEY program

The media have redacted the name of the person who is the PRISM collection manager, a title which is followed by S35333, which is NSA's internal organization designator for a unit of the Special Source Operations (SSO). The logo of this division is in the top left corner of each slide, with in the opposite corner a logo for the PRISM program itself.

Immediatly after the first slides of the presentation were published, some people thought it could be fake or photoshopped because of the not very professional looking design and the copy-paste elements. After more slides became available, we can now assume the presentation to be genuine.

> See also: Are the NSA's PRISM-slides photoshopped?

This presentation about PRISM was given in April 2013, which is just a month before Edward Snowden left his job at NSA and therefore this seems to be one of the most recent documents he was able to download from the internal NSA network.

General aspects of PRISM

The following slides are about the workings of the PRISM program in general:

2. This slide was one of the first four revealed by The Guardian and The Washington Post on June 6, 2013, and shows a short introduction of the world's telecommunications backbone.

The diagram shows that the majority of international communications from Latin America, Europe and even from Asia flow through the United States, which makes it easy for NSA to intercept them on American soil.

Note that most of the communications from Africa (the continent where many terrorists from the Middle East went to in recent years) are going through Europe, which explains why NSA sometimes needs European partner agencies (like from the Netherlands) to access them.

3. This slide was one of the first four revealed by The Guardian and The Washington Post on June 6, 2013, and shows which internet companies are involved and what kind communications can be received by the NSA.

We see that under PRISM the NSA is able to collect e-mail, chat, video and voice messages, photo's, stored data and things like that. But there are also "Notifications of target activity - logins, etc". This was interpreted by The Washington Post as a function that gives NSA analysts live notifications "when a target logs on or sends an e-mail".

But as these notifications are clearly listed as collected data (see also slide 8 down below), it's more likely they refer to the notification messages you get when someone logs in at an internet chatroom or an instant messenger, or when you receive an e-mail through an e-mail client.

It is possible though that NSA analysts can get a notification when new communications from a target they are watching becomes available in NSA systems. Whether (near) real-time monitoring of a target's communications is possible, depends on the way these data are made available to NSA (see slide 5 below).

4. This slide was one of the first four revealed by The Guardian and The Washington Post on June 6, 2013, and shows the dates when PRISM collection began for each provider:
- Microsoft: September 11, 2007
- Yahoo: March 12, 2008
- Google: January 14, 2009
- Facebook: June 3, 2009
- PalTalk: December 7, 2009
- YouTube: September 24, 2010
- Skype: February 2, 2011
- AOL: March 31, 2011
- Apple: October 2012

According to the book 'Der NSA Komplex', which was published by Der Spiegel in March 2014, PRISM also gained access to Microsoft's cloud service Skydrive as of March 2013. This was realized after months of cooperation between FBI and Microsoft.*

The Washington Post reported that in the speaker's notes accompanying the presentation, it's said that "98 percent of PRISM production is based on Yahoo, Google and Microsoft; we need to make sure we don’t harm these sources". The Post also says that "PalTalk, although much smaller, has hosted traffic of substantial intelligence interest during the Arab Spring and in the ongoing Syrian civil war".

The program cost of 20 million dollar per year was initially interpreted as being the cost of the program itself, but later it came out that NSA pays for expenses made by cooperating corporations, so it seems more likely that the 20 million is the total amount paid by NSA to the companies involved in the PRISM program.

5. This slide was one of four disclosed by The Washington Post on June 29, 2013 and shows the PRISM tasking process, which means how the actual collection facilities are instructed about what data should be gathered.

The process starts with an NSA analyst entering selectors into the Unified Targeting Tool (UTT). In this case, selectors can be e-mail or IP addresses, but not keywords. According to an article in the French paper Le Monde, there are some 45.000 selectors involved in the PRISM collection.

Analysts can order data from two different sources:
- Surveillance, which means communications that will happen from the moment the target was selected (although the media interpreted this as the ability to real-time "monitor a voice, text or voice chat as it happens")
- Stored Comms, which are communications stored by the various providers dating from before the moment the target was selected

Edward Snowden vehemently accuses NSA for a lack of control and oversight mechanisms, which according to him, makes that analysts have unrestricted access to the communications of virtually everyone in the world. But the diagram in the slide clearly shows that there are multiple steps for approving every collection request:

1. For Surveillance a first review is done by an FAA Adjudicator in the analysts Product Line (S2) and for Stored Comms there's a review by the Special FISA Oversight and Processing unit (SV4).

2. A second and final review is done in both cases by the Targeting and Mission Management (S343) unit. Only after passing both stages, the request is released through the UTT and the PRINTAURA distribution managing system.

3. For Stored Comms the Electronic Communications Surveillance Unit (ECSU) of the FBI even does a third check against its own database to filter out known Americans.

Then it's the Data Intercept Technology Unit (DITU) of the FBI that goes to the various internet companies to pick up the requested data and then sends them back to NSA.

As indicated by companies like Google, they deliver the information to the FBI in different ways, like through a secure FTP transfer, an encrypted dropbox or even in person. According to a report by the journalist Declan McCullagh, the companies prefer installing their own monitoring capabilities to their networks and servers, instead of allowing the FBI to plug in government-controlled equipment.

> See also: The PRISM tasking process

6. This slide was shown on Brazilian television and seems also to be about PRISM Tasking, more specifically about a procedure for emergency tasking when lives are in danger. The slide was uploaded to Wikipedia, where there's also a transcript of the text:
[...] your targets meet FAA criteria, you should consider tasking to FAA.
Emergency tasking processes exist for [imminent/immediate] threat to life situations and targets can be placed on [...] within hours (surveillance and stored comms).
Get to know your Product line FAA adjudicators and FAA leads.

According to an NSA report (pdf) published in April 2014, analysts "may seek to query a U.S. person identifier when there is an imminent threat to life, such as a hostage situation".

Just like a number of other slides and fragments thereof shown on television, there seems to be no good reason why a slide like this is still not published in a clear and proper way. They contain nothing that endangers the national security of the US, but instead would help to much better understand how the PRISM program is actually used.

7. This slide was one of four disclosed by The Washington Post on June 29, 2013.

It shows the flow of data which are collected under the PRISM program. Again we see that it's the FBI's DITU that picks up the data at the various providers and sends them to the PRINTAURA system at NSA.

From PRINTAURA some of the data are directed to TRAFFICTHIEF, which is a database for metadata about specifically selected e-mail addresses and is part of the TURBULANCE umbrella program to detect threats in cyberspace.

The main stream of data is sent through SCISSORS, which seems to be used for separating different types of data and protocols. Metadata and voice content then pass the ingest processing systems FALLOUT and CONVEYANCE respectively. Finally, the data are stored in the following NSA databases:
- MARINA: for internet metadata
- MAINWAY: for phonecall metadata
- NUCLEON: for voice content
- PINWALE: for internet content, video content, and "FAA partitions"

> See also: Storage of collected PRISM data

8. This slide was one of four disclosed by The Washington Post on June 29, 2013.

It shows the composition of the Case Notation (CASN) which is assigned to all communications which are intercepted under the PRISM program.

We see that there are positions for identifying the providers, the type of content, the year and a serial number. Also there's a fixed trigraph which denotes the source. For NSA's PRISM collection this trigraph is SQC. From another document (pdf) we learn that the trigraph for FISA data used by the FBI is SQF.

The abbreviations stand for: IM = Instant Messaging; RTN-EDC = Real Time Notification-Electronic Data Communication(?); RTN-IM = Real Time Notification-Instant Messaging; OSN = Online Social Networking.

> See for more about this slide: PRISM case notations

9. This slide was one of four disclosed by The Washington Post on June 29, 2013.

The content of the slide shows a screenshot of a web based application called REPRISMFISA, which is probably accessible through the web address which is blacked out by the Post. Unfortunately there's no further explanation of what application we see here, but if we look at the word REPRISMFISA we can imagine the application is for going "back to data collected under the PRISM program according to the Foreign Intelligence Surveillance Act (FISA)".

In the center of the page there are three icons, which can be clicked: PRISM, FBI FISA and DOJ FISA. This shows that both NSA, FBI and the Department of Justice (DOJ) are using data collected under the authority of the Foreign Intelligence Surveillance Act (FISA), and that the NSA's part is codenamed PRISM.

Below these icons there is a search field, to query one or more databases resulting in a partial list of records. The search options seem rather limited, as only two keywords can be entered, with an additonal "and/or" option. At the left there's a column presenting a number of options for showing totals of PRISM entries.

> See for more about this slide: Searching the collected data

Section 702 FAA Operations

The following slides are about how PRISM can be used to collect various types of data. This collection is governed by section 702 of the FISA Amendments Act (FAA), which in NSA-speak is called FAA702 or just merely 702.

Section 702 FAA was enacted in 2008 in order to legalize the interception that was going on since 2001 and that became known as the "warrentless wiretapping" because it was only authorized by a secret order of president George W. Bush. The FAA was re-authorized by Congress in December 2012 and extended for five years.

Under section 702 FAA, NSA is authorized to acquire foreign intelligence information by intercepting the content of communications of non-US persons who are reasonably believed to be located outside the US. This interception takes place inside the United States with the cooperation of American telecommunication and internet companies.

Operations under the original Foreign Intelligence Surveillance Act (FISA) from 1978 require an individual determination by the FISA Court, but under FAA the Attorney General and the Director of National Intelligence (DNI) certify an annual list of targets, which is then reviewed by the FISA Court to determine whether these certifications meet the statutory requirements.

10. This slide was additionally published by The Guardian on June 8, 2013, to clarify that PRISM, which involves data collection from servers, is distinct from the programs FAIRVIEW, STORMBREW, BLARNEY and OAKSTAR. These involve data collection from "fiber cables and infrastructure as data flows past", which is called Upstream collection.

NSA can collect data that flow through the internet backbone cables, as well as data that are stored on the servers of companies like Google, Facebook, Apple, etc. The latter are collected "directly from the servers" as opposed to the communications that are still on their way to those servers when passing through the main internet cables and switches.

The words "directly from the servers" were misinterpreted by The Guardian and The Washington Post, leading to the claim that NSA had "direct access" to the servers of the internet service providers. As the next slide will show, there's no such direct access.

(The claim of NSA having "direct access" was not only based on this slide, but also on misreading a section from the draft of a 2009 NSA Inspector General report about the STELLARWIND program, which on page 17 says: "collection managers sent content tasking instructions directly to equipment installed at company-controlled locations". The Washington Post thought this referred to the companies involved in the PRISM program, but it actually was about Upstream Collection, which has filters installed at major internet switches. This follows from two facts: first, that the STELLARWIND program was terminated in January 2007 while PRISM only started later that year; second, that STELLARWIND only involved companies that operate the internet and telephony backbone cables, like AT&T and Verizon, not internet service providers like Microsoft and Google)

An important thing that wasn't well explained by the media, is that not only PRISM, but also the domestic part of Upstream collection is legally based upon section 702 FAA. Note that NSA also conducts Upstream collection under three other legal authorities: FISA and Transit inside the US and Executive Order 12333 when the collection takes place abroad.

> See for more: Slides about NSA's Upstream collection

From a 2011 FISA Court ruling (pdf) that was declassified upon request of the Electronic Frontier Foundation we learn that under section 702 FAA, NSA acquires some 250 million "internet communications" each year. This number breaks down as follows:
- Upstream: ca. 9% or more than 22 million communications
- PRISM: ca. 91% or more than 227 million communications
The ruling doesn't explain what exactly a "internet communication" is.

11. This slide was one of three published on the website of the French paper Le Monde on October 22, 2013. It compares the main features of the PRISM program and the Upstream collection.

Direct Access?

The last line says that for PRISM there is no "Direct Relationship with Comms Providers". Data are collected through the FBI. This clearly contradicts the initial story by The Guardian and The Washington Post, which claimed that NSA had "direct access" to the servers of the internet companies. This led to spectacular headlines, but also a lot of confusion, as it allowed the companies involved to strongly deny any direct relationship with the NSA - because it's actually the FBI that is picking up their data.

Had this slide been published right in the beginning, then more adequate questions could have been asked and probably we could have got answers that made more sense.

A direct relationship does exist however with the companies which are involved in the Upstream collection, like AT&T and Verizon, who most likely have high volume filtering devices like the Narus STA 6400 installed at their switching stations. Unlike intercept facilities outside the US, where the XKeyscore system can store and search 3 days of content, the sites inside the US only seem to filter data as they flow past, and hence there's no access to Stored Communications.

About Collection

The slide also shows that the so-called "Abouts" collection is only possible under the Upstream method. As we learned from a hearing of the Presidential Civil Liberties Oversight Board (PCLOB ), this About Collection is not for gathering communications to or from a certain target, but 'about' a specific selector, like for example an e-mail message in which an e-mail address or a phone number of a known suspect is mentioned. This About Collection is not looking for names or keywords.

Because under Upstream NSA is allowed to do About Collection which pulls in a broader range of communications, the retention period (the time the data are stored) is only two years. Data collected under PRISM, which are restricted to communications to and from specific addresses, are stored for the standard period of five years. Both under PRISM and Upstream there's no collection based upon keywords.

12. The slide was seen in a television report and shows a world map with the undersee fiber optic cables according to the volumes of data they transmit. This map is used as background of a number of other slides about FAA 702 Operations. In seems that additional information, like in the next slide, appears by mouse clicking the original powerpoint presentation.

13. The slide shows the same world map with fiber-optic cables and is hardly readable, but according to Wikipedia, the subheader reads "Collection only possible under FAA702 Authority" and in the central cyan colored box the codenames FAIRVIEW and STORMBREW are shown subsequently. Maybe other codenames are in the yellow box at the right side. It's not clear what the irregular blue shapes in the Indian Ocean are. The figure which is right of New Zealand is a stereotype depiction of a terrorist with a turban.

14. This partial slide was seen on the laptop of Glenn Greenwald in a report by Brazilian television and shows two scenarios for collection data under FAA 702 authority. It has two boxes with text, the one on the right reads:
Scenario #2
OPI tasks badguy@yahoo.com under FAA702 and 12333 authority in UTT
Badguy sends e-mail from [outside?] U.S. and comms flow inside U.S.
FAIRVIEW sees selector but can't tell if destination end is U.S. or foreign
Collection allowed
Only the target end needs to be foreign
OPI stands for Office of Primary Interest and UTT for Unified Targeting Tool, the NSA application used for instructing the actual collection facilities.

15. This slide was one of three published on the website of the French paper Le Monde on October 22, 2013.

It shows a list of 35 IP addresses and domain names which are the "Higher Volume Domains Collected from FAA Passive". This indicates that data from these domains are collected from fiber optic cables and other internet infrastructures, which is called Passive or Upstream collection.

All IP addresses and domain names are blacked out, except for two French domains: wanadoo.fr (a major French internet service provider) and alcatel-lucent.com (a major French-American telecommunications company). The rest of the list will most likely contain many similar domain names, which shows that redactions of the Snowden-documents are not only made to protect legitimate security interests, but also when the papers, in this case Le Monde, want to keep these revelations strictly focussed to their own audience.

Reporting based on PRISM

The following slides show some of the results from the PRISM program:

16. This slide was one of three published on the website of the French paper Le Monde on October 22, 2013.

It shows a highlight of reporting under the section 702 FAA authority, which in this case includes both PRISM and the STORMBREW program of the Upstream collection capability. Information derived from both sources made the NSA/CSS Threat Operations Center (NTOC) figure out that someone had gotten access to the network of a cleared defense contractor (CDC) and was either preparing to, or at least had the ability to get 150 gigabytes of important data out. NTOC then alerted the FBI, which alerted the contractor and they plugged the hole the same day, apparently December 14, 2012.

Another cyber attack that was detected by PRISM occured in 2011 and was directed against the Pentagon and major defense contractors. According to the book 'Der NSA Komplex' this attack was codenamed LEGION YANKEE, which indicates that it was most likely conducted by Chinese hackers.*

17. This slide of the PRISM presentation appeared on the website of O Globo and is titled "A Week in the Life of PRISM Reporting" and shows some samples of reporting topics from early February 2013.

It seems the bottom part of this slide was blacked (or actually whited) out by Brazilian media, as the Indian paper The Hindu disclosed that this slide also mentions "politics, space, nuclear" as topics under "India", and also information from Asian and African countries, contributing to a total of "589 End product Reports".

These lists show that collection under the PRISM program is not restricted to counter-terrorism, but is also not about monitoring ordinary people all over the world, as many people still think. PRISM is used for gathering information about a range of targets derived from the topics in the NSA's Strategic Mission List (pdf). The 2007 edition of this list was also among the Snowden-documents and subsequently published, but got hardly any attention.

In 2012, data collected under the PRISM were cited as a source in 1477 items of the President's Daily Brief, making this program one of the main contributors to this Top Secret intelligence briefing which is provided to the president each morning.

Links and Sources
- MatthewAid.com: New NSA Report on Its Electronic Eavesdropping Programs
- DNI.gov: NSA's Implementation of Foreign Intelligence Surveillance Act Section 702 (pdf)
- TED.com: Edward Snowden: Here's how we take back the Internet
- C-Span.org: Privacy and Civil Liberties Oversight Board Hearing, Government Officials Panel
- TechDirt.com: Why Does The NSA Focus So Much On 'TERROR!' When PRISM's Success Story Is About Cybersecurity?
- SealedAbstract.com: The part of the FISC NSA decision you missed
- GlobalResearch.com: New Documents Shed Light on NSA’s Dragnet Surveillance

March 25, 2014

Some SIGINT and COMSEC during the Nuclear Security Summit

(UPDATED: March 27, 2014)

On March 24 and 25, the third Nuclear Security Summit (NSS) is held in The Hague, the seat of the government of the Netherlands. As 58 world leaders will be present, including US president Obama, the summit takes place under severe security measures.

Here we will take a look at some noticable things on the Signals Intelligence (SIGINT) and Communications Security (COMSEC) front, especially regarding the American president. When some new details or pictures come up, they will be added.

US presidential motorcade

On the morning of Monday, March 24, president Obama flew in aboard Air Force One, accompanied by all the famous vehicles like the helicopters which become Marine One when he is aboard, and the cars of the presidential motorcade. As can be seen in this video, there are actually 4 identical and heavily armored presidential limousines, so two motorcades with each two identical limousines can be formed - so no one knows which one is carrying the president.

One of the last cars in the motorcade is the WHCA Roadrunner which is recognizable by a small satellite dome and a number of VHF whip antennas on a roof platform. Also known as the Mobile Command and Control Vehicle (MC2V), it serves as the communications hub for the motorcade by beaming up encrypted duplex radio and streaming video to a military satellite, which in turn beams that data back down to a ground entry point and through to the switchboard of the White House Communication Agency (WHCA).

The WHCA Roadrunner arriving in Noordwijk, where Obama is staying
(still from a video by VOLMedia)

The WHCA Roadrunner on its way from Noordwijk to The Hague
with the satellite dome and 5 VHF antennas on its roof
(photo: madebymaus posted at NUfoto.nl)

US communications aircraft

Also present is an aircraft most people don't know of and will rarely see. But it was noticed by air traffic spotters: a small US Army Beechcraft RC-12P Huron with tail number 92-13123 entered Dutch airspace around the same time as Air Force One.

A Beechcraft RC-12P surveillance and communications aircraft
(photo: Wikimedia Commons)

The Beechcraft C-12 Huron is a small twin-turboprop aircraft, which is used for many years by the US Army under its Guardrail/Common Sensor System program. In many different versions, the Beechcraft planes are widely used in war zones like Afghanistan, mainly for collecting Signals Intelligence. For that purpose they have highly specialized equipment on board, like for example DRT devices, which can be used to intercept and monitor short range radio and cell phone communications.

When following the president, the Beechcraft is probably also used as an additional communications hub between for example the presidential motorcade and the White House Communication Agency (WHCA) as this aircraft can also serve as a relay for satellite communications. The mission equipment of the RC-12P version includes datalink capability, and has fibre optic cabling and smaller and lighter wing pods.

The Doomsday Plane

Another interesting plane that followed Air Force One is a Boeing E-4B that serves as National Airborne Operations Center (NAOC). It's also known as the "Doomsday Plane" because during the Cold War there was always one of these aircraft in the air to provide a survivable platform in the event of a nuclear attack. Its original codename was "Nightwatch", derived from the famous Rembrandt painting that Obama saw during his visit to Amsterdam.

On March 24, plane spotters noticed an E-4B with tail number 75-0125 following Air Force One when entering European airspace. When the US president travels outside North America, the E-4B lands on an airport within several hundred miles of the president's destination, to be readily available in the event of an emergency that renders Air Force One unusable. It's not yet known where the E-4B landed during the Nuclear Security Summit.

A Boeing E-4B National Airborne Operations Center plane
(photo: Sergio Perez Aguado)

World leaders' telephones

For secure phone calls, president Obama can use his highly secured BlackBerry, which connects to a secure base station that follows him where ever he goes. In the picture below we can see he also brought his BlackBerry to the opening session of the Nuclear Security Summit on March 24:

Obama and his BlackBerry at the opening session of the Nuclear Security Summit
(photo: Yves Herman/Pool/European Pressphoto Agency)

The WHCA also installs secure wireline phones at every place the president stays. Nowadays that includes a vIPer 'Universal Secure Phone' which can connect through analog, digital or VoIP networks, and a Cisco IP phone that connects to the highly secured Executive Voice over Secure IP-network through dedicated and encrypted satellite links.

Other world leaders will also have brought their own equipment for secure communications with their capitals, including landline and mobile telephones that are able to perform end-to-end encryption of the calls.

For example, German chancellor Angela Merkel can use her secured BlackBerry 10 smartphone or a landline secured by the Elcrodat 6-2 encryption device. The French president François Hollande has a Teorem secure cell phone and can also place secure calls through a wireline DCS 500 telephone set.

The British prime minister David Cameron can place secure calls by using a Brent 2 secure telephone. Apparently he also uses one or two mobile phones, but from the picture below it's difficult to recognize their make and type. It's also not clear whether they have any encryption capability.

British Prime Minister David Cameron attends the
opening session of the Nuclear Security Summit
(photo: Reuters/Sean Gallup/Pool)

Russian spy ships?

There was some speculation about two Russian crab fishing trawlers lying in the harbour of Scheveningen, which is a small port next to The Hague. People noticed that these ships had so many antennas, and thought they could be Russian spy ships, trying to intercept communications from the world leaders attending the Nuclear Security Summit.

Two Russian trawlers in Scheveningen
(photo: Ferry Mingelen)

However, it stated out that both ships, from Petropavlovsk-Kamtsjatski in the far eastern part of Russia, came in three weeks ago having problems with their engines. This was solved rather quickly, but then problems with some neccessary certificates appeared, and both ships had to stay until that issue has been cleared.

Regarding the antennas, some people say that they are quite ordinary HF and VHF antennas, which were used by ships before maritime satellite communication was introduced.

During the Cold War, the Soviet Union had a large fleet of fishing trawlers that conducted covert signals intelligence gathering. Nowadays, known Russian spy ships are often much larger military vessels, which don't have to go into the territorial waters of a country they want to monitor, but can operate safely from international waters. An example is the Viktor Leonov SSV-175 from the Vishnya class in the picture below, but a ship like this one has not been seen near the Netherlands.

The Russian spy ship Viktor Leonov SSV-175
(photo: Stringer/Reuters)

Links and Sources
- 618 photo's of Obama in Noordwijk
- Photo's of the NSS 2014 at Flickr
- Photo gallery of Obama's trip to Europe and Saudi Arabia
- Russian spy ships? Scheveningse haven door Russen geannexeerd
- More about The Beast and its Escorts
- WashingtonPost.com: Secret Service agents on Obama detail sent home from Netherlands after night of drinking
- DailyMail.co.uk: Revealed: INSIDE the President's $223m 'DOOMSDAY PLANE' that protects him against nuclear war, asteroids and terror attacks

March 23, 2014

Video demonstration of two intelligence analysis tools

In a previous article we provided a very extensive description of a communications analysis tool used by the Canadian agency CSEC. Here we will show two video demonstrations of analysis tools which are used by intelligence and law enforcement agencies all over the world: Sentinel Visualizer and Analyst's Notebook.

Sentinel Visualizer

The first intelligence analysis program is Sentinel Visualizer, which was developed by FMS Advanced Systems Group. This is a 'minority-owned' small business founded in 1986 and based in Vienna, Virginia, which provides custom software solutions to customers in over 100 countries.

This video shows a demonstration of how the Sentinel Visualizer software program can be used to analyse telephony metadata in order to discover new targets:

FMS claims that In-Q-Tel, the CIA's venture capital arm is an investor in FMS, apparently in order to improve their products so they can fit the needs of the CIA. FMS also claims that its product is much cheaper than the alternative, with the price of a single-computer license for its Sentinel Visualizer starting at 2699,- USD, while IBM's Analyst's Notebook tool starts at 7160,- USD.

Analyst's Notebook

Very similar to the Sentinel Visualizer is Analyst's Notebook, which was developed in the early 1990's by i2, a UK-based arm of software company i2 Group which produced visual intelligence and investigative analysis software. After a number of acquisitions, it became part of IBM in 2011.

Both programs offer similar functions, like metadata/link analysis, call chaining, timeline views, social network analysis, geospatial visualizations, and the import of data from knowledge bases and other data sets.

For analysing telephony metadata, Analyst's Notebook has an extension called Pattern Tracer, which enables rapid pattern analysis for "quickly identifying potential targets and predict future incidents more accurately".

This video demonstrates how a "Pattern-of-Life Analysis" can be conducted by using Analyst's Notebook - Esri Edition:

Analyst's Notebook is said to be used by about 2500 intelligence, security and law enforcement agencies, as wel as police forces (like for example the Dutch police, the German Federal Criminal Police Office and the London Metropolitan Police) and investigative organizations and companies in over 150 countries. According to a range of job descriptions, Analyst's Notebook is also used by analysts at NSA.


As can be seen in the second video, these intelligence analysis tools are quite powerful and able to provide a deep insight into the life of a targeted person. But the presentation also shows that this kind of surveillance is consuming too much time and resources for using it against millions of innocent civilians.

Like the example in the second video, these tools are mainly used for operations against known and potential terrorists and a number of other people of interest, like drugs and weapons traffickers, and also some high level foreign government and military officials.

Regarding the intrusiveness of these tools, we should also keep in mind that they are used by law enforcement and police forces too. Where intelligence agencies use these tools generally for preparing reports for political and military decision makers, their use in numerous criminal investigations by the police can affect ordinary citizens much more directly.


On December 15, 2013 the CBS television program 60 Minutes provided some hitherto unseen vieuws from inside the NSA headquarters. One of those was an NSA employee who gave a demonstration of how the metadata contact chaining method works. The following screenshots show a tool very similar to the ones in the videos above:

Today, the German magazine Der Spiegel published in its print edition a slide from an NSA presentation that shows a contact graph based upon a social network analysis for the CEO and the Chairwoman of the Chinese telecommunications company Huawei:

(image provided by @koenrh)

See our previous article about the Canadian OLYMPIA tool for how intelligence agencies can map such a social communications network by using just one or two e-mail addresses to start with. See also an earlier article about how NSA used similar techniques to create contact graphs about the Mexican and the Brazilian president.

Links and Sources
- FMSASG.com: How Sentinel Visualizer is a Superior Alternative to IBM's i2 Analyst's Notebook

March 13, 2014

OLYMPIA: How Canada's CSEC maps phone and internet connections

On October 6, 2013, the Brazilian television program Fantástico revealed the existance of a software program called OLYMPIA. In this case, the program was used by the Communications Security Establishment Canada (CSEC) to map the telephone and computer connections of the Brazilian Ministry of Mines and Energy (MME).

OLYMPIA is a sophisticated software framework that combines access to a range of databases and analytic tools. It's used to discover and identify the telephone and computer infrastructure used by potential targets. This information can then be used for setting up tapping, bugging and/or hacking operations. OLYMPIA itself does not collect any actual content of communications.

In this article we take a close look at the OLYMPIA tool, based on the powerpoint presentation that was first shown on Brazilian television on October 6, 2013. On November 30, the Canadian newspaper The Globe and Mail published most of the slides on its website. Here, all available slides are pulled together, including one that had to be reconstructed from the video footage (click the slides to enlarge them).

The OLYMPIA presentation was dissected and analysed in depth by a reader of this weblog, who wants to stay anonymous, but kindly allowed me to publish his interpretation here. I did some editing to make his text fit the format of this weblog.

For some readers these explanations may be too complex and detailed, but for those who are interested, they provide a unique look at this part of the signals intelligence tradecraft. We can assume that similar tools are used by NSA, GCHQ and other agencies.

The OLYMPIA presentation was held in June 2012 during the "SD Conference", where SD stands for SIGINT Development - an intelligence term for testing and creating new ways to collect signals intelligence information. According to Fantástico this is an annual conference for members of the Five Eyes partnership, which consists of the United States, United Kingdom, Canada, Australia and New Zealand.

This case study was presented by what seems to be someone from the Advanced Network Tradecraft unit of CSEC, probably because "one of the things Canada does very well is analysis" - according to NSA historian Matthew Aid. (or could Advanced Network Tradecraft be the ANT unit of NSA's Tailored Access Operations (TAO) division?)

This slide gives an overview of the Olympia interface which can present all sorts of different types of information at the same time and probably can be customized by the user. Right in the middle, probably just to have something graphical amidst all the tables, there is a map, showing the central part of Brasil, with a purple dot marking the capital Brasilia. It's not a Google map, because that would have replaced the jagged coastline with bathyometeric shaded relief and would look much nicer than this geolocation satellite view.

This slide shows the same image of the Olympia interface as in the previous slide, but this time with a pop-up menu open. The list shows eight previously known NSA tools and databases, a GCHQ tool, commercial software, and software tools developed by CSEC staff which are recognizable by their classical Greek names. Arranged in alphabetical order, the tools and databases listed in the pop-up menu and in another list from the interface are:
ATHENA - Ports Information (CSEC)
ATLAS - Geolocation and Network Information (CSEC)
BLACKPEARL - Survey Information (NSA)
COEUS - WHOIS Information (CSEC)
EONBLUE - Decoding Hostnames? (commercial)
EVILOLIVE - Geolocation (NSA)
GCHQ Geofusion - Geolocation (GCHQ)
HYPERION - IP-IP Communication Summaries (CSEC)
LEVITATE - FFU (FireFox User??) Events
MARINA - TDI Online Events (NSA)
PEITHO - TDI Online Events (CSEC)
PEPPERBOX - Targeting Requests
PROMETHEUS - CNO Event Summaries (CSEC)
QUOVA - Anonymizers, Geolocation Map (commercial)
SLINGSHOT - End Product Reports
STALKER - Web Forum Events
STARSEARCH - Target Knowledge
TIDALSURGE - Router Configs
TOYGRIPPE - VPN Detailed Events (NSA)

Only a handful of Olympia's tools do all the heavy lifting in the slide algorithms. The rest get passing mention in pull-down menus. Thus the presentation provides only a glimpse of Olympia's capabilities - we see for example TRITON for attacking the TOR network and TOYGRIPPE and FRIARTUCK for attacking VPN (virtual private networks) but not examples of their actual use.

The tools of Olympia represents a very large team effort at CSEC over several years with sheltering nearly all its database processing resources under the Olympia umbrella. The shelf life of the Olympia environment may be longer than its tools.

Of the 13 tools that are used, their use in the following algorithmic slides is almost entirely restricted to ATLAS, DANAUS, HYPERION, PEITHO and HANDSET. The reporting tool of course finishes every slide but two others, EONBLUE and QUOVO, only appear obliquely as report sources.

This slide presents a simple algorithm by using drag 'n' drop icons linked by arrows to specify its operations step-by-step. It draws on entries data records already stored in NSA's huge telephony database MAINWAY of call metadata. As NSA does the hoovering down in Brazil, the slide does not build use fresh Canadian surveillance by intercepts or insertion of malware on Brazilian cell phones or servers - that comes later in partnership with NSA's Tailored Access Operations (TAO) as warranted and informed by initial results obtained here.

Olympia is thus modular software that allows a mid-level analyst (who cannot write computer code) to specify and test advanced NoQYL database queries from within an intuitive visual environment. It provides an intuitive graphical interface allowing to assembly some 40 component tools into a flexible fit-for-purpose logic pipeline by simple drag and drop of icons. Such pipe-and-flow visual programming environments have a rich history – they match how Unix developers can quickly put together complex processes from the simple ones provided by the operating system.

Should an analyst drag one of the widgets into the design, a form window will pop up asking for parameters to be supplied. After stepping through the algorithm to fill in various pop-up forms that address database housekeeping issues, Olympia can then button up (compile) the tested product into a new icon that the next analyst can use as a trusted component for an even more complex investigative process. This allows analysts to conduct sophisticated target-development with minimal additional training.

With a database like MARINA consisting of trillions of rows (records) and 13 columns (fields), it is very easy to pose a query (play a design) that, after hours of delay, returns way too much data, or submit a query so complex or boolean-illogical that it freeezes NSA's server. To prevent this, it would make sense to have expert analysts work out main designs once and for all. Low-level analysts then just enter specific parameter ranges in the forms, but this of course would undercut the whole modular design power of Olympia.

So the whole process can be buttoned up, enabling one-button automation from a few business cards to the best phones to turn into meeting listening devices. While Olympia is a MySQL query builder, it does much more than that, notably advanced post-processing analytics of query results (which amount to a derived special-purpose database or QFD in NSA-speak: Question-Focused Dataset) resulting in convenient output to CSEC's reporting tool Tradecraft Navigator.

In the slide we see the following process:
-1- The process begins with a 'TC Init' widget that initializes processes Olympia needs to run. That may include starting up software, locating Five Eyes network resources, and verifying security authorizations for the analyst's 'thin client' interface to Olympia and NSA's remote network databases. That is, for security purposes following the Jeffrey Delisle spy case, Canadian analysts are given desktop computers without hard drives that cannot copy files to inserted thumb drives nor write to blank CDs. TC is used later in lower case to personalize data field header names so could alternatively represent the initials of the analyst (for logging purposes).

-2- The analyst next fills in a pop-up form called 'Dynamic Configuration' to provide initial data and establish project-specific terminology. The form amounts to a small database with one record (row) for each configuration needed and 7-8 fields (columns) with the specifics: configuration name and number, initial data, default value to use if actual value is missing after enrichment, true/false option to govern whether a later filter condition is met, field names to begin with tc_ (for thin client), and field type.

Configuration here seeds the coming discovery process with the MSISDN (SIM card routing number) for nine cell phones linked to staff at Brazil's Minerals Mining Energy (MME), either from business cards acquire by Canadian diplomats and mining executives or as metadata incidentally ingested by NSA from rooftop mobile phone intercepts at the American embassy in Brasilia. Recalling that MAINWAY has many billions of records just for Brazil, a narrow date range will keep the number of records, and so the subsequent latency (processing delay), manageable

-3- The initial set of phone numbers is then greatly expanded (enriched) by contact-chaining in the huge NSA metabase MAINWAY. This process collects the MSISDN of recipients of calls from the seed numbers, and recipients of their calls (two or more hops). Some of these will be just pizza joints or calls home but others will belong to coworkers at MME.

TAPERLAY is one of the most common skills listed in LinkedIn profiles, with one SIGINT analyst writing he "was responsible for entering numbering information for 132 countries and multiple service providers in each country by reviewing forms and reports and conferring with management." It is often used in conjunction with CHALKFUN, a NSA tool that searches the vast FASCIA database of device location information to find past or current location (notably US roaming) of mobile phones.

-6- The original phoneNumber field has now been supplemented by Last Seen (last recorded use), City and Country of initial registration, Identity (target's name), FIPS, destination number called and its fields, and others we cannot see on the alphabetical pulldown list. Here FIPS is an open source geolocation code maintained by the US government.

-7- The 'Sort' widget is then configured to re-order the records in some sensible way, say reverse chronological order and most frequent MSISDN.

-9- Prior to writing up a final report, the analyst could return to step 7 and insert further operational icons - 29 options are shown (even with A-E and Q-Z missing from the pop-up menu).

This slide says that the presentation is a case study about how to map the target's communication infrastructure when there's only very little information to start with, in this case:
- One known e-mail domain: @mme.gov.br
- Nine known phone numbers
- Very little data collected earlier

Starting with the single e-mail domain @mmm.gov.br for Brazil's Ministry of Mines and Energy (MME), the algorithm works out IP numbers of MME's mail and internet servers plus their network owners and backbone carriers. Note the potential target here is the entire department, not an individual.

-0- After initialization, the input - here just a single domain @mmm.gov.br but optionally a list of thousands - is put in a storage area (buffered) until its entries can be processed.

-1- The CSEC-developed tool DANAUS looks up the domain in its DNS (Domain Name System) repository. For one domain, this can easily be done by google search on the open internet but that is inefficient on a larger scale. Olympia will not only automates this process but can re-package it as a meta-tool icon that can be re-used as a component (sub-routine) of more ambitious algorithms.

-2- The DNS are next sorted by IP record type which splits them into two streams (Type A and Type MX records in DNS nomenclature). Here MX (Mail Exchanger) records specify the mail servers accepting e-mail messages on behalf of the recipient's domain. Type A (address) records specify IP numbers of the mail servers sending email from this domain.

-3a- The MX fork of the diagram filters records according to analyst specifications (pop-up window not shown), changes out value names, and merges text strings with certain information (extracted by the small 'i' icon, never explained) derived from records rejected by the filter. The output to Tradecraft Navigator is a simple database called 'Mail Servers' having six fields discussed below: Response_MX, Hostname, IPv4, Source, FirstSeen and LastSeen.

-3b- The A fork is filtered differently but here rejects are discarded. A new Canadian tool icon labelled ATLAS acts on the records that have been stored in fastBuffer to look up geospatial locations of the IPs. After a sort, duplicated IP locations can be eliminated by a standard database reporting feature (break on change in geolocation field). Duplicates might arise from a single server location hosting multiple IPs or a server cluster.

-4b- Records passing another filter (e.g. geolocation Brasilia) are then sorted by IP number for orderly output to Tradecraft Navigator for report-generation. Here the resulting database 'Domain's IPs' has 9 columns (fields) for IP Range, Country, ASN, Owner, and Carrier in addition to the ones above. The Autonomous System Number (ASN) provides the officially registered IP routing prefix that uniquely identifies each network on the Internet. Here the IPv4 numbers correspond to Global Village Telecom, Embratel and Pelpro. The analyst wants to know this because some carriers sell access to NSA while others have been hacked.

From the mail server records, it turned out the Ministry only used correio.mme.gov.br and correio2.mme.gov.br for their mail servers (correio means mail in Portuguese). Journalists have inexplicably blacked out IPv4 numbers but anyone can look up the IP address for a given domain name at WHOIS websites, or apply the COEUS widget if they work at CSEC.

The analyst has now actually determined the IP addresses, their blocks of consecutive numbers (ranges), geolocation of servers used by MME's internet services providers plus the identities of backbone carrier networks. Some 27 IPs shown associated with the domain @mme.gov.br came out of processing A type records.

Some of this is unremarkable (the hostname www.mme.gov.br is MME's public home page, ns1.mme.gov.br is just a name server) while others have undeterminable relevance (being barely legible) to commercial espionage. One of these, acessovpn.mme.gov.br ( running on http port 80 with A, comes up later as a potential target for a man-on-the-side attack.

- The Source field is a bit mysterious. It takes on only two values, EONBLUE and QUOVA. These are tool icons within Olympia whose names lie outside the Greek mythology theme, suggesting software from elsewhere. The explanation: a US company named Quova provides online blocking based on geolocation of a computer's IP address, like for example blacking out URL access to a football game in the home team's city so people purchase stadium tickets. Quova was acquired in 2010 by Neustar which provides a much broader range of backbone internet registry services. EonBlue is also corporate but more obscure.

- Between them, EONBLUE and QUOVA can report on recorded activities and attributes of the IPs at Brazil's MME: the MX record of correio.mme.gov.br shows it was first seen active from 17 Jun 09 and last seen active on 15 Feb 10; similar dates for correio2.mme.gov.br active are later and don't overlap, namely 21 Jun 10 to 19 Jun 11.

- Later Olympia slides show QUOVA within a diagram, so this one should show both QUOVA and EONBLUE but does neither. QUOVA concerns itself with IP ranges, IP geolocation, and anonymizers (proxy servers relaying on a user's behalf, hiding identifying information), yet ATLAS provided IP geolocation in later slides and HYPERION and PEITHO the IP proxies. So it must be that QUOVA add value to the in-house DNS lookup tool DANAUS.

This slide shows how the analyst can identify a proxy server at the Ministry of Mines and Energy based on its observed behavior. It's not clear whether a discovered proxy server has been identified for certain, or that is only the strongest candidate seen, nor whether the full set of MME proxy servers have been located or just one of several. However, this is the most promising site for defeat of SSL by a man-on-the-side attack to intercept of transiting documents before they can be encrypted.

-1- After initialization, the Dynamic Configuration for the IPs of MME determined above is set with three lines: high, low, high - low +1 = range for each block. Here a reverse proxy server (firewall surrogate) often holds the first number of the range block and sits in front of a local network of other computers utilizing the rest of the range block as their addresses. Those other IPs don't show up in metabases because the URL requested by an outside visitor passes through the proxy on its way to the server (that actually can fulfill the request) is returned as if it came from the proxy server.

-2- The initial data is split at an enhancement fork which is not described further. Buffers should have been created for two subsequent tools PEITHO and HYPERION because they are sent large files (as indicated by the little 2-page icon on the connecting line). Those icons are missing from the algorithm, breaking it. Both PEITHO and HYPERION also need demultiplexing as followup but the De-Mux icons (the all-purpose dummy widget) are also missing from the diagram.

Recall many different ongoing processes on a given server are sending (and receiving data) simultaneously using the same Internet Protocol software. To accomplish this, packets of different types are intermingled ('multiplexed') in the exit stream. As the stream of packets is received, it is sorted out by type (demultiplexed) and passed to appropriate application on the receiving client.

-3a- PEITHO specializes in "TDI events" and has the same iconography as MARINA, tinted blue instead of pink. A menu in another slide ties MARINA to these same mysterious TDI events. MARINA is known to be a vast NSA metabase of internet metadata. An online LinkedIn profile speaks of having "used MARINA as a raw SIGINT data viewer for detection and analysis of priority targets and as a tracking and pattern-of-life tool."

PEITHO can thus be presumed very similar to MARINA, probably a refined subset of it adapted to dissecting out the TCP/IP connection metadata needed here, in particular recognizing and compiling the exchange of SSL certificates that are the hallmark of a secure (https) site. In one scenario, an off-site MME staffer uploading oil lease data points a web browser at the MME server that will host the documents, which sits within a LAN (local area network) behind a proxy server running port 443 for https.

After exchange of SSL certificates, the content can be sent over the internet encrypted rather than as plain text, and will decrypted at the MME repository. NSA data trawling - while not specifically seeking them out - intercepts these exchanges and stores them as a Sigint record subset in MARINA. PEITHO extracts these for the specified IP address ranges. This has nothing to do with defeating SSL - that comes later.

PEITHO can only provide half of a full TCP/IP 4-tuple (the output of this algorithm), namely the connection pairs with mentioning MME and server port numbers. This is done by filtering records in PEITHO high and low IP values provided by the initial configuration file, partitioning it into passing and not-passing. Values from both are renamed and retained in output because they define IP blocks.

-3b- Meanwhile, HYPERION works in parallel to PEITHO to provide IP to IP communication summaries, how data flows in and out of MME servers and their IP range blocks, in response to remote IP requests. This data too undergoes similar filtering and re-mapping of value names and formats, again with ultimate retention of both streams as the entity_IP and remote_IP components of the TCP/IP 4-tuple.

-4- The four fields of a TCP/IP 4-tuple are called entity_IP, remote_IP, remote_port, entity_port and will appear as a small table on the proxy output page. They are obtained by merger of the PEITHO 2-tuple with that of HYPERION.

-5- At this point, only https (port 443) and http (port 80) metadata remains as remote_port values. The latter is discarded on the basis of its port value under the assumption that high-value data will be encrypted in transit by a secure socket layer (SSL) using port 443. Note email servers use port 25 - that will show up in the next slide in the context of correio.mme.gov.br.

On the results page provided by Tradecraft Navigator, only the two port columns are visible from the original socket pair 4-tuple. Ports are described by an esoteric compressed four-field format such as 6:443:TS(1) where the second element is the actual port number.

Here every port entry starts with 6: (making it uninformative) followed by 443 in the case of a remote https port, respectively high and variable (ephemeral) port numbers in the case of the entity_port column. The port description is then completed by a cryptic digraph drawn from TS, TC, FS, FC and a small qualifying number in parentheses.

It's not clear whether any more than just the straight port number needed to be retained here to substantiated a discovered proxy. Curiously, Olympia contains a distinct tool called ATHENA specializes in port information but it is not applied in this algorithm or any of the other slides.

The bottom line here is the analyst seems to have identified MME's proxy server and so a line of attack to be described later. That is of interest because closely held documents (like providing extents of offshore oil reserves or assay grade of mineral deposits being auctioned off) would be sent through this server as a measure to protect them from theft.

This slide presents a more complicated diagram of how an analyst can discover IP addresses the target, in the case the Brazilian MME, communicates with. This information can later be used to intercept these communications links.

-1- This starts with DNS lookup of the hostnames (eg correio.mme.gov.br). That process can give duplicates and other records that are empty with respect to fields of interest. These are discarded.

-2- After appropriate menu enrichments have expanded out from the initial seeds, PEITHO and HYPERION act again in parallel to reconstruct the TCP 4-tuples (or socket pairs). The stream of internet packets sent out by a given server are a mix of packets from whatever processes are running, for example http, https, ftp, smtp and telnet on the TCP side and dns, dhcp, tftp, snmp, rip, voip via UDP.

-3- As only http and https are of interest here, the other packets are discarded via the De-Mux widget. Note the packets are not really multiplexed in the traditional sense used in signal electronics but remain discrete and merely alternate in the packet stream connecting server to client. De-multiplexing in this context simply means separating the packets as they come along, retaining only the subset of interest.

-4 - Not everything is of interest here, so the 'select values to carry' widget is necessary to whittle down the fields retained. Since TCP processes are bi-directional, with some of the packets coming from the server and others heading to the server, it's necessary to flip the latter set so that FROM always goes with the MME server and TO goes with IP addresses it communicates with. The two streams are then sorted by IP contacted which allows them to merge coherently to the 4-tuples described before.

-5a- The results are duplicated and split with one fork - after a sort and break-on-same field value reduction - sent to Tradecraft Navigator as a summary of the number of times each IP pair has connected, with most frequent presumably on top. No data page is provided in the slides.

-5b- The other duplicate is sorted so that each client is represented just once for geolocation lookup by ATLAS. That needs another version of de-multiplexing, followed by discard of empty rows. ATLAS is mentioned in three slides; from those annotations, it has to do with geolocation of network information and is filterable by date and IP range.

-6- The output to Tradecraft Navigator is sorted by ASN (Autonomous System Number, the unique identifier for an ISP network). The internet had some 42,000 unique autonomous networks in the routing system at the beginning of 2013; ten distinct ASN networks that MME connects with are discovered here. These include ASNs 6453 and 32613 in Canada, 16322 for Iran, 25019 and two others for Saudi Arabia, plus inexplicable IPs in Eritrea, Jordan and Thailand. ASN lookup is readily available and it provides country, date of registration, registrar, and owner name.

The data page is quite instructive. It shows the silliness of newspaper redactions: Fantástico/Greenwald scrubbed out all tool annotations on the algorithm and blocked columns 2, 4, 5, and 8 in the output whereas the Globe & Mail showed the whole algorithm legibly and redacted columns 2, 3 and 8.

Column 2 is merely DNS lookup, freely available on the open internet. Column 3 in the Globe & Mail can be restored using the months-earlier Fantástico publication. The IP ranges of MME's contacts in Column 8 are not too hard to get at using the initial IP contact from Fantástico as they will be a block extending the last 3 digits of the initial IP contact out to 255, e.g. the first row gives the range to, all assigned to Eritrea.

Here MOEM, the Ministry of Energy and Mines in Eritrea, is located at www.moem.gov.er. While their server is not often working, the IP address there does not correspond to any result found by the algorithm. Those IP addresses are assigned to Eritrea but do not have Hostnames and may be routers. Note that British Telecom provides the ASN network so all traffic there is routinely ingested by GCHQ and available to the Canadians. However there is no evidence from this algorithm that MME had any interest in its Eritrean counterpart MOEM.

The algorithm here re-uses tools and widgets seen before with very similar logic: previously determined hostnames associated with Brazil's MME seed the IP address look-up via 'Forward DNS' (Danaus) followed by DNI enrichment at unspecified NSA databases, the symmetric same split to PEITHO and HYPERION to collect IPs and ports, followed by filters, sorts and field renaming (no pop-up details provided) as seen in slides 2 and 4. After Atlas provides geolocation of the retained IPs (note the never-explained x5 in the upper left corner of the ATLAS icon), the fields are consolidated, with just the ones geo-located to non-Five Eyes countries retained.

It's not clear why results for the Five Eyes countries are discarded. These countries by agreement don't launch spying operations on each other; Canada could certainly launch attack on IPs on itself but that may not be within the remit of CSEC. It's hard to believe the analyst would not take a peak at friendly country IPs - perhaps these were only discarded for purposes of this presentation (at which NSA and GCHQ analysts were surely represented).

From other Snowden leaks, it's known NSA also runs its own Brazilian espionage program; if Canada installed its own man-in-the-middle malware on top of a pre-existing NSA attack, these could conceivably collide and crash the Brazilian system, or at least alert the Brazilians via degradation of network performance. For this reason, the analyst contacted TAO prior to the presentation, turning over subsequent man-in-the-middle attack details to them. TAO maintains the central malware repository and is better positioned to vett installations for redundancy and collisions.

These four output tables provide the best view to what CSEC learned about MME's vulnerabilities from applying the algorithm:

-1- The first table consists of two records for acessovpn.mme.gov.br. This Brazilian server was obtained earlier as record 5 from the slide 2 processing (which started with mme.br.gov and provided IPs and ISPs in the 'Domain's IPs Output' table). Here journalists have blacked out the target column out of internet illiteracy (they are and and the IP it contacts. The port numbers indicate the target server is using ephemeral ports and the contact http port 80, meaning it is not a mail server nor secure like https.

This server in Brasilia has been assigned a new database field with value Case Notation MA10099(1) here that was added by the analyst later (certainly not produced from running the algorithm). It's not clear whether this case notation is that of GSEC or joint notation with NSA's TAO.

It's instructive to look at what anyone can learn in seconds for free on the open internet -- and how this works. In the case of acessovpn.mme.gov.br, the TLD (top level domain) acessovpn is recognized by the Root Server i.root-servers.net which redirects to c.dns.br which redirects to two name servers ns1.mme.gov.br and ns1.mme.gov.br which themselves have A type records and so separate IP addresses both located at the same geolocation in Brazil.

-2- This pair of tables unfortunately has the headers censored. They may simply represent the two IP addresses and They are sorted by order of use - number IPs contacted. Thus the ASN contacted the most (26 and 15 times respectively in the time frame considered) was 18881. That indicates the IPS was Global Village Telecom, a formerly Brazilian telecom owned since 2010 by the French company Vivendi. After that, the first IP contacted ASN 7738 11 times whereas the second IP contacted ASN 26599 9 times. Farther down the list, providers in Columbia, Mexico, India and China are listed.

-3- The final result table utilizes two tools not mentioned in the script suggesting these were applied from within Tradecraft Navigator: Reverse DNS (DANAUS) and EONBLUE. The latter is a closely held corporate tool, apparently used here for decoding Hostnames behind proxies, though nothing came of it here. EONBLUE surfaced earlier in slide 2 paired with corporate tool QUOVA (that was the source of acessovpn.mme.gov.br there). The entire table refers to A type rather than MX (email servers).

This slide shows the contact chaining for Brazil's Ministry of Mines and Energy on both the internet and telephony side, mostly the latter. The process is initialized from a small plaintext file of initial selectors (CSV comma separated values, records separated by carriage returns) which is reconfigured to a standardized database format with administrative oversight (front door rules: legal and policy justifications for collection) before being passed to the thin client of the analyst. This is the only appearance of 'Justification' in the slide set.

-1- Another field is added, 'SelectorRealm'. Realm isn't explained here by a popup or sample output slide but in the MonkeyPuzzle memo it meant divisions of a large database (emailAddrm, google, msnpassport, and yahoo). Realm here might specify a subset of collection SIGADS. Thus this step is narrowing the field of inquiry by adding a realm field to the input records to restrict subsequent processing to that realm.

-2a- The records are now filtered by their DNR (telephony) selectors in an unspecified manner. The fork meeting filter conditions is expanded by DNI (internet) chaining via unspecified databases (web email contacts possibly being the realm) and using one hop (see below) for output to Tradecraft Navigator. The fork of records failing to meet filter conditions is discarded.

-2b- The other fork meeting filter conditions, after specifying date ranges etc, is sent out to be expanded DNR contacted chaining. This enrichment step is quite instructive: it involves four telephony databases (FASTBAT, DISHFIRE, FASCIA, MAINWAY). Here FASTBAT appears for the first time in Snowden document releases. It must be partially non-redundant with respect to the others or it would make no sense to include it. It is possibly a SIGAD specific to Brazil or South America, possibly CSEC collection at the Canadian Embassy in Brasilia (the other three are NSA). DISHFIRE holds SMS records (cell phone texting).

It would be amazing if this contact-chaining step did not take overnight (or at least involve long latency) - these databases contain many trillions of records and NSA could be running thousands of multi-hop contact-chaining requests simultaneously for analysts throughout Five Eyes. It's not clear whether NSA's move to the cloud will expedite such searches or break algorithms such as this for whom the haystack has gotten too large.

-3- Because of how realms, date ranges, country of call origin etc were initially specified, not all records produced by contact chaining having any data left in the fields of interest. (It is very common for some fields to be blank in database records) These empty records are discarded so they don't contribute rubbish to the output.

-4a- After renaming records for consistent output, the records are sorted by an important field (e.g. MSISDN phone number) and split, with one fork going to summary statistics (how many records had a given value for the fixed field), as seen by the capital greek letter Sigma (symbol for sum in math) in the 'Group by' icon. These are likely sorted to highest frequency order.

-4b- The other fork simply outputs all the records to Tradecraft Navigator, which may have its own social networking visualization tool or just pass it on to RENOIR. The original presentation may have contained a sample of output but if so, Greenwald may not have included it or if he did, the Globe and Mail didn't publish it.

In this important Olympia algorithm slide, CSEC leverages an initially modest collection of 9 cell phone call records (called DNR selectors) to successively recover the three identification numbers characterizing a cell phone, which in turn lead the analyst to identification of two obsolete handset models (Nokia 3120c-1c and Motorola MURQ7) owned by top MME staffers at one time. The handset models might next be checked against NSA's collection of cell phone malware at TAO or NAC to see if existing tools could hack the phones and turn them into surveillance devices.

A Snowden document disclosed earlier revealed the NSA asking State Department to pass along all cell phone numbers they had been given in the course of normal high level contacts with foreign counterparts. Thus numbers turned in by the American Embassy representatives in Brazil with day-to-day dealings with MME were ingested into an NSA database to which Canada had ready access to. These 9 selectors probably have originated by this route.

What all can be deciphered from this slide?

-1- The overall logic flow is very clear: start from the 9 DNR call record seeds, determine the MSISDN number of the two cell phones, with that find the IMSI, from that the IMEI, and finally the handset model. This is far from trivial due to the properties of cell phone numbers (see below) and devious manufacturing practices in countries such as China. Unlike in previous slides (where anyone online can do reverse DNS lookup in seconds), cell phone owners cannot follow CSEC's logic flow even for their own phone.

-2- The three ellipses show a practically identical logic flow. Even though the tool and widget logos are barely legible, they are evidently the same. In fact, the ellipse processes make very little use of high-powered Olympia tools. The icons primarily represent housekeeping widgets (filter, dummy, rename, sort, delete, etc) that are useful but don't provide enough muscle to do more than shuffle record formats. The real work is done almost entirely by the large outlined-text H icon, not named in the redacted slide or seen elsewhere in menus or other algorithms. It will be called H for HANDSET here.

-3- The output (smaller orange rectangle on far right holding the Tradecraft Navigator icon) is key to understanding the steps of the algorithm. The output is provided for us below the schematic in the form of a small database with 8 fields and two records (the upper dark blue line is highlighted). Although it is highly unlikely these phones are still in use, the MSISDN numbers providing the original input are blued-out as are the IMSI and IMEI. Interestingly, their field names include the work 'correlation' suggesting that they cannot be unambiguously determined but are instead inferred from associations. The Motorola model is more specifically the MURQ7-3334411C11.

-4- The last column TOPI (Target Office of Primary Interest) here takes on the value CSEC, suggesting it is Five Eyes terminology. It's not clear why TOPI needs to be included as a database field. Perhaps adding MME to the NSA's target database - where priority, legal authority, resources needed and operational risk are reviewed - requires tracking of the originating partner agency. Since Canada lacks the malware and insert capabilities of NSA, Brazil's MME must go in the queue to compete with many other projects in the works.

-5- The output line 'Bands Supported by IMEI' can be read well enough that google search can be used to correct any letters mis-read initially. The result provides a look-up of the band wavelengths that the cell phone can use - that might be useful down the roar for DRTBOX interception - and the various communication protocols, like GSM, WCDMA, FDD, HSUPA and HSDPA.

-6- To understand the main algorithm flow, it is necessary to delve into the meaning of the MSISDN, IMSI and IMEI, the three main numbers associated with a cell phone. While that seems straightforward, nominal explanations have to be corrected for online tools that make end runs around official protocols. Cell phones are commonly lost, stolen, re-sold, unlocked, unblocked, registered in one country but used in another, SIM cards replaced, chip sets re-soldered and so on. And that can take place on phones whose manufacturers violated all the rules for unique serial numbers, billing information and so forth.

MSISDN (Mobile Subscriber ISDN Number) is just the ordinary telephone number of a mobile cell that would be on a business card. CSEC may have asked their Brazilian embassy to scan business cards of high level MME staff acquired in the course of ordinary interaction. These selectors could account for the 9 DNR records mentioned here as initializers.

-7- Due to the blurred slide and erased annotations, we cannot follow exactly how CSEC get from the MSISDN to the IMSI to IMEI to the handset model. This cannot be straightforward because the headers indicate correlation (possibly via different databases that share time of call) rather than a determinative algorithm.

In the CO-TRAVELER cloud analytics document, we see two years later that NSA cannot routinely obtain either the MSISDN or IMEI starting from the IMSI in the SEDB Tower QFD summary database. Thus this slide is in some ways the most interesting of all, more the pity that it was so poorly disclosed.

This slide provides a summary showing how all the information gathered can be used for BPoA (Backdoor Point of Access?) leading to further actions through:
- CNE (Computer Network Exploitation, such as cookie-replay, man-on-the-side attacks, CDR, etc.)

- Passive tasking (Upstream collection through backbone cable splitting and filtering, router intercept or telecom carrier cooperation)

- HUMINT-enabled (Human Intelligence, like information derived from voluntary, paid or bribed informants)
It's not clear whether CSEC could take things only so far and then NSA and GCHQ had to step in to aid in an actual tapping, bugging or hacking operation.

This slide is reconstructed from the video footage and shows a diagram containing all the telephone and internet connections discovered in the OLYMPIA case study. At the left side of the slide there are the telephone connections and at the right side the internet links.

It's interesting to see that in this diagram there are also a number of SIGADs, which are codes designating interception facilities. It's not really clear whether they were used to collect the metadata used for the chaining by the OLYMPIA tools, or whether they were eventually used to conduct interception of content on these communication links.

At the telephony side we see DS-800 as the facility for phone lines between the Brazilian ministry and numbers in Equador and Venezuela. Telephone communications to some other countries are monitored by facilities designated US-3294 and US-966V.

Internet traffic between IP addresses from Global Village Telecom and internet providers in Africa, the Middle East and Canada are also monitored by DS-800. We can also see that for internet traffic to India there's a facility designated DS-200 (maybe because GCHQ has good access to India?).

> See also: What are SIGADs starting with DS for?

This slide seems to be the final one of the OLYMPIA case study presentation. The analyst writes that he identified mail servers, which meanwhile have been targeted by means of passive collection. That means by tapping the traffic from internet backbone cables. Analysts have been assessing the value of these e-mail data.

The analyst also says that he is working with NSA's TAO division "to further examine the possibility for a Man on the Side operation". Here he's evidently referring to acessovpn.mme.gov.br. Based on the network information gathered, the Network Analysis Centre (NAC) of the British signals intelligence agency GCHQ has started "a BPoA analysis on the MME".

This shows that the OLYMPIA presentation was not just a software tutorial or an example of coding. The results prove CSEC actually ran this exercise against the Brazilian Ministry of Mines and Energy and got some real results: information about their telephone and internet connections, although probably by far not complete.

As OLYMPIA is target-development software, this tool didn't gather any content of phone calls or e-mail messages, but this last slide tells us that as a result of the OLYMPIA effort, at least the e-mail of the Brazilian ministry became subject of an actual collection operation.

> See also: An NSA eavesdropping case study

Links and Sources
- Vice.com: How does CSEC work with the world's most connected telecom company?
- Theoreti.ca: Interpreting the CSEC Presentation: Watch Out Olympians in the House!
- TheGlobeAndMail.com: Slides reveal Canada’s powerful espionage tool
- Globo.com: American and Canadian Spies target Brazilian Energy and Mining Ministry
- Anonymous: Total tear-down of Canada's Olympia spyware (pdf)