September 22, 2016

Secret report reveals: German BND also uses XKEYSCORE for data collection

(Updated: December 3, 2016)

Over the past few years we learned a lot about Germany's foreign intelligence service BND, although not from leaks, but from the public hearings of the parliamentary commission that investigates NSA spying operations and its cooperation with German agencies.

Recently however a secret government report was leaked to German media, which not only identifies violations of the data protection act but also reveals the codenames for several BND systems and the fact that BND uses the American XKEYSCORE system not only for analysis, but also for collection purposes.

Here, the new information from the secret report is combined with things we know from earlier sources and reportings.

- A secret report
- The SUSLAG liaison office
- Selectors provided by NSA: TND and SCRABBLE
      - BND's selector database: PBDB
- Operations SMARAGD and ZABBO
- Metadata analysis: VERAS
- Analysis and collection: XKEYSCORE
- Integrated analysis: MIRA 4
- Legal defects


The BND satellite intercept station at Bad Aibling, Germany
(Photo: AFP/Getty Images)


A secret report

The report that now has been published goes back to September 2013, when the then federal data protection commissioner Peter Schaar ordered a thorough inspection of the BND satellite intercept station in Bad Aibling, which took place on December 3 and 4 of that year.

In October 2014, Schaar's successor Andrea Voßhoff conducted a second visit to Bad Aibling, which in July 2015 resulted in an extensive and detailed report (German: Sachstandsbericht) about all the systems used at this BND station. This report was (and still is) classified as Top Secret.

Additionally, Voßhoff made a legal assessment based upon the Sachstandsbericht. This was finished in March 2016 and sent to then BND president Schindler and the federal chancellery. It was classified as Secret, but was leaked to regional broadcasters NDR and WDR and a transcription of the full document was published by the digital rights platform Netzpolitik.org on September 1.

Both reports are about the cooperation between BND and NSA, which goes back to 2004, when the Americans turned their satellite intercept station Bad Aibling (codenamed GARLICK) over to German intelligence. In return, BND had to share the results from its satellite collection with the NSA, for which the latter provided selectors, like e-mail addresses, phone numbers, etc. of the targets they were interested in.



Google Maps view of the Mangfall Barracks in Bad Aibling, Germany.
The building at the very top seems to be the BND facility,
the one nearby with the white roof NSA's "Tin Can".


The SUSLAG liaison office

After taking over the Bad Aibling satellite station, BND seems to have moved the control facility to the nearby Mangfall Barracks, which were taken over from the German armed forces (Bundeswehr) in 2002. For the Special US Liaison Activity Germany (SUSLAG), which is the liaison office of NSA for Germany, a new highly secure container building was built on the Mangfall Barracks premises in 2003 (nicknamed "tin can" or Blechdose).

According to the commissioner's report, the SUSLAG building and the building with BND servers and equipment are connected through a 100 MBit/s fiber optic cable. SUSLAG also has a technical data link to the NSA's primary communications hub in Europe, the European Technical Center (ETC) in the Mainz-Kastel district of the city of Wiesbaden.

Cooperation between the US and Germany in the Joint SIGINT Activity (JSA, 2004-2012) took place inside the BND building, for which NSA personnel had access permissions. After the JSA was terminated, SUSLAG personnel kept their entrance rights for the BND building, but it has separate rooms for highly sensitive information to which none of the Americans have access.

A letter from BND from October 15, 2015 says that at that moment, 10 people from NSA worked at SUSLAG, with following access rights:
- 2 have access to building 7 (SUSLAG) only
- 4 have access to building 7 and building 4 (Administration)
- 4 have access to building 7 and building 8 (BND)

The SUSLAG building is only used by NSA personnel and BND claims that the data protection commissioner has no jurisdiction over the SUSLAG, but she disputes that and says the SUSLAG building is simply part of the BND complex. She also regrets that SUSLAG doesn't recognize her oversight authority.




Selectors provided by NSA: TND and SCRABBLE

For the satellite interception in Bad Aibling, some 4 out of 5 selectors come from NSA, the rest from BND. According to Süddeutsche Zeitung, NSA provided BND with roughly 690.000 phone numbers and 7,8 million internet identifiers between 2002 and 2013. That is an average of something like 60.000 phone numbers and 700.000 internet identifiers a year, or 164 phone numbers and over 1900 internet identifiers each day.

From the parliamentary hearings we already knew that BND personnel pulls the American selectors from an NSA server, and the commissioner's report now reveals that this server is in NSA's ETC in Wiesbaden. On this server BND puts back any results for these selectors. These data transfers from and to ETC go through the SUSLAG facility, but BND is able to get direct access to the NSA server in Wiesbaden through an FTP-gateway (a "BACOM system").

Selector databases

From an earlier parliamentary hearing we know that BND stores the selectors from NSA in two databases: one for IP selectors (from NSA only), and one for telephone selectors (from both NSA and BND). Each agency had access to its own IP database; the phone database was managed jointly, but BND could only approve or disapprove NSA selectors, and NSA could only do so with those from BND.

The names of these databases were not known until now, but the commissioner's report mentions them, along with some additional details:
- Target Number Database (TND), which exists since 2008 and holds the telephone selectors from both NSA and BND. The latter either come from BND's own tasking database PBDB or are provided by domestic security services.

- SCRABBLE, which only holds selectors for packet-switched (internet) communications provided by NSA, after their format has been converted. These selectors initially had no description (Deutung, like a justification for the target). Because of this, BND temporarily stopped using them as of May 2015, and for the commissioner any results from them are unlawful because BND was not able to determine whether they are necessary for its mission.

Their names indicate that these database systems were provided by NSA, and together with the fact that they also contain NSA-provided selectors, this is likely the reason why these names were never mentioned during the parliamentary hearings - unlike those of BND's own systems.
Updates:

It was noticed that TND and SCRABBLE were actually mentioned once during the parliamentary hearings, when former BND president Schindler said that "the US has [its own] databases TND and SCRABBLE".

- PBDB - During a parliamentary hearing on November 9, 2016 it came out that BND's own tasking database PBDB (PersonenBezogene DatenBestände) became operational in the Summer of 2014, after a test period that started late 2012. Both in this system and in the previous system, it is/was logged when for example a selector was deactivated. An even older system had no such logging capability. Before 2014, BND field stations had their own proprietary tasking databases, at least some of them maintaining their selectors using Excell lists.
The PBDB is maintained by the T2-branch from BND headquarters. Analysts can enter any selectors (often multiple ones for a particular target) into PBDB that they assume useful for foreign intelligence purposes. Newly entered selectors are checked (through the DAFIS system) at BND headquarters to make sure they don't pull in German communications.
Results generated by approved and activated selectors are enriched with PBDB data in order to attribute them to their target. Maybe results are also stored in the PBDB database, where they can be accessed by groups of 4 to 5 analysts working on the particular topic. After it came out that BND itself also used selectors related to partner countries, those selectors were moved to a separate partition (called Gruppenliste) of the PBDB database in October 2013, so they couldn't be tasked anymore.

Approval

Before being stored in the SCRABBLE and TND databases, both the telephone and internet selectors have to pass the DAFIS filtering system, which checks whether they belong to German citizens or companies or may otherwise contradict German interests. Accordingly, the selectors are marked as "allowed" or "protected".

Those marked "allowed" are subsequently being activated ("tasked") on the actual data collection systems. The report says that for this, hard selectors like phone numbers and e-mail addresses can be freely combined with content search terms (Inhaltssuchbegriffe) like key words, which could refer to the GENESIS language used for more complex XKEYSCORE searches.

According to the report, selectors marked as "protected" are send back to NSA and are also deactivated in the TND and SCRABBLE databases - to make sure that they won't get activated when NSA provides them a second time (this confirms that there's no separate database (Ablehnungsdatei) with rejected selectors as was suggested during the earlier parliamentary commission hearings).

BND refused the data protection commissioner access to TND and SCRABBLE, so she wasn't able to check the individual selectors. She regarded that as a massive restriction of her supervision authority.



Operations SMARAGD and ZABBO

Selectors that have been approved are send to the systems that filter out communications that match those selectors. Some of these systems are in Germany, others are abroad. The report of commissioner Voßhoff for the first time discloses two specific data collection operations and their codewords:

- SMARAGD, a cable tapping operation somewhere outside Europe and in cooperation with another foreign intelligence agency.

- ZABBO, collection in Bad Aibling of satellite communications from Afghanistan.

There's no explanation for why the commissioner only mentions these two operations. The satellite antennas in Bad Aibling undoubtedly collect from many more countries, but maybe these are the only operations from which, during the investigation period, data were shared with NSA.

SMARAGD = WHARPDRIVE ?

The way SMARAGD is described perfectly fits a certain type of operations in which a 3rd Party partner of NSA like in this case BND, cooperates with yet another country that secretly provides access to data traffic, which is then also shared with NSA. According to the book Der NSA Komplex, BND and NSA conducted about half a dozen of such operations in recent years.

In its english version of the news report about this issue, the website Netzpolitik.org points to an NSA document that was published earlier by Der Spiegel. In it, we see EMERALD mentioned as an alternate codename for the NSA operation WHARPDRIVE, which is exactly such a trilateral program in which a third secret service participates.

WHARPDRIVE was still active in 2013, but in the Spring of that year, employees of the private company that operated the communication cables, accidently discovered the clandestine BND/NSA equipment, but the operation was rescued by providing a plausible cover story.*

The NSA report from April 2013 however said that "WHARPDRIVE has been identified for possible termination due to fiscal constraints", but this may have been coincided with the exposure of the program in the book Der NSA Komplex in March 2014.

It should also be noted that Netzpolitik.org came up with this identification by translating the German codename SMARAGD into its English equivalent EMERALD. It is possible that the Americans also translated the German codeword SMARAGD into EMERALD, but just as likely is that it's a different program (maybe as a successor with the same set-up).

Update:
During a parliamentary hearing on November 9, 2016, member of parliament Renner said that SMARAGD is identical with EMERALD and that the operation was deactivated after Snowden, because it was mentioned in documents. BND-employee R.U. said that a cable access which terminates in Bad Aibling (likely the one from the SMARAGD operation), provided just a minimal data stream, by fault of the foreign intelligence service (probably the 3rd partner involved).

Operation Eikonal

But there's another codeword connection: from 2004 till 2008, NSA cooperated with BND in operation EIKONAL in order to get access to fiber optic cables from Deutsche Telekom in Frankfurt.

From the parliamentary hearings we know that operation EIKONAL had GRANAT as its internal BND codename. And with GRANAT being German for garnet, and SMARAGD for emerald, we see that both operations are actually named after a gemstone, which often indicates some kind of similarity.

In October 2014, the Danish paper Information reported that the WHARPDRIVE access was opened in February 2013 and had the same size as EIKANOL. This operation EIKANOL or EIKONAL was a typical example of the way NSA cooperates with 3rd Party partner agencies under its RAMPART-A program, but unlike the SMARAGD/WHARPDRIVE operations with the cable access point being inside Germany:


 
Left: bilateral cable access operation (RAMPART-A) - Right: trilateral cable access operation
In the cases discussed here, Germany would be "Country X"
(click to enlarge)


It is tempting to identify SMARAGD and ZABBO as the two collection programs (SIGADs US-987LA and US-987LB) from the BOUNDLESSINFORMANT chart for Germany that was published in July 2013. For both facilities together, more than 552 million metadata records were counted between December 10, 2012 and January 8, 2013.

Provided that this chart shows the only data shared by BND, it's very well possible that the satellite collection program ZABBO is one of them. For the cable access SMARAGD this is less certain and depends on when this program started and whether it is identical with WHARPDRIVE (which started in February 2013).



BOUNDLESSINFORMANT screenshot showing metadata provided by BND
(click to enlarge)

Data transfer

The report of the data protection commissioner also provides an impression of the BND networks through which collected data are brought back to headquarters.

Data collected abroad are send back to Germany over the operational network ISNoVPN (apparently something that goes "over VPN" for secure tunneling) and then arrives at a dedicated demilitarized zone (DMZ) network for data collection (Datenabholungs-DMZ).

In this DMZ network there's a virtual machine (VM) that acts as a host for data that come in from each collection facility (Erfassungsansatz). The report mentions the virtual machines "Import VM SMARAGD" and "Import VM ZABBO" for the operations SMARAGD and ZABBO respectively.

In these virtual machines, the metadata go through an Application Level Gateway (ALG), which is a security components combined with a firewall. Such an ALG is able to detect, filter and when necessary, delete data from an incoming data stream. Again, there's an ALG for each collection facility: for example SMARAGD-ALG for data from the SMARAGD collection effort.

Finally, the collected data arrive at a network called NG-Netz, which is the back-end in Bad Aibling of the transfer system that pulls in data collected at a front-end access point (Erfassungskopf) somewhere abroad.



(click to enlarge)


Metadata analysis: VERAS

The system that BND uses for analysing bulk metadata from circuit-switched communications is called VERAS, which stands for Verkehrs-Analyse-System or Traffic Analysis System. VERAS stores metadata only for up to 90 days and according to the commissioner's report they are derived from two sources:

- Metadata that come with communications collected after matching with specific selectors (the related content goes to the INBE database)

- All the metadata from selected communication links (satellite frequencies and fiber optic channels) that are regarded useful for intelligence purposes, but only after passing the DAFIS filter.

According to the manual for VERAS version 4.3.x from 2010, the system has a topology mode, in which connections can be created level after level, similar to the "hops" we know from the NSA's contact chaining method. There's no limitation to the number of levels that can be added and analysts can also focus on specific targets to create patterns-of-life (Bewegungsprofile) for them.

This kind of contact-chaining and metadata analysis inevitably involves metadata from innocent people. BND distinguished between directly and indirectly relevant. Directly relevant are metadata related to people who are already known or suspected for being relevant for intelligence purposes.

Indirectly relevant are metadata related to people who have some kind of connection to directly relevant people, or when such metadata are being stored from a "geographical point of view", which apparently refers to metadata of people being somewhere near a target without having been in direct contact.

The report says that metadata connected on such a geographical basis results in much more people being involved than when using call or connection chaining. Data related to indirectly relevant people are also used by BND, for example as new selectors.

VERAS was introduced in 2002 and recently, VERAS 4 has been replaced by VERAS version 6, which was developed by the German armed forces (Bundeswehr) as part of the VERBA (VERkehrs-Beziehungs-Analyse) project.

For VERAS 6 there's not yet a database establishing order (see below), but in February 2015 BND sent the commissioner a draft version, which she already considers illegal because BND admits that it is technically impossible to prevent that data of innocent people are being used in the VERAS system.



Analysis and collection: XKEYSCORE

Already in July 2013, Der Spiegel reported that BND president Schindler had informed the parliamentary intelligence oversight commission (PKGr) that his agency was using NSA's XKEYSCORE system since 2007, but only for analysis, not for data collection. This was confirmed by W. K., a sub-division manager in the BND's Signals Intelligence division, during a parliamentary hearing.

But now, the report of the data protection commissioner says that BND uses XKEYSCORE not just for analysis, but also for the collection of both metadata and content.

The report explains that in its data collection, or front-end function, XKEYSCORE uses selectors, single ones or freely combined ones in the form of fingerprints, to search for matches in IP traffic of both public and privat networks, and stores anything that matches these selectors.

Remarkably enough, the commissioner writes that XKEYSCORE searches all internet traffic worldwide ("weltweit den gesamten Internetverkehr"), which seems to be a copy/paste from sensationalistic press reports, as XKEYSCORE can only search data which are collected at some physical access points and not even NSA has access to all the world's communications traffic, let alone BND.



Slide from an NSA presentation about the XKEYSCORE system


Besides picking out and storing communications that match specific selectors, XKEYSCORE is also able to store a so-called "full take", a temporary rolling buffer of all data from a particular link. This in order to find files which aren't directly associated with specific selectors - which was heralded as its unique capability.

The commissioner's report only mentions this buffer function when it cites a BND response calling XKEYSCORE "a local and temporary buffering of data" which in their opinion doesn't make it a database. The commissioner disagrees and says it's a database, because even when it's just for a short time, the data are available for usage. This means a there should have been a database establishing order for XKEYSCORE (see below).

Front-end and back-end

The report doesn't explain what XKEYSCORE actually does in its function as a back-end analysis tool. But maybe instead of distinguishing between collection and analysis, we should look at the difference between the front-end and the back-end functions of the system, which is explained in a manual for its so-called Deepdive version.

This learns us that the back-end performs high-speed filtering and selection using both strong selectors (like e-mail addresses) and soft selectors (like key words), and also uses various plug-ins to extract and index the metadata, which are also used for the rolling buffer-functionality of XKEYSCORE:



Diagram showing the dataflow for the DeepDive version of XKEYSCORE


The front-end is where the intercepted data streams come in, which are first reassembled by the METTLESOME and xFip components. Then, only the most useful streams are forwarded based upon rules using country codes, keywords and such. Finally, the Defrag component conducts full sessionizing, which means that the separate IP packets that travel over the internet are reassembled into their original readable form again.

The commissioner's report says that initially the sessionizing of data from a particular communications link was conducted by another NSA system codenamed WEALTHYCLUSTER (WC, which is for lower data rates), but that this kind of processing was more and more taken over by XKEYSCORE (XKS).

So, if the distinction between collection and analysing corresponds to that between front-end and back-end, that means that the new thing we learned from the commissioner's report is that BND apparently also uses XKEYSCORE for sessionizing the data they collect, and not only for filtering and analysing them.

This sessionizing might seem rather obvious, but real-time filtering and sessionizing at data rates as high as 10 Mbit/s requires very fast, specialized and expensive equipment. Well-known manufacturers are Narus and Verint, and it seems likely that their equipment is used for XKEYSCORE too.

As XKEYSCORE is only used for internet communications, the NSA selectors are derived from the SCRABBLE database. The results of the collection are transferred to NSA, after having been filtered by DAFIS to get rid of data related to Germans.



Integrated analysis: MIRA 4

Besides all the systems mentioned before, BND also uses MIRA 4, which stands for Modulare Integrierte Ressourcen Architektur or Modular Integrated Ressource Architecture, version 4. According to a letter from BND from February 2015, this system is used to store all the content, whether from e-mail, voice, fax or teletype messages, within a certain BND station and apparently also enables software to process and select raw data in order to create intelligence reports (Meldungen).

This was however contradicted by a letter from BND from December 2015 which said that MIRA 4 is only used to store just those Meldungen. The commissioner replied that she would be thankful when BND could clarify this discrepancy.

Apparently not noticed by the commissioner is an NSA report from 2006, which was published by earlier Der Spiegel, and which says that German analytic tool suites like MIRA 4:
"integrate multiple database analytic functions (such as viewing voice and listening to fax [sic]), much like NSA Headquarters has UIS (User Integrated Services). In some ways, these tools have features that surpass US SIGINT capabilities. Among a series of interesting items, NSA analysts noted that BND analysts could seamlessly move from VERAS (call-chaining software) to the associated voice cuts."

Later on, the 2006 NSA report says: "The BND responded positively to NSA's request for a copy of MIRA4 and VERAS software, and made several requests from NSA concerning target and tool development and data".

During a parliamentary hearing in October 2014, BND's own data protection officer Ms. H. F. said that in 2010, MIRA 4 was replaced by INBE as a system that apparently not only stores the content of communications, but also makes it available for analysis.

The 2016 commissioner's report says that data stored in MIRA 4 were not migrated to INBE, when the latter system became operational in 2011. Data in MIRA 4 seem to have been automatically "aged off" after 90 days and the last backup of the system was destroyed in the Summer of 2014.



Legal defects

The purpose of the secret report by federal data protection commissioner Andrea Voßhoff was to determine the legality of the data collection, processing, storing and analysing systems at the BND field station in Bad Aibling. The two main problems she identified are about necessity and the lack of database establishing orders.

Necessity

According to the German data protection law, BND is only allowed to receive, store, process and analyse personal data after checking that they are necessary and relevant for its foreign intelligence mission as authorized by German law. In various cases, especially when it comes to bulk collection of metadata and receiving the selectors from NSA, the agency doesn't or cannot check the necessity for each piece of data. This makes it unlawful for BND to posess and use those data.

The problem behind this is that when such laws were made, there was no awareness of secret services using large sets of metadata, which also includes those of innocent people. Also in this particular case, almost all data collected in Bad Aibling and shared with NSA will be collected from crisis zones like Afghanistan, which makes them more relevant for BND's mission and less likely of containing German communications.

Database establishing orders

Another major legal defect the commissioner found was that for the BND databases VERAS 4, VERAS 6, XKEYSCORE, TND, SCRABBLE, INBE, and DAFIS there was no database establishing order (Dateianordnung) and that they were also set up without prior approval by the commissioner. This makes the existance of these databases unlawful, which means the data they contain should be deleted immediatly until an establishing order is provided.

BND argued that the absence of a database establishing order is just a formal defect and doesn't affect the legal status of a database and its content. The commissioner doesn't agree with that and says that one of the functions of an establishing order is to determine the purpose of a database, which limits and restricts the use of the personal data in it. The lack of such an order also means that there are no rules for when approvals by oversight bodies are required, thus making the use of these databases both unlawful and uncontrolled.

In response

Meanwhile, on September 7, the German interior ministry released a draft for a new data protection act, in which it is proposed that in the future, the data protection commissioner will not have the authority anymore to impose sanctions or fines on the secret services - so restricting the commissioner's authority rather than strenghten it.

Finally, on September 15, Edward Snowden also mentioned the commissioner's report on Twitter, saying that it "confirms mass surveillance". Apparently he didn't read the report, as it is actually about the lack of specific legal restrictions, not about the scope of BND's collection efforts.




Links and Sources
- Rolf Weber: Der geleakte BND-Bericht der BfDI Voßhoff -- wie gewohnt bei näherem Hinsehen wenig skandalträchtig
- Netzpolitik: Secret Report: German Federal Intelligence Service BND Violates Laws And Constitution By The Dozen
- Der Spiegel: NSA-Standorte in Deutschland: Wiesbaden
- Wikipedia: Operation Eikonal

No comments:

In Dutch: Meer over het wetsvoorstel voor de Tijdelijke wet cyberoperaties