2018 Proceedings dblp program dblp Technical Sessions dblp Programme dblp
2019 Proceedings dblp program dblp Technical Sessions dblp Programme dblp
2020 Nov. 9-13, 2020, Orlando, USA May. 18-20, 2020, San Francisco, USA Aug. 12–14, 2020, Boston, USA Programme






2018 dblp

Session 2D: ML 1

Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach

Despite several attacks have been proposed, text-based CAPTCHAs are still being widely used as a security mechanism. One of the reasons for the pervasive use of text captchas is that many of the prior attacks are scheme-specific and require a labor-intensive and time-consuming process to construct. This means that a change in the captcha security features like a noisier background can simply invalid an earlier attack. This paper presents a generic, yet effective text captcha solver based on the generative adversarial network. Unlike prior machine-learning-based approaches that need a large volume of manually-labeled real captchas to learn an effective solver, our approach requires significantly fewer real captchas but yields much better performance. This is achieved by first learning a captcha synthesizer to automatically generate synthetic captchas to learn a base solver, and then fine-tuning the base solver on a small set of real captchas using transfer learning. We evaluate our approach by applying it to 33 captcha schemes, including 11 schemes that are currently being used by 32 of the top-50 popular websites including Microsoft, Wikipedia, eBay and Google. Our approach is the most capable attack on text captchas seen to date. It outperforms four state-of-the-art text-captcha solvers by not only delivering a significant higher accuracy on all testing schemes, but also successfully attacking schemes where others have zero chance. We show that our approach is highly efficient as it can solve a captcha within 0.05 second using a desktop GPU. We demonstrate that our attack is generally applicable because it can bypass the advanced security features employed by most modern text captcha schemes. We hope the results of our work can encourage the community to revisit the design and practical use of text captchas.

Model-Reuse Attacks on Deep Learning Systems

Many of today’s machine learning (ML) systems are built by reusing an array of, often pre-trained, primitive models, each fulfilling distinct functionality (e.g., feature extraction). The increasing use of primitive models significantly simplifies and expedites the development cycles of ML systems. Yet, because most of such models are contributed and maintained by untrusted sources, their lack of standardization or regulation entails profound security implications, about which little is known thus far. In this paper, we demonstrate that malicious primitive models pose immense threats to the security of ML systems. We present a broad class of model-reuse attacks wherein maliciously crafted models trigger host ML systems to misbehave on targeted inputs in a highly predictable manner. By empirically studying four deep learning systems (including both individual and ensemble systems) used in skin cancer screening, speech recognition, face verification, and autonomous steering, we show that such attacks are (i) effective - the host systems misbehave on the targeted inputs as desired by the adversary with high probability, (ii) evasive - the malicious models function indistinguishably from their benign counterparts on non-targeted inputs, (iii) elastic - the malicious models remain effective regardless of various system design choices and tuning strategies, and (iv) easy - the adversary needs little prior knowledge about the data used for system tuning or inference. We provide analytical justification for the effectiveness of model-reuse attacks, which points to the unprecedented complexity of today’s primitive models. This issue thus seems fundamental to many ML systems. We further discuss potential countermeasures and their challenges, which lead to several promising research directions.

LEMNA: Explaining Deep Learning based Security Applications

While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimized for non-security tasks ( e.g., image analysis). Their key assumptions are often violated in security applications, leading to a poor explanation fidelity. In this paper, we propose LEMNA, a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications ( e.g., binary code analysis); and (2) handle nonlinear local boundaries to boost explanation fidelity. We evaluate our system using two popular deep learning applications in security (a malware classifier, and a function start detector for binary reverse-engineering). Extensive evaluations show that LEMNA’s explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models.

Effective Program Debloating via Reinforcement Learning

Prevalent software engineering practices such as code reuse and the “one-size-fits-all” methodology have contributed to significant and widespread increases in the size and complexity of software. The resulting software bloat has led to decreased performance and increased security vulnerabilities. We propose a system called Chisel to enable programmers to effectively customize and debloat programs. Chisel takes as input a program to be debloated and a high-level specification of its desired functionality. The output is a reduced version of the program that is correct with respect to the specification. Chisel significantly improves upon existing program reduction systems by using a novel reinforcement learning-based approach to accelerate the search for the reduced program and scale to large programs. Our evaluation on a suite of 10 widely used UNIX utility programs each comprising 13-90 KLOC of C source code demonstrates that Chisel is able to successfully remove all unwanted functionalities and reduce attack surfaces. Compared to two state-of-the-art program reducers C-Reduce and Perses, which time out on 6 programs and 2 programs espectively in 12 hours, Chisel runs up to 7.1x and 3.7x faster and finishes on all programs.

Session 3D: ML 2

Tiresias: Predicting Security Events Through Deep Learning

With the increased complexity of modern computer attacks, there is a need for defenders not only to detect malicious activity as it happens, but also to predict the specific steps that will be taken by an adversary when performing an attack. However this is still an open research problem, and previous research in predicting malicious events only looked at binary outcomes (eg. whether an attack would happen or not), but not at the specific steps that an attacker would undertake. To fill this gap we present Tiresias xspace, a system that leverages Recurrent Neural Networks (RNNs) to predict future events on a machine, based on previous observations. We test Tiresias xspace on a dataset of 3.4 billion security events collected from a commercial intrusion prevention system, and show that our approach is effective in predicting the next event that will occur on a machine with a precision of up to 0.93. We also show that the models learned by Tiresias xspace are reasonably stable over time, and provide a mechanism that can identify sudden drops in precision and trigger a retraining of the system. Finally, we show that the long-term memory typical of RNNs is key in performing event prediction, rendering simpler methods not up to the task.

DeepMem: Learning Graph Neural Network Models for Fast and Robust Memory Forensic Analysis

Kernel data structure detection is an important task in memory forensics that aims at identifying semantically important kernel data structures from raw memory dumps. It is primarily used to collect evidence of malicious or criminal behaviors. Existing approaches have several limitations: 1) list-traversal approaches are vulnerable to DKOM attacks, 2) robust signature-based approaches are not scalable or efficient, because it needs to search the entire memory snapshot for one kind of objects using one signature, and 3) both list-traversal and signature-based approaches all heavily rely on domain knowledge of operating system. Based on the limitations, we propose DeepMem, a graph-based deep learning approach to automatically generate abstract representations for kernel objects, with which we could recognize the objects from raw memory dumps in a fast and robust way. Specifically, we implement 1) a novel memory graph model that reconstructs the content and topology information of memory dumps, 2) a graph neural network architecture to embed the nodes in the memory graph, and 3) an object detection method that cross-validates the evidence collected from different parts of objects. Experiments show that DeepMem achieves high precision and recall rate in identify kernel objects from raw memory dumps. Also, the detection strategy is fast and scalable by using the intermediate memory graph representation. Moreover, DeepMem is robust against attack scenarios, like pool tag manipulation and DKOM process hiding.

Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations

With the growing adoption of machine learning, sharing of learned models is becoming popular. However, in addition to the prediction properties the model producer aims to share, there is also a risk that the model consumer can infer other properties of the training data the model producer did not intend to share. In this paper, we focus on the inference of global properties of the training data, such as the environment in which the data was produced, or the fraction of the data that comes from a certain class, as applied to white-box Fully Connected Neural Networks (FCNNs). Because of their complexity and inscrutability, FCNNs have a particularly high risk of leaking unexpected information about their training sets; at the same time, this complexity makes extracting this information challenging. We develop techniques that reduce this complexity by noting that FCNNs are invariant under permutation of nodes in each layer. We develop our techniques using representations that capture this invariance and simplify the information extraction task. We evaluate our techniques on several synthetic and standard benchmark datasets and show that they are very effective at inferring various data properties. We also perform two case studies to demonstrate the impact of our attack. In the first case study we show that a classifier that recognizes smiling faces also leaks information about the relative attractiveness of the individuals in its training set. In the second case study we show that a classifier that recognizes Bitcoin mining from performance counters also leaks information about whether the classifier was trained on logs from machines that were patched for the Meltdown and Spectre attacks.

Machine Learning with Membership Privacy using Adversarial Regularization

Machine learning models leak significant amount of information about their training sets, through their predictions. This is a serious privacy concern for the users of machine learning as a service. To address this concern, in this paper, we focus on mitigating the risks of black-box inference attacks against machine learning models. We introduce a mechanism to train models with membership privacy, which ensures indistinguishability between the predictions of a model on its training data and other data points (from the same distribution). This requires minimizing the accuracy of the best black-box membership inference attack against the model. We formalize this as a min-max game, and design an adversarial training algorithm that minimizes the prediction loss of the model as well as the maximum gain of the inference attacks. This strategy, which can guarantee membership privacy (as prediction indistinguishability), acts also as a strong regularizer and helps generalizing the model. We evaluate the practical feasibility of our privacy mechanism on training deep neural networks using benchmark datasets. We show that the min-max strategy can mitigate the risks of membership inference attacks (near random guess), and can achieve this with a negligible drop in the model’s prediction accuracy (less than 4%).

Session 6D: Usable Security

Asking for a Friend: Evaluating Response Biases in Security User Studies

The security field relies on user studies, often including survey questions, to query end users’ general security behavior and experiences, or hypothetical responses to new messages or tools. Self-report data has many benefits – ease of collection, control, and depth of understanding – but also many well-known biases stemming from people’s difficulty remembering prior events or predicting how they might behave, as well as their tendency to shape their answers to a perceived audience. Prior work in fields like public health has focused on measuring these biases and developing effective mitigations; however, there is limited evidence as to whether and how these biases and mitigations apply specifically in a computer-security context. In this work, we systematically compare real-world measurement data to survey results, focusing on an exemplar, well-studied security behavior: software updating. We align field measurements about specific software updates (n=517,932) with survey results in which participants respond to the update messages that were used when those versions were released (n=2,092). This allows us to examine differences in self-reported and observed update speeds, as well as examining self-reported responses to particular message features that may correlate with these results. The results indicate that for the most part, self-reported data varies consistently and systematically with measured data. However, this systematic relationship breaks down when survey respondents are required to notice and act on minor details of experimental manipulations. Our results suggest that many insights from self-report security data can, when used with care, translate to real-world environments; however, insights about specific variations in message texts or other details may be more difficult to assess with surveys.

Towards Usable Checksums: Automating the Integrity Verification of Web Downloads for the Masses

Internet users can download software for their computers from app stores (e.g., Mac App Store and Windows Store) or from other sources, such as the developers’ websites. Most Internet users in the US rely on the latter, according to our representative study, which makes them directly responsible for the content they download. To enable users to detect if the downloaded files have been corrupted, developers can publish a checksum together with the link to the program file; users can then manually verify that the checksum matches the one they obtain from the downloaded file. In this paper, we assess the prevalence of such behavior among the general Internet population in the US (N=2,000), and we develop easy-to-use tools for users and developers to automate both the process of checksum verification and generation. Specifically, we propose an extension to the recent W3C specification for sub-resource integrity in order to provide integrity protection for download links. Also, we develop an extension for the popular Chrome browser that computes and verifies checksums of downloaded files automatically, and an extension for the WordPress CMS that developers can use to easily attach checksums to their remote content. Our in situ experiments with 40participants demonstrate the usability and effectiveness issues of checksums verification, and shows user desirability for our extension.

Investigating System Operators’ Perspective on Security Misconfigurations

Nowadays, security incidents have become a familiar “nuisance,” and they regularly lead to the exposure of private and sensitive data. The root causes for such incidents are rarely complex attacks. Instead, they are enabled by simple misconfigurations, such as authentication not being required, or security updates not being installed. For example, the leak of over 140 million Americans’ private data from Equifax’s systems is among most severe misconfigurations in recent history: The underlying vulnerability was long known, and a security patch had been available for months, but was never applied. Ultimately, Equifax blamed an employee for forgetting to update the affected system, highlighting his personal responsibility. In this paper, we investigate the operators’ perspective on security misconfigurations to approach the human component of this class of security issues. We focus our analysis on system operators, who have not received significant attention by prior research. Hence, we investigate their perspective with an inductive approach and apply a multi-step empirical methodology: (i), a qualitative study to understand how to approach the target group and measure the misconfiguration phenomenon (ii) a quantitative survey rooted in the qualitative data. We then provide the first analysis of system operators’ perspective on security misconfigurations, and we determine the factors that operators perceive as the root causes. Based on our findings, we provide practical recommendations on how to reduce security misconfigurations’ frequency and impact.

Peeling the Onion’s User Experience Layer: Examining Naturalistic Use of the Tor Browser

The strength of an anonymity system depends on the number of users. Therefore, User eXperience (UX) and usability of these systems is of critical importance for boosting adoption and use. To this end, we carried out a study with 19 non-expert participants to investigate how users experience routine Web browsing via the Tor Browser, focusing particularly on encountered problems and frustrations. Using a mixed-methods quantitative and qualitative approach to study one week of naturalistic use of the Tor Browser, we uncovered a variety of UX issues, such as broken Web sites, latency, lack of common browsing conveniences, differential treatment of Tor traffic, incorrect geolocation, operational opacity, etc. We applied this insight to suggest a number of UX improvements that could mitigate the issues and reduce user frustration when using the Tor Browser.

Session 7C: TLS

Pseudo Constant Time Implementations of TLS Are Only Pseudo Secure

Today, about 10% of TLS connections are still using CBC-mode cipher suites, despite a long history of attacks and the availability of better options (e.g. AES-GCM). In this work, we present three new types of attack against four popular fully patched implementations of TLS (Amazon’s s2n, GnuTLS, mbed TLS and wolfSSL) which elected to use “pseudo constant time” countermeasures against the Lucky 13 attack on CBC-mode. Our attacks combine several variants of the PRIME+PROBE cache timing technique with a new extension of the original Lucky 13 attack. They apply in a cross-VM attack setting and are capable of recovering most of the plaintext whilst requiring only a moderate number of TLS connections. Along the way, we uncovered additional serious (but easy to patch) bugs in all four of the TLS implementations that we studied; in three cases, these bugs lead to Lucky 13 style attacks that can be mounted remotely with no access to a shared cache. Our work shows that adopting pseudo constant time countermeasures is not sufficient to attain real security in TLS implementations in CBC mode.

Partially Specified Channels: The TLS 1.3 Record Layer without Elision

We advance the study of secure stream-based channels (Fischlin et al., CRYPTO ‘15) by considering the multiplexing of many data streams over a single channel, an essential feature of real world protocols such as TLS. Our treatment adopts the definitional perspective of Rogaway and Stegers (CSF ‘09), which offers an elegant way to reason about what standardizing documents actually provide: a partial specification of a protocol that admits a collection of compliant, fully realized implementations. We formalize partially specified channels as the component algorithms of two parties communicating over a channel. Each algorithm has an oracle that provides specification details ; the algorithms abstract the things that must be explicitly specified, while the oracle abstracts the things that need not be. Our security notions, which capture a variety of privacy and integrity goals, allow the adversary to respond to these oracle queries; security relative to these notions implies that the channel withstands attacks in the presence of worst-case (i.e., adversarial) realizations of the specification details. We apply this framework to a formal treatment of the TLS 13 record and, in doing so, show that its security hinges crucially upon details left unspecified by the standard.

The Multi-user Security of GCM, Revisited: Tight Bounds for Nonce Randomization

Multi-user (mu) security considers large-scale attackers (e.g., state actors) that given access to a number of sessions, attempt to compromise at least one of them. Mu security of authenticated encryption (AE) was explicitly considered in the development of TLS 1.3. This paper revisits the mu security of GCM, which remains to date the most widely used dedicated AE mode. We provide new concrete security bounds which improve upon previous work by adopting a refined parameterization of adversarial resources that highlights the impact on security of (1) nonce re-use across users and of (2) re-keying. As one of the main applications, we give tight security bounds for the nonce-randomization mechanism adopted in the record protocol of TLS 1.3 as a mitigation of large-scale multi-user attacks. We provide tight security bounds that yield the first validation of this method. In particular, we solve the main open question of Bellare and Tackmann (CRYPTO ‘16), who only considered restricted attackers which do not attempt to violate integrity, and only gave non-tight bounds.

Session 8A: Web Security 1

Predicting Impending Exposure to Malicious Content from User Behavior

Many computer-security defenses are reactive—they operate only when security incidents take place, or immediately thereafter. Recent efforts have attempted to predict security incidents before they occur, to enable defenders to proactively protect their devices and networks. These efforts have primarily focused on long-term predictions. We propose a system that enables proactive defenses at the level of a single browsing session. By observing user behavior, it can predict whether they will be exposed to malicious content on the web seconds before the moment of exposure, thus opening a window of opportunity for proactive defenses. We evaluate our system using three months’ worth of HTTP traffic generated by 20,645 users of a large cellular provider in 2017 and show that it can be helpful, even when only very low false positive rates are acceptable, and despite the difficulty of making “on-the-fly’’ predictions. We also engage directly with the users through surveys asking them demographic and security-related questions, to evaluate the utility of self-reported data for predicting exposure to malicious content. We find that self-reported data can help forecast exposure risk over long periods of time. However, even on the long-term, self-reported data is not as crucial as behavioral measurements to accurately predict exposure.

Clock Around the Clock: Time-Based Device Fingerprinting

Physical device fingerprinting exploits hardware features to uniquely identify a machine. This technique has been used for authentication, license binding, or attackers identification, among other tasks. More recently, hardware features have also been introduced to identify web users and perform web tracking. A particular type of hardware fingerprint exploits differences in the computer internal clock signals. However, previous methods to test for these differences relied on complex experiments performed by running native code in the target machine. In this paper, we show a new way to compute a hardware finger- printing, based on timing the execution of sequences of instructions readily available in API functions. Due to its simplicity, this method can also be performed remotely by simply timing few seemingly innocuous lines of JavaScript code. We tested our approach with different functions, such as common string manipulation or widespread cryptographic routines, and found that several of them can be used as basic blocks for fingerprinting. Using this technique, we implemented a tool called CryptoFP. We tested its native implementation in a homogeneous scenario, to distinguish among a perfectly identical (both in software and hardware) set of computers. CryptoFP was able to correctly discriminate all the identical computers in this scenario and recognize the same computer also under different CPU load configurations, outperforming every other hardware fingerprinting method. We then show how CryptoFP can be implemented using a combination of the HTML5 Cryptography API and standard timing API for web device fingerprinting. In this case, we compared our method, both in the same homogeneous scenario and by performing an experiment with real-world users running heterogeneous devices, against other state-of-the-art web device fingerprinting solutions. In both cases, our approach clearly outperforms all existing methods.

The Web’s Sixth Sense: A Study of Scripts Accessing Smartphone Sensors

We present the first large-scale measurement of smartphone sensor API usage and stateless tracking on the mobile web. We extend the OpenWPM web privacy measurement tool to develop OpenWPM-Mobile, adding the ability to emulate plausible sensor values for different smartphone sensors such as motion, orientation, proximity and light. Using OpenWPM-Mobile we find that one or more sensor APIs are accessed on 3695 of the top 100K websites by scripts originating from 603 distinct domains. We also detect fingerprinting attempts on mobile platforms, using techniques previously applied in the desktop setting. We find significant overlap between fingerprinting scripts and scripts accessing sensor data. For example, 63% of the scripts that access motion sensors also engage in browser fingerprinting. To better understand the real-world uses of sensor APIs, we cluster JavaScript programs that access device sensors and then perform automated code comparison and manual analysis. We find a significant disparity between the actual and intended use cases of device sensor as drafted by W3C. While some scripts access sensor data to enhance user experience, such as orientation detection and gesture recognition, tracking and analytics are the most common use cases among the scripts we analyzed. We automated the detection of sensor data exfiltration and observed that the raw readings are frequently sent to remote servers for further analysis. Finally, we evaluate available countermeasures against the misuse of sensor APIs. We find that popular tracking protection lists such as EasyList and Disconnect commonly fail to block most tracking scripts that misuse sensors. Studying nine popular mobile browsers we find that even privacy-focused browsers, such as Brave and Firefox Focus, fail to implement mitigations suggested by W3C, which includes limiting sensor access from insecure contexts and cross-origin iframes. We have reported these issues to the browser vendors.

Session 8B: Usable Passwords

Reinforcing System-Assigned Passphrases Through Implicit Learning

People tend to choose short and predictable passwords that are vulnerable to guessing attacks. Passphrases are passwords consisting of multiple words, initially introduced as more secure authentication keys that people could recall. Unfortunately, people tend to choose predictable natural language patterns in passphrases, again resulting in vulnerability to guessing attacks. One solution could be system-assigned passphrases, but people have difficulty recalling them. With the goal of improving the usability of system-assigned passphrases, we propose a new approach of reinforcing system-assigned passphrases using implicit learning techniques. We design and test a system that implements this approach using two implicit learning techniques: contextual cueing and semantic priming. In a 780-participant online study, we explored the usability of 4-word system-assigned passphrases using our system compared to a set of control conditions. Our study showed that our system significantly improves usability of system-assigned passphrases, both in terms of recall rates and login time.

“What was that site doing with my Facebook password?”: Designing Password-Reuse Notifications

Password reuse is widespread, so a breach of one provider’s password database threatens accounts on other providers. When companies find stolen credentials on the black market and notice potential password reuse, they may require a password reset and send affected users a notification. Through two user studies, we provide insight into such notifications. In Study 1, 180 respondents saw one of six representative notifications used by companies in situations potentially involving password reuse. Respondents answered questions about their reactions and understanding of the situation. Notifications differed in the concern they elicited and intended actions they inspired. Concerningly, less than a third of respondents reported intentions to change any passwords. In Study 2, 588 respondents saw one of 15 variations on a model notification synthesizing results from Study 1. While the variations’ impact differed in small ways, respondents’ intended actions across all notifications would leave them vulnerable to future password-reuse attacks. We discuss best practices for password-reuse notifications and how notifications alone appear insufficient in solving password reuse.

On the Accuracy of Password Strength Meters

Password strength meters are an important tool to help users choose secure passwords. Strength meters can only then provide reasonable guidance when they are accurate, i.e., their score correctly reflect password strength. A strength meter with low accuracy may do more harm than good and guide the user to choose passwords with a high score but low actual security. While a substantial number of different strength meters is proposed in the literature and deployed in practice, we are lacking a clear picture of which strength meters provide high accuracy, and thus are most helpful for guiding users. Furthermore, we lack a clear understanding of how to compare accuracies of strength meters. In this work, (i) we propose a set of properties that a strength meter needs to fulfill to be considered to have high accuracy, (ii) we use these properties to select a suitable measure that can determine the accuracy of strength meters, and (iii) we use the selected measure to compare a wide range of strength meters proposed in the academic literature, provided by password managers, operating systems, and those used on websites. We expect our work to be helpful in the selection of good password strength meters by service operators, and to aid the further development of improved strength meters.

Session 9A: Web Security 2

Mystique: Uncovering Information Leakage from Browser Extensions

Browser extensions are small JavaScript, CSS and HTML programs that run inside the browser with special privileges. These programs, often written by third parties, operate on the pages that the browser is visiting, giving the user a programmatic way to configure the browser. The privacy implications that arise by allowing privileged third-party code to execute inside the users’ browser are not well understood. In this paper, we develop a taint analysis framework for browser extensions and use it to perform a large scale study of extensions in regard to their privacy practices. We first present a hybrid approach to traditional taint analysis: by leveraging the fact that extension source code is available to the runtime JavaScript engine, we implement as well as enhance traditional taint analysis using information gathered from static data flow and control-flow analysis of the JavaScript source code. Based on this, we further modify the Chromium browser to support taint tracking for extensions. We analyzed 178,893 extensions crawled from the Chrome Web Store between September 2016 and March 2018, as well as a separate set of all available extensions (2,790 in total) for the Opera browser at the time of analysis. From these, our analysis flagged 3,868 (2.13%) extensions as potentially leaking privacy-sensitive information. The top 10 most popular Chrome extensions that we confirmed to be leaking privacy-sensitive information have more than 60 million users combined. We ran the analysis on a local Kubernetes cluster and were able to finish within a month, demonstrating the feasibility of our approach for large-scale analysis of browser extensions. At the same time, our results emphasize the threat browser extensions pose to user privacy, and the need for countermeasures to safeguard against misbehaving extensions that abuse their privileges.

How You Get Shot in the Back: A Systematical Study about Cryptojacking in the Real World

As a new mechanism to monetize web content, cryptocurrency mining is becoming increasingly popular. The idea is simple: a webpage delivers extra workload (JavaScript) that consumes computational resources on the client machine to solve cryptographic puzzles, typically without notifying users or having explicit user consent. This new mechanism, often heavily abused and thus considered a threat termed “cryptojacking”, is estimated to affect over 10 million web users every month; however, only a few anecdotal reports exist so far and little is known about its severeness, infrastructure, and technical characteristics behind the scene. This is likely due to the lack of effective approaches to detect cryptojacking at a large-scale (e.g., VirusTotal). In this paper, we take a first step towards an in-depth study over cryptojacking. By leveraging a set of inherent characteristics of cryptojacking scripts, we build CMTracker, a behavior-based detector with two runtime profilers for automatically tracking Cryptocurrency Mining scripts and their related domains. Surprisingly, our approach successfully discovered 2,770 unique cryptojacking samples from 853,936 popular web pages, including 868 among top 100K in Alexa list. Leveraging these samples, we gain a more comprehensive picture of the cryptojacking attacks, including their impact, distribution mechanisms, obfuscation, and attempts to evade detection. For instance, a diverse set of organizations benefit from cryptojacking based on the unique wallet ids. In addition, to stay under the radar, they frequently update their attack domains (fastflux) on the order of days. Many attackers also apply evasion techniques, including limiting the CPU usage, obfuscating the code, etc.

MineSweeper: An In-depth Look into Drive-by Cryptocurrency Mining and Its Defense

A wave of alternative coins that can be effectively mined without specialized hardware, and a surge in cryptocurrencies’ market value has led to the development of cryptocurrency mining ( cryptomining ) services, such as Coinhive, which can be easily integrated into websites to monetize the computational power of their visitors. While legitimate website operators are exploring these services as an alternative to advertisements, they have also drawn the attention of cybercriminals: drive-by mining (also known as cryptojacking ) is a new web-based attack, in which an infected website secretly executes JavaScript code and/or a WebAssembly module in the user’s browser to mine cryptocurrencies without her consent. In this paper, we perform a comprehensive analysis on Alexa’s Top 1 Million websites to shed light on the prevalence and profitability of this attack. We study the websites affected by drive-by mining to understand the techniques being used to evade detection, and the latest web technologies being exploited to efficiently mine cryptocurrency. As a result of our study, which covers 28 Coinhive-like services that are widely being used by drive-by mining websites, we identified 20 active cryptomining campaigns. Motivated by our findings, we investigate possible countermeasures against this type of attack. We discuss how current blacklisting approaches and heuristics based on CPU usage are insufficient, and present MineSweeper, a novel detection technique that is based on the intrinsic characteristics of cryptomining code, and, thus, is resilient to obfuscation. Our approach could be integrated into browsers to warn users about silent cryptomining when visiting websites that do not ask for their consent.

Pride and Prejudice in Progressive Web Apps: Abusing Native App-like Features in Web Applications

Progressive Web App (PWA) is a new generation of Web application designed to provide native app-like browsing experiences even when a browser is offline. PWAs make full use of new HTML5 features which include push notification, cache, and service worker to provide short-latency and rich Web browsing experiences. We conduct the first systematic study of the security and privacy aspects unique to PWAs. We identify security flaws in main browsers as well as design flaws in popular third-party push services, that exacerbate the phishing risk. We introduce a new side-channel attack that infers the victim’s history of visited PWAs. The proposed attack exploits the offline browsing feature of PWAs using a cache. We demonstrate a cryptocurrency mining attack which abuses service workers. Defenses and recommendations to mitigate the identified security and privacy risks are suggested with in-depth understanding.

Session 9D: Vulnerability Detection

Block Oriented Programming: Automating Data-Only Attacks

With the widespread deployment of Control-Flow Integrity (CFI), control-flow hijacking attacks, and consequently code reuse attacks, are significantly more difficult. CFI limits control flow to well-known locations, severely restricting arbitrary code execution. Assessing the remaining attack surface of an application under advanced control-flow hijack defenses such as CFI and shadow stacks remains an open problem. We introduce BOPC, a mechanism to automatically assess whether an attacker can execute arbitrary code on a binary hardened with CFI/shadow stack defenses. BOPC computes exploits for a target program from payload specifications written in a Turing-complete, high-level language called SPL that abstracts away architecture and program-specific details. SPL payloads are compiled into a program trace that executes the desired behavior on top of the target binary. The input for BOPC is an SPL payload, a starting point (e.g., from a fuzzer crash) and an arbitrary memory write primitive that allows application state corruption. To map SPL payloads to a program trace, BOPC introduces Block Oriented Programming (BOP), a new code reuse technique that utilizes entire basic blocks as gadgets along valid execution paths in the program, i.e., without violating CFI or shadow stack policies. We find that the problem of mapping payloads to program traces is NP-hard, so BOPC first reduces the search space by pruning infeasible paths and then uses heuristics to guide the search to probable paths. BOPC encodes the BOP payload as a set of memory writes. We execute 13 SPL payloads applied to 10 popular applications. BOPC successfully finds payloads and complex execution traces – which would likely not have been found through manual analysis – while following the target’s Control-Flow Graph under an ideal CFI policy in 81% of the cases.

Threat Intelligence Computing

Cyber threat hunting is the process of proactively and iteratively formulating and validating threat hypotheses based on security-relevant observations and domain knowledge. To facilitate threat hunting tasks, this paper introduces threat intelligence computing as a new methodology that models threat discovery as a graph computation problem. It enables efficient programming for solving threat discovery problems, equipping threat hunters with a suite of potent new tools for agile codifications of threat hypotheses, automated evidence mining, and interactive data inspection capabilities. A concrete realization of a threat intelligence computing platform is presented through the design and implementation of a domain-specific graph language with interactive visualization support and a distributed graph database. The platform was evaluated in a two-week DARPA competition for threat detection on a test bed comprising a wide variety of systems monitored in real time. During this period, sub-billion records were produced, streamed, and analyzed, dozens of threat hunting tasks were dynamically planned and programmed, and attack campaigns with diverse malicious intent were discovered. The platform exhibited strong detection and analytics capabilities coupled with high efficiency, resulting in a leadership position in the competition. Additional evaluations on comprehensive policy reasoning are outlined to demonstrate the versatility of the platform and the expressiveness of the language.

Check It Again: Detecting Lacking-Recheck Bugs in OS Kernels

Operating system kernels carry a large number of security checks to validate security-sensitive variables and operations. For example, a security check should be embedded in a code to ensure that a user-supplied pointer does not point to the kernel space. Using security-checked variables is typically safe. However, in reality, security-checked variables are often subject to modification after the check. If a recheck is lacking after a modification, security issues may arise, e.g., adversaries can control the checked variable to launch critical attacks such as out-of-bound memory access or privilege escalation. We call such cases lacking-recheck (LRC) bugs, a subclass of TOCTTOU bugs, which have not been explored yet. In this paper, we present the first in-depth study of LRC bugs and develop LRSan, a static analysis system that systematically detects LRC bugs in OS kernels. Using an inter-procedural analysis and multiple new techniques, LRSan first automatically identifies security checks, critical variables, and uses of the checked variables, and then reasons about whether a modification is present after a security check. A case in which a modification is present but a recheck is lacking is an LRC bug. We apply LRSan to the latest Linux kernel and evaluate the effectiveness of LRSan. LRSan reports thousands of potential LRC cases, and we have confirmed 19 new LRC bugs. We also discuss patching strategies of LRC bugs based on our study and bug-fixing experience.

Revery: From Proof-of-Concept to Exploitable

Automatic exploit generation is an open challenge. Existing solutions usually explore in depth the crashing paths, i.e., paths taken by proof-of-concept (POC) inputs triggering vulnerabilities, and generate exploits when exploitable states are found along the paths. However, exploitable states do not always exist in crashing paths. Moreover, existing solutions heavily rely on symbolic execution and are not scalable in path exploration and exploit generation. In addition, few solutions could exploit heap-based vulnerabilities. In this paper, we propose a new solution revery to search for exploitable states in paths diverging from crashing paths, and generate control-flow hijacking exploits for heap-based vulnerabilities. It adopts three novel techniques:(1) a digraph to characterize a vulnerability’s memory layout and its contributor instructions;(2) a fuzz solution to explore diverging paths, which have similar memory layouts as the crashing paths, in order to search more exploitable states and generate corresponding diverging inputs;(3) a stitch solution to stitch crashing paths and diverging paths together, and synthesize EXP inputs able to trigger both vulnerabilities and exploitable states. We have developed a prototype of revery based on the binary analysis engine angr, and evaluated it on a set of 19 real world CTF (capture the flag) challenges. Experiment results showed that it could generate exploits for 9 (47%) of them, and generate EXP inputs able to trigger exploitable states for another 5 (26%) of them.

Session 10A: TOR

Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning

Website fingerprinting enables a local eavesdropper to determine which websites a user is visiting over an encrypted connection. State-of-the-art website fingerprinting attacks have been shown to be effective even against Tor. Recently, lightweight website fingerprinting defenses for Tor have been proposed that substantially degrade existing attacks: WTF-PAD and Walkie-Talkie. In this work, we present Deep Fingerprinting (DF), a new website fingerprinting attack against Tor that leverages a type of deep learning called Convolutional Neural Networks (CNN) with a sophisticated architecture design, and we evaluate this attack against WTF-PAD and Walkie-Talkie. The DF attack attains over 98% accuracy on Tor traffic without defenses, better than all prior attacks, and it is also the only attack that is effective against WTF-PAD with over 90% accuracy. Walkie-Talkie remains effective, holding the attack to just 49.7% accuracy. In the more realistic open-world setting, our attack remains effective, with 0.99 precision and 0.94 recall on undefended traffic. Against traffic defended with WTF-PAD in this setting, the attack still can get 0.96 precision and 0.68 recall. These findings highlight the need for effective defenses that protect against this new attack and that could be deployed in Tor.

Privacy-Preserving Dynamic Learning of Tor Network Traffic

Experimentation tools facilitate exploration of Tor performance and security research problems and allow researchers to safely and privately conduct Tor experiments without risking harm to real Tor users. However, researchers using these tools configure them to generate network traffic based on simplifying assumptions and outdated measurements and without understanding the efficacy of their configuration choices. In this work, we design a novel technique for dynamically learning Tor network traffic models using hidden Markov modeling and privacy-preserving measurement techniques. We conduct a safe but detailed measurement study of Tor using 17 relays (~2% of Tor bandwidth) over the course of 6 months, measuring general statistics and models that can be used to generate a sequence of streams and packets. We show how our measurement results and traffic models can be used to generate traffic flows in private Tor networks and how our models are more realistic than standard and alternative network traffic generation~methods.

DeepCorr: Strong Flow Correlation Attacks on Tor Using Deep Learning

Flow correlation is the core technique used in a multitude of deanonymization attacks on Tor. Despite the importance of flow correlation attacks on Tor, existing flow correlation techniques are considered to be ineffective and unreliable in linking Tor flows when applied at a large scale, i.e., they impose high rates of false positive error rates or require impractically long flow observations to be able to make reliable correlations. In this paper, we show that, unfortunately, flow correlation attacks can be conducted on Tor traffic with drastically higher accuracies than before by leveraging emerging learning mechanisms. We particularly design a system, called DeepCorr, that outperforms the state-of-the-art by significant margins in correlating Tor connections. DeepCorr leverages an advanced deep learning architecture to learn a flow correlation function tailored to Tor’s complex network- this is in contrast to previous works’ use of generic statistical correlation metrics to correlate Tor flows. We show that with moderate learning, DeepCorr can correlate Tor connections (and therefore break its anonymity) with accuracies significantly higher than existing algorithms, and using substantially shorter lengths of flow observations. For instance, by collecting only about 900 packets of each target Tor flow (roughly 900KB of Tor data), DeepCorr provides a flow correlation accuracy of 96% compared to 4% by the state-of-the-art system of RAPTOR using the same exact setting. We hope that our work demonstrates the escalating threat of flow correlation attacks on Tor given recent advances in learning algorithms, calling for the timely deployment of effective countermeasures by the Tor community.

Measuring Information Leakage in Website Fingerprinting Attacks and Defenses

Tor provides low-latency anonymous and uncensored network access against a local or network adversary. Due to the design choice to minimize traffic overhead (and increase the pool of potential users) Tor allows some information about the client’s connections to leak. Attacks using (features extracted from) this information to infer the website a user visits are called Website Fingerprinting (WF) attacks. We develop a methodology and tools to measure the amount of leaked information about a website. We apply this tool to a comprehensive set of features extracted from a large set of websites and WF defense mechanisms, allowing us to make more fine-grained observations about WF attacks and defenses.

Session 10D: Fuzzing, Exploitation, and Side Channels

Hawkeye: Towards a Desired Directed Grey-box Fuzzer

Grey-box fuzzing is a practically effective approach to test real-world programs. However, most existing grey-box fuzzers lack directedness, i.e. the capability of executing towards user-specified target sites in the program. To emphasize existing challenges in directed fuzzing, we propose Hawkeye to feature four desired properties of directed grey-box fuzzers. Owing to a novel static analysis on the program under test and the target sites, Hawkeye precisely collects the information such as the call graph, function and basic block level distances to the targets. During fuzzing, Hawkeye evaluates exercised seeds based on both static information and the execution traces to generate the dynamic metrics, which are then used for seed prioritization, power scheduling and adaptive mutating. These strategies help Hawkeye to achieve better directedness and gravitate towards the target sites. We implemented Hawkeye as a fuzzing framework and evaluated it on various real-world programs under different scenarios. The experimental results showed that Hawkeye can reach the target sites and reproduce the crashes much faster than state-of-the-art grey-box fuzzers such as AFL and AFLGo. Specially, Hawkeye can reduce the time to exposure for certain vulnerabilities from about 3.5 hours to 0.5 hour. By now, Hawkeye has detected more than 41 previously unknown crashes in projects such as Oniguruma, MJS with the target sites provided by vulnerability prediction tools; all these crashes are confirmed and 15 of them have been assigned CVE IDs.

ret2spec: Speculative Execution Using Return Stack Buffers

Speculative execution is an optimization technique that has been part of CPUs for over a decade. It predicts the outcome and target of branch instructions to avoid stalling the execution pipeline. However, until recently, the security implications of speculative code execution have not been studied. In this paper, we investigate a special type of branch predictor that is responsible for predicting return addresses. To the best of our knowledge, we are the first to study return address predictors and their consequences for the security of modern software. In our work, we show how return stack buffers (RSBs), the core unit of return address predictors, can be used to trigger misspeculations. Based on this knowledge, we propose two new attack variants using RSBs that give attackers similar capabilities as the documented Spectre attacks. We show how local attackers can gain arbitrary speculative code execution across processes, e.g., to leak passwords another user enters on a shared system. Our evaluation showed that the recent Spectre countermeasures deployed in operating systems can also cover such RSB-based cross-process attacks. Yet we then demonstrate that attackers can trigger misspeculation in JIT environments in order to leak arbitrary memory content of browser processes. Reading outside the sandboxed memory region with JIT-compiled code is still possible with 80% accuracy on average.

Evaluating Fuzz Testing

Fuzz testing has enjoyed great success at discovering security critical bugs in real software. Recently, researchers have devoted significant effort to devising new fuzzing techniques, strategies, and algorithms. Such new ideas are primarily evaluated experimentally so an important question is: What experimental setup is needed to produce trustworthy results? We surveyed the recent research literature and assessed the experimental evaluations carried out by 32 fuzzing papers. We found problems in every evaluation we considered. We then performed our own extensive experimental evaluation using an existing fuzzer. Our results showed that the general problems we found in existing experimental evaluations can indeed translate to actual wrong or misleading assessments. We conclude with some guidelines that we hope will help improve experimental evaluations of fuzz testing algorithms, making reported results more robust.

Rendered Insecure: GPU Side Channel Attacks are Practical

Graphics Processing Units (GPUs) are commonly integrated with computing devices to enhance the performance and capabilities of graphical workloads. In addition, they are increasingly being integrated in data centers and clouds such that they can be used to accelerate data intensive workloads. Under a number of scenarios the GPU can be shared between multiple applications at a fine granularity allowing a spy application to monitor side channels and attempt to infer the behavior of the victim. For example, OpenGL and WebGL send workloads to the GPU at the granularity of a frame, allowing an attacker to interleave the use of the GPU to measure the side-effects of the victim computation through performance counters or other resource tracking APIs. We demonstrate the vulnerability using two applications. First, we show that an OpenGL based spy can fingerprint websites accurately, track user activities within the website, and even infer the keystroke timings for a password text box with high accuracy. The second application demonstrates how a CUDA spy application can derive the internal parameters of a neural network model being used by another CUDA application, illustrating these threats on the cloud. To counter these attacks, the paper suggests mitigations based on limiting the rate of the calls, or limiting the granularity of the returned information.

2019 dblp

Session 1A: Attack I

1 Trillion Dollar Refund: How To Spoof PDF Signatures

The Portable Document Format (PDF) is the de-facto standard for document exchange worldwide. To guarantee the authenticity and integrity of documents, digital signatures are used. Several public and private services ranging from governments, public enterprises, banks, and payment services rely on the security of PDF signatures.

In this paper, we present the first comprehensive security evaluation on digital signatures in PDFs. We introduce three novel attack classes which bypass the cryptographic protection of digitally signed PDF files allowing an attacker to spoof the content of a signed PDF. We analyzed 22 different PDF viewers and found 21 of them to be vulnerable, including prominent and widely used applications such as Adobe Reader DC and Foxit. We additionally evaluated eight online validation services and found six to be vulnerable. A possible explanation for these results could be the absence of a standard algorithm to verify PDF signatures – each client verifies signatures differently, and attacks can be tailored to these differences. We, therefore, propose the standardization of a secure verification algorithm, which we describe in this paper.

All findings have been responsibly disclosed, and the affected vendors were supported during fixing the issues. As a result, three generic CVEs for each attack class were issued [50-52]. Our research on PDF signatures and more information is also online available at https://www.pdf-insecurity.org/.

Practical Decryption exFiltration: Breaking PDF Encryption

The Portable Document Format, better known as PDF, is one of the most widely used document formats worldwide, and in order to ensure information confidentiality, this file format supports document encryption. In this paper, we analyze PDF encryption and show two novel techniques for breaking the confidentiality of encrypted documents. First, we abuse the PDF feature of partially encrypted documents to wrap the encrypted part of the document within attacker-controlled content and therefore, exfiltrate the plaintext once the document is opened by a legitimate user. Second, we abuse a flaw in the PDF encryption specification to arbitrarily manipulate encrypted content. The only requirement is that a single block of known plaintext is needed, and we show that this is fulfilled by design. Our attacks allow the recovery of the entire plaintext of encrypted documents by using exfiltration channels which are based on standard compliant PDF properties. We evaluated our attacks on 27 widely used PDF viewers and found all of them to be vulnerable. We responsibly disclosed the vulnerabilities and supported the vendors in fixing the issues.

Session 1C: Cloud Security I

A Machine-Checked Proof of Security for AWS Key Management Service

We present a machine-checked proof of security for the domain management protocol of Amazon Web Services’ KMS (Key Management Service) a critical security service used throughout AWS and by AWS customers. Domain management is at the core of AWS KMS; it governs the top-level keys that anchor the security of encryption services at AWS. We show that the protocol securely implements an ideal distributed encryption mechanism under standard cryptographic assumptions. The proof is machine-checked in the EasyCrypt proof assistant and is the largest EasyCrypt development to date.

Mitigating Leakage in Secure Cloud-Hosted Data Structures: Volume-Hiding for Multi-Maps via Hashing

Volume leakage has recently been identified as a major threat to the security of cryptographic cloud-based data structures by Kellaris \em et al. [CCS’16] (see also the attacks in Grubbs \em et al. [CCS’18] and Lacharité \em et al. [S&P’18]). In this work, we focus on volume-hiding implementations of \em encrypted multi-maps as first considered by Kamara and Moataz [Eurocrypt’19]. Encrypted multi-maps consist of outsourcing the storage of a multi-map to an untrusted server, such as a cloud storage system, while maintaining the ability to perform private queries. Volume-hiding encrypted multi-maps ensure that the number of responses (volume) for any query remains hidden from the adversarial server. As a result, volume-hiding schemes can prevent leakage attacks that leverage the adversary’s knowledge of the number of query responses to compromise privacy. We present both conceptual and algorithmic contributions towards volume-hiding encrypted multi-maps. We introduce the first formal definition of volume-hiding leakage functions. In terms of design, we present the first volume-hiding encrypted multi-map dprfMM whose storage and query complexity are both asymptotically optimal. Furthermore, we experimentally show that our construction is practically efficient. Our server storage is smaller than the best previous construction while we improve query complexity by a factor of 10-16x. In addition, we introduce the notion of differentially private volume-hiding leakage functions which strikes a better, tunable balance between privacy and efficiency. To accompany our new notion, we present a differentially private volume-hiding encrypted multi-map dpMM whose query complexity is the volume of the queried key plus an additional logarithmic factor. This is a significant improvement compared to all previous volume-hiding schemes whose query overhead was the maximum volume of any key. In natural settings, our construction improves the average query overhead by a factor of 150-240x over the previous best volume-hiding construction even when considering small privacy budget of ε=0.2.

Session 2B: ML Security I

Neural Network Inversion in Adversarial Setting via Background Knowledge Alignment

The wide application of deep learning technique has raised new security concerns about the training data and test data. In this work, we investigate the model inversion problem under adversarial settings, where the adversary aims at inferring information about the target model’s training data and test data from the model’s prediction values. We develop a solution to train a second neural network that acts as the inverse of the target model to perform the inversion. The inversion model can be trained with black-box accesses to the target model. We propose two main techniques towards training the inversion model in the adversarial settings. First, we leverage the adversary’s background knowledge to compose an auxiliary set to train the inversion model, which does not require access to the original training data. Second, we design a truncation-based technique to align the inversion model to enable effective inversion of the target model from partial predictions that the adversary obtains on victim user’s data. We systematically evaluate our approach in various machine learning tasks and model architectures on multiple image datasets. We also confirm our results on Amazon Rekognition, a commercial prediction API that offers “machine learning as a service”. We show that even with partial knowledge about the black-box model’s training data, and with only partial prediction values, our inversion approach is still able to perform accurate inversion of the target model, and outperform previous approaches.

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

The arms race between attacks and defenses for machine learning models has come to a forefront in recent years, in both the security community and the privacy community. However, one big limitation of previous research is that the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards resolving this limitation by combining the two domains. In particular, we measure the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples (i.e., evasion attacks). Membership inference attacks determine whether or not an individual data record has been part of a model’s training set. The accuracy of such attacks reflects the information leakage of training algorithms about individual members of the training set. Adversarial defense methods against adversarial examples influence the model’s decision boundaries such that model predictions remain unchanged for a small area around each input. However, this objective is optimized on training data. Thus, individual data records in the training set have a significant influence on robust models. This makes the models more vulnerable to inference attacks. To perform the membership inference attacks, we leverage the existing inference methods that exploit model predictions. We also propose two new inference methods that exploit structural properties of robust models on adversarially perturbed data. Our experimental evaluation demonstrates that compared with the natural training (undefended) approach, adversarial defense methods can indeed increase the target model’s risk against membership inference attacks. When using adversarial defenses to train the robust models, the membership inference advantage increases by up to 4.5 times compared to the naturally undefended models. Beyond revealing the privacy risks of adversarial defenses, we further investigate the factors, such as model capacity, that influence the membership information leakage.

MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

In a membership inference attack, an attacker aims to infer whether a data sample is in a target classifier’s training dataset or not. Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample’s confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier’s training dataset. Membership inference attacks pose severe privacy and security threats to the training dataset. Most existing defenses leverage differential privacy when training the target classifier or regularize the training process of the target classifier. These defenses suffer from two key limitations: 1) they do not have formal utility-loss guarantees of the confidence score vectors, and 2) they achieve suboptimal privacy-utility tradeoffs. In this work, we propose MemGuard,the first defense with formal utility-loss guarantees against black-box membership inference attacks. Instead of tampering the training process of the target classifier, MemGuard adds noise to each confidence score vector predicted by the target classifier. Our key observation is that attacker uses a classifier to predict member or non-member and classifier is vulnerable to adversarial examples.Based on the observation, we propose to add a carefully crafted noise vector to a confidence score vector to turn it into an adversarial example that misleads the attacker’s classifier. Specifically, MemGuard works in two phases. In Phase I, MemGuard finds a carefully crafted noise vector that can turn a confidence score vector into an adversarial example, which is likely to mislead the attacker’s classifier to make a random guessing at member or non-member. We find such carefully crafted noise vector via a new method that we design to incorporate the unique utility-loss constraints on the noise vector. In Phase II, MemGuard adds the noise vector to the confidence score vector with a certain probability, which is selected to satisfy a given utility-loss budget on the confidence score vector. Our experimental results on three datasets show that MemGuard can effectively defend against membership inference attacks and achieve better privacy-utility tradeoffs than existing defenses. Our work is the first one to show that adversarial examples can be used as defensive mechanisms to defend against membership inference attacks.

Procedural Noise Adversarial Examples for Black-Box Attacks on Deep Convolutional Networks

Deep Convolutional Networks (DCNs) have been shown to be vulnerable to adversarial examples—perturbed inputs specifically designed to produce intentional errors in the learning algorithms at test time. Existing input-agnostic adversarial perturbations exhibit interesting visual patterns that are currently unexplained. In this paper, we introduce a structured approach for generating Universal Adversarial Perturbations (UAPs) with procedural noise functions. Our approach unveils the systemic vulnerability of popular DCN models like Inception v3 and YOLO v3, with single noise patterns able to fool a model on up to 90% of the dataset. Procedural noise allows us to generate a distribution of UAPs with high universal evasion rates using only a few parameters. Additionally, we propose Bayesian optimization to efficiently learn procedural noise parameters to construct inexpensive untargeted black-box attacks. We demonstrate that it can achieve an average of less than 10 queries per successful attack, a 100-fold improvement on existing methods. We further motivate the use of input-agnostic defences to increase the stability of models to adversarial perturbations. The universality of our attacks suggests that DCN models may be sensitive to aggregations of low-level class-agnostic features. These findings give insight on the nature of some universal adversarial perturbations and how they could be generated in other applications.

Session 2E: Internet Security

SICO: Surgical Interception Attacks by Manipulating BGP Communities

The Border Gateway Protocol (BGP) is the primary routing protocol for the Internet backbone, yet it lacks adequate security mechanisms. While simple BGP hijack attacks only involve an adversary hijacking Internet traffic destined to a victim, more complex and challenging interception attacks require that adversary intercept a victim’s traffic and forward it on to the victim. If an interception attack is launched incorrectly, the adversary’s attack will disrupt its route to the victim making it impossible to forward packets. To overcome these challenges, we introduce SICO attacks (Surgical Interception using COmmunities): a novel method of launching interception attacks that leverages BGP communities to scope an adversary’s attack and ensure a route to the victim. We then show how SICO attacks can be targeted to specific source IP addresses for reducing attack costs. Furthermore, we ethically perform SICO attacks on the real Internet backbone to evaluate their feasibility and effectiveness. Results suggest that SICO attacks can achieve interception even when previously proposed attacks would not be feasible and outperforms them by attracting traffic from an additional 16% of Internet hosts (worst case) and 58% of Internet hosts (best case). Finally, we analyze the Internet topology to find that at least 83% of multi-homed ASes are capable of launching these attacks.

Just the Tip of the Iceberg: Internet-Scale Exploitation of Routers for Cryptojacking

The release of an efficient browser-based cryptominer, as introduced by Coinhive in 2017, has quickly spread throughout the web either as a new source of revenue for websites or exploited within the context of hacks and malicious advertisements. Several studies have analyzed the Alexa Top 1M and found 380 - 3,200 (0.038% - 0.32%) to be actively mining, with an estimated $41,000 per month revenue for the top 10 perpetrators. While placing a cryptominer on a popular website supplies considerable returns from its visitors’ web browsers, it only generates revenue while a client is visiting the page. Even though large popular websites attract millions of visitors, the relatively low number of exploiting websites limits the total revenue that can be made. In this paper, we report on a new attack vector that drastically overshadows all existing cryptojacking activity discovered to date. Through a firmware vulnerability in MikroTik routers, cyber criminals are able to rewrite outgoing user traffic and embed cryptomining code in every outgoing web connection. Thus, every web page visited by any user behind an infected router would mine to profit the criminals. Based on NetFlows recorded in a Tier 1 network, semiweekly crawls and telescope traffic, we followed their activities over a period of 10 months, and report on the modus operandi and coordinating infrastructure of the perpetrators, which were during this period in control of up to 1.4M routers, approximately 70% of all MikroTik devices deployed worldwide. We observed different levels of sophistication among adversaries, ranging from individual installations to campaigns involving large numbers of routers. Our results show that cryptojacking through MITM attacks is highly lucrative, a factor of 30 more than previous attack vectors.

Network Hygiene, Incentives, and Regulation: Deployment of Source Address Validation in the Internet

The Spoofer project has collected data on the deployment and characteristics of IP source address validation on the Internet since 2005. Data from the project comes from participants who install an active probing client that runs in the background. The client automatically runs tests both periodically and when it detects a new network attachment point. We analyze the rich dataset of Spoofer tests in multiple dimensions: across time, networks, autonomous systems, countries, and by Internet protocol version. In our data for the year ending August 2019, at least a quarter of tested ASes did not filter packets with spoofed source addresses leaving their networks. We show that routers performing Network Address Translation do not always filter spoofed packets, as 6.4% of IPv4/24 tested in the year ending August 2019 did not filter. Worse, at least two thirds of tested ASes did not filter packets entering their networks with source addresses claiming to be from within their network that arrived from outside their network. We explore several approaches to encouraging remediation and the challenges of evaluating their impact. While we have been able to remediate 352 IPv4/24, we have found an order of magnitude more IPv4/24 that remains unremediated, despite myriad remediation strategies, with 21% unremediated for more than six months. Our analysis provides the most complete and confident picture of the Internet’s susceptibility to date of this long-standing vulnerability. Although there is no simple solution to address the remaining long-tail of unremediated networks, we conclude with a discussion of possible non-technical interventions, and demonstrate how the platform can support evaluation of the impact of such interventions over time.

Security Certification in Payment Card Industry: Testbeds, Measurements, and Recommendations

The massive payment card industry (PCI) involves various entities such as merchants, issuer banks, acquirer banks, and card brands. Ensuring security for all entities that process payment card information is a challenging task. The PCI Security Standards Council requires all entities to be compliant with the PCI Data Security Standard (DSS), which specifies a series of security requirements. However, little is known regarding how well PCI DSS is enforced in practice. In this paper, we take a measurement approach to systematically evaluate the PCI DSS certification process for e-commerce websites. We develop an e-commerce web application testbed, BuggyCart, which can flexibly add or remove 35 PCI DSS related vulnerabilities. Then we use the testbed to examine the capability and limitations of PCI scanners and the rigor of the certification process. We find that there is an alarming gap between the security standard and its real-world enforcement. None of the 6 PCI scanners we tested are fully compliant with the PCI scanning guidelines, issuing certificates to merchants that still have major vulnerabilities. To further examine the compliance status of real-world e-commerce websites, we build a new lightweight scanning tool named PciCheckerLite and scan 1,203 e-commerce websites across various business sectors. The results confirm that 86% of the websites have at least one PCI DSS violation that should have disqualified them as non-compliant. Our in-depth accuracy analysis also shows that PciCheckerLite’s output is more precise than w3af. We reached out to the PCI Security Council to share our research results to improve the enforcement in practice.

Session 3A: Fuzzing: Methods and Applications

Matryoshka: Fuzzing Deeply Nested Branches

Greybox fuzzing has made impressive progress in recent years, evolving from heuristics-based random mutation to approaches for solving individual branch constraints. However, they have difficulty solving path constraints that involve deeply nested conditional statements, which are common in image and video decoders, network packet analyzers, and checksum tools. We propose an approach for addressing this problem. First, we identify all the control flow-dependent conditional statements of the target conditional statement. Next, we select the taint flow-dependent conditional statements. Finally, we use three strategies to find an input that satisfies all conditional statements simultaneously. We implemented this approach in a tool called Matryoshka and compared its effectiveness on 13 open source programs against other state-of-the-art fuzzers. Matryoshka has significantly higher cumulative line and branch coverage than AFL, QSYM, and Angora. We manually classified the crashes found by Matryoshka into 41 unique new bugs and obtained 12 CVEs. Our evaluation also uncovered the key technique contributing to Matryoshka’s impressive performance: it collects only the nesting constraints that may cause the target conditional statement unreachable, which greatly simplifies the path constraints that it has to solve.

Intriguer: Field-Level Constraint Solving for Hybrid Fuzzing

Hybrid fuzzing, which combines fuzzing and concolic execution, is promising in light of the recent performance improvements in concolic engines. We have observed that there is room for further improvement: symbolic emulation is still slow, unnecessary constraints dominate solving time, resources are overly allocated, and hard-to-trigger bugs are missed. To address these problems, we present a new hybrid fuzzer named Intriguer. The key idea of Intriguer is field-level constraint solving, which optimizes symbolic execution with field-level knowledge. Intriguer performs instruction-level taint analysis and records execution traces without data transfer instructions like mov. Intriguer then reduces the execution traces for tainted instructions that accessed a wide range of input bytes, and infers input fields to build field transition trees. With these optimizations, Intriguer can efficiently perform symbolic emulation for more relevant instructions and invoke a solver for complicated constraints only. Our evaluation results indicate that Intriguer outperforms the state-of-the-art fuzzers: Intriguer found all the bugs in the LAVA-M(5h) benchmark dataset for ground truth performance, and also discovered 43 new security bugs in seven real-world programs. We reported the bugs and received 23 new CVEs.

Learning to Fuzz from Symbolic Execution with Application to Smart Contracts

Fuzzing and symbolic execution are two complementary techniques for discovering software vulnerabilities. Fuzzing is fast and scalable, but can be ineffective when it fails to randomly select the right inputs. Symbolic execution is thorough but slow and often does not scale to deep program paths with complex path conditions. In this work, we propose to learn an effective and fast fuzzer from symbolic execution, by phrasing the learning task in the framework of imitation learning. During learning, a symbolic execution expert generates a large number of quality inputs improving coverage on thousands of programs. Then, a fuzzing policy, represented with a suitable architecture of neural networks, is trained on the generated dataset. The learned policy can then be used to fuzz new programs. We instantiate our approach to the problem of fuzzing smart contracts, a domain where contracts often implement similar functionality (facilitating learning) and security is of utmost importance. We present an end-to-end system, ILF (for Imitation Learning based Fuzzer), and an extensive evaluation over >18K contracts. Our results show that ILF is effective: (i) it is fast, generating 148 transactions per second, (ii) it outperforms existing fuzzers (e.g., achieving 33% more coverage), and (iii) it detects more vulnerabilities than existing fuzzing and symbolic execution tools for Ethereum.

Session 5C: Cloud Security II

Houdini’s Escape: Breaking the Resource Rein of Linux Control Groups

Linux Control Groups, i.e., cgroups, are the key building blocks to enable operating-system-level containerization. The cgroups mechanism partitions processes into hierarchical groups and applies different controllers to manage system resources, including CPU, memory, block I/O, etc. Newly spawned child processes automatically copy cgroups attributes from their parents to enforce resource control. Unfortunately, inherited cgroups confinement via process creation does not always guarantee consistent and fair resource accounting. In this paper, we devise a set of exploiting strategies to generate out-of-band</>workloads via de-associating processes from their original process groups. The system resources consumed by such workloads will not be charged to the appropriate cgroups. To further demonstrate the feasibility, we present five case studies within Docker containers to demonstrate how to break the resource rein of cgroups in realistic scenarios. Even worse, by exploiting those cgroups’ insufficiencies in a multi-tenant container environment, an adversarial container is able to greatly amplify the amount of consumed resources, significantly slow-down other containers on the same host, and gain extra unfair advantages on the system resources. We conduct extensive experiments on both a local testbed and an Amazon EC2 cloud dedicated server. The experimental results demonstrate that a container can consume system resources (e.g., CPU) as much as $200\times$ of its limit, and reduce both computing and I/O performance of particular workloads in other co-resident containers by 95%.

Insecure Until Proven Updated: Analyzing AMD SEV’s Remote Attestation

Cloud computing is one of the most prominent technologies to host Internet services that unfortunately leads to an increased risk of data theft. Customers of cloud services have to trust the cloud providers, as they control the building blocks that form the cloud. This includes the hypervisor enabling the sharing of a single hardware platform among multiple tenants. Executing in a higher-privileged CPU mode, the hypervisor has direct access to the memory of virtual machines. While data at rest can be protected using well-known disk encryption methods, data residing in main memory is still threatened by a potentially malicious cloud provider. AMD Secure Encrypted Virtualization (SEV) claims a new level of protection in such cloud scenarios. AMD SEV encrypts the main memory of virtual machines with VM-specific keys, thereby denying the higher-privileged hypervisor access to a guest’s memory. To enable the cloud customer to verify the correct deployment of his virtual machine, SEV additionally introduces a remote attestation protocol. This protocol is a crucial component of the SEV technology that can prove that SEV protection is in place and that the virtual machine was not subject to manipulation. This paper analyzes the firmware components that implement the SEV remote attestation protocol on the current AMD Epyc Naples CPU series. We demonstrate that it is possible to extract critical</> CPU-specific keys that are fundamental for the security of the remote attestation protocol. Building on the extracted keys, we propose attacks that allow a malicious cloud provider a complete circumvention of the SEV protection mechanisms. Although the underlying firmware issues were already fixed by AMD, we show that the current series of AMD Epyc CPUs, i.e., the Naples series, does not prevent the installation of previous firmware versions. We show that the severity of our proposed attacks is very high as no purely software-based mitigations are possible. This effectively renders the SEV technology on current AMD Epyc CPUs useless when confronted with an untrusted cloud provider. To overcome these issues, we also propose robust changes to the SEV design that allow future generations of the SEV technology to mitigate the proposed attacks.

Session 5E: Fingerprinting

Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with N-shot Learning

Website Fingerprinting (WF) attacks pose a serious threat to users’ online privacy, including for users of the Tor anonymity system. By exploiting recent advances in deep learning, WF attacks like Deep Fingerprinting (DF) have reached up to 98% accuracy. The DF attack, however, requires large amounts of training data that needs to be updated regularly, making it less practical for the weaker attacker model typically assumed in WF. Moreover, research on WF attacks has been criticized for not demonstrating attack effectiveness under more realistic and more challenging scenarios. Most research on WF attacks assumes that the testing and training data have similar distributions and are collected from the same type of network at about the same time. In this paper, we examine how an attacker could leverage N-shot learning—a machine learning technique requiring just a few training samples to identify a given class—to reduce the effort of gathering and training with a large WF dataset as well as mitigate the adverse effects of dealing with different network conditions. In particular, we propose a new WF attack called Triplet Fingerprinting (TF) that uses triplet networks for N-shot learning. We evaluate this attack in challenging settings such as where the training and testing data are collected multiple years apart on different networks, and we find that the TF attack remains effective in such settings with 85% accuracy or better. We also show that the TF attack is also effective in the open world and outperforms traditional transfer learning. On top of that, the attack requires only five examples to recognize a website, making it dangerous in a wide variety of scenarios where gathering and training on a complete dataset would be impractical.

DeMiCPU: Device Fingerprinting with Magnetic Signals Radiated by CPU

With the widespread use of smart devices, device authentication has received much attention. One popular method for device authentication is to utilize internally-measured device fingerprints, such as device ID, software or hardware-based characteristics. In this paper, we propose DeMiCPU, a stimulation-response-based device fingerprinting technique that relies on externally-measured information, i.e., magnetic induction (MI) signals emitted from the CPU module that consists of the CPU chip and its affiliated power supply circuits. The key insight of DeMiCPU is that hardware discrepancies essentially exist among CPU modules and thus the corresponding MI signals make promising device fingerprints, which are difficult to be modified or mimicked. We design a stimulation and a discrepancy extraction scheme and evaluate them with 90 mobile devices, including 70 laptops (among which 30 are of totally identical CPU and operating system) and 20 smartphones. The results show that DeMiCPU can achieve 99.1% precision and recall on average, and 98.6% precision and recall for the 30 identical devices, with a fingerprinting time of 0.6 s. In addition, the performance can be further improved to 99.9% with multi-round fingerprinting.

Session 6B: ML Security II

QUOTIENT: Two-Party Secure Neural Network Training and Prediction

Recently, there has been a wealth of effort devoted to the design of secure protocols for machine learning tasks. Much of this is aimed at enabling secure prediction from highly-accurate Deep Neural Networks (DNNs). However, as DNNs are trained on data, a key question is how such models can be also trained securely. The few prior works on secure DNN training have focused either on designing custom protocols for existing training algorithms, or on developing tailored training algorithms and then applying generic secure protocols. In this work, we investigate the advantages of designing training algorithms alongside a novel secure protocol, incorporating optimizations on both fronts. We present QUOTIENT, a new method for discretized training of DNNs, along with a customized secure two-party protocol for it. QUOTIENT incorporates key components of state-of-the-art DNN training such as layer normalization and adaptive gradient methods, and improves upon the state-of-the-art in DNN training in two-party computation. Compared to prior work, we obtain an improvement of 50X in WAN time and 6% in absolute accuracy.

Quantitative Verification of Neural Networks and Its Security Applications

Neural networks are increasingly employed in safety-critical domains. This has prompted interest in verifying or certifying logically encoded properties of neural networks. Prior work has largely focused on checking existential properties, wherein the goal is to check whether there exists any input that violates a given property of interest. However, neural network training is a stochastic process, and many questions arising in their analysis require probabilistic and quantitative reasoning, i.e., estimating how many inputs satisfy a given property. To this end, our paper proposes a novel and principled framework to quantitative verification of logical properties specified over neural networks. Our framework is the first to provide PAC-style soundness guarantees, in that its quantitative estimates are within a controllable and bounded error from the true count. We instantiate our algorithmic framework by building a prototype tool called NPAQ that enables checking rich properties over binarized neural networks. We show how emerging security analyses can utilize our framework in 3 applications: quantifying robustness to adversarial inputs, efficacy of trojan attacks, and fairness/bias of given neural networks.

ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation

This paper presents a technique to scan neural network based AI models to determine if they are trojaned. Pre-trained AI models may contain back-doors that are injected through training or by transforming inner neuron weights. These trojaned models operate normally when regular inputs are provided, and mis-classify to a specific output label when the input is stamped with some special pattern called trojan trigger. We develop a novel technique that analyzes inner neuron behaviors by determining how output activations change when we introduce different levels of stimulation to a neuron. The neurons that substantially elevate the activation of a particular output label regardless of the provided input is considered potentially compromised. Trojan trigger is then reverse-engineered through an optimization procedure using the stimulation analysis results, to confirm that a neuron is truly compromised. We evaluate our system ABS on 177 trojaned models that are trojaned with various attack methods that target both the input space and the feature space, and have various trojan trigger sizes and shapes, together with 144 benign models that are trained with different data and initial weight values. These models belong to 7 different model structures and 6 different datasets, including some complex ones such as ImageNet, VGG-Face and ResNet110. Our results show that ABS is highly effective, can achieve over 90% detection rate for most cases (and many 100%), when only one input sample is provided for each output label. It substantially out-performs the state-of-the-art technique Neural Cleanse that requires a lot of input samples and small trojan triggers to achieve good performance.

Lifelong Anomaly Detection Through Unlearning

Anomaly detection is essential towards ensuring system security and reliability. Powered by constantly generated system data, deep learning has been found both effective and flexible to use, with its ability to extract patterns without much domain knowledge. Existing anomaly detection research focuses on a scenario referred to as zero-positive, which means that the detection model is only trained for normal (i.e., negative) data. In a real application scenario, there may be additional manually inspected positive data provided after the system is deployed. We refer to this scenario as lifelong anomaly detection. However, we find that existing approaches are not easy to adopt such new knowledge to improve system performance. In this work, we are the first to explore the lifelong anomaly detection problem, and propose novel approaches to handle corresponding challenges. In particular, we propose a framework called unlearning, which can effectively correct the model when a false negative (or a false positive) is labeled. To this aim, we develop several novel techniques to tackle two challenges referred to as exploding loss and catastrophic forgetting. In addition, we abstract a theoretical framework based on generative models. Under this framework, our unlearning approach can be presented in a generic way to be applied to most zero-positive deep learning-based anomaly detection algorithms to turn them into corresponding lifelong anomaly detection solutions. We evaluate our approach using two state-of-the-art zero-positive deep learning anomaly detection architectures and three real-world tasks. The results show that the proposed approach is able to significantly reduce the number of false positives and false negatives through unlearning.

Session 6E: Passwords and Accounts

How to (not) Share a Password: Privacy Preserving Protocols for Finding Heavy Hitters with Adversarial Behavior

Bad choices of passwords were and are a pervasive problem. Users choosing weak passwords do not only compromise themselves, but the whole ecosystem. E.g, common and default passwords in IoT devices were exploited by hackers to create botnets and mount severe attacks on large Internet services, such as the Mirai botnet DDoS attack. We present a method to help protect the Internet from such large scale attacks. Our method enables a server to identify popular passwords (heavy hitters), and publish a list of over-popular passwords that must be avoided. This filter ensures that no single password can be used to compromise a large percentage of the users. The list is dynamic and can be changed as new users are added or when current users change their passwords. We apply maliciously secure two-party computation and differential privacy to protect the users’ password privacy. Our solution does not require extra hardware or cost, and is transparent to the user. Our private heavy hitters construction is secure even against a malicious coalition of devices which tries to manipulate the protocol to hide the popularity of some password that the attacker is exploiting. It also ensures differential privacy under continual observation of the blacklist as it changes over time. As a reality check we conducted three tests: computed the guarantees that the system provides wrt a few publicly available databases, ran full simulations on those databases, and implemented and analyzed a proof-of-concept on an IoT device. Our construction can also be used in other settings to privately learn heavy hitters in the presence of an active malicious adversary. E.g., learning the most popular sites accessed by the Tor network.

Protocols for Checking Compromised Credentials

To prevent credential stuffing attacks, industry best practice now proactively checks if user credentials are present in known data breaches. Recently, some web services, such as HaveIBeenPwned (HIBP) and Google Password Checkup (GPC), have started providing APIs to check for breached passwords. We refer to such services as compromised credential checking (C3) services. We give the first formal description of C3 services, detailing different settings and operational requirements, and we give relevant threat models. One key security requirement is the secrecy of a user’s passwords that are being checked. Current widely deployed C3 services have the user share a small prefix of a hash computed over the user’s password. We provide a framework for empirically analyzing the leakage of such protocols, showing that in some contexts knowing the hash prefixes leads to a 12x increase in the efficacy of remote guessing attacks. We propose two new protocols that provide stronger protection for users’ passwords, implement them, and show experimentally that they remain practical to deploy.

User Account Access Graphs

The primary authentication method for a user account is rarely the only way to access that account. Accounts can often be accessed through other accounts, using recovery methods, password managers, or single sign-on. This increases each account’s attack surface, giving rise to subtle security problems. These problems cannot be detected by considering each account in isolation, but require analyzing the links between a user’s accounts. Furthermore, to accurately assess the security of accounts, the physical world must also be considered. For example, an attacker with access to a physical mailbox could obtain credentials sent by post.

Despite the manifest importance of understanding these interrelationships and the security problems they entail, no prior methods exist to perform an analysis thereof in a precise way. To address this need, we introduce account access graphs, the first formalism that enables a comprehensive modeling and analysis of a user’s entire setup, incorporating all connections between the user’s accounts, devices, credentials, keys, and documents. Account access graphs support systematically identifying both security vulnerabilities and lockout risks in a user’s accounts. We give analysis algorithms and illustrate their effectiveness in a case study, where we automatically detect significant weaknesses in a user’s setup and suggest improvement options.

Detecting Fake Accounts in Online Social Networks at the Time of Registrations

Online social networks are plagued by fake information. In particu- lar, using massive fake accounts (also called Sybils), an attacker can disrupt the security and privacy of benign users by spreading spam, malware, and disinformation. Existing Sybil detection methods rely on rich content, behavior, and/or social graphs generated by Sybils. The key limitation of these methods is that they incur significant delays in catching Sybils, i.e., Sybils may have already performed many malicious activities when being detected. In this work, we propose Ianus, a Sybil detection method that leverages account registration information. Ianus aims to catch Sybils immediately after they are registered. First, using a real- world registration dataset with labeled Sybils from WeChat (the largest online social network in China), we perform a measurement study to characterize the registration patterns of Sybils and benign users. We find that Sybils tend to have synchronized and abnormal registration patterns. Second, based on our measurement results, we model Sybil detection as a graph inference problem, which allows us to integrate heterogeneous features. In particular, we extract synchronization and anomaly based features for each pair of accounts, use the features to build a graph in which Sybils are densely connected with each other while a benign user is isolated or sparsely connected with other benign users and Sybils, and finally detect Sybils via analyzing the structure of the graph. We evaluate Ianus using real-world registration datasets of WeChat. Moreover, WeChat has deployed Ianus on a daily basis, i.e., WeChat uses Ianus to analyze newly registered accounts on each day and detect Sybils. Via manual verification by the WeChat security team, we find that Ianus can detect around 400K per million new registered accounts each day and achieve a precision of over 96% on average.

Session 8A: Attack II

Gollum: Modular and Greybox Exploit Generation for Heap Overflows in Interpreters

We present the first approach to automatic exploit generation for heap overflows in interpreters. It is also the first approach to exploit generation in any class of program that integrates a solution for automatic heap layout manipulation. At the core of the approach is a novel method for discovering exploit primitives—inputs to the target program that result in a sensitive operation, such as a function call or a memory write, utilizing attacker-injected data. To produce an exploit primitive from a heap overflow vulnerability, one has to discover a target data structure to corrupt, ensure an instance of that data structure is adjacent to the source of the overflow on the heap, and ensure that the post-overflow corrupted data is used in a manner desired by the attacker. Our system addresses all three tasks in an automatic, greybox, and modular manner. Our implementation is called GOLLUM, and we demonstrate its capabilities by producing exploits from 10 unique vulnerabilities in the PHP and Python interpreters, 5 of which do not have existing public exploits.

SLAKE: Facilitating Slab Manipulation for Exploiting Vulnerabilities in the Linux Kernel Share on

To determine the exploitability for a kernel vulnerability, a secu- rity analyst usually has to manipulate slab and thus demonstrate the capability of obtaining the control over a program counter or performing privilege escalation. However, this is a lengthy process because (1) an analyst typically has no clue about what objects and system calls are useful for kernel exploitation and (2) he lacks the knowledge of manipulating a slab and obtaining the desired layout. In the past, researchers have proposed various techniques to facilitate exploit development. Unfortunately, none of them can be easily applied to address these challenges. On the one hand, this is because of the complexity of the Linux kernel. On the other hand, this is due to the dynamics and non-deterministic of slab variations. In this work, we tackle the challenges above from two perspectives. First, we use static and dynamic analysis techniques to explore the kernel objects, and the corresponding system calls useful for exploitation. Second, we model commonly-adopted exploitation methods and develop a technical approach to facilitate the slab layout adjustment. By extending LLVM as well as Syzkaller, we implement our techniques and name their combination after SLAKE. We evaluate SLAKE by using 27 real-world kernel vulnerabilities, demonstrating that it could not only diversify the ways to perform kernel exploitation but also sometimes escalate the exploitability of kernel vulnerabilities.

Session 8E: Web Security

HideNoSeek: Camouflaging Malicious JavaScript in Benign ASTs

In the malware field, learning-based systems have become popular to detect new malicious variants. Nevertheless, attackers with specific and internal knowledge of a target system may be able to produce input samples which are misclassified. In practice, the assumption of strong attackers is not realistic as it implies access to insider information. We instead propose HideNoSeek, a novel and generic camouflage attack, which evades the entire class of detectors based on syntactic features, without needing any information about the system it is trying to evade. Our attack consists of changing the constructs of malicious JavaScript samples to reproduce a benign syntax. For this purpose, we automatically rewrite the Abstract Syntax Trees (ASTs) of malicious JavaScript inputs into existing benign ones. In particular, HideNoSeek uses malicious seeds and searches for isomorphic subgraphs between the seeds and traditional benign scripts. Specifically, it replaces benign sub-ASTs by their malicious equivalents (same syntactic structure) and adjusts the benign data dependencies–without changing the AST–so that the malicious semantics is kept. In practice, we leveraged 23 malicious seeds to generate 91,020 malicious scripts, which perfectly reproduce ASTs of Alexa top 10,000 web pages. Also, we can produce on average 14 different malicious samples with the same AST as each Alexa top 10. Overall, a standard trained classifier has 99.98% false negatives with HideNoSeek inputs, while a classifier trained on such samples has over 88.74% false positives, rendering the targeted static detectors unreliable.

Your Cache Has Fallen: Cache-Poisoned Denial-of-Service Attack ✔👍

Web caching enables the reuse of HTTP responses with the aim to reduce the number of requests that reach the origin server, the volume of network traffic resulting from resource requests, and the user-perceived latency of resource access. For these reasons, a cache is a key component in modern distributed systems as it enables applications to scale at large. In addition to optimizing performance metrics, caches promote additional protection against Denial of Service (DoS) attacks. In this paper we introduce and analyze a new class of web cache poisoning attacks. By provoking an error on the origin server that is not detected by the intermediate caching system, the cache gets poisoned with the server-generated error page and instrumented to serve this useless content instead of the intended one, rendering the victim service unavailable. In an extensive study of fifteen web caching solutions we analyzed the negative impact of the CachePoisoned DoS (CPDoS) attack-as we coined it. We show the practical relevance by identifying one proxy cache product and five CDN services that are vulnerable to CPDoS. Amongst them are prominent solutions that in turn cache high-value websites. The consequences are severe as one simple request is sufficient to paralyze a victim website within a large geographical region. The awareness of the newly introduced CPDoS attack is highly valuable for researchers for obtaining a comprehensive understanding of causes and countermeasures as well as practitioners for implementing robust and secure distributed systems.

Session 9A: User Study

Passwords are an often unavoidable authentication mechanism, despite the availability of additional alternative means. In the case of smartphones, usability problems are aggravated because interaction happens through small screens and multilayer keyboards. While password managers (PMs) can improve this situation and contribute to hardening security, their adoption is far from widespread. To understand the underlying reasons, we conducted the first empirical usability study of mobile PMs, covering both quantitative and qualitative evaluations. Our findings show that popular PMs are barely acceptable according to the standard System Usability Scale, and that there are three key areas for improvement: integration with external applications, security, and user guidance and interaction. We build on the collected evidence to suggest recommendations that can fill this gap.

Matched and Mismatched SOCs: A Qualitative Study on Security Operations Center Issues

Organizations, such as companies and governments, created Security Operations Centers (SOCs) to defend against computer security attacks. SOCs are central defense groups that focus on security incident management with capabilities such as monitoring, preventing, responding, and reporting. They are one of the most critical defense components of a modern organization’s defense. Despite their critical importance to organizations, and the high frequency of reported security incidents, only a few research studies focus on problems specific to SOCs. In this study, to understand and identify the issues of SOCs, we conducted 18 semi-structured interviews with SOC analysts and managers who work for organizations from different industry sectors. Through our analysis of the interview data, we identified technical and non-technical issues that exist in SOC. Moreover, we found inherent disagreements between SOC managers and their analysts that, if not addressed, could entail a risk to SOC efficiency and effectiveness. We distill these issues into takeaways that apply both to future academic research and to SOC management. We believe that research should focus on improving the efficiency and effectiveness of SOCs.

A Usability Evaluation of Let’s Encrypt and Certbot: Usable Security Done Right

The correct configuration of HTTPS is a complex set of tasks, which many administrators have struggled with in the past. Let’s Encrypt and Electronic Frontier Foundation’s Certbot aim to improve the TLS ecosystem by offering free trusted certificates (Let’s Encrypt) and by providing user-friendly support to configure and harden TLS (Certbot). Although adoption rates have increased, to date, there has been only a little scientific evidence of the actual usability and security benefits of this semi-automated approach. Therefore, we conducted a randomized control trial to evaluate the usability of Let’s Encrypt and Certbot in comparison to the traditional certificate authority approach. We performed a within-subjects lab study with 31 participants. The study sheds light on the security and usability enhancements that Let’s Encrypt and Certbot provide. We highlight how usability improvements aimed at administrators can have a large impact on security and discuss takeaways for Certbot and other security-related tasks that experts struggle with.

Session 9B: ML Security III

Seeing isn’t Believing: Towards More Robust Adversarial Attack Against Real World Object Detectors

Recently Adversarial Examples (AEs) that deceive deep learning models have been a topic of intense research interest. Compared with the AEs in the digital space, the physical adversarial attack is considered as a more severe threat to the applications like face recognition in authentication, objection detection in autonomous driving cars, etc. In particular, deceiving the object detectors practically, is more challenging since the relative position between the object and the detector may keep changing. Existing works attacking object detectors are still very limited in various scenarios, e.g., varying distance and angles, etc. In this paper, we presented systematic solutions to build robust and practical AEs against real world object detectors. Particularly, for Hiding Attack (HA), we proposed thefeature-interference reinforcement (FIR) method and theenhanced realistic constraints generation (ERG) to enhance robustness, and for Appearing Attack (AA), we proposed thenested-AE, which combines two AEs together to attack object detectors in both long and short distance. We also designed diverse styles of AEs to make AA more surreptitious. Evaluation results show that our AEs can attack the state-of-the-art real-time object detectors (i.e., YOLO V3 and faster-RCNN) at the success rate up to 92.4% with varying distance from 1m to 25m and angles from -60º to 60º. Our AEs are also demonstrated to be highly transferable, capable of attacking another three state-of-the-art black-box models with high success rate.

AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning

Perceptual ad-blocking is a novel approach that detects online advertisements based on their visual content. Compared to traditional filter lists, the use of perceptual signals is believed to be less prone to an arms race with web publishers and ad networks. We demonstrate that this may not be the case. We describe attacks on multiple perceptual ad-blocking techniques, and unveil a new arms race that likely disfavors ad-blockers. Unexpectedly, perceptual ad-blocking can also introduce new vulnerabilities that let an attacker bypass web security boundaries and mount DDoS attacks. We first analyze the design space of perceptual ad-blockers and present a unified architecture that incorporates prior academic and commercial work. We then explore a variety of attacks on the ad-blocker’s detection pipeline, that enable publishers or ad networks to evade or detect ad-blocking, and at times even abuse its high privilege level to bypass web security boundaries. On one hand, we show that perceptual ad-blocking must visually classify rendered web content to escape an arms race centered on obfuscation of page markup. On the other, we present a concrete set of attacks on visual ad-blockers by constructing adversarial examples in a real web page context. For seven ad-detectors, we create perturbed ads, ad-disclosure logos, and native web content that misleads perceptual ad-blocking with 100% success rates. In one of our attacks, we demonstrate how a malicious user can upload adversarial content, such as a perturbed image in a Facebook post, that fools the ad-blocker into removing another users’ non-ad content. Moving beyond the Web and visual domain, we also build adversarial examples for AdblockRadio, an open source radio client that uses machine learning to detects ads in raw audio streams.

Attacking Graph-based Classification via Manipulating the Graph Structure

Graph-based classification methods are widely used for security analytics. Roughly speaking, graph-based classification methods include collective classification and graph neural network. Attacking a graph-based classification method enables an attacker to evade detection in security analytics. However, existing adversarial machine learning studies mainly focused on machine learning for non-graph data. Only a few recent studies touched adversarial graph-based classification methods. However, they focused on graph neural network, leaving collective classification largely unexplored. We aim to bridge this gap in this work. We consider an attacker’s goal is to evade detection via manipulating the graph structure. We formulate our attack as a graph-based optimization problem, solving which produces the edges that an attacker needs to manipulate to achieve its attack goal. However, it is computationally challenging to solve the optimization problem exactly. To address the challenge, we propose several approximation techniques to solve the optimization problem. We evaluate our attacks and compare them with a recent attack designed for graph neural networks using four graph datasets. Our results show that our attacks can effectively evade graph-based classification methods. Moreover, our attacks outperform the existing attack for evading collective classification methods and some graph neural network methods.

Latent Backdoor Attacks on Deep Neural Networks

Recent work proposed the concept of backdoor attacks on deep neural networks (DNNs), where misclassification rules are hidden inside normal models, only to be triggered by very specific inputs. However, these “traditional” backdoors assume a context where users train their own models from scratch, which rarely occurs in practice. Instead, users typically customize “Teacher” models already pretrained by providers like Google, through a process called transfer learning. This customization process introduces significant changes to models and disrupts hidden backdoors, greatly reducing the actual impact of backdoors in practice. In this paper, we describe latent backdoors, a more powerful and stealthy variant of backdoor attacks that functions under transfer learning. Latent backdoors are incomplete backdoors embedded into a “Teacher” model, and automatically inherited by multiple “Student” models through transfer learning. If any Student models include the label targeted by the backdoor, then its customization process completes the backdoor and makes it active. We show that latent backdoors can be quite effective in a variety of application contexts, and validate its practicality through real-world attacks against traffic sign recognition, iris identification of volunteers, and facial recognition of public figures (politicians). Finally, we evaluate 4 potential defenses, and find that only one is effective in disrupting latent backdoors, but might incur a cost in classification accuracy as tradeoff.

Session 9E: Web Censorship and Auditing

Geneva: Evolving Censorship Evasion Strategies

Researchers and censoring regimes have long engaged in a cat-and-mouse game, leading to increasingly sophisticated Internet-scale censorship techniques and methods to evade them. In this paper, we take a drastic departure from the previously manual evade-detect cycle by developing techniques to automate the discovery of censorship evasion strategies. We present Geneva, a novel genetic algorithm that evolves packet-manipulation-based censorship evasion strategies against nation-state level censors. Geneva composes, mutates, and evolves sophisticated strategies out of four basic packet manipulation primitives (drop, tamper headers, duplicate, and fragment). With experiments performed both in-lab and against several real censors (in China, India, and Kazakhstan), we demonstrate that Geneva is able to quickly and independently re-derive most strategies from prior work, and derive novel subspecies and altogether new species of packet manipulation strategies. Moreover, Geneva discovers successful strategies that prior work posited were not effective, and evolves extinct strategies into newly working variants. We analyze the novel strategies Geneva creates to infer previously unknown behavior in censors. Geneva is a first step towards automating censorship evasion; to this end, we have made our code and data publicly available.

Conjure: Summoning Proxies from Unused Address Space

Refraction Networking (formerly known as “Decoy Routing”) has emerged as a promising next-generation approach for circumventing Internet censorship. Rather than trying to hide individual circumvention proxy servers from censors, proxy functionality is implemented in the core of the network, at cooperating ISPs in friendly countries. Any connection that traverses these ISPs could be a conduit for the free flow of information, so censors cannot easily block access without also blocking many legitimate sites. While one Refraction scheme, TapDance, has recently been deployed at ISP-scale, it suffers from several problems: a limited number of “decoy” sites in realistic deployments, high technical complexity, and undesirable tradeoffs between performance and observability by the censor. These challenges may impede broader deployment and ultimately allow censors to block such techniques. We present Conjure, an improved Refraction Networking approach that overcomes these limitations by leveraging unused address space at deploying ISPs. Instead of using real websites as the decoy destinations for proxy connections, our scheme connects to IP addresses where no web server exists leveraging proxy functionality from the core of the network. These phantom hosts are difficult for a censor to distinguish from real ones, but can be used by clients as proxies. We define the Conjure protocol, analyze its security, and evaluate a prototype using an ISP testbed. Our results suggest that Conjure can be harder to block than TapDance, is simpler to maintain and deploy, and offers substantially better network performance.

You Shall Not Join: A Measurement Study of Cryptocurrency Peer-to-Peer Bootstrapping Techniques

Cryptocurrencies are digital assets which depend upon the use of distributed peer-to-peer networks. The method a new peer uses to initially join a peer-to-peer network is known as bootstrapping. The ability to bootstrap without the use of a centralized resource is an unresolved challenge. In this paper we survey the bootstrapping techniques used by 74 cryptocurrencies and find that censorship-prone methods such as DNS seeding and IP hard-coding are the most prevalent. In response to this finding, we test two other bootstrapping techniques less susceptible to censorship, Tor and ZMap, to determine if they are operationally feasible alternatives more resilient to censorship. We perform a global measurement study of DNS query responses for each the 92 DNS seeds discovered across 42 countries using the distributed RIPE Atlas network. This provides details of each cryptocurrencies’ peer-to-peer network topology and also highlights instances of DNS outages and query manipulation impacting the bootstrapping process. Our study also reveals that the source code of the cryptocurrencies researched comes from only five main repositories; hence accounting for the inheritance of legacy bootstrapping methods. Finally, we discuss the implications of our findings and provide recommendations to mitigate the risks exposed.

SAMPL: Scalable Auditability of Monitoring Processes using Public Ledgers

Organized surveillance, especially by governments poses a major challenge to individual privacy, due to the resources governments have at their disposal, and the possibility of overreach. Given the impact of invasive monitoring, in most democratic countries, government surveillance is, in theory, monitored and subject to public oversight to guard against violations. In practice, there is a difficult fine balance between safeguarding individual’s privacy rights and not diluting the efficacy of national security investigations, as exemplified by reports on government surveillance programs that have caused public controversy, and have been challenged by civil and privacy rights organizations. Surveillance is generally conducted through a mechanism where federal agencies obtain a warrant from a federal or state judge (e.g., the US FISA court, Supreme Court in Canada) to subpoena a company or service-provider (e.g., Google, Microsoft) for their customers’ data. The courts provide annual statistics on the requests (accepted, rejected), while the companies provide annual transparency reports for public auditing. However, in practice, the statistical information provided by the courts and companies is at a very high level, generic, is released after-the-fact, and is inadequate for auditing the operations. Often this is attributed to the lack of scalable mechanisms for reporting and transparent auditing. In this paper, we present SAMPL, a novel auditing framework which leverages cryptographic mechanisms, such as zero knowledge proofs, Pedersen commitments, Merkle trees, and public ledgers to create a scalable mechanism for auditing electronic surveillance processes involving multiple actors. SAMPL is the first framework that can identify the actors (e.g., agencies and companies) that violate the purview of the court orders. We experimentally demonstrate the scalability for SAMPL for handling concurrent monitoring processes without undermining their secrecy and auditability.


2018 program(有视频) dblp

Machine Learning

AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation

We present AI2, the first sound and scalable analyzer for deep neural networks. Based on overapproximation, AI2 can automatically prove safety properties (e.g., robustness) of realistic neural networks (e.g., convolutional neural networks).

The key insight behind AI2 is to phrase reasoning about safety and robustness of neural networks in terms of classic abstract interpretation, enabling us to leverage decades of advances in that area. Concretely, we introduce abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers. This allows us to handle real-world neural networks, which are often built out of those types of layers.

We present a complete implementation of AI2 together with an extensive evaluation on 20 neural networks. Our results demonstrate that: (i) AI2 is precise enough to prove useful specifications (e.g., robustness), (ii) AI2 can be used to certify the effectiveness of state-of-the-art defenses for neural networks, (iii) AI2 is significantly faster than existing analyzers based on symbolic analysis, which often take hours to verify simple fully connected networks, and (iv) AI2 can handle deep convolutional networks, which are beyond the reach of existing methods.

Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning

As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms. In this paper, we perform the first systematic study of poisoning attacks and their countermeasures for linear regression models. In poisoning attacks, attackers deliberately influence the training data to manipulate the results of a predictive model. We propose a theoretically-grounded optimization framework specifically designed for linear regression and demonstrate its effectiveness on a range of datasets and models. We also introduce a fast statistical attack that requires limited knowledge of the training process. Finally, we design a new principled defense method that is highly resilient against all poisoning attacks. We provide formal guarantees about its convergence and an upper bound on the effect of poisoning attacks when the defense is deployed. We evaluate extensively our attacks and defenses on three realistic datasets from health care, loan assessment, and real estate domains.

Stealing Hyperparameters in Machine Learning

Hyperparameters are critical in machine learning, as different hyperparameters often result in models with significantly different performance. Hyperparameters may be deemed confidential because of their commercial value and the confidentiality of the proprietary algorithms that the learner uses to learn them. In this work, we propose attacks on stealing the hyperparameters that are learned by a learner. We call our attacks hyperparameter stealing attacks. Our attacks are applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network. We evaluate the effectiveness of our attacks both theoretically and empirically. For instance, we evaluate our attacks on Amazon Machine Learning. Our results demonstrate that our attacks can accurately steal hyperparameters. We also study countermeasures. Our results highlight the need for new defenses against our hyperparameter stealing attacks for certain machine learning algorithms.

A Machine Learning Approach To Prevent Malicious Calls Over Telephony Networks

Malicious calls, i.e., telephony spams and scams, have been a long-standing challenging issue that causes billions of dollars of annual financial loss worldwide. This work presents the first machine learning-based solution without relying on any particular assumptions on the underlying telephony network infrastructures.

The main challenge of this decade-long problem is that it is unclear how to construct effective features without the access to the telephony networks’ infrastructures. We solve this problem by combining several innovations. We first develop a TouchPal user interface on top of a mobile App to allow users tagging malicious calls. This allows us to maintain a large-scale call log database. We then conduct a measurement study over three months of call logs, including 9 billion records. We design 29 features based on the results, so that machine learning algorithms can be used to predict malicious calls. We extensively evaluate different state-of-the-art machine learning approaches using the proposed features, and the results show that the best approach can reduce up to 90% unblocked malicious calls while maintaining a precision over 99.99% on the benign call traffic. The results also show the models are efficient to implement without incurring a significant latency overhead. We also conduct ablation analysis, which reveals that using 10 out of the 29 features can reach a performance comparable to using all features.

Surveylance: Automatically Detecting Online Survey Scams

Online surveys are a popular mechanism for performing market research in exchange for monetary compensation. Unfortunately, fraudulent survey websites are similarly rising in popularity among cyber-criminals as a means for executing social engineering attacks. In addition to the sizable population of users that participate in online surveys as a secondary revenue stream, unsuspecting users who search the web for free content or access codes to commercial software can also be exposed to survey scams. This occurs through redirection to websites that ask the user to complete a survey in order to receive the promised content or a reward.

In this paper, we present SURVEYLANCE , the first system that automatically identifies survey scams using machine learning techniques. Our evaluation demonstrates that SURVEYLANCE works well in practice by identifying 8,623 unique websites involved in online survey attacks. We show that SURVEYLANCE is suitable for assisting human analysts in survey scam detection at scale. Our work also provides the first systematic analysis of the survey scam ecosystem by investigating the capabilities of these services, mapping all the parties involved in the ecosystem, and quantifying the consequences to users that are exposed to these services. Our analysis reveals that a large number of survey scams are easily reachable through the Alexa top 30K websites, and expose users to a wide range of security issues including identity fraud, deceptive advertisements, potentially unwanted programs (PUPs), malicious extensions, and malware .

Understanding Users

Hackers vs. Testers: A Comparison of Software Vulnerability Discovery Processes

Identifying security vulnerabilities in software is a critical task that requires significant human effort. Currently, vulnerability discovery is often the responsibility of software testers before release and white-hat hackers (often within bug bounty programs) afterward. This arrangement can be ad-hoc and far from ideal; for example, if testers could identify more vulnerabilities, software would be more secure at release time. Thus far, however, the processes used by each group - and how they compare to and interact with each other - have not been well studied. This paper takes a first step toward better understanding, and eventually improving, this ecosystem: we report on a semi-structured interview study (n=25) with both testers and hackers, focusing on how each group finds vulnerabilities, how they develop their skills, and the challenges they face. The results suggest that hackers and testers follow similar processes, but get different results due largely to differing experiences and therefore different underlying knowledge of security concepts. Based on these results, we provide recommendations to support improved security training for testers, better communication between hackers and developers, and smarter bug bounty policies to motivate hacker participation.

Towards Security and Privacy for Multi-User Augmented Reality: Foundations with End Users

Immersive augmented reality (AR) technologies are becoming a reality. Prior works have identified security and privacy risks raised by these technologies, primarily considering individual users or AR devices. However, we make two key observations: (1) users will not always use AR in isolation, but also in ecosystems of other users, and (2) since immersive AR devices have only recently become available, the risks of AR have been largely hypothetical to date.
To provide a foundation for understanding and addressing the security and privacy challenges of emerging AR technologies, grounded in the experiences of real users, we conduct a qualitative lab study with an immersive AR headset, the Microsoft HoloLens. We conduct our study in pairs - 22 participants across 11 pairs - wherein participants engage in paired and individual (but physically co-located) HoloLens activities. Through semi-structured interviews, we explore participants’ security, privacy, and other concerns, raising key findings. For example, we find that despite the HoloLens’s limitations, participants were easily immersed, treating virtual objects as real (e.g., stepping around them for fear of tripping). We also uncover numerous security, privacy, and safety concerns unique to AR (e.g., deceptive virtual objects misleading users about the real world), and a need for access control among users to manage shared physical spaces and virtual content embedded in those spaces. Our findings give us the opportunity to identify broader lessons and key challenges to inform the design of emerging single- and multi-user AR technologies.

Computer Security and Privacy for Refugees in the United States

In this work, we consider the computer security and privacy practices and needs of recently resettled refugees in the United States. We ask: How do refugees use and rely on technology as they settle in the US? What computer security and privacy practices do they have, and what barriers do they face that may put them at risk? And how are their computer security mental models and practices shaped by the advice they receive? We study these questions through in-depth qualitative interviews with case managers and teachers who work with refugees at a local NGO, as well as through focus groups with refugees themselves. We find that refugees must rely heavily on technology (e.g., email) as they attempt to establish their lives and find jobs; that they also rely heavily on their case managers and teachers for help with those technologies; and that these pressures can push security practices into the background or make common security “best practices’’ infeasible. At the same time, we identify fundamental challenges to computer security and privacy for refugees, including barriers due to limited technical expertise, language skills, and cultural knowledge–for example, we find that scams as a threat are a new concept for many of the refugees we studied, and that many common security practices (e.g., password creation techniques and security questions) rely on US cultural knowledge. From these and other findings, we distill recommendations for the computer security community to better serve the computer security and privacy needs and constraints of refugees, a potentially vulnerable population that has not been previously studied in this context.

On Enforcing the Digital Immunity of a Large Humanitarian Organization

Humanitarian action, the process of aiding individuals in situations of crises, poses unique information-security challenges due to natural or manmade disasters, the adverse environments in which it takes place, and the scale and multi-disciplinary nature of the problems. Despite these challenges, humanitarian organizations are transitioning towards a strong reliance on the digitization of collected data and digital tools, which improves their effectiveness but also exposes them to computer security threats. In this paper, we conduct a qualitative analysis of the computer-security challenges of the International Committee of the Red Cross (ICRC), a large humanitarian organization with over sixteen thousand employees, an international legal personality, which involves privileges and immunities, and over 150 years of experience with armed conflicts and other situations of violence worldwide. To investigate the computer security needs and practices of the ICRC from an operational, technical, legal, and managerial standpoint by considering individual, organizational, and governmental levels, we interviewed 27 field workers, IT staff, lawyers, and managers. Our results provide a first look at the unique security and privacy challenges that humanitarian organizations face when collecting, processing, transferring, and sharing data to enable humanitarian action for a multitude of sensitive activities. These results highlight, among other challenges, the trade offs between operational security and requirements stemming from all stakeholders, the legal barriers for data sharing among jurisdictions; especially, the need to complement privileges and immunities with robust technological safeguards in order to avoid any leakages that might hinder access and potentially compromise the neutrality, impartiality, and independence of humanitarian action.

The Spyware Used in Intimate Partner Violence

Survivors of intimate partner violence increasingly report that abusers install spyware on devices to track their location, monitor communications, and cause emotional and physical harm. To date there has been only cursory investigation into the spyware used in such intimate partner surveillance (IPS). We provide the first in-depth study of the IPS spyware ecosystem. We design, implement, and evaluate a measurement pipeline that combines web and app store crawling with machine learning to find and label apps that are potentially dangerous in IPS contexts. Ultimately we identify several hundred such IPS-relevant apps. While we find dozens of overt spyware tools, the majority are “dual-use” apps - they have a legitimate purpose (e.g., child safety or anti-theft), but are easily and effectively repurposed for spying on a partner. We document that a wealth of online resources are available to educate abusers about exploiting apps for IPS. We also show how some dual-use app developers are encouraging their use in IPS via advertisements, blogs, and customer support services. We analyze existing anti-virus and anti-spyware tools, which universally fail to identify dual-use apps as a threat.

Networked Systems

Distance-Bounding Protocols: Verification without Time and Location

Distance-bounding protocols are cryptographic protocols that securely establish an upper bound on the physical distance between the participants. Existing symbolic verification frameworks for distance-bounding protocols consider timestamps and the location of agents. In this work we introduce a causality-based characterization of secure distance-bounding that discards the notions of time and location. This allows us to verify the correctness of distance-bounding protocols with standard protocol verification tools. That is to say, we provide the first fully automated verification framework for distance-bounding protocols. By using our framework, we confirmed known vulnerabilities in a number of protocols and discovered unreported attacks against two recently published protocols.

Sonar: Detecting SS7 Redirection Attacks With Audio-Based Distance Bounding

The global telephone network is relied upon by billions every day. Central to its operation is the Signaling System 7 (SS7) protocol, which is used for setting up calls, managing mobility, and facilitating many other network services. This protocol was originally built on the assumption that only a small number of trusted parties would be able to directly communicate with its core infrastructure. As a result, SS7 — as a feature — allows all parties with core access to redirect and intercept calls for any subscriber anywhere in the world. Unfortunately, increased interconnectivity with the SS7 network has led to a growing number of illicit call redirection attacks. We address such attacks with Sonar, a system that detects the presence of SS7 redirection attacks by securely measuring call audio round-trip times between telephony devices. This approach works because redirection attacks force calls to travel longer physical distances than usual, thereby creating longer end-to-end delay. We design and implement a distance bounding-inspired protocol that allows us to securely characterize the round-trip time between the two endpoints. We then use custom hardware deployed in 10 locations across the United States and a redirection testbed to characterize how distance affects round trip time in phone networks. We develop a model using this testbed and show Sonar is able to detect 70.9% of redirected calls between call endpoints of varying attacker proximity (300–7100 miles) with low false positive rates (0.3%). Finally, we ethically perform actual SS7 redirection attacks on our own devices with the help of an industry partner to demonstrate that Sonar detects 100% of such redirections in a real network (with no false positives). As such, we demonstrate that telephone users can reliably detect SS7 redirection attacks and protect the integrity of their calls.

OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding

Designing a secure permissionless distributed ledger (blockchain) that performs on par with centralized payment processors, such as Visa, is a challenging task. Most existing distributed ledgers are unable to scale-out, i.e., to grow their total processing capacity with the number of validators; and those that do, compromise security or decentralization. We present OmniLedger, a novel scale-out distributed ledger that preserves long- term security under permissionless operation. It ensures security and correctness by using a bias-resistant public-randomness protocol for choosing large, statistically representative shards that process transactions, and by introducing an efficient cross- shard commit protocol that atomically handles transactions affecting multiple shards. OmniLedger also optimizes performance via parallel intra-shard transaction processing, ledger pruning via collectively-signed state blocks, and low-latency “trust-but- verify” validation for low-value transactions. An evaluation of our experimental prototype shows that OmniLedger’s throughput scales linearly in the number of active validators, supporting Visa-level workloads and beyond, while confirming typical transactions in under two seconds.

Routing Around Congestion: Defeating DDoS Attacks and Adverse Network Conditions via Reactive BGP Routing

In this paper, we present Nyx, the first system to both effectively mitigate modern Distributed Denial of Service (DDoS) attacks regardless of the amount of traffic under adversarial control and function without outside cooperation or an Internet redesign. Nyx approaches the problem of DDoS mitigation as a routing problem rather than a filtering problem. This conceptual shift allows Nyx to avoid many of the common shortcomings of existing academic and commercial DDoS mitigation systems. By leveraging how Autonomous Systems (ASes) handle route advertisement in the existing Border Gateway Protocol (BGP), Nyx allows the deploying AS to achieve isolation of traffic from a critical upstream AS off of attacked links and onto alternative, uncongested, paths. This isolation removes the need for filtering or de-prioritizing attack traffic. Nyx controls outbound paths through normal BGP path selection, while return paths from critical ASes are controlled through the use of specific techniques we developed using existing traffic engineering principles and require no outside coordination. Using our own realistic Internet-scale simulator, we find that in more than 98% of cases our system can successfully route critical traffic around network segments under transit-link DDoS attacks; a new form of DDoS attack where the attack traffic never reaches the victim AS, thus invaliding defensive filtering, throttling, or prioritization strategies. More significantly, in over 95% of those cases, the alternate path provides complete congestion relief from transit-link DDoS. Nyx additionally provides complete congestion relief in over 75% of cases when the deployer is being directly attacked.

Tracking Ransomware End-to-end

Ransomware is a type of malware that encrypts the files of infected hosts and demands payment, often in a crypto-currency like Bitcoin. In this paper, we create a measurement framework that we use to perform a large-scale, two-year, end-to-end measurement of ransomware payments, victims, and operators. By combining an array of data sources, including ransomware binaries, seed ransom payments, victim telemetry from infections, and a large database of bitcoin addresses annotated with their owners, we sketch the outlines of this burgeoning ecosystem and associated third-party infrastructure. In particular, we are able to trace the financial transactions, from the acquisition of bitcoins by victims, through the payment of ransoms, to the cash out of bitcoins by the ransomware operators. We find that many ransomware operators cashed out using BTC-e, a now-defunct Bitcoin exchange. In total we are able to track over $16 million USD in likely ransom payments made by 19,750 potential victims during a two-year period. While our study focuses on ransomware, our methods are potentially applicable to other cybercriminal operations that have similarly adopted Bitcoin as their payment channel.

Program Analysis

The Rise of the Citizen Developer: Assessing the Security Impact of Online App Generators

Mobile apps are increasingly created using online application generators (OAGs) that automate app development, distribution, and maintenance. These tools significantly lower the level of technical skill that is required for app development, which makes them particularly appealing to citizen developers, i.e., developers with little or no software engineering background. However, as the pervasiveness of these tools increases, so does their overall influence on the mobile ecosystem’s security, as security lapses by such generators affect thousands of generated apps. The security of such generated apps, as well as their impact on the security of the overall app ecosystem, has not yet been investigated.

We present the first comprehensive classification of commonly used OAGs for Android and show how to fingerprint uniquely generated apps to link them back to their generator. We thereby quantify the market penetration of these OAGs based on a corpus of 2,291,898 free Android apps from Google Play and discover that at least 11.1% of these apps were created using OAGs. Using a combination of dynamic, static, and manual analysis, we find that the services’ app generation model is based on boilerplate code that is prone to reconfiguration attacks in 7/13 analyzed OAGs. Moreover, we show that this boilerplate code includes well-known security issues such as code injection vulnerabilities and insecure WebViews. Given the tight coupling of generated apps with their services’ backends, we further identify security issues in their infrastructure. Due to the blackbox development approach, citizen developers are unaware of these hidden problems that ultimately put the end-users sensitive data and privacy at risk and violate the user’s trust assumption. A particular worrisome result of our study is that OAGs indeed have a significant amplification factor for those vulnerabilities, notably harming the health of the overall mobile app ecosystem.

Learning from Mutants: Using Code Mutation to Learn and Monitor Invariants of a Cyber-Physical System

Cyber-physical systems (CPS) consist of sensors, actuators, and controllers all communicating over a network; if any subset becomes compromised, an attacker could cause significant damage. With access to data logs and a model of the CPS, the physical effects of an attack could potentially be detected before any damage is done. Manually building a model that is accurate enough in practice, however, is extremely difficult. In this paper, we propose a novel approach for constructing models of CPS automatically, by applying supervised machine learning to data traces obtained after systematically seeding their software components with faults (“mutants”). We demonstrate the efficacy of this approach on the simulator of a real-world water purification plant, presenting a framework that automatically generates mutants, collects data traces, and learns an SVM-based model. Using cross-validation and statistical model checking, we show that the learnt model characterises an invariant physical property of the system. Furthermore, we demonstrate the usefulness of the invariant by subjecting the system to 55 network and code-modification attacks, and showing that it can detect 85% of them from the data logs generated at runtime.

Precise and Scalable Detection of Double-Fetch Bugs in OS Kernels

During system call execution, it is common for operating system kernels to read userspace memory multiple times (multi-reads). A critical bug may exist if the fetched userspace memory is subject to change across these reads, i.e., a race condition, which is known as a double-fetch bug. Prior works have attempted to detect these bugs both statically and dynamically. However, due to their improper assumptions and imprecise definitions regarding double-fetch bugs, their multi-read detection is inherently limited and suffers from significant false positives and false negatives. For example, their approach is unable to support device emulation, inter-procedural analysis, loop handling, etc. More importantly, they completely leave the task of finding real double-fetch bugs from the haystack of multi-reads to manual verification, which is expensive if possible at all.

In this paper, we first present a formal and precise definition of double-fetch bugs and then implement a static analysis system -Deadline - to automatically detect double-fetch bugs in OS kernels. Deadline uses static program analysis techniques to systematically find multi-reads throughout the kernel and employs specialized symbolic checking to vet each multi-read for double-fetch bugs. We apply Deadline to Linux and FreeBSD kernels and find 23 new bugs in Linux and one new bug in FreeBSD. We further propose four generic strategies to patch and prevent double-fetch bugs based on our study and the discussion with kernel maintainers.

CollAFL: Path Sensitive Fuzzing

Coverage-guided fuzzing is a widely used and ef- fective solution to find software vulnerabilities. Tracking code coverage and utilizing it to guide fuzzing are crucial to coverage- guided fuzzers. However, tracking full and accurate path coverage is infeasible in practice due to the high instrumentation overhead. Popular fuzzers (e.g., AFL) often use coarse coverage information, e.g., edge hit counts stored in a compact bitmap, to achieve highly efficient greybox testing. Such inaccuracy and incompleteness in coverage introduce serious limitations to fuzzers. First, it causes path collisions, which prevent fuzzers from discovering potential paths that lead to new crashes. More importantly, it prevents fuzzers from making wise decisions on fuzzing strategies.
In this paper, we propose a coverage sensitive fuzzing solution CollAFL. It mitigates path collisions by providing more accurate coverage information, while still preserving low instrumentation overhead. It also utilizes the coverage information to apply three new fuzzing strategies, promoting the speed of discovering new paths and vulnerabilities. We emented a prototype of CollAFL based on the popular fuzzer AFL and evaluated it on 24 popular applications. The results showed that path collisions are common, i.e., up to 75% of edges could collide with others in some applications, and CollAFL could reduce the edge collision ratio to nearly zero. Moreover, armed with the three fuzzing strategies, CollAFL outperforms AFL in terms of both code coverage and vulnerability discovery. On average, CollAFL covered 20% more program paths, found 320% more unique crashes and 260% more bugs than AFL in 200 hours. In total, CollAFL found 157 new security bugs with 95 new CVEs assigned.

T-Fuzz: fuzzing by program transformation

Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

Fuzzing is a popular technique for finding software bugs. However, the performance of the state-of-the-art fuzzers leaves a lot to be desired. Fuzzers based on symbolic execution produce quality inputs but run slow, while fuzzers based on random mutation run fast but have difficulty producing quality inputs. We propose Angora, a new mutation-based fuzzer that outperforms the state-of-the-art fuzzers by a wide margin. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution. To solve path constraints efficiently, we introduce several key techniques: scalable byte-level taint tracking, context-sensitive branch count, search based on gradient descent, and input length exploration. On the LAVA-M data set, Angora found almost all the injected bugs, found more bugs than any other fuzzer that we compared with, and found eight times as many bugs as the second-best fuzzer in the program who. Angora also found 103 bugs that the LAVA authors injected but could not trigger. We also tested Angora on eight popular, mature open source programs. Angora found 6, 52, 29, 40 and 48 new bugs in file, jhead, nm, objdump and size, respectively. We measured the coverage of Angora and evaluated how its key techniques contribute to its impressive performance.


FP-STALKER: Tracking Browser Fingerprint Evolutions Along Time

Browser fingerprinting has emerged as a technique to track users without their consent. Unlike cookies, fingerprinting is a stateless technique that does not store any information on devices, but instead exploits unique combinations of attributes handed over freely by browsers. The uniqueness of fingerprints allows them to be used for identification. However, browser fingerprints change over time and the effectiveness of tracking users over longer durations has not been properly addressed.

In this paper, we show that browser fingerprints tend to change frequently-from every few hours to days-due to, for example, software updates or configuration changes. Yet, despite these frequent changes, we show that browser fingerprints can still be linked, thus enabling long-term tracking.

FP-STALKER is an approach to link browser fingerprint evolutions. It compares fingerprints to determine if they originate from the same browser. We created two variants of FP-STALKER, a rule-based variant that is faster, and a hybrid variant that exploits machine learning to boost accuracy. To evaluate FP-STALKER , we conduct an empirical study using 98,598 fingerprints we collected from 1, 905 distinct browser instances. We compare our algorithm with the state of the art and show that, on average, we can track browsers for 54.48 days, and 26 % of browsers can be tracked for more than 100 days.

Study and Mitigation of Origin Stripping Vulnerabilities in Hybrid-postMessage Enabled Mobile Applications

postMessage is popular in HTML5 based web apps to allow the communication between different origins. With the increasing popularity of the embedded browser (i.e., WebView) in mobile apps (i.e., hybrid apps), postMessage has found utility in these apps. However, different from web apps, hybrid apps have a unique requirement that their native code (e.g., Java for Android) also needs to exchange messages with web code loaded in WebView. To bridge the gap, developers typically extend postMessage by treating the native context as a new frame, and allowing the communication between the new frame and the web frames. We term such extended postMessage “hybrid postMessage” in this paper. We find that hybrid postMessage introduces new critical security flaws: all origin information of a message is not respected or even lost during the message delivery in hybrid postMessage. If adversaries inject malicious code into WebView, the malicious code may leverage the flaws to passively monitor messages that may contain sensitive information, or actively send messages to arbitrary message receivers and access their internal functionalities and data. We term the novel security issue caused by hybrid postMessage “Origin Stripping Vulnerability” (OSV).

In this paper, our contributions are fourfold. First, we conduct the first systematic study on OSV. Second, we propose a lightweight detection tool against OSV, called OSV-Hunter. Third, we evaluate OSV-Hunter using a set of popular apps. We found that 74 apps implemented hybrid postMessage, and all these apps suffered from OSV, which might be exploited by adversaries to perform remote real-time microphone monitoring, data race, internal data manipulation, denial of service (DoS) attacks and so on. Several popular development frameworks, libraries (such as the Facebook React Native framework, and the Google cloud print library) and apps (such as Adobe Reader and WPS office) are impacted. Lastly, to mitigate OSV from the root, we design and implement three new postMessage APIs, called OSV-Free. Our evaluation shows that OSV-Free is secure and fast, and it is generic and resilient to the notorious Android fragmentation problem. We also demonstrate that OSV-Free is easy to use, by applying OSV-Free to harden the complex “Facebook React Native” framework. OSV-Free is open source, and its source code and more implementation and evaluation details are available online.

Mobile Application Web API Reconnaissance: Web-to-Mobile Inconsistencies & Vulnerabilities

Modern mobile apps use cloud-hosted HTTP-based API services and heavily rely on the Internet infrastructure for data communication and storage. To improve performance and leverage the power of the mobile device, input validation and other business logic required for interfacing with web API services are typically implemented on the mobile client. However, when a web service implementation fails to thoroughly replicate input validation, it gives rise to inconsistencies that could lead to attacks that can compromise user security and privacy. Developing automatic methods of auditing web APIs for security remains challenging.

In this paper, we present a novel approach for automatically analyzing mobile app-to-web API communication to detect inconsistencies in input validation logic between apps and their respective web API services. We present our system, \sysname, which implements a static analysis-based web API reconnaissance approach to uncover inconsistencies on real world API services that can lead to attacks with severe consequences for potentially millions of users throughout the world. Our system utilizes program analysis techniques to automatically extract HTTP communication templates from Android apps that encode the input validation constraints imposed by the apps on outgoing web requests to web API services. WARDroid is also enhanced with blackbox testing of server validation logic to identify inconsistencies that can lead to attacks.

We evaluated our system on a set of 10,000 popular free apps from the Google Play Store. We detected problematic logic in APIs used in over 4,000 apps, including 1,743 apps that use unencrypted HTTP communication. We further tested 1,000 apps to validate web API hijacking vulnerabilities that can lead to potential compromise of user privacy and security and found that millions of users are potentially affected from our sample set of tested apps.

Enumerating Active IPv6 Hosts for Large-scale Security Scans via DNSSEC-signed Reverse Zones

Security research has made extensive use of exhaustive Internet-wide scans over the recent years, as they can provide significant insights into the overall state of security of the Internet, and ZMap made scanning the entire IPv4 address space practical. However, the IPv4 address space is exhausted, and a switch to IPv6, the only accepted long-term solution, is inevitable. In turn, to better understand the security of devices connected to the Internet, including in particular Internet of Things devices, it is imperative to include IPv6 addresses in security evaluations and scans. Unfortunately, it is practically infeasible to iterate through the entire IPv6 address space, as it is 2^96 times larger than the IPv4 address space. Therefore, enumeration of active hosts prior to scanning is necessary. Without it, we will be unable to investigate the overall security of Internet-connected devices in the future.

In this paper, we introduce a novel technique to enumerate an active part of the IPv6 address space by walking DNSSEC-signed IPv6 reverse zones. Subsequently, by scanning the enumerated addresses, we uncover significant security problems: the exposure of sensitive data, and incorrectly controlled access to hosts, such as access to routing infrastructure via administrative interfaces, all of which were accessible via IPv6. Furthermore, from our analysis of the differences between accessing dual-stack hosts via IPv6 and IPv4, we hypothesize that the root cause is that machines automatically and by default take on globally routable IPv6 addresses. This is a practice that the affected system administrators appear unaware of, as the respective services are almost always properly protected from unauthorized access via IPv4.

Our findings indicate (i) that enumerating active IPv6 hosts is practical without a preferential network position contrary to common belief, (ii) that the security of active IPv6 hosts is currently still lagging behind the security state of IPv4 hosts, and (iii) that unintended IPv6 connectivity is a major security issue for unaware system administrators.

Tracking Certificate Misissuance in the Wild

Certificate Authorities (CAs) regularly make mechanical errors when issuing certificates. To quantify these errors, we introduce ZLint, a certificate linter that codifies the policies set forth by the CA/Browser Forum Baseline Requirements and RFC 5280 that can be tested in isolation. We run ZLint on browser-trusted certificates in Censys and systematically analyze how well CAs construct certificates. We find that the number errors has drastically reduced since 2012. In 2017, only 0.02% of certificates have errors. However, this is largely due to a handful of large authorities that consistently issue correct certificates. There remains a long tail of small authorities that regularly issue non-conformant certificates. We further find that issuing certificates with errors is correlated with other types of mismanagement and for large authorities, browser action. Drawing on our analysis, we conclude with a discussion on how the community can best use lint data to identify authorities with worrisome organizational practices and ensure long-term health of the Web PKI.

A Formal Treatment of Accountable Proxying over TLS

Much of Internet traffic nowadays passes through active proxies, whose role is to inspect, filter, cache, or trans- form data exchanged between two endpoints. To perform their tasks, such proxies modify channel-securing protocols, like TLS, resulting in serious vulnerabilities. Such problems are exacerbated by the fact that middleboxes are often invisible to one or both endpoints, leading to a lack of accountability. A recent protocol, called mcTLS, pioneered accountability for proxies, which are authorized by the endpoints and given limited read/write permissions to application traffic.

Unfortunately, we show that mcTLS is insecure: the protocol modifies the TLS protocol, exposing it to a new class of middlebox-confusion attacks. Such attacks went unnoticed mainly because mcTLS lacked a formal analysis and security proofs. Hence, our second contribution is to formalize the goal of accountable proxying over secure channels. Third, we propose a provably-secure alternative to soon-to-be-standardized mcTLS: a generic and modular protocol-design that care- fully composes generic secure channel-establishment protocols, which we prove secure. Finally, we present a proof-of-concept implementation of our design, instantiated with unmodified TLS 1.3, and evaluate its overheads.


Secure Device Bootstrapping without Secrets Resistant to Signal Manipulation Attacks

In this paper, we address the fundamental problem of securely bootstrapping a group of wireless devices to a hub, when none of the devices share prior associations (secrets) with the hub or between them. This scenario aligns with the secure deployment of body area networks, IoT, medical devices, industrial automation sensors, autonomous vehicles, and others. We develop VERSE, a physical-layer group message integrity verification primitive that effectively detects advanced wireless signal manipulations that can be used to launch man-in-the-middle (MitM) attacks over wireless. Without using shared secrets to establish authenticated channels, such attacks are notoriously difficult to thwart and can undermine the authentication and key establishment processes. VERSE exploits the existence of multiple devices to verify the integrity of the messages exchanged within the group. We then use VERSE to build a bootstrapping protocol, which securely introduces new devices to the network. Compared to the state-of-the-art, VERSE achieves in-band message integrity verification during secure pairing using only the RF modality without relying on out-of-band channels or extensive human involvement. It guarantees security even when the adversary is capable of fully controlling the wireless channel by annihilating and injecting wireless signals. We study the limits of such advanced wireless attacks and prove that the introduction of multiple legitimate devices can be leveraged to increase the security of the pairing process. We validate our claims via theoretical analysis and extensive experimentations on the USRP platform. We further discuss various implementation aspects such as the effect of time synchronization between devices and the effects of multipath and interference. Note that the elimination of shared secrets, default passwords, and public key infrastructures effectively addresses the related key management challenges when these are considered at scale.

Do You Feel What I Hear? Enabling Autonomous IoT Device Pairing using Different Sensor Types

Context-based pairing solutions increase the usability of IoT device pairing by eliminating any human involvement in the pairing process. This is possible by utilizing on-board sensors (with same sensing modalities) to capture a common physical context (e.g., ambient sound via each device’s microphone). However, in a smart home scenario, it is impractical to assume that all devices will share a common sensing modality. For example, a motion detector is only equipped with an infrared sensor while Amazon Echo only has microphones. In this paper, we develop a new context-based pairing mechanism called Perceptio that uses time as the common factor across differing sensor types. By focusing on the event timing, rather than the specific event sensor data, Perceptio creates event fingerprints that can be matched across a variety of IoT devices. We propose Perceptio based on the idea that devices co-located within a physically secure boundary (e.g., single family house) can observe more events in common over time, as opposed to devices outside. Devices make use of the observed contextual information to provide entropy for Perceptio’s pairing protocol. We design and implement Perceptio, and evaluate its effectiveness as an autonomous secure pairing solution. Our implementation demonstrates the ability to sufficiently distinguish between legitimate devices (placed within the boundary) and attacker devices (placed outside) by imposing a threshold on fingerprint similarity. Perceptio demonstrates an average fingerprint similarity of 94.9% between legitimate devices while even a hypothetical impossibly well-performing attacker yields only 68.9% between itself and a valid device.

On the Economics of Offline Password Cracking

We develop an economic model of an offline password cracker which allows us to make quantitative predictions about the fraction of accounts that a rational password attacker would crack in the event of an authentication server breach. We apply our economic model to analyze recent massive password breaches at Yahoo!, Dropbox, LastPass and AshleyMadison. All four organizations were using key-stretching to protect user passwords. In fact, LastPass’ use of PBKDF2-SHA256 with $10^5$ hash iterations exceeds 2017 NIST minimum recommendation by an order of magnitude. Nevertheless, our analysis paints a bleak picture: the adopted key-stretching levels provide insufficient protection for user passwords. In particular, we present strong evidence that most user passwords follow a Zipf’s law distribution, and characterize the behavior of a rational attacker when user passwords are selected from a Zipf’s law distribution. We show that there is a finite threshold which depends on the Zipf’s law parameters that characterizes the behavior of a rational attacker — if the value of a cracked password (normalized by the cost of computing the password hash function) exceeds this threshold then the adversary’s optimal strategy is always to continue attacking until each user password has been cracked. In all cases (Yahoo!, Dropbox, LastPass and AshleyMadison) we find that the value of a cracked password almost certainly exceeds this threshold meaning that a rational attacker would crack all passwords that are selected from the Zipf’s law distribution (i.e., most user passwords). This prediction holds even if we incorporate an aggressive model of diminishing returns for the attacker (e.g., the total value of $500$ million cracked passwords is less than $100$ times the total value of $5$ million passwords). On a positive note our analysis demonstrates that memory hard functions (MHFs) such as SCRYPT or Argon2i can significantly reduce the damage of an offline attack. In particular, we find that because MHFs substantially increase guessing costs a rational attacker will give up well before he cracks most user passwords and this prediction holds even if the attacker does not encounter diminishing returns for additional cracked passwords. Based on our analysis we advocate that password hashing standards should be updated to require the use of memory hard functions for password hashing and disallow the use of non-memory hard functions such as BCRYPT or PBKDF2.

A Tale of Two Studies: The Best and Worst of YubiKey Usability

Two-factor authentication (2FA) significantly improves the security of password-based authentication. Recently, there has been increased interest in Universal 2nd Factor (U2F) security keys-small hardware devices that require users to press a button on the security key to authenticate. To examine the usability of security keys in non-enterprise usage, we conducted two user studies of the YubiKey, a popular line of U2F security keys. The first study tasked 31 participants with configuring a Windows, Google, and Facebook account to authenticate using a YubiKey. This study revealed problems with setup instructions and workflow including users locking themselves out of their operating system or thinking they had successfully enabled 2FA when they had not. In contrast, the second study had 25 participants use a YubiKey in their daily lives over a period of four weeks, revealing that participants generally enjoyed the experience. Conducting both a laboratory and longitudinal study yielded insights into the usability of security keys that would not have been evident from either study in isolation. Based on our analysis, we recommend standardizing the setup process, enabling verification of success, allowing shared accounts, integrating with operating systems, and preventing lockouts.

When Your Fitness Tracker Betrays You: Quantifying the Predictability of Biometric Features Across Contexts

Attacks on behavioral biometrics have become increasingly popular. Most research has been focused on presenting a previously obtained feature vector to the biometric sensor, often by the attacker training themselves to change their behavior to match that of the victim. However, obtaining the victim’s biometric information may not be easy, especially when the user’s template on the authentication device is adequately secured. As such, if the authentication device is inaccessible, the attacker may have to obtain data elsewhere. In this paper, we present an analytic framework that enables us to measure how easily features can be predicted based on data gathered in a different context (e.g., different sensor, performed task or environment). This framework is used to assess how resilient individual features or entire biometrics are against such cross-context attacks. In order to be able to compare existing biometrics with regard to this property, we perform a user study to gather biometric data from 30 participants and ?ve biometrics (ECG, eye movements, mouse movements, touchscreen dynamics and gait) in a variety of contexts. We make this dataset publicly available online. Our results show that many attack scenarios are viable in practice as features are easily predicted from a variety of contexts. All biometrics include features that are particularly predictable (e.g., amplitude features for ECG or curvature for mouse movements). Overall, we observe that cross-context attacks on eye movements, mouse movements and touchscreen inputs are comparatively easy while ECG and gait exhibit much more chaotic cross-context changes.

2019 program(有视频和PPT) dblp

Session 3: Web Security

Does Certificate Transparency Break the Web? Measuring Adoption and Error Rate

Certificate Transparency (CT) is an emerging system for enabling the rapid discovery of malicious or misissued certificates. Initially standardized in 2013, CT is now finally beginning to see widespread support. Although CT provides desirable security benefits, web browsers cannot begin requiring all websites to support CT at once, due to the risk of breaking large numbers of websites. We discuss challenges for deployment, analyze the adoption of CT on the web, and measure the error rates experienced by users of the Google Chrome web browser. We find that CT has so far been widely adopted with minimal breakage and warnings.

Security researchers often struggle with the tradeoff between security and user frustration: rolling out new security requirements often causes breakage. We view CT as a case study for deploying ecosystem-wide change while trying to minimize end user impact. We discuss the design properties of CT that made its success possible, as well as draw lessons from its risks and pitfalls that could be avoided in future large-scale security deployments.

EmPoWeb: Empowering Web Applications with Browser Extensions

Browser extensions are third party programs, tightly integrated to browsers, where they execute with elevated privileges in order to provide users with additional functionalities. Unlike web applications, extensions are not subject to the Same Origin Policy (SOP) and therefore can read and write user data on any web application. They also have access to sensitive user information including browsing history, bookmarks, credentials (cookies) and list of installed extensions. They have access to a permanent storage in which they can store data as long as they are installed in the user’s browser. They can trigger the download of arbitrary files and save them on the user’s device.

For security reasons, browser extensions and web applications are executed in separate contexts. Nonetheless, in all major browsers, extensions and web applications can interact by exchanging messages. Through these communication channels, a web application can exploit extension privileged capabilities and thereby access and exfiltrate sensitive user information.

In this work, we analyzed the communication interfaces exposed to web applications by Chrome, Firefox and Opera browser extensions. As a result, we identified many extensions that web applications can exploit to access privileged capabilities. Through extensions’ APIS, web applications can bypass SOP and access user data on any other web application, access user credentials (cookies), browsing history, bookmarks, list of installed extensions, extensions storage, and download and save arbitrary files in the user’s device.

Our results demonstrate that the communications between browser extensions and web applications pose serious security and privacy threats to browsers, web applications and more importantly to users. We discuss countermeasures and proposals, and believe that our study and in particular the tool we used to detect and exploit these threats, can be used as part of extensions review process by browser vendors to help them identify and fix the aforementioned problems in extensions.

“If HTTPS Were Secure, I Wouldn’t Need 2FA” - End User and Administrator Mental Models of HTTPS

HTTPS is one of the most important protocols used to secure communication and is, fortunately, becoming more pervasive. However, especially the long tail of websites is still not sufficiently secured.
HTTPS involves different types of users, e.g., end users who are forced to make critical security decisions when faced with warnings or administrators who are required to deal with cryptographic fundamentals and complex decisions concerning compatibility.

In this work, we present the first qualitative study of both end user and administrator mental models of HTTPS. We interviewed 18 end users and 12 administrators; our findings reveal misconceptions about security benefits and threat models from both groups. We identify protocol components that interfere with secure configurations and usage behavior and reveal differences between administrator and end user mental models.

Our results suggest that end user mental models are more conceptual while administrator models are more protocol-based. We also found that end users often confuse encryption with authentication, significantly underestimate the security benefits of HTTPS, and ignore and distrust security indicators while administrators often do not understand the interplay of functional protocol components. Based on the different mental models, we discuss implications and provide actionable recommendations for future designs of user interfaces and protocols.

Fidelius: Protecting User Secrets from Compromised Browsers

Users regularly enter sensitive data, such as passwords, credit card numbers, or tax information, into the browser window. While modern browsers provide powerful client-side privacy measures to protect this data, none of these defenses prevent a browser compromised by malware from stealing it. In this work, we present Fidelius, a new architecture that uses trusted hardware enclaves integrated into the browser to enable protection of user secrets during web browsing sessions, even if the entire underlying browser and OS are fully controlled by a malicious attacker.

Fidelius solves many challenges involved in providing protection for browsers in a fully malicious environment, offering support for integrity and privacy for form data, JavaScript execution, XMLHttpRequests, and protected web storage, while minimizing the TCB. Moreover, interactions between the enclave and the browser, the keyboard, and the display all require new protocols, each with their own security considerations. Finally, Fidelius takes into account UI considerations to ensure a consistent and simple interface for both developers and users.

As part of this project, we develop the first open source system that provides a trusted path from input and output peripherals to a hardware enclave with no reliance on additional hypervisor security assumptions. These components may be of independent interest and useful to future projects.

We implement and evaluate Fidelius to measure its performance overhead, finding that Fidelius imposes acceptable overhead on page load and user interaction for secured pages and has no impact on pages and page components that do not use its enhanced security features.

Postcards from the Post-HTTP World: Amplification of HTTPS Vulnerabilities in the Web Ecosystem

HTTPS aims at securing communication over the Web by providing a cryptographic protection layer that ensures the confidentiality and integrity of communication and enables client/server authentication. However, HTTPS is based on the SSL/TLS protocol suites that have been shown to be vulnerable to various attacks in the years. This has required fixes and mitigations both in the servers and in the browsers, producing a complicated mixture of protocol versions and implementations in the wild, which makes it unclear which attacks are still effective on the modern Web and what is their import on web application security. In this paper, we present the first systematic quantitative evaluation of web application insecurity due to cryptographic vulnerabilities. We specify attack conditions against TLS using attack trees and we crawl the Alexa Top 10k to assess the import of these issues on page integrity, authentication credentials and web tracking. Our results show that the security of a consistent number of websites is severely harmed by cryptographic weaknesses that, in many cases, are due to external or related-domain hosts. This empirically, yet systematically demonstrates how a relatively limited number of exploitable HTTPS vulnerabilities are amplified by the complexity of the web ecosystem.

Session 6: Protocols and Authentication

Reasoning Analytically About Password-Cracking Software

A rich literature has presented efficient techniques for estimating password strength by modeling password-cracking algorithms. Unfortunately, these previous techniques only apply to probabilistic password models, which real attackers seldom use. In this paper, we introduce techniques to reason analytically and efficiently about transformation-based password cracking in software tools like John the Ripper and Hashcat. We define two new operations, rule inversion and guess counting, with which we analyze these tools without needing to enumerate guesses. We implement these techniques and find orders-of-magnitude reductions in the time it takes to estimate password strength. We also present four applications showing how our techniques enable increased scientific rigor in optimizing these attacks’ configurations. In particular, we show how our techniques can leverage revealed password data to improve orderings of transformation rules and to identify rules and words potentially missing from an attack configuration. Our work thus introduces some of the first principled mechanisms for reasoning scientifically about the types of password-guessing attacks that occur in practice.

True2F: Backdoor-Resistant Authentication Tokens

We present True2F, a system for second-factor authentication that provides the benefits of conventional authentication tokens in the face of phishing and software compromise, while also providing strong protection against token faults and backdoors. To do so, we develop new lightweight two-party protocols for generating cryptographic keys and ECDSA signatures, and we implement new privacy defenses to prevent cross-origin token-fingerprinting attacks. To facilitate real-world deployment, our system is backwards-compatible with today’s U2F-enabled web services and runs on commodity hardware tokens after a firmware modification. A True2F-protected authentication takes just 57ms to complete on the token, compared with 23ms for unprotected U2F.

Beyond Credential Stuffing: Password Similarity Models using Neural Networks

Attackers increasingly use passwords leaked from one website to compromise associated accounts on other websites. Such targeted attacks work because users reuse, or pick similar, passwords for different websites. We recast one of the core technical challenges underlying targeted attacks as the task of modeling similarity of human-chosen passwords. We show how to learn good password similarity models using a compilation of 1.4 billion leaked email, password pairs. Using our trained models of password similarity, we exhibit the most damaging targeted attack to date. Simulations indicate that our attack compromises more than 16% of user accounts in less than a thousand guesses, should one of their other passwords be known to the attacker and despite the use of state-of-the art countermeasures. We show via a case study involving a large university authentication service that the attacks are also effective in practice. We go on to propose the first-ever defense against such targeted attacks, by way of personalized password strength meters (PPSMs). These are password strength meters that can warn users when they are picking passwords that are vulnerable to attacks, including targeted ones that take advantage of the user’s previously compromised passwords. We design and build a PPSM that can be compressed to less than 3 MB, making it easy to deploy in order to accurately estimate the strength of a password against all known guessing attacks.

The 9 Lives of Bleichenbacher’s CAT: New Cache ATtacks on TLS Implementations

At CRYPTO’98, Bleichenbacher published his seminal paper which described a padding oracle attack against RSA implementations that follow the PKCS #1 v1.5 standard. Over the last twenty years researchers and implementors had spent a huge amount of effort in developing and deploying numerous mitigation techniques which were supposed to plug all the possible sources of Bleichenbacher-like leakages. However, as we show in this paper, most implementations are still vulnerable to several novel types of attack based on leakage from various microarchitectural side channels: Out of nine popular implementations of TLS that we tested, we were able to break the security of seven implementations with practical proof-of-concept attacks. We demonstrate the feasibility of using those Cache-like ATacks (CATs) to perform a downgrade attack against any TLS connection to a vulnerable server, using a BEAST-like Man in the Browser attack. The main difficulty we face is how to perform the thousands of oracle queries required before the browser’s imposed timeout (which is 30 seconds for almost all browsers, with the exception of Firefox which can be tricked into extending this period). Due to its use of adaptive chosen ciphertext queries, the attack seems to be inherently sequential, but we describe a new way to parallelize Bleichenbacher-like padding attacks by exploiting any available number of TLS servers that share the same public key certificate. With this improvement, we can demonstrate the feasibility of a downgrade attack which could recover all the 2048 bits of the RSA plaintext (including the premaster secret value, which suffices to establish a secure connection) from five available TLS servers in under 30 seconds. This sequential-to-parallel transformation of such attacks can be of independent interest, speeding up and facilitating other side channel attacks on RSA implementations.

An Extensive Formal Security Analysis of the OpenID Financial-grade API

Forced by regulations and industry demand, banks worldwide are working to open their customers’ online banking accounts to third-party services via web-based APIs. By using these so-called Open Banking APIs, third-party companies, such as FinTechs, are able to read information about and initiate payments from their users’ bank accounts. Such access to financial data and resources needs to meet particularly high security requirements to protect customers. One of the most promising standards in this segment is the OpenID Financial-grade API (FAPI), currently under development in an open process by the OpenID Foundation and backed by large industry partners. The FAPI is a profile of OAuth 2.0 designed for high-risk scenarios and aiming to be secure against very strong attackers. To achieve this level of security, the FAPI employs a range of mechanisms that have been developed to harden OAuth 2.0, such as Code and Token Binding (including mTLS and OAUTB), JWS Client Assertions, and Proof Key for Code Exchange. In this paper, we perform a rigorous, systematic formal analysis of the security of the FAPI, based on an existing comprehensive model of the web infrastructure - the Web Infrastructure Model (WIM) proposed by Fett, Küsters, and Schmitz. To this end, we first develop a precise model of the FAPI in the WIM, including different profiles for read-only and read-write access, different flows, different types of clients, and different combinations of security features, capturing the complex interactions in a web-based environment. We then use our model of the FAPI to precisely define central security properties. In an attempt to prove these properties, we uncover partly severe attacks, breaking authentication, authorization, and session integrity properties. We develop mitigations against these attacks and finally are able to formally prove the security of a fixed version of the FAPI. Although financial applications are high-stakes environments, this work is the first to formally analyze and, importantly, verify an Open Banking security profile. By itself, this analysis is an important contribution to the development of the FAPI since it helps to define exact security properties and attacker models, and to avoid severe security risks before the first implementations of the standard go live. Of independent interest, we also uncover weaknesses in the aforementioned security mechanisms for hardening OAuth 2.0. We illustrate that these mechanisms do not necessarily achieve the security properties they have been designed for.

Session 8: Machine Learning

Certified Robustness to Adversarial Examples with Differential Privacy

Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks. However these defenses either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google’s Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism, that provides a rigorous, generic, and flexible foundation for defense.

DeepSec: A Uniform Platform for Security Analysis of Deep Learning Models

Deep learning (DL) models are inherently vulnerable to adversarial examples – maliciously crafted inputs to trigger target DL models to misbehave – which significantly hinders the application of DL in security-sensitive domains. Intensive research on adversarial learning has led to an arms race between adversaries and defenders. Such plethora of emerging attacks and defenses raise many questions: Which attacks are more evasive, preprocessing-proof, or transferable? Which defenses are more effective, utility-preserving, or general? Are ensembles of multiple defenses more robust than individuals? Yet, due to the lack of platforms for comprehensive evaluation on adversarial attacks and defenses, these critical questions remain largely unsolved. In this paper, we present the design, implementation, and evaluation of DEEPSEC, a uniform platform that aims to bridge this gap. In its current implementation, DEEPSEC incorporates 16 state-of-the-art attacks with 10 attack utility metrics, and 13 state-of-the-art defenses with 5 defensive utility metrics. To our best knowledge, DEEPSEC is the first platform that enables researchers and practitioners to (i) measure the vulnerability of DL models, (ii) evaluate the effectiveness of various attacks/defenses, and (iii) conduct comparative studies on attacks/defenses in a comprehensive and informative manner. Leveraging DEEPSEC, we systematically evaluate the existing adversarial attack and defense methods, and draw a set of key findings, which demonstrate DEEPSEC’s rich functionality, such as (1) the trade-off between misclassification and imperceptibility is empirically confirmed; (2) most defenses that claim to be universally applicable can only defend against limited types of attacks under restricted settings; (3) it is not necessary that adversarial examples with higher perturbation magnitude are easier to be detected; (4) the ensemble of multiple defenses cannot improve the overall defense capability, but can improve the lower bound of the defense effectiveness of individuals. Extensive analysis on DEEPSEC demonstrates its capabilities and advantages as a benchmark platform which can benefit future adversarial learning research.

Exploiting Unintended Feature Leakage in Collaborative Learning

Collaborative machine learning and related techniques such as federated learning allow multiple participants, each with his own training dataset, to build a joint model by training locally and periodically exchanging model updates.
We demonstrate that these updates leak unintended information about participants’ training data and develop passive and active inference attacks to exploit this leakage. First, we show that an adversarial participant can infer the presence of exact data points – for example, specific locations – in others’ training data (i.e., membership inference). Then, we show how this adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture. For example, he can infer when a specific person first appears in the photos used to train a binary gender classifier.
We evaluate our attacks on a variety of tasks, datasets, and learning configurations, analyze their limitations, and discuss possible defenses.

Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

Lack of transparency in deep neural networks (DNNs) make them susceptible to backdoor attacks, where hidden associations or triggers override normal classification to produce unexpected results. For example, a model with a
backdoor always identifies a face as Bill Gates if a specific symbol is present in the input. Backdoors can stay hidden indefinitely until activated by an input, and present a serious security risk to many security or safety related applications, e.g. biometric authentication systems or self-driving cars.

We present the first robust and generalizable detection and mitigation system for DNN backdoor attacks. Our techniques identify backdoors and reconstruct possible triggers. We identify multiple mitigation techniques via input filters, neuron pruning and unlearning. We demonstrate their efficacy via extensive experiments on a variety of DNNs, against two types of backdoor injection methods identified by prior work. Our techniques also prove robust against a number of variants of the backdoor attack.

Helen: Maliciously Secure Coopetitive Learning for Linear Models

Many organizations wish to collaboratively train machine learning models on their combined datasets for a common benefit (e.g., better medical research, or fraud detection). However, they often cannot share their plaintext datasets due to privacy concerns and/or business competition. In this paper, we design and build Helen, a system that allows multiple parties to train a linear model without revealing their data, a setting we call coopetitive learning. Compared to prior secure training systems, Helen protects against a much stronger adversary who is malicious and can compromise m−1 out of m parties. Our evaluation shows that Helen can achieve up to five orders of magnitude of performance improvement when compared to training using an existing state-of-the-art secure multi-party computation framework.

Comprehensive Privacy Analysis of Deep Learning

Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.

Session 9: Fuzzing

Razzer: Finding Kernel Race Bugs through Fuzzing

A data race in a kernel is an important class of bugs, critically impacting the reliability and security of the associated system. As a result of a race, the kernel may become unresponsive. Even worse, an attacker may launch a privilege escalation attack to acquire root privileges.
In this paper, we propose Razzer, a tool to find race bugs in kernels. The core of Razzer is in guiding fuzz testing towards potential data race spots in the kernel. Razzer employs two techniques to find races efficiently: a static analysis and a deterministic thread interleaving technique. Using a static analysis, Razzer identifies over-approximated potential data race spots, guiding the fuzzer to search for data races in the kernel more efficiently. Using the deterministic thread interleaving technique implemented at the hypervisor, Razzer tames the non-deterministic behavior of the kernel such that it can deterministically trigger a race. We implemented a prototype of Razzer and ran the latest Linux kernel (from v4.16-rc3 to v4.18-rc3) using Razzer. As a result, Razzer discovered 30 new races in the kernel, with 16 subsequently confirmed and accordingly patched by kernel developers after they were reported.

ProFuzzer: On-the-fly Input Type Probing for Better Zero-day Vulnerability Discovery

Existing mutation based fuzzers tend to randomly mutate the input of a program without understanding its underlying syntax and semantics. In this paper, we propose a novel on-the-fly probing technique (called ProFuzzer) that automatically recovers and understands input fields of critical importance to vulnerability discovery during a fuzzing process and intelligently adapts the mutation strategy to enhance the chance of hitting zero-day targets. Since such probing is transparently piggybacked to the regular fuzzing, no prior knowledge of the input specification is needed. During fuzzing, individual bytes are first mutated and their fuzzing results are automatically analyzed to link those related together and identify the type for the field connecting them; these bytes are further mutated together following type-specific strategies, which substantially prunes the search space. We define the probe types generally across all applications, thereby making our technique application agnostic. Our experiments on standard benchmarks and real-world applications show that ProFuzzer substantially outperforms AFL and its optimized version AFLFast, as well as other state-of-art fuzzers including VUzzer, Driller and QSYM. Within two months, it exposed 42 zero-days in 10 intensively tested programs, generating 30 CVEs.

Full-speed Fuzzing: Reducing Fuzzing Overhead through Coverage-guided Tracing

Coverage-guided fuzzing is one of the most successful approaches for discovering software bugs and security vulnerabilities. Of its three main components: (1) test case generation, (2) code coverage tracing, and (3) crash triage, code coverage tracing is a dominant source of overhead. Coverage-guided fuzzers trace every test case’s code coverage through either static or dynamic binary instrumentation, or more recently, using hardware support. Unfortunately, tracing all test cases incurs significant performance penalties–-even when the overwhelming majority of test cases and their coverage information are discarded because they do not increase code coverage. To eliminate needless tracing by coverage-guided fuzzers, we introduce the notion of coverage-guided tracing. Coverage-guided tracing leverages two observations: (1) only a fraction of generated test cases increase coverage, and thus require tracing; and (2) coverage-increasing test cases become less frequent over time. Coverage-guided tracing encodes the current frontier of coverage in the target binary so that it self-reports when a test case produces new coverage–-without tracing. This acts as a filter for tracing; restricting the expense of tracing to only coverage-increasing test cases. Thus, coverage-guided tracing trades increased time handling coverage-increasing test cases for decreased time handling non-coverage-increasing test cases. To show the potential of coverage-guided tracing, we create an implementation based on the static binary instrumentor Dyninst called UnTracer. We evaluate UnTracer using eight real-world binaries commonly used by the fuzzing community. Experiments show that after only an hour of fuzzing, UnTracer’s average overhead is below 1%, and after 24-hours of fuzzing, UnTracer approaches 0% overhead, while tracing every test case with popular white- and black-box-binary tracers AFL-Clang, AFL-QEMU, and AFL-Dyninst incurs overheads of 36%, 612%, and 518%, respectively. We further integrate UnTracer with the state-of-the-art hybrid fuzzer QSYM and show that in 24-hours of fuzzing, QSYM-UnTracer executes 79% and 616% more test cases than QSYM-Clang and QSYM-QEMU, respectively.

NEUZZ: Efficient Fuzzing with Neural Program Smoothing

Fuzzing has become the de facto standard technique for finding software vulnerabilities. However, even state-of-the-art fuzzers are not very efficient at finding hard-to-trigger software bugs. Most popular fuzzers use evolutionary guidance to generate inputs that can trigger different bugs. Such evolutionary algorithms, while fast and simple to implement, often get stuck in fruitless sequences of random mutations. Gradient-guided optimization presents a promising alternative to evolutionary guidance. Gradient-guided techniques have been shown to significantly outperform evolutionary algorithms at solving high-dimensional structured optimization problems in domains like machine learning by efficiently utilizing gradients or higher-order derivatives of the underlying function.

However, gradient-guided approaches are not directly applicable to fuzzing as real-world program behaviors contain many discontinuities, plateaus, and ridges where the gradient-based methods often get stuck. We observe that this problem can be addressed by creating a smooth surrogate function approximating the target program’s discrete branching behavior. In this paper, we propose a novel program smoothing technique using surrogate neural network models that can incrementally learn smooth approximations of a complex, real-world program’s branching behaviors. We further demonstrate that such neural network models can be used together with gradient-guided input generation schemes to significantly increase the efficiency of the fuzzing process.

Our extensive evaluations demonstrate that NEUZZ significantly outperforms 10 state-of-the-art graybox fuzzers on 10 popular real-world programs both at finding new bugs and achieving higher edge coverage. NEUZZ found 31 previously unknown bugs (including two CVEs) that other fuzzers failed to find in 10 real-world programs and achieved 3X more edge coverage than all of the tested graybox fuzzers over 24 hour runs. Furthermore, NEUZZ also outperformed existing fuzzers on both LAVA-M and DARPA CGC bug datasets.

Fuzzing File Systems via Two-Dimensional Input Space Exploration

File systems, a basic building block of an OS, are too big and too complex to be bug free. Nevertheless, file systems rely on regular stress-testing tools and formal checkers to find bugs, which are limited due to the ever-increasing complexity of both file systems and OSes. Thus, fuzzing, proven to be an effective and a practical approach, becomes a preferable choice, as it does not need much knowledge about a target. However, three main challenges exist in fuzzing file systems: mutating a large image blob that degrades overall performance, generating image-dependent file operations, and reproducing found bugs, which is difficult for existing OS fuzzers.
Hence, we present JANUS, the first feedback-driven fuzzer that explores the two-dimensional input space of a file system, i.e., mutating metadata on a large image, while emitting image-directed file operations. In addition, JANUS relies on a library OS rather than on traditional VMs for fuzzing, which enables JANUS to load a fresh copy of the OS, thereby leading to better reproducibility of bugs. We evaluate JANUS on eight file systems and found 90 bugs in the upstream Linux kernel, 62 of which have been acknowledged. Forty-three bugs have been fixed with 32 CVEs assigned. In addition, JANUS achieves higher code coverage on all the file systems after fuzzing 12 hours, when compared with the state-of-the-art fuzzer Syzkaller for fuzzing file systems. JANUS visits 4.19x and 2.01x more code paths in Btrfs and ext4, respectively. Moreover, JANUS is able to reproduce 88–100% of the crashes, while Syzkaller fails on all of them.

Session 13: Network Security

Breaking LTE on Layer Two

Long Term Evolution (LTE) is the latest mobile communication standard and has a pivotal role in our information society: LTE combines performance goals with modern security mechanisms and serves casual use cases as well as critical infrastructure and public safety communications. Both scenarios are demanding towards a resilient and secure specification and implementation of LTE, as outages and open attack vectors potentially lead to severe risks. Previous work on LTE protocol security identified crucial attack vectors for both the physical (layer one) and network (layer three) layers. Data link layer (layer two) protocols, however, remain a blind spot in existing LTE security research.
In this paper, we present a comprehensive layer two security analysis and identify three attack vectors. These attacks impair the confidentiality and/or privacy of LTE communication. More specifically, we first present a passive identity mapping attack that matches volatile radio identities to longer lasting network identities, enabling us to identify users within a cell and serving as a stepping stone for follow-up attacks. Second, we demonstrate how a passive attacker can abuse the resource allocation as a side channel to perform website fingerprinting that enables the attacker to learn the websites a user accessed. Finally, we present the A LTE R attack that exploits the fact that LTE user data is encrypted in counter mode (AES-CTR) but not integrity protected, which allows us to modify the message payload. As a proof-of-concept demonstration, we show how an active attacker can redirect DNS requests and then perform a DNS spoofing attack. As a result, the user is redirected to a malicious website. Our experimental analysis demonstrates the real-world applicability of all three attacks and emphasizes the threat of open attack vectors on LTE layer two protocols.

HOLMES: Real-time APT Detection through Correlation of Suspicious Information Flows

In this paper, we present HOLMES, a system that implements a new approach to the detection of Advanced and Persistent Threats (APTs). HOLMES is inspired by several case studies of real-world APTs that highlight some common goals of APT actors. In a nutshell, HOLMES aims to produce a detection signal that indicates the presence of a coordinated set of activities that are part of an APT campaign. One of the main challenges addressed by our approach involves developing a suite of techniques that make the detection signal robust and reliable. At a high-level, the techniques we develop effectively leverage the correlation between suspicious information flows that arise during an attacker campaign. In addition to its detection capability, HOLMES is also able to generate a high-level graph that summarizes the attacker’s actions in real-time. This graph can be used by an analyst for an effective cyber response. An evaluation of our approach against some real-world APTs indicates that HOLMES can detect APT campaigns with high precision and low false alarm rate. The compact high-level graphs produced by HOLMES effectively summarizes an ongoing attack campaign and can assist real-time cyber-response operations.

Touching the Untouchables: Dynamic Security Analysis of the LTE Control Plane

This paper presents our extensive investigation of the security aspects of control plane procedures based on dynamic testing of the control components in operational Long Term Evolution (LTE) networks. For dynamic testing in LTE networks, we implemented a semi-automated testing tool, named LTEFuzz, by using open-source LTE software over which the user has full control. We systematically generated test cases by defining three basic security properties by closely analyzing the standards. Based on the security property, LTEFuzz generates and sends the test cases to a target network, and classifies the problematic behavior by only monitoring the device-side logs. Accordingly, we uncovered 36 vulnerabilities, which have not been disclosed previously. These findings are categorized into five types: Improper handling of (1) unprotected initial procedure, (2) crafted plain requests, (3) messages with invalid integrity protection, (4) replayed messages, and (5) security procedure bypass. We confirmed those vulnerabilities by demonstrating proof-of-concept attacks against operational LTE networks. The impact of the attacks is to either deny LTE services to legitimate users, spoof SMS messages, or eavesdrop/manipulate user data traffic. Precise root cause analysis and potential countermeasures to address these problems are presented as well. Cellular carriers were partially involved to maintain ethical standards as well as verify our findings in commercial LTE networks.

On the Feasibility of Rerouting-Based DDoS Defenses

Large botnet-based flooding attacks have recently demonstrated unprecedented damage. However, the best-known end-to-end availability guarantees against flooding attacks require costly global-scale coordination among autonomous systems (ASes). A recent proposal called routing around congestion (or RAC) attempts to offer strong end-to-end availability to a selected critical flow by dynamically rerouting it to an uncongested detour path without requiring any inter-AS coordination. This paper presents an in-depth analysis of the (in)feasibility of the RAC defense and points out that its rerouting approach, though intriguing, cannot possibly solve the challenging flooding problem. An effective RAC solution should find an inter-domain detour path for its critical flow with the two following desired properties: (1) it guarantees the establishment of an arbitrary detour path of its choice, and (2) it isolates the established detour path from non-critical flows so that the path is used exclusively for its critical flow. However, we show a fundamental trade-off between the two desired properties, and as a result, only one of them can be achieved but not both. Worse yet, we show that failing to achieve either of the two properties makes the RAC defense not just ineffective but nearly unusable. When the newly established detour path is not isolated, a new adaptive adversary can detect it in real time and immediately congest the path, defeating the goals of the RAC defense. Conversely, when the establishment of an arbitrary detour path is not guaranteed, more than 80% of critical flows we test have only a small number (e.g., three or less) of detour paths that can actually be established and disjoint from each other, which significantly restricts the available options for the reliable RAC operation. The first lesson of this study is that BGP-based rerouting solutions in the current inter-domain infrastructure seem to be impractical due to implicit assumptions (e.g., the invisibility of poisoning messages) that are unattainable in BGP’s current practice. Second, we learn that the analysis of protocol specifications alone is insufficient for the feasibility study of any new defense proposal and, thus, additional rigorous security analysis and various network evaluations, including real-world testing, are required. Finally, our findings in this paper agree well with the conclusion of the major literature about end-to-end guarantees; that is, strong end-to-end availability should be a security feature of the Internet routing by design, not an ad hoc feature obtained via exploiting current routing protocols.

Resident Evil: Understanding Residential IP Proxy as a Dark Service

An emerging Internet business is residential proxy (RESIP) as a service, in which a provider utilizes the hosts within residential networks (in contrast to those running in a datacenter) to relay their customers’ traffic, in an attempt to avoid server- side blocking and detection. With the prominent roles the services could play in the underground business world, little has been done to understand whether they are indeed involved in Cybercrimes and how they operate, due to the challenges in identifying their RESIPs, not to mention any in-depth analysis on them.
In this paper, we report the first study on RESIPs, which sheds light on the behaviors and the ecosystem of these elusive gray services. Our research employed an infiltration framework, including our clients for RESIP services and the servers they visited, to detect 6 million RESIP IPs across 230+ countries and 52K+ ISPs. The observed addresses were analyzed and the hosts behind them were further fingerprinted using a new profiling system. Our effort led to several surprising findings about the RESIP services unknown before. Surprisingly, despite the providers’ claim that the proxy hosts are willingly joined, many proxies run on likely compromised hosts including IoT devices. Through cross-matching the hosts we discovered and labeled PUP (potentially unwanted programs) logs provided by a leading IT company, we uncovered various illicit operations RESIP hosts performed, including illegal promotion, Fast fluxing, phishing, malware hosting, and others. We also reverse engi- neered RESIP services’ internal infrastructures, uncovered their potential rebranding and reselling behaviors. Our research takes the first step toward understanding this new Internet service, contributing to the effective control of their security risks.

Session 15: Web and Cloud Security

Why Does Your Data Leak? Uncovering the Data Leakage in Cloud from Mobile Apps

Increasingly, more and more mobile applications (apps for short) are using the cloud as the back-end, in particular the cloud APIs, for data storage, data analytics, message notification, and monitoring. Unfortunately, we have recently witnessed massive data leaks from the cloud, ranging from personally identifiable information to corporate secrets. In this paper, we seek to understand why such significant leaks occur and design tools to automatically identify them. To our surprise, our study reveals that lack of authentication, misuse of various keys (e.g., normal user keys and superuser keys) in authentication, or misconfiguration of user permissions in authorization are the root causes. Then, we design a set of automated program analysis techniques including obfuscation-resilient cloud API identification and string value analysis, and implement them in a tool called LeakScope to identify the potential data leakage vulnerabilities from mobile apps based on how the cloud APIs are used. Our evaluation with over 1.6 million mobile apps from the Google Play Store has uncovered 15, 098 app servers managed by mainstream cloud providers such as Amazon, Google, and Microsoft that are subject to data leakage attacks. We have made responsible disclosure to each of the cloud service providers, and they have all confirmed the vulnerabilities we have identified and are actively working with the mobile app developers to patch their vulnerable services.

Measuring and Analyzing Search Engine Poisoning of Linguistic Collisions

Misspelled keywords have become an appealing target in search poisoning, since they are less competitive to promote than the correct queries and account for a considerable amount of search traffic. Search engines have adopted several countermeasure strategies, e.g., Google applies automated corrections on queried keywords and returns search results of the corrected versions directly. However, a sophisticated class of attack, which we term as linguistic-collision misspelling, can evade auto-correction and poison search results. Cybercriminals target special queries where the misspelled terms are existent words, even in other languages (e.g., “idobe”, a misspelling of the English word “adobe”, is a legitimate word in the Nigerian language).

In this paper, we perform the first large-scale analysis on linguistic-collision search poisoning attacks. In particular, we check 1.77 million misspelled search terms on Google and Baidu and analyze both English and Chinese languages, which are the top two languages used by Internet users. We leverage edit distance operations and linguistic properties to generate misspelling candidates. To more efficiently identify linguistic-collision search terms, we design a deep learning model that can improve collection rate by 2.84x compared to random sampling. Our results show that the abuse is prevalent: around 1.19% of linguistic-collision search terms on Google and Baidu have results on the first page directing to malicious websites. We also find that cybercriminals mainly target categories of gambling, drugs, and adult content. Mobile-device users disproportionately search for misspelled keywords, presumably due to small screen for input. Our work highlights this new class of search engine poisoning and provides insights to help mitigate the threat.

How Well Do My Results Generalize? Comparing Security and Privacy Survey Results from MTurk, Web, and Telephone Samples

Security and privacy researchers often rely on data collected from Amazon Mechanical Turk (MTurk) to evaluate security tools, to understand users’ privacy preferences and to measure online behavior. Yet, little is known about how well Turkers’ survey responses and performance on security- and privacy-related tasks generalizes to a broader population. This paper takes a first step toward understanding the generalizability of security and privacy user studies by comparing users’ self-reports of their security and privacy knowledge, past experiences, advice sources, and behavior across samples collected using MTurk (n=480), a census-representative web-panel (n=428), and a probabilistic telephone sample (n=3,000) statistically weighted to be accurate within 2.7% of the true prevalence in the U.S.

Surprisingly, the results suggest that: (1) MTurk responses regarding security and privacy experiences, advice sources, and knowledge are more representative of the U.S. population than are responses from the census-representative panel; (2) MTurk and general population reports of security and privacy experiences, knowledge, and advice sources are quite similar for respondents who are younger than 50 or who have some college education; and (3) respondents’ answers to the survey questions we ask are stable over time and robust to relevant, broadly-reported news events. Further, differences in responses cannot be ameliorated with simple demographic weighting, possibly because MTurk and panel participants have more internet experience compared to their demographic peers. Together, these findings lend tempered support for the generalizability of prior crowdsourced security and privacy user studies; provide context to more accurately interpret the results of such studies; and suggest rich directions for future work to mitigate experience- rather than demographic-related sample biases.

PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques Against Browser Phishing Blacklists

Phishing attacks have reached record volumes in recent years. Simultaneously, modern phishing websites are growing in sophistication by employing diverse cloaking techniques to avoid detection by security infrastructure. In this paper, we present PhishFarm: a scalable framework for methodically testing the resilience of anti-phishing entities and browser blacklists to attackers’ evasion efforts. We use PhishFarm to deploy 2,380 live phishing sites (on new, unique, and previously-unseen .com domains) each using one of six different HTTP request filters based on real phishing kits. We reported subsets of these sites to 10 distinct anti-phishing entities and measured both the occurrence and timeliness of native blacklisting in major web browsers to gauge the effectiveness of protection ultimately extended to victim users and organizations. Our experiments revealed shortcomings in current infrastructure, which allows some phishing sites to go unnoticed by the security community while remaining accessible to victims. We found that simple cloaking techniques representative of real-world attacks— including those based on geolocation, device type, or JavaScript— were effective in reducing the likelihood of blacklisting by over 55% on average. We also discovered that blacklisting did not function as intended in popular mobile browsers (Chrome, Safari, and Firefox), which left users of these browsers particularly vulnerable to phishing attacks. Following disclosure of our findings, anti-phishing entities are now better able to detect and mitigate several cloaking techniques (including those that target mobile users), and blacklisting has also become more consistent between desktop and mobile platforms— but work remains to be done by anti-phishing entities to ensure users are adequately protected. Our PhishFarm framework is designed for continuous monitoring of the ecosystem and can be extended to test future state-of-the-art evasion techniques used by malicious websites.

USENIX Security

2018 Technical Sessions(有视频音频和PPT) dblp

Censorship and Web Privacy

Fp-Scanner: The Privacy Implications of Browser Fingerprint Inconsistencies

By exploiting the diversity of device and browser configurations, browser fingerprinting established itself as a viable technique to enable stateless user tracking in production. Companies and academic communities have responded with a wide range of countermeasures. However, the way these countermeasures are evaluated does not properly assess their impact on user privacy, in particular regarding the quantity of information they may indirectly leak by revealing their presence.

In this paper, we investigate the current state of the art of browser fingerprinting countermeasures to study the inconsistencies they may introduce in altered finger- prints, and how this may impact user privacy. To do so, we introduce FP-Scanner as a new test suite that explores browser fingerprint inconsistencies to detect potential alterations, and we show that we are capable of detecting countermeasures from the inconsistencies they introduce. Beyond spotting altered browser fingerprints, we demonstrate that FP-Scanner can also reveal the original value of altered fingerprint attributes, such as the browser or the operating system. We believe that this result can be exploited by fingerprinters to more accurately target browsers with countermeasures.

Nowadays, cookies are the most prominent mechanism to identify and authenticate users on the Internet. Although protected by the Same Origin Policy, popular browsers include cookies in all requests, even when these are cross-site. Unfortunately, these third-party cookies enable both cross-site attacks and third-party tracking. As a response to these nefarious consequences, various countermeasures have been developed in the form of browser extensions or even protection mechanisms that are built directly into the browser.

In this paper, we evaluate the effectiveness of these defense mechanisms by leveraging a framework that automatically evaluates the enforcement of the policies imposed to third-party requests. By applying our framework, which generates a comprehensive set of test cases covering various web mechanisms, we identify several flaws in the policy implementations of the 7 browsers and 46 browser extensions that were evaluated. We find that even built-in protection mechanisms can be circumvented by multiple novel techniques we discover. Based on these results, we argue that our proposed framework is a much-needed tool to detect bypasses and evaluate solutions to the exposed leaks. Finally, we analyze the origin of the identified bypass techniques, and find that these are due to a variety of implementation, configuration and design flaws.

Effective Detection of Multimedia Protocol Tunneling using Machine Learning

Multimedia protocol tunneling enables the creation of covert channels by modulating data into the input of popular multimedia applications such as Skype. To be effective, protocol tunneling must be unobservable, i.e., an adversary should not be able to distinguish the streams that carry a covert channel from those that do not. However, existing multimedia protocol tunneling systems have been evaluated using ad hoc methods, which casts doubts on whether such systems are indeed secure, for instance, for censorship-resistant communication.

In this paper, we conduct an experimental study of the unobservability properties of three state of the art systems: Facet, CovertCast, and DeltaShaper. Our work unveils that previous claims regarding the unobservability of the covert channels produced by those tools were flawed and that existing machine learning techniques, namely those based on decision trees, can uncover the vast majority of those channels while incurring in comparatively lower false positive rates. We also explore the application of semi-supervised and unsupervised machine learning techniques. Our findings suggest that the existence of manually labeled samples is a requirement for the successful detection of covert channels.

Quack: Scalable Remote Measurement of Application-Layer Censorship

Remote censorship measurement tools can now detect DNS- and IP-based blocking at global scale. However, a major unmonitored form of interference is blocking triggered by deep packet inspection of application-layer data. We close this gap by introducing Quack, a scalable, remote measurement system that can efficiently detect application-layer interference.

We show that Quack can effectively detect application- layer blocking triggered on HTTP and TLS headers, and it is flexible enough to support many other diverse protocols. In experiments, we test for blocking across 4458 autonomous systems, an order of magnitude larger than provided by country probes used by OONI. We also test a corpus of 100,000 keywords from vantage points in 40 countries to produce detailed national blocklists. Finally, we analyze the keywords we find blocked to provide in- sight into the application-layer blocking ecosystem and compare countries’ behavior. We find that the most consistently blocked services are related to circumvention tools, pornography, and gambling, but that there is significant country-to-country variation.

Understanding How Humans Authenticate

Better managed than memorized? Studying the Impact of Managers on Password Strength and Reuse

Despite their well-known security problems, passwords are still the incumbent authentication method for virtually all online services. To remedy the situation, users are very often referred to password managers as a solution to the password reuse and weakness problems. However, to date the actual impact of password managers on password strength and reuse has not been studied systematically.

We provide the first large-scale study of the password managers’ influence on users’ real-life passwords. By combining qualitative data on users’ password creation and management strategies, collected from 476 participants of an online survey, with quantitative data (incl. password metrics and entry methods) collected in situ with a browser plugin from 170 users, we were able to gain a more complete picture of the factors that influence our participants’ password strength and reuse. Our approach allows us to quantify for the first time that password managers indeed influence the password security, however, whether this influence is beneficial or aggravating existing problems depends on the users’ strategies and how well the manager supports the users’ password management right from the time of password creation. Given our results, we think research should further investigate how managers can better support users’ password strategies in order to improve password security as well as stop aggravating the existing problems.

Forgetting of Passwords: Ecological Theory and Data

It is well known that text-based passwords are hard to remember and that users prefer simple (and non-secure) passwords. However, despite extensive research on the topic, no principled account exists for explaining when a password will be forgotten. This paper contributes new data and a set of analyses building on the ecological theory of memory and forgetting. We propose that human memory naturally adapts according to an estimate of how often a password will be needed, such that often used, important passwords are less likely to be forgotten. We derive models for login duration and odds of recall as a function of rate of use and number of uses thus far. The models achieved a root-mean-square error (RMSE) of 1.8 seconds for login duration and 0.09 for recall odds for data collected in a month-long field experiment where frequency of password use was controlled. The theory and data shed new light on password management, account usage, password security and memorability.

The Rewards and Costs of Stronger Passwords in a University: Linking Password Lifetime to Strength

We present an opportunistic study of the impact of a new password policy in a university with 100,000 staff and students. The goal of the IT staff who conceived the policy was to encourage stronger passwords by varying password lifetime according to password strength. Strength was measured through Shannon entropy (acknowledged to be a poor measure of password strength by the academic community, but still widely used in practice). When users change their password, a password meter informs them of the lifetime of their new password, which may vary from 100 days (50 bits of entropy) to 350 days (120 bits of entropy).

We analysed data of nearly 200,000 password changes and 115,000 resets of passwords that were forgotten/expired over a period of 14 months. The new policy took over 100 days to gain traction, but after that, average entropy rose steadily. After another 12 months, the average password lifetime increased from 146 days (63 bits) to 170 days (70 bits).

We also found that passwords with more than 300 days of lifetime are 4 times as likely to be reset as passwords of 100 days of lifetime. Users who reset their password more than once per year (27% of users) choose passwords with over 10 days fewer lifetime, and while they also respond to the policy, maintain this deficit.

We conclude that linking password lifetime to strength at the point of password creation is a viable strategy for encouraging users to choose stronger passwords (at least when measured by Shannon entropy).

Rethinking Access Control and Authentication for the Home Internet of Things (IoT)

Computing is transitioning from single-user devices to the Internet of Things (IoT), in which multiple users with complex social relationships interact with a single device. Currently deployed techniques fail to provide usable access-control specification or authentication in such settings. In this paper, we begin reenvisioning access control and authentication for the home IoT. We propose that access control focus on IoT capabilities (i.e., certain actions that devices can perform), rather than on a per-device granularity. In a 425-participant online user study, we find stark differences in participants’ desired access-control policies for different capabilities within a single device, as well as based on who is trying to use that capability. From these desired policies, we identify likely candidates for default policies. We also pinpoint necessary primitives for specifying more complex, yet desired, access-control policies. These primitives range from the time of day to the current location of users. Finally, we discuss the degree to which different authentication methods potentially support desired policies.

Vulnerability Discovery

ATtention Spanned: Comprehensive Vulnerability Analysis of AT Commands Within the Android Ecosystem

AT commands, originally designed in the early 80s for controlling modems, are still in use in most modern smartphones to support telephony functions. The role of AT commands in these devices has vastly expanded through vendor-specific customizations, yet the extent of their functionality is unclear and poorly documented. In this paper, we systematically retrieve and extract 3,500 AT commands from over 2,000 Android smartphone firmware images across 11 vendors. We methodically test our corpus of AT commands against eight Android devices from four different vendors through their USB interface and characterize the powerful functionality exposed, including the ability to rewrite device firmware, bypass Android security mechanisms, exfiltrate sensitive device information, perform screen unlocks, and inject touch events solely through the use of AT commands. We demonstrate that the AT command interface contains an alarming amount of unconstrained functionality and represents a broad attack surface on Android devices.

Charm: Facilitating Dynamic Analysis of Device Drivers of Mobile Systems

Mobile systems, such as smartphones and tablets, incorporate a diverse set of I/O devices, such as camera, audio devices, GPU, and sensors. This in turn results in a large number of diverse and customized device drivers running in the operating system kernel of mobile systems. These device drivers contain various bugs and vulnerabilities, making them a top target for kernel exploits [78]. Unfortunately, security analysts face important challenges in analyzing these device drivers in order to find, understand, and patch vulnerabilities. More specifically, using the state-of-the-art dynamic analysis techniques such as interactive debugging, fuzzing, and record-and-replay for analysis of these drivers is difficult, inefficient, or even completely inaccessible depending on the analysis.

In this paper, we present Charm, a system solution that facilitates dynamic analysis of device drivers of mobile systems. Charm’s key technique is remote device driver execution, which enables the device driver to execute in a virtual machine on a workstation. Charm makes this possible by using the actual mobile system only for servicing the low-level and infrequent I/O operations through a low-latency and customized USB channel. Charm does not require any specialized hardware and is immediately available to analysts. We show that it is feasible to apply Charm to various device drivers, including camera, audio, GPU, and IMU sensor drivers, in different mobile systems, including LG Nexus 5X, Huawei Nexus 6P, and Samsung Galaxy S7. In an extensive evaluation, we show that Charm enhances the usability of fuzzing of device drivers, enables record-and-replay of driver’s execution, and facilitates detailed vulnerability analysis. Altogether, these capabilities have enabled us to find 25 bugs in device drivers, analyze 3 existing ones, and even build an arbitrary-code-execution kernel exploit using one of them.

Inception: System-Wide Security Testing of Real-World Embedded Systems Software

Connected embedded systems are becoming widely deployed, and their security is a serious concern. Current techniques for security testing of embedded software rely either on source code or on binaries. Detecting vulnerabilities by testing binary code is harder, because source code semantics are lost. Unfortunately, in embedded systems, high-level source code (C/C++) is often mixed with hand-written assembly, which cannot be directly handled by current source-based tools.

In this paper we introduce Inception, a framework to perform security testing of complete real-world embedded firmware. Inception introduces novel techniques for symbolic execution in embedded systems. In particular, Inception Translator generates and merges LLVM bitcode from high-level source code, hand-written assembly, binary libraries, and part of the processor hardware behavior. This design reduces differences with real execution as well as the manual effort. The source code semantics are preserved, improving the effectiveness of security checks. Inception Symbolic Virtual Machine, based on KLEE, performs symbolic execution, using several strategies to handle different levels of memory abstractions, interaction with peripherals, and interrupts. Finally, the Inception Debugger is a high-performance JTAG debugger which performs redirection of memory accesses to the real hardware.

We first validate our implementation using 53000 tests comparing Inception’s execution to concrete execution on an Arm Cortex-M3 chip. We then show Inception’s advantages on a benchmark made of 1624 synthetic vulnerable programs, four real-world open source and industrial applications, and 19 demos. We discovered eight crashes and two previously unknown vulnerabilities, demonstrating the effectiveness of Inception as a tool to assist embedded device firmware testing.

Acquisitional Rule-based Engine for Discovering Internet-of-Things Devices

The rapidly increasing landscape of Internet-of-Thing (IoT) devices has introduced significant technical challenges for their management and security, as these IoT devices in the wild are from different device types, vendors, and product models. The discovery of IoT devices is the pre-requisite to characterize, monitor, and protect these devices. However, manual device annotation impedes a large-scale discovery, and the device classification based on machine learning requires large training data with labels. Therefore, automatic device discovery and annotation in large-scale remains an open problem in IoT. In this paper, we propose an Acquisitional Rule-based Engine (ARE), which can automatically generate rules for discovering and annotating IoT devices without any training data. ARE builds device rules by leveraging application-layer response data from IoT devices and product descriptions in relevant websites for device annotations. We define a transaction as a mapping between a unique response to a product description. To collect the transaction set, ARE extracts relevant terms in the response data as the search queries for crawling websites. ARE uses the association algorithm to generate rules of IoT device annotations in the form of (type, vendor, and product). We conduct experiments and three applications to validate the effectiveness of ARE.

Web Applications

A Sense of Time for JavaScript and Node.js: First-Class Timeouts as a Cure for Event Handler Poisoning

The software development community is adopting the Event-Driven Architecture (EDA) to provide scalable web services, most prominently through Node.js. Though the EDA scales well, it comes with an inherent risk: the Event Handler Poisoning (EHP) Denial of Service attack. When an EDA-based server multiplexes many clients onto few threads, a blocked thread (EHP) renders the server unresponsive. EHP attacks are a serious threat, with hundreds of vulnerabilities already reported in the wild.

We make three contributions against EHP attacks. First, we describe EHP attacks, and show that they are a common form of vulnerability in the largest EDA community, the Node.js ecosystem. Second, we design a defense against EHP attacks, First-Class Timeouts, which incorporates timeouts at the EDA framework level. Our Node.cure prototype defends Node.js applications against all known EHP attacks with overheads between 0% and 24% on real applications. Third, we promote EHP awareness in the Node.js community. We analyzed Node.js for vulnerable APIs and documented or corrected them, and our guide on avoiding EHP attacks is available on nodejs.org.

Freezing the Web: A Study of ReDoS Vulnerabilities in JavaScript-based Web Servers

Regular expression denial of service (ReDoS) is a class of algorithmic complexity attacks where matching a regular expression against an attacker-provided input takes unexpectedly long. The single-threaded execution model of JavaScript makes JavaScript-based web servers particularly susceptible to ReDoS attacks. Despite this risk and the increasing popularity of the server-side Node.js platform, there is currently little reported knowledge about the severity of the ReDoS problem in practice. This paper presents a large-scale study of ReDoS vulnerabilities in real-world web sites. Underlying our study is a novel methodology for analyzing the exploitability of deployed servers. The basic idea is to search for previously unknown vulnerabilities in popular libraries, hypothesize how these libraries may be used by servers, and to then craft targeted exploits. In the course of the study, we identify 25 previously unknown vulnerabilities in popular modules and test 2,846 of the most popular websites against them. We find that 339 of these web sites suffer from at least one ReDoS vulnerability. Since a single request can block a vulnerable site for several seconds, and sometimes even much longer, ReDoS poses a serious threat to the availability of these sites. Our results are a call-to-arms for developing techniques to detect and mitigate ReDoS vulnerabilities in JavaScript.

(官方没给,来自论文摘要)Modern multi-tier web applications are composed of several dynamic features, which make their vulnerability analysis challenging from a purely static analysis perspective. We describe an approach that overcomes the challenges posed by the dynamic nature of web applications. Our approach combines dynamic analysis that is guided by static analysis techniques in order to automatically identify vulnerabilities and build working exploits. Our approach is implemented and evaluated in NAVEX, a tool that can scale the process of automatic vulnerability analysis and exploit generation to large applications and to multiple classes of vulnerabilities. In our experiments, we were able to use NAVEX over a codebase of 3.2 million lines of PHP code, and construct 204 exploits in the code that was analyzed.

Rampart: Protecting Web Applications from CPU-Exhaustion Denial-of-Service Attacks

Denial-of-Service (DoS) attacks pose a severe threat to the availability of web applications. Traditionally, attackers have employed botnets or amplification techniques to send a significant amount of requests to exhaust a target web server’s resources, and, consequently, prevent it from responding to legitimate requests. However, more recently, highly sophisticated DoS attacks have emerged, in which a single, carefully crafted request results in significant resource consumption and ties up a web application’s back-end components for a non-negligible amount of time. Unfortunately, these attacks require only few requests to overwhelm an application, which makes them difficult to detect by state-of-the-art detection systems.

In this paper, we present Rampart, which is a defense that protects web applications from sophisticated CPU-exhaustion DoS attacks. Rampart detects and stops sophisticated CPU-exhaustion DoS attacks using statistical methods and function-level program profiling. Furthermore, it synthesizes and deploys filters to block subsequent attacks, and it adaptively updates them to minimize any potentially negative impact on legitimate users.

We implemented Rampart as an extension to the PHP Zend engine. Rampart has negligible performance overhead and it can be deployed for any PHP application without having to modify the application’s source code. To evaluate Rampart’s effectiveness and efficiency, we demonstrate that it protects two of the most popular web applications, WordPress and Drupal, from real-world and synthetic CPU-exhaustion DoS attacks, and we also show that Rampart preserves web server performance with low false positive rate and low false negative rate.


How Do Tor Users Interact With Onion Services?

Onion services are anonymous network services that are exposed over the Tor network. In contrast to conventional Internet services, onion services are private, generally not indexed by search engines, and use self-certifying domain names that are long and difficult for humans to read. In this paper, we study how people perceive, understand, and use onion services based on data from 17 semi-structured interviews and an online survey of 517 users. We find that users have an incomplete mental model of onion services, use these services for anonymity, and have vary- ing trust in onion services in general. Users also have difficulty discovering and tracking onion sites and authenticating them. Finally, users want technical improvements to onion services and better information on how to use them. Our findings suggest various improvements for the security and usability of Tor onion services, including ways to automatically detect phishing of onion services, clearer security indicators, and better ways to manage onion domain names that are difficult to remember.

Towards Predicting Efficient and Anonymous Tor Circuits

The Tor anonymity system provides online privacy for millions of users, but it is slower than typical web browsing. To improve Tor performance, we propose PredicTor, a path selection technique that uses a Random Forest classifier trained on recent measurements of Tor to predict the performance of a proposed path. If the path is predicted to be fast, then the client builds a circuit using those relays. We implemented PredicTor in the Tor source code and show through live Tor experiments and Shadow simulations that PredicTor improves Tor network performance by 11% to 23% compared to Vanilla Tor and by 7% to 13% compared to the previous state-of-the-art scheme. Our experiments show that PredicTor is the first path selection algorithm to dynamically avoid highly congested nodes during times of high congestion and avoid long-distance paths during times of low congestion. We evaluate the anonymity of PredicTor using standard entropy-based and time-to-first-compromise metrics, but these cannot capture the possibility of leakage due to the use of location in path selection. To better address this, we propose a new anonymity metric called CLASI: Client Autonomous System Inference. CLASI is the first anonymity metric in Tor that measures an adversary’s ability to infer client Autonomous Systems (ASes) by fingerprinting circuits at the network, country, and relay level. We find that CLASI shows anonymity loss for location-aware path selection algorithms, where entropy-based metrics show little to no loss of anonymity. Additionally, CLASI indicates that PredicTor has similar sender AS leakage compared to the current Tor path selection algorithm due to PredicTor building circuits that are independent of client location.

BurnBox: Self-Revocable Encryption in a World Of Compelled Access

Dissidents, journalists, and others require technical means to protect their privacy in the face of compelled access to their digital devices (smartphones, laptops, tablets, etc.). For example, authorities increasingly force disclosure of all secrets, including passwords, to search devices upon national border crossings. We therefore present the design, implementation, and evaluation of a new system to help victims of compelled searches. Our system, called BurnBox, provides self-revocable encryption: the user can temporarily disable their access to specific files stored remotely, without revealing which files were revoked during compelled searches, even if the adversary also compromises the cloud storage service. They can later restore access. We formalize the threat model and provide a construction that uses an erasable index, secure erasure of keys, and standard cryptographic tools in order to provide security supported by our formal analysis. We report on a prototype implementation, which showcases the practicality of BurnBox.

An Empirical Analysis of Anonymity in Zcash

Among the now numerous alternative cryptocurrencies derived from Bitcoin, Zcash is often touted as the one with the strongest anonymity guarantees, due to its basis in well-regarded cryptographic research. In this paper, we examine the extent to which anonymity is achieved in the deployed version of Zcash. We investigate all facets of anonymity in Zcash’s transactions, ranging from its transparent transactions to the interactions with and within its main privacy feature, a shielded pool that acts as the anonymity set for users wishing to spend coins privately. We conclude that while it is possible to use Zcash in a private way, it is also possible to shrink its anonymity set considerably by developing simple heuristics based on identifiable patterns of usage.

Network Defenses

NetHide: Secure and Practical Network Topology Obfuscation

Simple path tracing tools such as traceroute allow malicious users to infer network topologies remotely and use that knowledge to craft advanced denial-of-service (DoS) attacks such as Link-Flooding Attacks (LFAs). Yet, despite the risk, most network operators still allow path tracing as it is an essential network debugging tool.

In this paper, we present NetHide, a network topology obfuscation framework that mitigates LFAs while preserving the practicality of path tracing tools. The key idea behind NetHide is to formulate network obfuscation as a multi-objective optimization problem that allows for a flexible tradeoff between security (encoded as hard constraints) and usability (encoded as soft constraints). While solving this problem exactly is hard, we show that NetHide can obfuscate topologies at scale by only considering a subset of the candidate solutions and without reducing obfuscation quality. In practice, NetHide obfuscates the topology by intercepting and modifying path tracing probes directly in the data plane. We show that this process can be done at line-rate, in a stateless fashion, by leveraging the latest generation of programmable network devices.

We fully implemented NetHide and evaluated it on realistic topologies. Our results show that NetHide is able to obfuscate large topologies (>150 nodes) while preserving near-perfect debugging capabilities. In particular, we show that operators can still precisely trace back >90 % of link failures despite obfuscation.

Towards a Secure Zero-rating Framework with Three Parties

Zero-rating services provide users with free access to contracted or affiliated Content Providers (CPs), but also incur new types of free-riding attacks. Specifically, a malicious user can masquerade a zero-rating CP or alter an existing zero-rating communication to evade charges enforced by the Internet Service Provider (ISP). According to our study, major commercial ISPs, such as T-Mobile, China Mobile, and airport WiFi, are all vulnerable to such free-riding attacks. In this paper, we propose a secure, backward compatible, zero-rating framework, called ZFree, which only allows network traffic authorized by the correct CP to be zero-rated. We perform a formal security analysis using ProVerif, and the results show that ZFree is secure, i.e., preserving both packet integrity and CP server authenticity. We have implemented an open-source prototype of ZFree available at this repository (https://github.com/zfree2018/ZFREE). A working demo is at this link (http://zfree.org/). Our evaluation shows that ZFree is lightweight, scalable and secure.

Fuzzing and Exploit Generation

MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation

OS fuzzers primarily test the system call interface between the OS kernel and user-level applications for security vulnerabilities. The effectiveness of evolutionary OS fuzzers depends heavily on the quality and diversity of their seed system call sequences. However, generating good seeds for OS fuzzing is a hard problem as the behavior of each system call depends heavily on the OS kernel state created by the previously executed system calls. Therefore, popular evolutionary OS fuzzers often rely on hand-coded rules for generating valid seed sequences of system calls that can bootstrap the fuzzing process. Unfortunately, this approach severely restricts the diversity of the seed system call sequences and therefore limits the effectiveness of the fuzzers.

In this paper, we develop MoonShine, a novel strategy for distilling seeds for OS fuzzers from system call traces of real-world programs while still maintaining the dependencies across the system calls. MoonShine leverages light-weight static analysis for efficiently detecting dependencies across different system calls.

We designed and implemented MoonShine as an extension to Syzkaller, a state-of-the-art evolutionary fuzzer for the Linux kernel. Starting from traces containing 2.8 million system calls gathered from 3,220 real-world programs, MoonShine distilled down to just over 14,000 calls while preserving 86% of the original code coverage. Using these distilled seed system call sequences, MoonShine was able to improve Syzkaller’s achieved code coverage for the Linux kernel by 13% on average. MoonShine also found 14 new vulnerabilities in the Linux kernel that were not found by Syzkaller.

QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

Recently, hybrid fuzzing has been proposed to address the limitations of fuzzing and concolic execution by combining both approaches. The hybrid approach has shown its effectiveness in various synthetic benchmarks such as DARPA Cyber Grand Challenge (CGC) binaries, but it still suffers from scaling to find bugs in complex, real-world software. We observed that the performance bottleneck of the existing concolic executor is the main limiting factor for its adoption beyond a small-scale study.

To overcome this problem, we design a fast concolic execution engine, called QSYM, to support hybrid fuzzing. The key idea is to tightly integrate the symbolic emulation with the native execution using dynamic binary translation, making it possible to implement more fine-grained, so faster, instruction-level symbolic emulation. Additionally, QSYM loosens the strict soundness requirements of conventional concolic executors for better performance, yet takes advantage of a faster fuzzer for validation, providing unprecedented opportunities for performance optimizations, e.g., optimistically solving constraints and pruning uninteresting basic blocks.

Our evaluation shows that QSYM does not just outperform state-of-the-art fuzzers (i.e., found 14× more bugs than VUzzer in the LAVA-M dataset, and outperformed Driller in 104 binaries out of 126), but also found 13 previously unknown security bugs in eight real-world programs like Dropbox Lepton, ffmpeg, and OpenJPEG, which have already been intensively tested by the state-of-the-art fuzzers, AFL and OSS-Fuzz.

Automatic Heap Layout Manipulation for Exploitation

Heap layout manipulation is integral to exploiting heap-based memory corruption vulnerabilities. In this paper we present the first automatic approach to the problem, based on pseudo-random black-box search. Our approach searches for the inputs required to place the source of a heap-based buffer overflow or underflow next to heap-allocated objects that an exploit developer, or automatic exploit generation system, wishes to read or corrupt. We present a framework for benchmarking heap layout manipulation algorithms, and use it to evaluate our approach on several real-world allocators, showing that pseudo-random black box search can be highly effective. We then present SHRIKE, a novel system that can perform automatic heap layout manipulation on the PHP interpreter and can be used in the construction of control-flow hijacking exploits. Starting from PHP’s regression tests, SHRIKE discovers fragments of PHP code that interact with the interpreter’s heap in useful ways, such as making allocations and deallocations of particular sizes, or allocating objects containing sensitive data, such as pointers. SHRIKE then uses our search algorithm to piece together these fragments into programs, searching for one that achieves a desired heap layout. SHRIKE allows an exploit developer to focus on the higher level concepts in an exploit, and to defer the resolution of heap layout constraints to SHRIKE. We demonstrate this by using SHRIKE in the construction of a control-flow hijacking exploit for the PHP interpreter.

FUZE: Towards Facilitating Exploit Generation for Kernel Use-After-Free Vulnerabilities

Software vendors usually prioritize their bug remediation based on ease of their exploitation. However, accurately determining exploitability typically takes tremendous hours and requires significant manual efforts. To address this issue, automated exploit generation techniques can be adopted. In practice, they however exhibit an insufficient ability to evaluate exploitability particularly for the kernel Use-After-Free (UAF) vulnerabilities. This is mainly because of the complexity of UAF exploitation as well as the scalability of an OS kernel.

In this paper, we therefore propose FUZE, a new framework to facilitate the process of kernel UAF exploitation. The design principle behind this technique is that we expect the ease of crafting an exploit could augment a security analyst with the ability to expedite exploitability evaluation. Technically, FUZE utilizes kernel fuzzing along with symbolic execution to identify, analyze and evaluate the system calls valuable and useful for kernel UAF exploitation.

To demonstrate the utility of FUZE, we implement FUZE on a 64-bit Linux system by extending a binary analysis framework and a kernel fuzzer. Using 15 real-world kernel UAF vulnerabilities on Linux systems, we then demonstrate FUZE could not only escalate kernel UAF exploitability and but also diversify working exploits. In addition, we show that FUZE could facilitate security mitigation bypassing, making exploitability evaluation less labor-intensive and more efficient.


The Secure Socket API: TLS as an Operating System Service

SSL/TLS libraries are notoriously hard for developers to use, leaving system administrators at the mercy of buggy and vulnerable applications. We explore the use of the standard POSIX socket API as a vehicle for a simplified TLS API, while also giving administrators the ability to control applications and tailor TLS configuration to their needs. We first assess OpenSSL and its uses in open source software, recommending how this functionality should be accommodated within the POSIX API. We then propose the Secure Socket API (SSA), a minimalist TLS API built using existing network functions and find that it can be employed by existing network applications by modifications requiring as little as one line of code. We next describe a prototype SSA implementation that leverages network system calls to provide privilege separation and support for other programming languages. We end with a discussion of the benefits and limitations of the SSA and our accompanying implementation, noting avenues for future work.

Return Of Bleichenbacher’s Oracle Threat (ROBOT)

In 1998 Bleichenbacher presented an adaptive chosen-ciphertext attack on the RSA PKCS~#1~v1.5 padding scheme. The attack exploits the availability of a server which responds with different messages based on the ciphertext validity. This server is used as an oracle and allows the attacker to decrypt RSA ciphertexts. Given the importance of this attack, countermeasures were defined in TLS and other cryptographic standards using RSA PKCS~#1~v1.5.

We perform the first large-scale evaluation of Bleichenbacher’s RSA vulnerability. We show that this vulnerability is still very prevalent in the Internet and affected almost a third of the top 100 domains in the Alexa Top 1 Million list, including Facebook and Paypal.

We identified vulnerable products from nine different vendors and open source projects, among them F5, Citrix, Radware, Palo Alto Networks, IBM, and Cisco. These implementations provide novel side-channels for constructing Bleichenbacher oracles: TCP resets, TCP timeouts, or duplicated alert messages. In order to prove the importance of this attack, we have demonstrated practical exploitation by signing a message with the private key of \texttt{facebook.com}’s HTTPS certificate. Finally, we discuss countermeasures against Bleichenbacher attacks in TLS and recommend to deprecate the RSA encryption key exchange in TLS and the RSA PKCS~#1~v1.5 standard.

Bamboozling Certificate Authorities with BGP

The Public Key Infrastructure (PKI) protects users from malicious man-in-the-middle attacks by having trusted Certificate Authorities (CAs) vouch for the domain names of servers on the Internet through digitally signed certificates. Ironically, the mechanism CAs use to issue certificates is itself vulnerable to man-in-the-middle attacks by network-level adversaries. Autonomous Systems (ASes) can exploit vulnerabilities in the Border Gateway Protocol (BGP) to hijack traffic destined to a victim’s domain. In this paper, we rigorously analyze attacks that an adversary can use to obtain a bogus certificate. We perform the first real-world demonstration of BGP attacks to obtain bogus certificates from top CAs in an ethical manner. To assess the vulnerability of the PKI, we collect a dataset of 1.8 million certificates and find that an adversary would be capable of gaining a bogus certificate for the vast majority of domains. Finally, we propose and evaluate two countermeasures to secure the PKI: 1) CAs verifying domains from multiple vantage points to make it harder to launch a successful attack, and 2) a BGP monitoring system for CAs to detect suspicious BGP routes and delay certificate issuance to give network operators time to react to BGP attacks.

The Broken Shield: Measuring Revocation Effectiveness in the Windows Code-Signing PKI

Recent measurement studies have highlighted security threats against the code-signing public key infrastructure (PKI), such as certificates that had been compromised or issued directly to the malware authors. The primary mechanism for mitigating these threats is to revoke the abusive certificates. However, the distributed yet closed nature of the code signing PKI makes it difficult to evaluate the effectiveness of revocations in this ecosystem. In consequence, the magnitude of signed malware threat is not fully understood.

In this paper, we collect seven datasets, including the largest corpus of code-signing certificates, and we combine them to analyze the revocation process from end to end. Effective revocations rely on three roles: (1) discovering the abusive certificates, (2) revoking the certificates effectively, and (3) disseminating the revocation information for clients. We assess the challenge for discovering compromised certificates and the subsequent revocation delays. We show that erroneously setting revocation dates causes signed malware to remain valid even after the certificate has been revoked. We also report failures in disseminating the revocations, leading clients to continue trusting the revoked certificates.

Vulnerability Mitigations

Debloating Software through Piece-Wise Compilation and Loading

Programs are bloated. Our study shows that only 5% of libc is used on average across the Ubuntu Desktop envi- ronment (2016 programs); the heaviest user, vlc media player, only needed 18%. In this paper: (1) We present a debloating framework built on a compiler toolchain that can successfully de- bloat programs (shared/static libraries and executables). Our solution can successfully compile and load most li- braries on Ubuntu Desktop 16.04. (2) We demonstrate the elimination of over 79% of code from coreutils and 86% of code from SPEC CPU 2006 benchmark pro- grams without affecting functionality. We show that even complex programs such as Firefox and curl can be debloated without a need to recompile. (3) We demon- strate the security impact of debloating by eliminating over 71% of reusable code gadgets from the coreutils suite, and show that unused code that contains real-world vulnerabilities can also be successfully eliminated with- out adverse effects on the program. (4) We incur a low load time overhead.

Precise and Accurate Patch Presence Test for Binaries

Patching is the main resort to battle software vulnerabilities. It is critical to ensure that patches are propagated to all affected software timely, which, unfortunately, is often not the case. Thus the capability to accurately test the security patch presence in software distributions is crucial, for both defenders and attackers.

Inspired by human analysts’ behaviors to inspect only small and localized code areas, we present FIBER, an automated system that leverages this observation in its core design. FIBER works by first parsing and analyzing the open-source security patches carefully and then generating fine-grained binary signatures that faithfully reflect the most representative syntax and semantic changes introduced by the patch, which are used to search against target binaries. Compared to previous work, FIBER leverages the source-level insight strategically by primarily focusing on small changes of patches and minimal contexts, instead of the whole function or file. We have systematically evaluated FIBER using 107 real-world security patches and 8 Android kernel images from 3 different mainstream vendors, the results show that FIBER can achieve an average accuracy of 94% with no false positives.

From Patching Delays to Infection Symptoms: Using Risk Profiles for an Early Discovery of Vulnerabilities Exploited in the Wild

At any given time there exist a large number of software vulnerabilities in our computing systems, but only a fraction of them are ultimately exploited in the wild. Advanced knowledge of which vulnerabilities are being or likely to be exploited would allow system administrators to prioritize patch deployments, enterprises to assess their security risk more precisely, and security companies to develop intrusion protection for those vulnerabilities. In this paper, we present a novel method based on the notion of community detection for early discovery of vulnerability exploits. Specifically, on one hand, we use symptomatic botnet data (in the form of a set of spam blacklists) to discover a community structure which reveals how similar Internet entities behave in terms of their malicious activities. On the other hand, we analyze the risk behavior of end-hosts through a set of patch deployment measurements that allow us to assess their risk to different vulnerabilities. The latter is then compared to the former to quantify whether the underlying risks are consistent with the observed global symptomatic community structure, which then allows us to statistically determine whether a given vulnerability is being actively exploited in the wild. Our results show that by observing up to 10 days’ worth of data, we can successfully detect vulnerability exploitation with a true positive rate of 90% and a false positive rate of 10%. Our detection is shown to be much earlier than the standard discovery time records for most vulnerabilities. Experiments also demonstrate that our community based detection algorithm is robust against strategic adversaries.

Understanding the Reproducibility of Crowd-reported Security Vulnerabilities

Today’s software systems are increasingly relying on the “power of the crowd” to identify new security vulnerabilities. And yet, it is not well understood how reproducible the crowd-reported vulnerabilities are. In this paper, we perform the first empirical analysis on a wide range of real-world security vulnerabilities (368 in total) with the goal of quantifying their reproducibility. Following a carefully controlled workflow, we organize a focused group of security analysts to carry out reproduction experiments. With 3600 man-hours spent, we obtain quantitative evidence on the prevalence of missing information in vulnerability reports and the low reproducibility of the vulnerabilities. We find that relying on a single vulnerability report from a popular security forum is generally difficult to succeed due to the incomplete information. By widely crowdsourcing the information gathering, security analysts could increase the reproduction success rate, but still face key challenges to troubleshoot the non-reproducible cases. To further explore solutions, we surveyed hackers, researchers, and engineers who have extensive domain expertise in software security (N=43). Going beyond Internet-scale crowdsourcing, we find that, security professionals heavily rely on manual debugging and speculative guessing to infer the missed information. Our result suggests that there is not only a necessity to overhaul the way a security forum collects vulnerability reports, but also a need for automated mechanisms to collect information commonly missing in a report.


Plug and Prey? Measuring the Commoditization of Cybercrime via Online Anonymous Markets

Researchers have observed the increasing commoditization of cybercrime, that is, the offering of capabilities, services, and resources as commodities by specialized suppliers in the underground economy. Commoditization enables outsourcing, thus lowering entry barriers for aspiring criminals, and potentially driving further growth in cybercrime. While there is evidence in the literature of specific examples of cybercrime commoditization, the overall phenomenon is much less understood. Which parts of cybercrime value chains are successfully commoditized, and which are not? What kind of revenue do criminal business-to-business (B2B) services generate and how fast are they growing?

We use longitudinal data from eight online anonymous marketplaces over six years, from the original Silk Road to AlphaBay, and track the evolution of commoditization on these markets. We develop a conceptual model of the value chain components for dominant criminal business models. We then identify the market supply for these components over time. We find evidence of commoditization in most components, but the outsourcing options are highly restricted and transaction volume is often modest. Cash-out services feature the most listings and generate the largest revenue. Consistent with behavior observed in the context of narcotic sales, we also find a significant amount of revenue in retail cybercrime, i.e., business-to-consumer (B2C) rather than business-to-business. We conservatively estimate the overall revenue for cybercrime commodities on online anonymous markets to be at least US \$15M between 2011-2017. While there is growth, commoditization is a spottier phenomenon than previously assumed.

Reading Thieves’ Cant: Automatically Identifying and Understanding Dark Jargons from Cybercrime Marketplaces

Underground communication is invaluable for understanding cybercrimes. However, it is often obfuscated by the extensive use of dark jargons, innocently-looking terms like “popcorn” that serves sinister purposes (buying/selling drug, seeking crimeware, etc.). Discovery and understanding of these jargons have so far relied on manual effort, which is error-prone and cannot catch up with the fast evolving underground ecosystem. In this paper, we present the first technique, called Cantreader, to automatically detect and understand dark jargon. Our approach employs a neural-network based embedding technique to analyze the semantics of words, detecting those whose contexts in legitimate documents are significantly different from those in underground communication. For this purpose, we enhance the existing word embedding model to support semantic comparison across good and bad corpora, which leads to the detection of dark jargons. To further understand them, our approach utilizes projection learning to identify a jargon’s hypernym that sheds light on its true meaning when used in underground communication. Running Cantreader over one million traces collected from four underground forums, our approach automatically reported 3,462 dark jargons and their hypernyms, including 2,491 never known before. The study further reveals how these jargons are used (by 25% of the traces) and evolve and how they help cybercriminals communicate on legitimate forums.

Schrödinger’s RAT: Profiling the Stakeholders in the Remote Access Trojan Ecosystem

Remote Access Trojans (RATs) are a class of malware that give an attacker direct, interactive access to a victim’s personal computer, allowing the attacker to steal private data stored on the machine, spy on the victim in real-time using the camera and microphone, and interact directly with the victim via a dialog box. RATs have been used for surveillance, information theft, and extortion of victims.

In this work, we report on the attackers and victims for two popular RATs, njRAT and DarkComet. Using the malware repository VirusTotal, we find all instances of these RATs and identify the domain names of the controllers. We then register those domains that have expired and direct them to our measurement infrastructure, allowing us to determine the victims of these campaigns. We investigated several techniques for excluding network scanners and sandbox executions of the malware sample in order to exclude apparent infections that are not real victims of the campaign. Our results show that over 99% of the 828,137 IP addresses that connected to our sinkhole are likely not real victims. We report on the number of victims, how long RAT campaigns remain active, and the geographic relationship between victims and attackers.

The aftermath of a crypto-ransomware attack at a large academic institution

In 2016, a large North American university was subject to a significant crypto-ransomware attack and did not pay the ransom. We conducted a survey with 150 respondents and interviews with 30 affected students, staff, and faculty in the immediate aftermath to understand their experiences during the attack and the recovery process. We provide analysis of the technological, productivity, and personal and social impact of ransomware attacks, including previously unaccounted secondary costs. We suggest strategies for comprehensive cyber-response plans that include human factors, and highlight the importance of communication. We conclude with a Ransomware Process for Organizations diagram summarizing the additional contributing factors beyond those relevant to individual infections.

Web and Network Measurement

We Still Don’t Have Secure Cross-Domain Requests: an Empirical Study of CORS

The default Same Origin Policy essentially restricts access of cross-origin network resources to be “write-only”. However, many web applications require “read’’ access to contents from a different origin, so developers have come up with workarounds, such as JSON-P, to bypass the default Same Origin Policy restriction. Such ad-hoc workarounds leave a number of inherent security issues. CORS (cross-origin resource sharing) is a more disciplined mechanism supported by all web browsers to handle cross-origin network accesses. This paper presents our empirical study about the real-world uses of CORS. We find that the design, implementation, and deployment of CORS are subject to a number of new security issues: 1) CORS relaxes the cross-origin “write’’ privilege in a number of subtle ways that are problematic in practice; 2) CORS brings new forms of risky trust dependencies into web interactions; 3) CORS is generally not well understood by developers, possibly due to its inexpressive policy and its complex and subtle interactions with other web mechanisms, leading to various misconfigurations. Finally, we propose protocol simplifications and clarifications to mitigate the security problems uncovered in our study. Some of our proposals have been adopted by both CORS specification and major browsers.

End-to-End Measurements of Email Spoofing Attacks

Spear phishing has been a persistent threat to users and organizations, and yet email providers still face key challenges to authenticate incoming emails. As a result, attackers can apply spoofing techniques to impersonate a trusted entity to conduct highly deceptive phishing attacks. In this work, we study email spoofing to answer three key questions: (1) How do email providers detect and handle forged emails? (2) Under what conditions can forged emails penetrate the defense to reach user inbox? (3) Once the forged email gets in, how email providers warn users? Is the warning truly effective?

We answer these questions by conducting an end-to-end measurement on 35 popular email providers and examining user reactions to spoofing through a real-world spoofing/phishing test. Our key findings are three folds. First, we observe that most email providers have the necessary protocols to detect spoofing, but still allow forged emails to reach the user inbox (e.g., Yahoo Mail, iCloud, Gmail). Second, once a forged email gets in, most email providers have no warning for users, particularly for mobile email apps. Some providers (e.g., Gmail Inbox) even have misleading UIs that make the forged email look authentic. Third, a few email providers (9/35) have implemented visual security cues on unverified emails. Our phishing experiment shows that security cues have a positive impact on reducing risky user actions, but cannot eliminate the risk. Our study reveals a major miscommunication between email providers and end-users. Improvements at both ends (server-side protocols and UIs) are needed to bridge the gap.

Who Is Answering My Queries: Understanding and Characterizing Interception of the DNS Resolution Path

DNS queries from end users are handled by recursive DNS servers for scalability. For convenience, Internet Service Providers (ISPs) assign recursive servers for their clients automatically when the clients choose the default network settings. But users should also have flexibility to use their preferred recursive servers, like public DNS servers. This kind of trust, however, can be broken by the hidden interception of the DNS resolution path (which we term as DNSIntercept). Specifically, on-path devices could spoof the IP addresses of user-specified DNS servers and intercept the DNS queries surreptitiously, introducing privacy and security issues.

In this paper, we perform a large-scale analysis of on-path DNS interception and shed light on its scope and characteristics. We design novel approaches to detect DNS interception and leverage 148,478 residential and cellular IP addresses around the world for analysis. As a result, we find that 259 of the 3,047 ASes (8.5%) that we inspect exhibit DNS interception behavior, including large providers, such as China Mobile. Moreover, we find that the DNS servers of the ASes which intercept requests may use outdated vulnerable software (deprecated before 2009) and lack security-related functionality, such as handling DNSSEC requests. Our work highlights the issues around on-path DNS interception and provides new insights for addressing such issues.

End-Users Get Maneuvered: Empirical Analysis of Redirection Hijacking in Content Delivery Networks

The success of Content Delivery Networks (CDNs) relies on the mapping system that leverages dynamically generated DNS records to distribute client’s request to a proximal server for achieving optimal content delivery. However, the mapping system is vulnerable to malicious hijacks, as (1) it is very difficult to provide pre-computed DNSSEC signatures for dynamically generated records and (2) even considering DNSSEC enabled, DNSSEC itself is vulnerable to replay attacks. By leveraging crafted but legitimate mapping between end-user and edge server, adversaries can hijack CDN’s request redirection and nullify the benefits offered by CDNs, such as proximal access, load balancing, and DoS protection, while remaining undetectable by existing security practices. In this paper, we investigate the security implications of dynamic mapping that remain understudied in security and CDN community. We perform a characterization of CDN’s service delivery and assess this fundamental vulnerability in DNS-based CDNs in the wild. We demonstrate that DNSSEC is ineffective to address this problem, even with the newly adopted ECDSA that is capable of achieving live signing. We then discuss practical countermeasures against such manipulation.

Attacks on Systems That Learn

With Great Training Comes Great Vulnerability: Practical Attacks against Transfer Learning

Transfer learning is a powerful approach that allows users to quickly build accurate deep-learning (Student) models by “learning” from centralized (Teacher) models pretrained with large datasets, e.g. Google’s InceptionV3. We hypothesize that the centralization of model training increases their vulnerability to misclassification attacks leveraging knowledge of publicly accessible Teacher models. In this paper, we describe our efforts to understand and experimentally validate such attacks in the context of image recognition. We identify techniques that allow attackers to associate Student models with their Teacher counterparts, and launch highly effective misclassification attacks on black-box Student models. We validate this on widely used Teacher models in the wild. Finally, we propose and evaluate multiple approaches for defense, including a neuron-distance technique that successfully defends against these attacks while also obfuscates the link between Teacher and Student models.

When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks

Recent results suggest that attacks against supervised machine learning systems are quite effective, while defenses are easily bypassed by new attacks. However, the specifications for machine learning systems currently lack precise adversary definitions, and the existing attacks make diverse, potentially unrealistic assumptions about the strength of the adversary who launches them. We propose the FAIL attacker model, which describes the adversary’s knowledge and control along four dimensions. Our model allows us to consider a wide range of weaker adversaries who have limited control and incomplete knowledge of the features, learning algorithms and training instances utilized. To evaluate the utility of the FAIL model, we consider the problem of conducting targeted poisoning attacks in a realistic setting: the crafted poison samples must have clean labels, must be individually and collectively inconspicuous, and must exhibit a generalized form of transferability, defined by the FAIL model. By taking these constraints into account, we design StingRay, a targeted poisoning attack that is practical against 4 machine learning applications, which use 3 different learning algorithms, and can bypass 2 existing defenses. Conversely, we show that a prior evasion attack is less effective under generalized transferability. Such attack evaluations, under the FAIL adversary model, may also suggest promising directions for future defenses.

Web Authentication

Vetting Single Sign-On SDK Implementations via Symbolic Reasoning

Encouraged by the rapid adoption of Single Sign-On (SSO) technology in web services, mainstream identity providers, such as Facebook and Google, have developed Software Development Kits (SDKs) to facilitate the implementation of SSO for 3rd-party application developers. These SDKs have become a critical foundation for web services. Despite its importance, little effort has been devoted to a systematic testing on the implementations of SSO SDKs, especially in the public domain. In this paper, we design and implement S3KVetter (Single-Sign-on SdK Vetter), an automated, efficient testing tool, to check the logical correctness and identify vulnerabilities of SSO SDKs. To demonstrate the efficacy of S3KVetter, we apply it to test ten popular SSO SDKs which enjoy millions of downloads by application developers. Among these carefully engineered SDKs, S3KVetter has surprisingly discovered 7 classes of logic flaws, 4 of which were previously unknown. These vulnerabilities can lead to severe consequences, ranging from the sniffing of user activities to the hijacking of user accounts.

O Single Sign-Off, Where Art Thou? An Empirical Analysis of Single Sign-On Account Hijacking and Session Management on the Web

Single Sign-On (SSO) allows users to effortlessly navigate the Web and obtain a personalized experience without the hassle of creating and managing accounts across different services. Due to its proliferation, user accounts in identity providers are now keys to the kingdom and pose a massive security risk. In this paper we investigate the security implications of SSO and offer an in-depth analysis of account hijacking in the modern Web. Our experimental methodology explores multiple aspects of the attack workflow and reveals significant variance in how services deploy SSO. We first present a cookie hijacking attack for Facebook that results in complete account takeover, which in turn can be used to compromise accounts in services that support SSO. Next we introduce several novel attacks that leverage SSO for maintaining long-term control of user accounts. We empirically evaluate our attacks against 95 major web and mobile services and demonstrate their severity and stealthy nature. Next we explore what session and account management options are available to users after an account is compromised. Our findings highlight the inherent limitations of prevalent SSO schemes as most services lack the functionality that would allow users to remediate an account takeover. This is exacerbated by the scale of SSO coverage, rendering manual remediation attempts a futile endeavor. To remedy this we propose Single Sign-Off, an extension to OpenID Connect for universally revoking access to all the accounts associated with the hijacked identity provider account.

WPSE: Fortifying Web Protocols via Browser-Side Security Monitoring

We present WPSE, a browser-side security monitor for web protocols designed to ensure compliance with the intended protocol flow, as well as confidentiality and integrity properties of messages. We formally prove that WPSE is expressive enough to protect web applications from a wide range of protocol implementation bugs and web attacks. We discuss concrete examples of attacks which can be prevented by WPSE on OAuth 2.0 and SAML 2.0, including a novel attack on the Google implementation of SAML 2.0 which we discovered by formalizing the protocol specification in WPSE. Moreover, we use WPSE to carry out an extensive experimental evaluation of OAuth 2.0 in the wild. Out of 90 tested websites, we identify security flaws in 55 websites (61.1%), including new critical vulnerabilities introduced by tracking libraries such as Facebook Pixel, all of which fixable by WPSE. Finally, we show that WPSE works flawlessly on 83 websites (92.2%), with the 7 compatibility issues being caused by custom implementations deviating from the OAuth 2.0 specification, one of which introducing a critical vulnerability.

Man-in-the-Machine: Exploiting Ill-Secured Communication Inside the Computer

Operating systems provide various inter-process communication (IPC) mechanisms. Software applications typically use IPC for communication between front-end and back-end components, which run in different processes on the same computer. This paper studies the security of how the IPC mechanisms are used in PC, Mac and Linux software. We describe attacks where a nonprivileged process impersonates the IPC communication endpoints. The attacks are closely related to impersonation and man-in-the-middle attacks on computer networks but take place inside one computer. The vulnerable IPC methods are ones where a server process binds to a name or address and waits for client communication. Our results show that application developers are often unaware of the risks and secure practices in using IPC. We find attacks against several security-critical applications including password managers and hardware tokens, in which another user’s process is able to steal and misuse sensitive data such as the victim’s credentials. The vulnerabilities can be exploited in enterprise environments with centralized access control that gives multiple users remote or local login access to the same host. Computers with guest accounts and shared computers at home are similarly vulnerable.

Neural Networks

Formal Security Analysis of Neural Networks using Symbolic Intervals

Due to the increasing deployment of Deep Neural Networks (DNNs) in real-world security-critical domains including autonomous vehicles and collision avoidance systems, formally checking security properties of DNNs, especially under different attacker capabilities, is becoming crucial. Most existing security testing techniques for DNNs try to find adversarial examples without providing any formal security guarantees about the non-existence of such adversarial examples. Recently, several projects have used different types of Satisfiability Modulo Theory (SMT) solvers to formally check security properties of DNNs. However, all of these approaches are limited by the high overhead caused by the solver.

In this paper, we present a new direction for formally checking security properties of DNNs without using SMT solvers. Instead, we leverage interval arithmetic to compute rigorous bounds on the DNN outputs. Our approach, unlike existing solver-based approaches, is easily parallelizable. We further present symbolic interval analysis along with several other optimizations to minimize overestimations of output bounds.

We design, implement, and evaluate our approach as part of ReluVal, a system for formally checking security properties of Relu-based DNNs. Our extensive empirical results show that ReluVal outperforms Reluplex, a state-of-the-art solver-based system, by 200 times on average. On a single 8-core machine without GPUs, within 4 hours, ReluVal is able to verify a security property that Reluplex deemed inconclusive due to timeout after running for more than 5 days. Our experiments demonstrate that symbolic interval analysis is a promising new direction towards rigorously analyzing different security properties of DNNs.

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

Deep Neural Networks have recently gained lots of success after enabling several breakthroughs in notoriously challenging problems. Training these networks is computationally expensive and requires vast amounts of training data. Selling such pre-trained models can, therefore, be a lucrative business model. Unfortunately, once the models are sold they can be easily copied and redistributed. To avoid this, a tracking mechanism to identify models as the intellectual property of a particular vendor is necessary. In this work, we present an approach for watermarking Deep Neural Networks in a black-box way. Our scheme works for general classification tasks and can easily be combined with current learning algorithms. We show experimentally that such a watermark has no noticeable impact on the primary task that the model is designed for and evaluate the robustness of our proposal against a multitude of practical attacks. Moreover, we provide a theoretical analysis, relating our approach to previous work on backdooring.

A4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation

Text-based analysis methods enable an adversary to reveal privacy relevant author attributes such as gender, age and can identify the text’s author. Such methods can compromise the privacy of an anonymous author even when the author tries to remove privacy sensitive content. In this paper, we propose an automatic method, called the Adversarial Author Attribute Anonymity Neural Translation ($\text{A}^{4}\text{NT}$), to combat such text-based adversaries. Unlike prior works on obfuscation, we propose a system that is fully automatic and learns to perform obfuscation entirely from the data. This allows us to easily apply the $\text{A}^{4}\text{NT}$ system to obfuscate different author attributes. We propose a sequence-to-sequence language model, inspired by machine translation, and an adversarial training framework to design a system which learns to transform the input text to obfuscate the author attributes without paired data. We also propose and evaluate techniques to impose constraints on our $\text{A}^{4}\text{NT}$ model to preserve the semantics of the input text. $\text{A}^{4}\text{NT}$ learns to make minimal changes to the input to successfully fool author attribute classifiers, while preserving the meaning of the input text. Our experiments on two datasets and three settings show that the proposed method is effective in fooling the attribute classifiers and thus improves the anonymity of authors.

GAZELLE: A Low Latency Framework for Secure Neural Network Inference

The growing popularity of cloud-based machine learning raises natural questions about the privacy guarantees that can be provided in such settings. Our work tackles this problem in the context of prediction-as-a-service wherein a server has a convolutional neural network (CNN) trained on its private data and wishes to provide classifications on clients’ private images. Our goal is to build efficient protocols whereby the client can acquire the classification result without revealing their input to the server, while guaranteeing the privacy of the server’s neural network.

To this end, we design Gazelle, a scalable and low-latency system for secure neural network inference, using an intricate combination of homomorphic encryption and traditional two-party computation techniques (such as garbled circuits). Gazelle makes three contributions. First, we design the Gazelle homomorphic encryption library which provides fast algorithms for basic homomorphic operations such as SIMD (single instruction multiple data) addition, SIMD multiplication and ciphertext permutation. Second, we implement the Gazelle homomorphic linear algebra kernels which map neural network layers to optimized homomorphic matrix-vector multiplication and convolution routines. Third, we design optimized encryption switching protocols which seamlessly convert between homomorphic and garbled circuit encodings to enable implementation of complete neural network inference.

We evaluate our protocols on benchmark neural networks trained on the MNIST and CIFAR-10 datasets and show that Gazelle outperforms the best existing systems such as MiniONN (ACM CCS 2017) and Chameleon (Crypto Eprint 2017/1164) by 20–30x in online runtime. When compared with fully homomorphic approaches like CryptoNets (ICML 2016), we demonstrate three orders of magnitude faster online run-time.

2019 Technical Sessions(有视频音频和PPT) dblp

Protecting Users Everywhere

Computer Security and Privacy in the Interactions Between Victim Service Providers and Human Trafficking Survivors

A victim service provider, or VSP, is a crucial partner in a human trafficking survivor’s recovery. VSPs provide or connect survivors to services such as medical care, legal services, employment opportunities, etc. In this work, we study VSP-survivor interactions from a computer security and privacy perspective. Through 17 semi-structured interviews with staff members at VSPs and survivors of trafficking, we surface the role technology plays in VSP-survivor interactions as well as related computer security and privacy concerns and mitigations. Our results highlight various tensions that VSPs must balance, including building trust with their clients (often by giving them as much autonomy as possible) while attempting to guide their use of technology to mitigate risks around revictimization. We conclude with concrete recommendations for computer security and privacy technologists who wish to partner with VSPs to support and empower trafficking survivors.

Clinical Computer Security for Victims of Intimate Partner Violence

Digital insecurity in the face of targeted, persistent attacks increasingly leaves victims in debilitating or even life-threatening situations. We propose an approach to helping victims, what we call clinical computer security, and explore it in the context of intimate partner violence (IPV). IPV is widespread and abusers exploit technology to track, harass, intimidate, and otherwise harm their victims. We report on the iterative design, refinement, and deployment of a consultation service that we created to help IPV victims obtain in-person security help from a trained technologist. To do so we created and tested a range of new technical and non-technical tools that systematize the discovery and investigation of the complicated, multimodal digital attacks seen in IPV. An initial field study with 44 IPV survivors showed how our procedures and tools help victims discover account compromise, exploitable misconfigurations, and potential spyware.

Evaluating the Contextual Integrity of Privacy Regulation: Parents’ IoT Toy Privacy Norms Versus COPPA

Increased concern about data privacy has prompted new and updated data protection regulations worldwide. However, there has been no rigorous way to test whether the practices mandated by these regulations actually align with the privacy norms of affected populations. Here, we demonstrate that surveys based on the theory of contextual integrity provide a quantifiable and scalable method for measuring the conformity of specific regulatory provisions to privacy norms. We apply this method to the U.S. Children’s Online Privacy Protection Act (COPPA), surveying 195 parents and providing the first data that COPPA’s mandates generally align with parents’ privacy expectations for Internet-connected “smart” children’s toys. Nevertheless, variations in the acceptability of data collection across specific smart toys, information types, parent ages, and other conditions emphasize the importance of detailed contextual factors to privacy norms, which may not be adequately captured by COPPA.

Secure Multi-User Content Sharing for Augmented Reality Applications

Augmented reality (AR), which overlays virtual content on top of the user’s perception of the real world, has now begun to enter the consumer market. Besides smartphone platforms, early-stage head-mounted displays such as the Microsoft HoloLens are under active development. Many compelling uses of these technologies are multi-user: e.g., in-person collaborative tools, multiplayer gaming, and telepresence. While prior work on AR security and privacy has studied potential risks from AR applications, new risks will also arise among multiple human users. In this work, we explore the challenges that arise in designing secure and private content sharing for multi-user AR. We analyze representative application case studies and systematize design goals for security and functionality that a multi-user AR platform should support. We design an AR content sharing control module that achieves these goals and build a prototype implementation (ShareAR) for the HoloLens. This work builds foundations for secure and private multi-user AR interactions.

Understanding and Improving Security and Privacy in Multi-User Smart Homes: A Design Exploration and In-Home User Study

Smart homes face unique security, privacy, and usability challenges because they are multi-user, multi-device systems that affect the physical environment of all inhabitants of the home. Current smart home technology is often not well designed for multiple users, sometimes lacking basic access control and other affordances for making the system intelligible and accessible for all users. While prior work has shed light on the problems and needs of smart home users, it is not obvious how to design and build solutions. Such questions have certainly not been answered for challenging adversarial situations (e.g., domestic abuse), but we observe that they have not even been answered for tensions in otherwise functional, non-adversarial households. In this work, we explore user behaviors, needs, and possible solutions to multi-user security and privacy issues in generally non-adversarial smart homes. Based on design principles grounded in prior work, we built a prototype smart home app that includes concrete features such as location-based access controls, supervisory access controls, and activity notifications, and we tested our prototype though a month-long in-home user study with seven households. From the results of the user study, we re-evaluate our initial design principles, we surface user feedback on security and privacy features, and we identify challenges and recommendations for smart home designers and researchers.

Machine Learning Applications

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

This paper describes a testing methodology for quantitatively assessing the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models—a common type of machine-learning model. Because such models are sometimes trained on sensitive data (e.g., the text of users’ private messages), this methodology can benefit privacy by allowing deep-learning practitioners to select means of training that minimize such memorization.

In experiments, we show that unintended memorization is a persistent, hard-to-avoid issue that can have serious consequences. Specifically, for models trained without consideration of memorization, we describe new, efficient procedures that can extract unique, secret sequences, such as credit card numbers. We show that our testing strategy is a practical and easy-to-use first line of defense, e.g., by describing its application to quantitatively limit data exposure in Google’s Smart Compose, a commercial text-completion neural network trained on millions of users’ email messages.

Improving Robustness of ML Classifiers against Realizable Evasion Attacks Using Conserved Features

Machine learning (ML) techniques are increasingly common in security applications, such as malware and intrusion detection. However, ML models are often susceptible to evasion attacks, in which an adversary makes changes to the input (such as malware) in order to avoid being detected. A conventional approach to evaluate ML robustness to such attacks, as well as to design robust ML, is by considering simplified feature-space models of attacks, where the attacker changes ML features directly to effect evasion, while minimizing or constraining the magnitude of this change. We investigate the effectiveness of this approach to designing robust ML in the face of attacks that can be realized in actual malware (realizable attacks). We demonstrate that in the context of structure-based PDF malware detection, such techniques appear to have limited effectiveness, but they are effective with content-based detectors. In either case, we show that augmenting the feature space models with conserved features (those that cannot be unilaterally modified without compromising malicious functionality) significantly improves performance. Finally, we show that feature space models enable generalized robustness when faced with a variety of realizable attacks, as compared to classifiers which are tuned to be robust to a specific realizable attack.

ALOHA: Auxiliary Loss Optimization for Hypothesis Augmentation

Malware detection is a popular application of Machine Learning for Information Security (ML-Sec), in which an ML classifier is trained to predict whether a given file is malware or benignware. Parameters of this classifier are typically optimized such that outputs from the model over a set of input samples most closely match the samples’ true malicious/benign (1/0) target labels. However, there are often a number of other sources of contextual metadata for each malware sample, beyond an aggregate malicious/benign label, including multiple labeling sources and malware type information (e.g. ransomware, trojan, etc.), which we can feed to the classifier as auxiliary prediction targets. In this work, we fit deep neural networks to multiple additional targets derived from metadata in a threat intelligence feed for Portable Executable (PE) malware and benignware, including a multi-source malicious/benign loss, a count loss on multi-source detections, and a semantic malware attribute tag loss. We find that incorporating multiple auxiliary loss terms yields a marked improvement in performance on the main detection task. We also demonstrate that these gains likely stem from a more informed neural network representation and are not due to a regularization artifact of multi-target learning. Our auxiliary loss architecture yields a significant reduction in detection error rate (false negatives) of 42.6% at a false positive rate (FPR) of 10-3 when compared to a similar model with only one target, and a decrease of 53.8% at 10-5 FPR.

Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks

Transferability captures the ability of an attack against a machine-learning model to be effective against a different, potentially unknown, model. Empirical evidence for transferability has been shown in previous work, but the underlying reasons why an attack transfers or not are not yet well understood. In this paper, we present a comprehensive analysis aimed to investigate the transferability of both test-time evasion and training-time poisoning attacks. We provide a unifying optimization framework for evasion and poisoning attacks, and a formal definition of transferability of such attacks. We highlight two main factors contributing to attack transferability: the intrinsic adversarial vulnerability of the target model, and the complexity of the surrogate model used to optimize the attack. Based on these insights, we define three metrics that impact an attack’s transferability. Interestingly, our results derived from theoretical analysis hold for both evasion and poisoning attacks, and are confirmed experimentally using a wide range of linear and non-linear classifiers and datasets.

Stack Overflow Considered Helpful! Deep Learning Security Nudges Towards Stronger Cryptography

Stack Overflow is the most popular discussion platform for software developers. Recent research found a large amount of insecure encryption code in production systems that has been inspired by examples given on Stack Overflow. By copying and pasting functional code, developers introduced exploitable software vulnerabilities into security-sensitive high-profile applications installed by millions of users every day. Proposed mitigations of this problem suffer from usability flaws and push developers to continue shopping for code examples on Stack Overflow once again. This points us to fighting the proliferation of insecure code directly at the root before it even reaches the clipboard. By viewing Stack Overflow as a market, implementation of cryptography becomes a decision-making problem: i. e. how to simplify the selection of helpful and secure examples. We focus on supporting software developers in making better decisions by applying nudges, a concept borrowed from behavioral science. This approach is motivated by one of our key findings: for 99.37% of insecure code examples on Stack Overflow, similar alternatives are available that serve the same use case and provide strong cryptography. Our system design is based on several nudges that are controlled by a deep neural network. It learns a representation for cryptographic API usage patterns and classification of their security, achieving average AUC-ROC of 0.992. With a user study we demonstrate that nudge-based security advice significantly helps tackling the most popular and error-prone cryptographic use cases in Android.

Machine Learning, Adversarial and Otherwise

Seeing is Not Believing: Camouflage Attacks on Image Scaling Algorithms

Image scaling algorithms are intended to preserve the visual features before and after scaling, which is commonly used in numerous visual and image processing applications. In this paper, we demonstrate an automated attack against common scaling algorithms, i.e. to automatically generate camouflage images whose visual semantics change dramatically after scaling. To illustrate the threats from such camouflage attacks, we choose several computer vision applications as targeted victims, including multiple image classification applications based on popular deep learning frameworks, as well as main-stream web browsers. Our experimental results show that such attacks can cause different visual results after scaling and thus create evasion or data poisoning effect to these victim applications. We also present an algorithm that can successfully enable attacks against famous cloud-based image services (such as those from Microsoft Azure, Aliyun, Baidu, and Tencent) and cause obvious misclassification effects, even when the details of image processing (such as the exact scaling algorithm and scale dimension parameters) are hidden in the cloud. To defend against such attacks, this paper suggests a few potential countermeasures from attack prevention to detection.

CT-GAN: Malicious Tampering of 3D Medical Imagery using Deep Learning

In 2018, clinics and hospitals were hit with numerous attacks leading to significant data breaches and interruptions in medical services. An attacker with access to medical records can do much more than hold the data for ransom or sell it on the black market. In this paper, we show how an attacker can use deep-learning to add or remove evidence of medical conditions from volumetric (3D) medical scans. An attacker may perform this act in order to stop a political candidate, sabotage research, commit insurance fraud, perform an act of terrorism, or even commit murder. We implement the attack using a 3D conditional GAN and show how the framework (CT-GAN) can be automated. Although the body is complex and 3D medical scans are very large, CT-GAN achieves realistic results which can be executed in milliseconds. To evaluate the attack, we focused on injecting and removing lung cancer from CT scans. We show how three expert radiologists and a state-of-the-art deep learning AI are highly susceptible to the attack. We also explore the attack surface of a modern radiology network and demonstrate one attack vector: we intercepted and manipulated CT scans in an active hospital network with a covert penetration test.

Misleading Authorship Attribution of Source Code using Adversarial Learning

In this paper, we present a novel attack against authorship attribution of source code. We exploit that recent attribution methods rest on machine learning and thus can be deceived by adversarial examples of source code. Our attack performs a series of semantics-preserving code transformations that mislead learning-based attribution but appear plausible to a developer. The attack is guided by Monte-Carlo tree search that enables us to operate in the discrete domain of source code. In an empirical evaluation with source code from 204 programmers, we demonstrate that our attack has a substantial effect on two recent attribution methods, whose accuracy drops from over 88% to 1% under attack. Furthermore, we show that our attack can imitate the coding style of developers with high accuracy and thereby induce false attributions. We conclude that current approaches for authorship attribution are inappropriate for practical application and there is a need for resilient analysis techniques.

Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks

Deep neural networks (DNNs) have been shown to tolerate “brain damage”: cumulative changes to the network’s parameters (e.g., pruning, numerical perturbations) typically result in a graceful degradation of classification accuracy. However, the limits of this natural resilience are not well understood in the presence of small adversarial changes to the DNN parameters’ underlying memory representation, such as bit-flips that may be induced by hardware fault attacks. We study the effects of bitwise corruptions on 19 DNN models—six architectures on three image classification tasks—and we show that most models have at least one parameter that, after a specific bit-flip in their bitwise representation, causes an accuracy loss of over 90%. For large models, we employ simple heuristics to identify the parameters likely to be vulnerable and estimate that 40–50% of the parameters in a model might lead to an accuracy drop greater than 10% when individually subjected to such single-bit perturbations. To demonstrate how an adversary could take advantage of this vulnerability, we study the impact of an exemplary hardware fault attack, Rowhammer, on DNNs. Specifically, we show that a Rowhammer-enabled attacker co-located in the same physical machine can inflict significant accuracy drops (up to 99%) even with single bit corruptions and no knowledge of the model. Our results expose the limits of DNNs’ resilience against parameter perturbations induced by real-world fault attacks. We conclude by discussing possible mitigations and future research directions towards fault attack-resilient DNNs.

CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel

Machine learning has become mainstream across industries. Numerous examples prove the validity of it for security applications. In this work, we investigate how to reverse engineer a neural network by using side-channel information such as timing and electromagnetic (EM) emanations. To this end, we consider multilayer perceptron and convolutional neural networks as the machine learning architectures of choice and assume a non-invasive and passive attacker capable of measuring those kinds of leakages.

We conduct all experiments on real data and commonly used neural network architectures in order to properly assess the applicability and extendability of those attacks. Practical results are shown on an ARM Cortex-M3 microcontroller, which is a platform often used in pervasive applications using neural networks such as wearables, surveillance cameras, etc. Our experiments show that a side-channel attacker is capable of obtaining the following information: the activation functions used in the architecture, the number of layers and neurons in the layers, the number of output classes, and weights in the neural network. Thus, the attacker can effectively reverse engineer the network using merely side-channel information such as timing or EM.

Intelligence and Vulnerabilities

Reading the Tea leaves: A Comparative Analysis of Threat Intelligence

The term “threat intelligence” has swiftly become a staple buzzword in the computer security industry. The entirely reasonable premise is that, by compiling up-to-date information about known threats (i.e., IP addresses, domain names, file hashes, etc.), recipients of such information may be able to better defend their systems from future attacks. Thus, today a wide array of public and commercial sources distribute threat intelligence data feeds to support this purpose. However, our understanding of this data, its characterization and the extent to which it can meaningfully support its intended uses, is still quite limited. In this paper, we address these gaps by formally defining a set of metrics for characterizing threat intelligence data feeds and using these measures to systematically characterize a broad range of public and commercial sources. Further, we ground our quantitative assessments using external measurements to qualitatively investigate issues of coverage and accuracy. Unfortunately, our measurement results suggest that there are significant limitations and challenges in using existing threat intelligence data for its purported goals.

Towards the Detection of Inconsistencies in Public Security Vulnerability Reports

Public vulnerability databases such as Common Vulnerabilities and Exposures (CVE) and National Vulnerability Database (NVD) have achieved a great success in promoting vulnerability disclosure and mitigation. While these databases have accumulated massive data, there is a growing concern for their information quality and consistency.

In this paper, we propose an automated system VIEM to detect inconsistent information between the fully standardized NVD database and the unstructured CVE descriptions and their referenced vulnerability reports. VIEM allows us, for the first time, to quantify the information consistency at a massive scale, and provides the needed tool for the community to keep the CVE/NVD databases up-to date. VIEM is developed to extract vulnerable software names and vulnerable versions from unstructured text. We introduce customized designs to deep-learning-based named entity recognition (NER) and relation extraction (RE) so that VIEM can recognize previous unseen software names and versions based on sentence structure and contexts. Ground-truth evaluation shows the system is highly accurate (0.941 precision and 0.993 recall). Using VIEM, we examine the information consistency using a large dataset of 78,296 CVE IDs and 70,569 vulnerability reports in the past 20 years. Our result suggests that inconsistent vulnerable software versions are highly prevalent. Only 59.82% of the vulnerability reports/CVE summaries strictly match the standardized NVD entries, and the inconsistency level increases over time. Case studies confirm the erroneous information of NVD that either overclaims or underclaims the vulnerable software versions.

Understanding and Securing Device Vulnerabilities through Automated Bug Report Analysis

Recent years have witnessed the rise of Internet-of-Things (IoT) based cyber attacks. These attacks, as expected, are launched from compromised IoT devices by exploiting security flaws already known. Less clear, however, are the fundamental causes of the pervasiveness of IoT device vulnerabilities and their security implications, particularly in how they affect ongoing cybercrimes. To better understand the problems and seek effective means to suppress the wave of IoT-based attacks, we conduct a comprehensive study based on a large number of real-world attack traces collected from our honeypots, attack tools purchased from the underground, and information collected from high-profile IoT attacks. This study sheds new light on the device vulnerabilities of today’s IoT systems and their security implications: ongoing cyber attacks heavily rely on these known vulnerabilities and the attack code released through their reports. We found that the reliance on known vulnerabilities can actually be used against adversaries. The same bug reports that enable the development of an attack at an exceedingly low cost can also be leveraged to extract vulnerability-specific features that help stop the attack. In particular, we leverage Natural Language Processing (NLP) to automatically collect and analyze more than 7,500 security reports (with 12,286 security critical IoT flaws in total) scattered across bug-reporting blogs, forums, and mailing lists on the Internet. We show that signatures can be automatically generated through an NLP-based report analysis, and used by intrusion detection or firewall systems to effectively mitigate the threats from today’s IoT-based attacks.

ATTACK2VEC: Leveraging Temporal Word Embeddings to Understand the Evolution of Cyberattacks

Despite the fact that cyberattacks are constantly growing in complexity, the research community still lacks effective tools to easily monitor and understand them. In particular, there is a need for techniques that are able to not only track how prominently certain malicious actions, such as the exploitation of specific vulnerabilities, are exploited in the wild, but also (and more importantly) how these malicious actions factor in as attack steps in more complex cyberattacks. In this paper we present ATTACK2VEC, a system that uses word embeddings to model how attack steps are exploited in the wild, and track how they evolve. We test ATTACK2VEC on a dataset of billions of security events collected from the customers of a commercial Intrusion Prevention System over a period of two years, and show that our approach is effective in monitoring the emergence of new attack strategies in the wild and in flagging which attack steps are often used together by attackers (e.g., vulnerabilities that are frequently exploited together). ATTACK2VEC provides a useful tool for research and practitioners to better understand cyberattacks and their evolution, and use this knowledge to improve situational awareness and develop proactive defenses.

Web Attacks

Leaky Images: Targeted Privacy Attacks in the Web

Sharing files with specific users is a popular service provided by various widely used websites, e.g., Facebook, Twitter, Google, and Dropbox. A common way to ensure that a shared file can only be accessed by a specific user is to authenticate the user upon a request for the file. This paper shows a novel way of abusing shared image files for targeted privacy attacks. In our attack, called leaky images, an image shared with a particular user reveals whether the user is visiting a specific website. The basic idea is simple yet effective: an attacker-controlled website requests a privately shared image, which will succeed only for the targeted user whose browser is logged into the website through which the image was shared. In addition to targeted privacy attacks aimed at single users, we discuss variants of the attack that allow an attacker to track a group of users and to link user identities across different sites. Leaky images require neither JavaScript nor CSS, exposing even privacy-aware users, who disable scripts in their browser, to the leak. Studying the most popular websites shows that the privacy leak affects at least eight of the 30 most popular websites that allow sharing of images between users, including the three most popular of all sites. We disclosed the problem to the affected sites, and most of them have been fixing the privacy leak in reaction to our reports. In particular, the two most popular affected sites, Facebook and Twitter, have already fixed the leaky images problem. To avoid leaky images, we discuss potential mitigation techniques that address the problem at the level of the browser and of the image sharing website.

All Your Clicks Belong to Me: Investigating Click Interception on the Web

Click is the prominent way that users interact with web applications. For example, we click hyperlinks to navigate among different pages on the Web, click form submission buttons to send data to websites, and click player controls to tune video playback. Clicks are also critical in online advertising, which fuels the revenue of billions of websites. Because of the critical role of clicks in the Web ecosystem, attackers aim to intercept genuine user clicks to either send malicious commands to another application on behalf of the user or fabricate realistic ad click traffic. However, existing studies mainly consider one type of click interceptions in the cross-origin settings via iframes, i.e., clickjacking. This does not comprehensively represent various types of click interceptions that can be launched by malicious third-party JavaScript code.

In this paper, we therefore systematically investigate the click interception practices on the Web. We developed a browser-based analysis framework, Observer, to collect and analyze click related behaviors. Using Observer, we identified three different techniques to intercept user clicks on the Alexa top 250K websites, and detected 437 third-party scripts that intercepted user clicks on 613 websites, which in total receive around 43 million visits on a daily basis.

We revealed that some websites collude with third-party scripts to hijack user clicks for monetization. In particular, our analysis demonstrated that more than 36% of the 3,251 unique click interception URLs were related to online advertising, which is the primary monetization approach on the Web. Further, we discovered that users can be exposed to malicious contents such as scamware through click interceptions. Our research demonstrated that click interception has become an emerging threat to web users.

What Are You Searching For? A Remote Keylogging Attack on Search Engine Autocomplete

Many search engines have an autocomplete feature that presents a list of suggested queries to the user as they type. Autocomplete induces network traffic from the client upon changes to the query in a web page. We describe a remote keylogging attack on search engine autocomplete. The attack integrates information leaked by three independent sources: the timing of keystrokes manifested in packet inter-arrival times, percent-encoded Space characters in a URL, and the static Huffman code used in HTTP2 header compression. While each source is a relatively weak predictor in its own right, combined, and by leveraging the relatively low entropy of English language, up to 15% of search queries are identified among a list of 50 hypothesis queries generated from a dictionary with over 12k words. The attack succeeds despite network traffic being encrypted. We demonstrate the attack on two popular search engines and discuss some countermeasures to mitigate attack success.

Iframes/Popups Are Dangerous in Mobile WebView: Studying and Mitigating Differential Context Vulnerabilities

In this paper, we present a novel class of Android WebView vulnerabilities (called Differential Context Vulnerabilities or DCVs) associated with web iframe/popup behaviors. To demonstrate the security implications of DCVs, we devise several novel concrete attacks. We show an untrusted web iframe/popup inside WebView becomes dangerous that it can launch these attacks to open holes on existing defense solutions, and obtain risky privileges and abilities, such as breaking web messaging integrity, stealthily accessing sensitive mobile functionalities, and performing phishing attacks.

Then, we study and assess the security impacts of DCVs on real-world apps. For this purpose, we develop a novel technique, DCV-Hunter, that can automatically vet Android apps against DCVs. By applying DCV-Hunter on a large number of most popular apps, we find DCVs are prevalent. Many high-profile apps are verified to be impacted, such as Facebook, Instagram, Facebook Messenger, Google News, Skype, Uber, Yelp, and U.S. Bank. To mitigate DCVs, we design a multi-level solution that enhances the security of WebView. Our evaluation on real-world apps shows the mitigation solution is effective and scalable, with negligible overhead.

Small World with High Risks: A Study of Security Threats in the npm Ecosystem

The popularity of JavaScript has lead to a large ecosystem of third-party packages available via the npm software package registry. The open nature of npm has boosted its growth, providing over 800,000 free and reusable software packages. Unfortunately, this open nature also causes security risks, as evidenced by recent incidents of single packages that broke or attacked software running on millions of computers. This paper studies security risks for users of npm by systematically analyzing dependencies between packages, the maintainers responsible for these packages, and publicly reported security issues. Studying the potential for running vulnerable or malicious code due to third-party dependencies, we find that individual packages could impact large parts of the entire ecosystem. Moreover, a very small number of maintainer accounts could be used to inject malicious code into the majority of all packages, a problem that has been increasing over time. Studying the potential for accidentally using vulnerable code, we find that lack of maintenance causes many packages to depend on vulnerable code, even years after a vulnerability has become public. Our results provide evidence that npm suffers from single points of failure and that unmaintained packages threaten large code bases. We discuss several mitigation techniques, such as trusted maintainers and total first-party security, and analyze their potential effectiveness.

Phishing and Scams

Detecting and Characterizing Lateral Phishing at Scale

We present the first large-scale characterization of lateral phishing attacks, based on a dataset of 113 million employee-sent emails from 92 enterprise organizations. In a lateral phishing attack, adversaries leverage a compromised enterprise account to send phishing emails to other users, benefitting from both the implicit trust and the information in the hijacked user’s account. We develop a classifier that finds hundreds of real-world lateral phishing emails, while generating under four false positives per every one-million employee-sent emails. Drawing on the attacks we detect, as well as a corpus of user-reported incidents, we quantify the scale of lateral phishing, identify several thematic content and recipient targeting strategies that attackers follow, illuminate two types of sophisticated behaviors that attackers exhibit, and estimate the success rate of these attacks. Collectively, these results expand our mental models of the ‘enterprise attacker’ and shed light on the current state of enterprise phishing attacks.

High Precision Detection of Business Email Compromise

Business email compromise (BEC) and employee impersonation have become one of the most costly cyber-security threats, causing over $12 billion in reported losses. Impersonation emails take several forms: for example, some ask for a wire transfer to the attacker’s account, while others lead the recipient to following a link, which compromises their credentials. Email security systems are not effective in detecting these attacks, because the attacks do not contain a clearly malicious payload, and are personalized to the recipient.

We present BEC-Guard, a detector used at Barracuda Networks that prevents business email compromise attacks in real-time using supervised learning. BEC-Guard has been in production since July 2017, and is part of the Barracuda Sentinel email security product. BEC-Guard detects attacks by relying on statistics about the historical email patterns that can be accessed via cloud email provider APIs. The two main challenges when designing BEC-Guard are the need to label millions of emails to train its classifiers, and to properly train the classifiers when the occurrence of employee impersonation emails is very rare, which can bias the classification. Our key insight is to split the classification problem into two parts, one analyzing the header of the email, and the second applying natural language processing to detect phrases associated with BEC or suspicious links in the email body. BEC-Guard utilizes the public APIs of cloud email providers both to automatically learn the historical communication patterns of each organization, and to quarantine emails in real-time. We evaluated BEC-Guard on a commercial dataset containing more than 4,000 attacks, and show it achieves a precision of 98.2% and a false positive rate of less than one in five million emails.

Cognitive Triaging of Phishing Attacks

In this paper we employ quantitative measurements of cognitive vulnerability triggers in phishing emails to predict the degree of success of an attack. To achieve this we rely on the cognitive psychology literature and develop an automated and fully quantitative method based on machine learning and econometrics to construct a triaging mechanism built around the cognitive features of a phishing email; we showcase our approach relying on data from the anti-phishing division of a large financial organization in Europe. Our evaluation shows empirically that an effective triaging mechanism for phishing success can be put in place by response teams to effectively prioritize remediation efforts (e.g. domain takedowns), by first acting on those attacks that are more likely to collect high response rates from potential victims.

Users Really Do Answer Telephone Scams

As telephone scams become increasingly prevalent, it is crucial to understand what causes recipients to fall victim to these scams. Armed with this knowledge, effective countermeasures can be developed to challenge the key foundations of successful telephone phishing attacks.

In this paper, we present the methodology, design, execution, results, and evaluation of an ethical telephone phishing scam. The study performed 10 telephone phishing experiments on 3,000 university participants without prior awareness over the course of a workweek. Overall, we were able to identify at least one key factor—spoofed Caller ID—that had a significant effect in tricking the victims into revealing their Social Security number.

Platforms in Everything: Analyzing Ground-Truth Data on the Anatomy and Economics of Bullet-Proof Hosting

This paper presents the first empirical study based on ground-truth data of a major Bullet-Proof Hosting (BPH) provider, a company called Maxided. BPH allows miscreants to host criminal activities in support of various cybercrime business models such as phishing, botnets, DDoS, spam, and counterfeit pharmaceutical websites. Maxided was legally taken down by law enforcement and its backend servers were seized. We analyze data extracted from its backend databases and connect it to various external data sources to characterize Maxided’s business model, supply chain, customers and finances. We reason about what the inside’’ view reveals about potential chokepoints for disrupting BPH providers. We demonstrate the BPH landscape to have further shifted from agile resellers towards marketplace platforms with an oversupply of resources originating from hundreds of legitimate upstream hosting providers. We find the BPH provider to have few choke points in the supply chain amendable to intervention, though profit margins are very slim, so even a marginal increase in operating costs might already have repercussions that render the business unsustainable. The other intervention option would be to take down the platform itself.


Birthday, Name and Bifacial-security: Understanding Passwords of Chinese Web Users

Much attention has been paid to passwords chosen by English speaking users, yet only a few studies have examined how non-English speaking users select passwords. In this paper, we perform an extensive, empirical analysis of 73.1 million real-world Chinese web passwords in comparison with 33.2 million English counterparts. We highlight a number of interesting structural and semantic characteristics in Chinese passwords. We further evaluate the security of these passwords by employing two state-of-the-art cracking techniques. In particular, our cracking results reveal the bifacial-security nature of Chinese passwords. They are weaker against online guessing attacks (i.e., when the allowed guess number is small, 1∼104) than English passwords. But out of the remaining Chinese passwords, they are stronger against offline guessing attacks (i.e., when the guess number is large, >105) than their English counterparts. This reconciles two conflicting claims about the strength of Chinese passwords made by Bonneau (IEEE S&P’12) and Li et al. (Usenix Security’14 and IEEE TIFS’16). At 107 guesses, the success rate of our improved PCFG-based attack against the Chinese datasets is 33.2%~49.8%, indicating that our attack can crack 92% to 188% more passwords than the state of the art. We also discuss the implications of our findings for password policies, strength meters and cracking.

Protecting accounts from credential stuffing with password breach alerting

Protecting accounts from credential stuffing attacks remains burdensome due to an asymmetry of knowledge: attackers have wide-scale access to billions of stolen usernames and passwords, while users and identity providers remain in the dark as to which accounts require remediation. In this paper, we propose a privacy-preserving protocol whereby a client can query a centralized breach repository to determine whether a specific username and password combination is publicly exposed, but without revealing the information queried. Here, a client can be an end user, a password manager, or an identity provider. To demonstrate the feasibility of our protocol, we implement a cloud service that mediates access to over 4 billion credentials found in breaches and a Chrome extension serving as an initial client. Based on anonymous telemetry from nearly 670,000 users and 21 million logins, we find that 1.5% of logins on the web involve breached credentials. By alerting users to this breach status, 26% of our warnings result in users migrating to a new password, at least as strong as the original. Our study illustrates how secure, democratized access to password breach alerting can help mitigate one dimension of account hijacking.

Probability Model Transforming Encoders Against Encoding Attacks

Honey encryption (HE) is a novel encryption scheme for resisting brute-force attacks even using low-entropy keys (e.g., passwords). HE introduces a distribution transforming encoder (DTE) to yield plausible-looking decoy messages for incorrect keys. Several HE applications were proposed for specific messages with specially designed probability model transforming encoders (PMTEs), DTEs transformed from probability models which are used to characterize the intricate message distributions.

We propose attacks against three typical PMTE schemes. Using a simple machine learning algorithm, we propose a distribution difference attack against genomic data PMTEs, achieving 76.54%–100.00% accuracy in distinguishing real data from decoy one. We then propose a new type of attack—encoding attacks—against two password vault PMTEs, achieving 98.56%–99.52% accuracy. Different from distribution difference attacks, encoding attacks do not require any knowledge (statistics) about the real message distribution.

We also introduce a generic conceptual probability model—generative probability model (GPM)—to formalize probability models and design a generic method for transforming an arbitrary GPM to a PMTE. We prove that our PMTEs are information-theoretically indistinguishable from the corresponding GPMs. Accordingly, they can resist encoding attacks. For our PMTEs transformed from existing password vault models, encoding attacks cannot achieve more than 52.56% accuracy, which is slightly better than the randomly guessing attack (50% accuracy).

Web Defenses

Rendered Private: Making GLSL Execution Uniform to Prevent WebGL-based Browser Fingerprinting

Browser fingerprinting, a substitute of cookies-based tracking, extracts a list of client-side features and combines them as a unique identifier for the target browser. Among all these features, one that has the highest entropy and the ability for an even sneakier purpose, i.e., cross-browser fingerprinting, is the rendering of WebGL tasks, which produce different results across different installations of the same browser on different computers, thus being considered as fingerprintable.

Such WebGL-based fingerprinting is hard to defend against, because the client browser executes a program written in OpenGL Shading Language (GLSL). To date, it remains unclear, in either the industry or the research community, about how and why the rendering of GLSL programs could lead to result discrepancies. Therefore, all the existing defenses, such as these adopted by Tor Browser, can only disable WebGL, i.e., a sacrifice of functionality over privacy, to prevent WebGL-based fingerprinting.

In this paper, we propose a novel system, called UNIGL, to rewrite GLSL programs and make uniform WebGL rendering procedure with the support of existing WebGL functionalities. Particularly, we, being the first in the community, point out that such rendering discrepancies in state-of-the-art WebGL-based fingerprinting are caused by floating-point operations. After realizing the cause, we design UNIGL so that it redefines all the floating-point operations, either explicitly written in GLSL programs or implicitly invoked by WebGL, to mitigate the fingerprinting factors.

We implemented a prototype of UNIGL as an open-source browser add-on (https://www.github.com/unigl/). We also created a demo website (http://test.unigl.org/), i.e., a modified version of an existing fingerprinting website, which directly integrates our add-on at the server-side to demonstrate the effectiveness of UNIGL. Our evaluation using crowdsourcing workers shows that UNIGL can prevent state-of-the-art WebGL-based fingerprinting with reasonable FPSes.

Site Isolation: Process Separation for Web Sites within the Browser

Current production web browsers are multi-process but place different web sites in the same renderer process, which is not sufficient to mitigate threats present on the web today. With the prevalence of private user data stored on web sites, the risk posed by compromised renderer processes, and the advent of transient execution attacks like Spectre and Meltdown that can leak data via microarchitectural state, it is no longer safe to render documents from different web sites in the same process. In this paper, we describe our successful deployment of the Site Isolation architecture to all desktop users of Google Chrome as a mitigation for process-wide attacks. Site Isolation locks each renderer process to documents from a single site and filters certain cross-site data from each process. We overcame performance and compatibility challenges to adapt a production browser to this new architecture. We find that this architecture offers the best path to protection against compromised renderer processes and same-process transient execution attacks, despite current limitations. Our performance results indicate it is practical to deploy this level of isolation while sufficiently preserving compatibility with existing web content. Finally, we discuss future directions and how the current limitations of Site Isolation might be addressed.

Everyone is Different: Client-side Diversification for Defending Against Extension Fingerprinting

Browser fingerprinting refers to the extraction of attributes from a user’s browser which can be combined into a near-unique fingerprint. These fingerprints can be used to re-identify users without requiring the use of cookies or other stateful identifiers. Browser extensions enhance the client-side browser experience; however, prior work has shown that their website modifications are fingerprintable and can be used to infer sensitive information about users.

In this paper we present CloakX, the first client-side anti-fingerprinting countermeasure that works without requiring browser modification or requiring extension developers to modify their code. CloakX uses client-side diversification to prevent extension detection using anchorprints (fingerprints comprised of artifacts directly accessible to any webpage) and to reduce the accuracy of extension detection using structureprints (fingerprints built from an extension’s behavior). Despite the complexity of browser extensions, CloakX automatically incorporates client-side diversification into the extensions and maintains equivalent functionality through the use of static and dynamic program analysis. We evaluate the efficacy of CloakX on 18,937 extensions using large-scale automated analysis and in-depth manual testing. We conducted experiments to test the functionality equivalence, the detectability, and the performance of CloakX-enabled extensions. Beyond extension detection, we demonstrate that client-side modification of extensions is a viable method for the late-stage customization of browser extensions.

Less is More: Quantifying the Security Benefits of Debloating Web Applications

As software becomes increasingly complex, its attack surface expands enabling the exploitation of a wide range of vulnerabilities. Web applications are no exception since modern HTML5 standards and the ever-increasing capabilities of JavaScript are utilized to build rich web applications, often subsuming the need for traditional desktop applications. One possible way of handling this increased complexity is through the process of software debloating, i.e., the removal not only of dead code but also of code corresponding to features that a specific set of users do not require. Even though debloating has been successfully applied on operating systems, libraries, and compiled programs, its applicability on web applications has not yet been investigated.

In this paper, we present the first analysis of the security benefits of debloating web applications. We focus on four popular PHP applications and we dynamically exercise them to obtain information about the server-side code that executes as a result of client-side requests. We evaluate two different debloating strategies (file-level debloating and function-level debloating) and we show that we can produce functional web applications that are 46% smaller than their original versions and exhibit half their original cyclomatic complexity. Moreover, our results show that the process of debloating removes code associated with tens of historical vulnerabilities and further shrinks a web application’s attack surface by removing unnecessary external packages and abusable PHP gadgets.

The Web’s Identity Crisis: Understanding the Effectiveness of Website Identity Indicators

Users must understand the identity of the website that they are visiting in order to make trust decisions. Web browsers indicate website identity via URLs and HTTPS certificates, but users must understand and act on these indicators for them to be effective. In this paper, we explore how browser identity indicators affect user behavior and understanding. First, we present a large-scale field experiment measuring the effects of the HTTPS Extended Validation (EV) certificate UI on user behavior. Our experiment is many orders of magnitude larger than any prior study of EV indicators, and it is the first to examine the EV indicator in a naturalistic scenario. We find that most metrics of user behavior are unaffected by its removal, providing evidence that the EV indicator adds little value in its current form. Second, we conduct three experimental design surveys to understand how users perceive UI variations in identity indicators for login pages, looking at EV UI in Chrome and Safari and URL formatting designs in Chrome. In 14 iterations on browsers’ EV and URL formats, no intervention significantly impacted users’ understanding of the security or identity of login pages. Informed by our experimental results, we provide recommendations to build more effective website identity mechanisms.


Fuzzification: Anti-Fuzzing Techniques

Fuzzing is a software testing technique that quickly and automatically explores the input space of a program without knowing its internals. Therefore, developers commonly use fuzzing as part of test integration throughout the software development process. Unfortunately, it also means that such a blackbox and the automatic natures of fuzzing are appealing to adversaries who are looking for zero-day vulnerabilities.

To solve this problem, we propose a new mitigation approach, called Fuzzification , that helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques. Given a performance budget, this approach aims to hinder the fuzzing process from adversaries as much as possible. We propose three Fuzzification techniques: 1) SpeedBump, which amplifies the slowdown in normal executions by hundreds of times to the fuzzed execution, 2) BranchTrap, interfering with feedback logic by hiding paths and polluting coverage maps, and 3) AntiHybrid, hindering taint-analysis and symbolic execution. Each technique is designed with best-effort, defensive measures that attempt to hinder adversaries from bypassing Fuzzification .

Our evaluation on popular fuzzers and real-world applications shows that Fuzzification effectively reduces the number of discovered paths by 70.3% and decreases the number of identified crashes by 93.0% from real-world binaries, and decreases the number of detected bugs by 67.5% from LAVA-M dataset while under user-specified overheads for common workloads. We discuss the robustness of Fuzzification techniques against adversarial analysis techniques. We open-source our Fuzzification system to foster future research.

AntiFuzz: Impeding Fuzzing Audits of Binary Executables

A general defense strategy in computer security is to increase the cost of successful attacks in both computational resources as well as human time. In the area of binary security, this is commonly done by using obfuscation methods to hinder reverse engineering and the search for software vulnerabilities. However, recent trends in automated bug finding changed the modus operandi. Nowadays it is very common for bugs to be found by various fuzzing tools. Due to ever-increasing amounts of automation and research on better fuzzing strategies, large-scale, dragnet-style fuzzing of many hundreds of targets becomes viable. As we show, current obfuscation techniques are aimed at increasing the cost of human understanding and do little to slow down fuzzing. In this paper, we introduce several techniques to protect a binary executable against an analysis with automated bug finding approaches that are based on fuzzing, symbolic/concolic execution, and taint-assisted fuzzing (commonly known as hybrid fuzzing). More specifically, we perform a systematic analysis of the fundamental assumptions of bug finding tools and develop general countermeasures for each assumption. Note that these techniques are not designed to target specific implementations of fuzzing tools, but address general assumptions that bug finding tools necessarily depend on. Our evaluation demonstrates that these techniques effectively impede fuzzing audits, while introducing a negligible performance overhead. Just as obfuscation techniques increase the amount of human labor needed to find a vulnerability, our techniques render automated fuzzing-based approaches futile.

MOPT: Optimized Mutation Scheduling for Fuzzers

Mutation-based fuzzing is one of the most popular vulnerability discovery solutions. Its performance of generating interesting test cases highly depends on the mutation scheduling strategies. However, existing fuzzers usually follow a specific distribution to select mutation operators, which is inefficient in finding vulnerabilities on general programs. Thus, in this paper, we present a novel mutation scheduling scheme MOPT, which enables mutation-based fuzzers to discover vulnerabilities more efficiently. MOPT utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal selection probability distribution of operators with respect to fuzzing effectiveness, and provides a pacemaker fuzzing mode to accelerate the convergence speed of PSO. We applied MOPT to the state-of-the-art fuzzers AFL, AFLFast and VUzzer, and implemented MOPT-AFL, -AFLFast and -VUzzer respectively, and then evaluated them on 13 real world open-source programs. The results showed that, MOPT-AFL could find 170% more security vulnerabilities and 350% more crashes than AFL. MOPT-AFLFast and MOPT-VUzzer also outperform their counterparts. Furthermore, the extensive evaluation also showed that MOPT provides a good rationality, compatibility and steadiness, while introducing negligible costs.

EnFuzz: Ensemble Fuzzing with Seed Synchronization among Diverse Fuzzers

Fuzzing is widely used for vulnerability detection. There are various kinds of fuzzers with different fuzzing strategies, and most of them perform well on their targets. However, in industrial practice, it is found that the performance of those well-designed fuzzing strategies is challenged by the complexity and diversity of real-world applications. In this paper, we systematically study an ensemble fuzzing approach. First, we define the diversity of base fuzzers in three heuristics: diversity of coverage information granularity, diversity of input generation strategy and diversity of seed selection and mutation strategy. Based on those heuristics, we choose several of the most recent base fuzzers that are as diverse as possible, and propose a globally asynchronous and locally synchronous (GALS) based seed synchronization mechanism to seamlessly ensemble those base fuzzers and obtain better performance. For evaluation, we implement EnFuzz based on several widely used fuzzers such as QSYM and FairFuzz, and then we test them on LAVA-M and Google’s fuzzing-test-suite, which consists of 24 widely used real-world applications. This experiment indicates that, under the same constraints for resources, these base fuzzers perform differently on different applications, while EnFuzz always outperforms other fuzzers in terms of path coverage, branch coverage and bug discovery. Furthermore, EnFuzz found 60 new vulnerabilities in several well-fuzzed projects such as libpng and libjpeg, and 44 new CVEs were assigned.

GRIMOIRE: Synthesizing Structure while Fuzzing

In the past few years, fuzzing has received significant attention from the research community. However, most of this attention was directed towards programs without a dedicated parsing stage. In such cases, fuzzers which leverage the input structure of a program can achieve a significantly higher code coverage compared to traditional fuzzing approaches. This advancement in coverage is achieved by applying large-scale mutations in the application’s input space. However, this improvement comes at the cost of requiring expert domain knowledge, as these fuzzers depend on structure input specifications (e.g., grammars). Grammar inference, a technique which can automatically generate such grammars for a given program, can be used to address this shortcoming. Such techniques usually infer a program’s grammar in a pre-processing step and can miss important structures that are uncovered only later during normal fuzzing.

In this paper, we present the design and implementation of GRIMOIRE, a fully automated coverage-guided fuzzer which works without any form of human interaction or pre-configuration; yet, it is still able to efficiently test programs that expect highly structured inputs. We achieve this by performing large-scale mutations in the program input space using grammar-like combinations to synthesize new highly structured inputs without any pre-processing step. Our evaluation shows that GRIMOIRE outperforms other coverage-guided fuzzers when fuzzing programs with highly structured inputs. Furthermore, it improves upon existing grammar-based coverage-guided fuzzers. Using GRIMOIRE, we identified 19 distinct memory corruption bugs in real-world programs and obtained 11 new CVEs.


2018 programme(有视频PPT) dblp

Session 1B: Attacks and Vulnerabilities

Didn’t You Hear Me? – Towards More Successful Web Vulnerability Notifications

After treating the notification of vulnerable parties as mere side-notes in research, the security community has recently put more focus on how to conduct vulnerability disclosure at scale. The first works in this area have shown that while notifications are helpful to a significant fraction of operators, the vast majority of systems remain unpatched. In this paper, we build on these previous works, aiming to understand why the effects are not more significant. To that end, we report on a notification experiment targeting more than 24,000 domains, which allowed us to analyze what technical and human aspects are roadblocks to a successful campaign. As part of this experiment, we explored potential alternative notification channels beyond email, including social media and phone. In addition, we conducted an anonymous survey with the notified operators, investigating their perspectives on our notifications. We show the pitfalls of email-based communications, such as the impact of anti-spam filters, the lack of trust by recipients, and the hesitation in fixing vulnerabilities despite awareness. However, our exploration of alternative communication channels did not suggest a more promising medium. Seeing these results, we pinpoint future directions in improving security notifications.

Exposing Congestion Attack on Emerging Connected Vehicle based Traffic Signal

Connected vehicle (CV) technology will soon transform today’s transportation systems by connecting vehicles and the transportation infrastructure through wireless communication. Having demonstrated the potential to greatly improve transportation mobility efficiency, such dramatically increased connectivity also opens a new door for cyber attacks. In this work, we perform the first detailed security analysis of the nextgeneration CV-based transportation systems. As a first step, we target the USDOT (U.S. Department of Transportation) sponsored CV-based traffic control system, which has been tested and shown high effectiveness in real road intersections. In the analysis, we target a realistic threat, namely CV data spoofing from one single attack vehicle, with the attack goal of creating traffic congestion. We first analyze the system design and identify data spoofing strategies that can potentially influence the traffic control. Based on the strategies, we perform vulnerability analysis by exhaustively trying all the data spoofing options for these strategies to understand the upper bound of the attack effectiveness. For the highly effective cases, we analyze the causes and find that the current signal control algorithm design and implementation choices are highly vulnerable to data spoofing attacks from even a single attack vehicle. These vulnerabilities can be exploited to completely reverse the benefit of the CV-based signal control system by causing the traffic mobility to be 23.4% worse than that without adopting such system. We then construct practical exploits and evaluate them under real-world intersection settings. The evaluation results are consistent with our vulnerability analysis, and we find that the attacks can even cause a blocking effect to jam an entire approach. In the jamming period, 22% of the vehicles need to spend over 7 minutes for an original halfminute trip, which is 14 times higher. We also discuss defense directions leveraging the insights from our analysis.

Removing Secrets from Android’s TLS

Cryptographic libraries that implement Transport Layer Security (TLS) have a responsibility to delete cryptographic keys once they’re no longer in use. Any key that’s left in memory can potentially be recovered through the actions of an attacker, up to and including the physical capture and forensic analysis of a device’s memory. This paper describes an analysis of the TLS library stack used in recent Android distributions, combining a C language core (BoringSSL) with multiple layers of Java code (Conscrypt, OkHttp, and Java Secure Sockets). We first conducted a black-box analysis of virtual machine images, allowing us to discover keys that might remain recoverable. After identifying several such keys, we subsequently pinpointed undesirable interactions across these layers, where the higherlevel use of BoringSSL’s reference counting features, from Java code, prevented BoringSSL from cleaning up its keys. This interaction poses a threat to all Android applications built on standard HTTPS libraries, exposing master secrets to memory disclosure attacks. We found all versions we investigated from Android 4 to the latest Android 8 are vulnerable, showing that this problem has been long overlooked. The Android Chrome application is proven to be particularly problematic. We suggest modest changes to the Android codebase to mitigate these issues, and have reported these to Google to help them patch the vulnerability in future Android systems.

rtCaptcha: A Real-Time CAPTCHA Based Liveness Detection System

Facial/voice-based authentication is becoming increasingly popular (e.g., already adopted by MasterCard and AliPay), because it is easy to use. In particular, users can now authenticate themselves to online services by using their mobile phone to show themselves performing simple tasks like blinking or smiling in front of its built-in camera. Our study shows that many of the publicly available facial/voice recognition services (e.g. Microsoft Cognitive Services or Amazon Rekognition) are vulnerable to even the most primitive attacks. Furthermore, recent work on modeling a person’s face/voice (e.g. Face2Face [1]) allows an adversary to create very authentic video/audio of any target victim to impersonate that target. All it takes to launch such attacks are a few pictures and voice samples of a victim, which can all be obtained by either abusing the camera and microphone of the victim’s phone, or through the victim’s social media account. In this work, we propose the Real Time Captcha (rtCaptcha) system, which stops/slows down such an attack by turning the adversary’s task from creating authentic video/audio of the target victim performing known authentication tasks (e.g., smile, blink) to figuring out what is the authentication task, which is encoded as a Captcha. Specifically, when a user tries to authenticate using rtCaptcha, they will be presented a Captcha and will be asked to take a “selfie” video while announcing the answer to the Captcha. As such, the security guarantee of our system comes from the strength of Captcha, and not how well we can distinguish real faces/voices from synthesized ones. To demonstrate the usability and security of rtCaptcha, we conducted a user study to measure human response times to the most popular Captcha schemes. Our experiments show that, thanks to the humans’ speed of solving Captchas, adversaries will have to solve Captchas in less than 2 seconds in order to appear live/human and defeat rtCaptcha, which is not possible for the best settings on the attack side.

Session 2A: Network Security/Cellular Networks

Automated Attack Discovery in TCP Congestion Control Using a Model-guided Approach

One of the most important goals of TCP is to ensure fairness and prevent congestion collapse by implementing congestion control. Various attacks against TCP congestion control have been reported over the years, most of which have been discovered through manual analysis. In this paper, we propose an automated method that combines the generality of implementation-agnostic fuzzing with the precision of runtime analysis to find attacks against implementations of TCP congestion control. It uses a model-guided approach to generate abstract attack strategies, by leveraging a state machine model of TCP congestion control to find vulnerable state machine paths that an attacker could exploit to increase or decrease the throughput of a connection to his advantage. These abstract strategies are then mapped to concrete attack strategies, which consist of sequences of actions such as injection or modification of acknowledgements and a logical time for injection. We design and implement a virtualized platform, TCPWN, that consists of a a proxy-based attack injector and a TCP congestion control state tracker that uses only network traffic to create and inject these concrete attack strategies. We evaluated 5 TCP implementations from 4 Linux distributions and Windows 8.1. Overall, we found 11 classes of attacks, of which 8 are new.

Preventing (Network) Time Travel with Chronos

The Network Time Protocol (NTP) synchronizes time across computer systems over the Internet. Unfortunately, NTP is highly vulnerable to “time shifting attacks”, in which the attacker’s goal is to shift forward/backward the local time at an NTP client. NTP’s security vulnerabilities have severe implications for time-sensitive applications and for security mechanisms, including TLS certificates, DNS and DNSSEC, RPKI, Kerberos, BitCoin, and beyond. While technically NTP supports cryptographic authentication, it is very rarely used in practice and, worse yet, timeshifting attacks on NTP are possible even if all NTP communications are encrypted and authenticated. We present Chronos, a new NTP client that achieves good synchronization even in the presence of powerful attackers who are in direct control of a large number of NTP servers. Importantly, Chronos is backwards compatible with legacy NTP and involves no changes whatsoever to NTP servers. Chronos leverages ideas from distributed computing literature on clock synchronization in the presence of adversarial (Byzantine) behavior. A Chronos client iteratively “crowdsources” time queries across multiple NTP servers and applies a provably secure algorithm for eliminating “suspicious” responses and averaging over the remaining responses. Chronos is carefully engineered to minimize communication overhead so as to avoid overloading NTP servers. We evaluate Chronos’ security and network efficiency guarantees via a combination of theoretical analyses and experiments with a prototype implementation. Our results indicate that to succeed in shifting time at a Chronos client by over 100ms from the UTC, even a powerful man-in-the-middle attacker requires over 20 years of effort in expectation.

LTEInspector: A Systematic Approach for Adversarial Testing of 4G LTE

In this paper, we investigate the security and privacy of the three critical procedures of the 4G LTE protocol (i.e., attach, detach, and paging), and in the process, uncover potential design flaws of the protocol and unsafe practices employed by the stakeholders. For exposing vulnerabilities, we propose a modelbased testing approach LTEInspector which lazily combines a symbolic model checker and a cryptographic protocol verifier in the symbolic attacker model. Using LTEInspector, we have uncovered 10 new attacks along with 9 prior attacks, categorized into three abstract classes (i.e., security, user privacy, and disruption of service), in the three procedures of 4G LTE. Notable among our findings is the authentication relay attack that enables an adversary to spoof the location of a legitimate user to the core network without possessing appropriate credentials. To ensure that the exposed attacks pose real threats and are indeed realizable in practice, we have validated 8 of the 10 new attacks and their accompanying adversarial assumptions through experimentation in a real testbed.

GUTI Reallocation Demystified: Cellular Location Tracking with Changing Temporary Identifier

To keep subscribers’ identity confidential, a cellular network operator must use a temporary identifier instead of a permanent one according to the 3GPP standard. Temporary identifiers include Temporary Mobile Subscriber Identity (TMSI) and Globally Unique Temporary Identifier (GUTI) for GSM/3G and Long-Term Evolution (LTE) networks, respectively. Unfortunately, recent studies have shown that carriers fail to protect subscribers in both GSM/3G and LTE mainly because these identifiers have static and persistent values. These identifiers can be used to track subscribers’ locations. These studies have suggested that temporary identifiers must be reallocated frequently to solve this privacy problem. The only mechanism to update the temporary identifier in current LTE implementations is called GUTI reallocation. We investigate whether the current implementation of the GUTI reallocation mechanism can provide enough security to protect subscribers’ privacy. To do this, we collect data by performing GUTI reallocation more than 30,000 times with 28 carriers across 11 countries using 78 SIM cards. Then, we investigate whether (1) these reallocated GUTIs in each carrier show noticeable patterns and (2) if they do, these patterns are consistent among different SIM cards within each carrier. Among 28 carriers, 19 carriers have easily predictable and consistent patterns in their GUTI reallocation mechanisms. Among the remaining 9 carriers, we revisit 4 carriers to investigate them in greater detail. For all these 4 carriers, we could find interesting yet predictable patterns after invoking GUTI reallocation multiple times within a short time period. By using this predictability, we show that an adversary can track subscribers’ location as in previous studies. Finally, we present a lightweight and unpredictable GUTI reallocation mechanism as a solution.

Session 3A: Deep Learning and Adversarial ML

Automated Website Fingerprinting through Deep Learning

Several studies have shown that the network traffic that is generated by a visit to a website over Tor reveals information specific to the website through the timing and sizes of network packets. By capturing traffic traces between users and their Tor entry guard, a network eavesdropper can leverage this meta-data to reveal which website Tor users are visiting. The success of such attacks heavily depends on the particular set of traffic features that are used to construct the fingerprint. Typically, these features are manually engineered and, as such, any change introduced to the Tor network can render these carefully constructed features ineffective. In this paper, we show that an adversary can automate the feature engineering process, and thus automatically deanonymize Tor traffic by applying our novel method based on deep learning. We collect a dataset comprised of more than three million network traces, which is the largest dataset of web traffic ever used for website fingerprinting, and find that the performance achieved by our deep learning approaches is comparable to known methods which include various research efforts spanning over multiple years. The obtained success rate exceeds 96% for a closed world of 100 websites and 94% for our biggest closed world of 900 classes. In our open world evaluation, the most performant deep learning model is 2% more accurate than the state-ofthe-art attack. Furthermore, we show that the implicit features automatically learned by our approach are far more resilient to dynamic changes of web content over time. We conclude that the ability to automatically construct the most relevant traffic features and perform accurate traffic recognition makes our deep learning based approach an efficient, flexible and robust technique for website fingerprinting.

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection

The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features. Since deep learning is motivated to deal with problems that are very different from the problem of vulnerability detection, we need some guiding principles for applying deep learning to vulnerability detection. In particular, we need to find representations of software programs that are suitable for deep learning. For this purpose, we propose using code gadgets to represent programs and then transform them into vectors, where a code gadget is a number of (not necessarily consecutive) lines of code that are semantically related to each other. This leads to the design and implementation of a deep learning-based vulnerability detection system, called Vulnerability Deep Pecker (VulDeePecker). In order to evaluate VulDeePecker, we present the first vulnerability dataset for deep learning approaches. Experimental results show that VulDeePecker can achieve much fewer false negatives (with reasonable false positives) than other approaches. We further apply VulDeePecker to 3 software products (namely Xen, Seamonkey, and Libav) and detect 4 vulnerabilities, which are not reported in the National Vulnerability Database but were “silently” patched by the vendors when releasing later versions of these products; in contrast, these vulnerabilities are almost entirely missed by the other vulnerability detection systems we experimented with.

Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection

Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which could potentially host an NIDS, simply do not have the memory or processing power to train and sometimes even execute such models. More importantly, the existing neural network solutions are trained in a supervised manner. Meaning that an expert must label the network traffic and update the model manually from time to time. In this paper, we present Kitsune: a plug and play NIDS which can learn to detect attacks on the local network, without supervision, and in an efficient online manner. Kitsune’s core algorithm (KitNET) uses an ensemble of neural networks called autoencoders to collectively differentiate between normal and abnormal traffic patterns. KitNET is supported by a feature extraction framework which efficiently tracks the patterns of every network channel. Our evaluations show that Kitsune can detect various attacks with a performance comparable to offline anomaly detectors, even on a Raspberry PI. This demonstrates that Kitsune can be a practical and economic NIDS.

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by adversarial examples that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model’s prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives. This paper explores two feature squeezing methods: reducing the color bit depth of each pixel and spatial smoothing. These simple strategies are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.

Trojaning Attack on Neural Networks

With the fast spread of machine learning techniques, sharing and adopting public machine learning models become very popular. This gives attackers many new opportunities. In this paper, we propose a trojaning attack on neural networks. As the models are not intuitive for human to understand, the attack features stealthiness. Deploying trojaned models can cause various severe consequences including endangering human lives (in applications like autonomous driving). We first inverse the neural network to generate a general trojan trigger, and then retrain the model with reversed engineered training data to inject malicious behaviors to the model. The malicious behaviors are only activated by inputs stamped with the trojan trigger. In our attack, we do not need to tamper with the original training process, which usually takes weeks to months. Instead, it takes minutes to hours to apply our attack. Also, we do not require the datasets that are used to train the model. In practice, the datasets are usually not shared due to privacy or copyright concerns. We use five different applications to demonstrate the power of our attack, and perform a deep analysis on the possible factors that affect the attack. The results show that our attack is highly effective and efficient. The trojaned behaviors can be successfully triggered (with nearly 100% possibility) without affecting its test accuracy for normal input and even with better accuracy on public dataset. Also, it only takes a small amount of time to attack a complex neuron network model. In the end, we also discuss possible defense against such attacks.

Session 3B: Authentication

Broken Fingers: On the Usage of the Fingerprint API in Android

Smartphones are increasingly used for very important tasks such as mobile payments. Correspondingly, new technologies are emerging to provide better security on smartphones. One of the most recent and most interesting is the ability to recognize fingerprints, which enables mobile apps to use biometric-based authentication and authorization to protect security-sensitive operations. In this paper, we present the first systematic analysis of the fingerprint API in Android, and we show that this API is not well understood and often misused by app developers. To make things worse, there is currently confusion about which threat model the fingerprint API should be resilient against. For example, although there is no official reference, we argue that the fingerprint API is designed to protect from attackers that can completely compromise the untrusted OS. After introducing several relevant threat models, we identify common API usage patterns and show how inappropriate choices can make apps vulnerable to multiple attacks. We then design and implement a new static analysis tool to automatically analyze the usage of the fingerprint API in Android apps. Using this tool, we perform the first systematic study on how the fingerprint API is used. The results are worrisome: Our tool indicates that 53.69% of the analyzed apps do not use any cryptographic check to ensure that the user actually touched the fingerprint sensor. Depending on the specific use case scenario of a given app, it is not always possible to make use of cryptographic checks. However, a manual investigation on a subset of these apps revealed that 80% of them could have done so, preventing multiple attacks. Furthermore, the tool indicates that only the 1.80% of the analyzed apps use this API in the most secure way possible, while many others, including extremely popular apps such as Google Play Store and Square Cash, use it in weaker ways. To make things worse, we find issues and inconsistencies even in the samples provided by the official Google documentation. We end this work by suggesting various improvements to the fingerprint API to prevent some of these problematic attacks.

K-means++ vs. Behavioral Biometrics: One Loop to Rule Them All

Behavioral biometrics, a field that studies patterns in an individual’s unique behavior, has been researched actively as a means of authentication for decades. Recently, it has even been adopted in many real world scenarios. In this paper, we study keystroke dynamics, the most researched of such behavioral biometrics, from the perspective of an adversary. We designed two adversarial agents with a standard accuracy convenience tradeoff: Targeted K-means++, which is an expensive, but extremely effective adversarial agent, and Indiscriminate K-means++, which is slightly less powerful, but adds no overhead cost to the attacker. With Targeted K-means++ we could compromise the security of 40-70% of users within ten tries. In contrast, with Indiscriminate K-means++, the security of 30-50% of users was compromised. Therefore, we conclude that while keystroke dynamics has potential, it is not ready for security critical applications yet. Future keystroke dynamics research should use such adversaries to benchmark the performance of the detection algorithms, and design better algorithms to foil these. Finally, we show that the K-means++ adversarial agent generalizes well to even other types of behavioral biometrics data by applying it on a dataset of touchscreen swipes.

ABC: Enabling Smartphone Authentication with Built-in Camera

Reliably identifying and authenticating smartphones is critical in our daily life since they are increasingly being used to manage sensitive data such as private messages and financial data. Recent researches on hardware fingerprinting show that each smartphone, regardless of the manufacturer or make, possesses a variety of hardware fingerprints that are unique, robust, and physically unclonable. There is a growing interest in designing and implementing hardware-rooted smartphone authentication which authenticates smartphones through verifying the hardware fingerprints of their built-in sensors. Unfortunately, previous fingerprinting methods either involve large registration overhead or suffer from fingerprint forgery attacks, rendering them infeasible in authentication systems. In this paper, we propose ABC, a real-time smartphone Authentication protocol utilizing the photo-response non-uniformity (PRNU) of the Built-in Camera. In contrast to previous works that require tens of images to build reliable PRNU features for conventional cameras, we are the first to observe that one image alone can uniquely identify a smartphone due to the unique PRNU of a smartphone image sensor. This new discovery makes the use of PRNU practical for smartphone authentication. While most existing hardware fingerprints are vulnerable against forgery attacks, ABC defeats forgery attacks by verifying a smartphone’s PRNU identity through a challenge response protocol using a visible light communication channel. A user captures two time-variant QR codes and sends the two images to a server, which verifies the identity by fingerprint and image content matching. The time-variant QR codes can also defeat replay attacks. Our experiments with 16,000 images over 40 smartphones show that ABC can efficiently authenticate user devices with an error rate less than 0.5%.

Device Pairing at the Touch of an Electrode

Device pairing is the problem of having two devices securely establish a key that can be used to secure subsequent communication. The problem arises every time two devices that do not already share a secret need to bootstrap a secure communication channel. Many solutions exist, all suited to different situations, and all with their own strengths and weaknesses. In this paper, we propose a novel approach to device pairing that applies whenever a user wants to pair two devises that can be physically touched at the same time. The pairing process is easy to perform, even for novice users. A central problem for a device (Alice) running a device pairing protocol, is determining whether the other party (Bob) is in fact the device that we are supposed to establish a key with. Our scheme is based on the idea that two devices can perform device pairing, if they are physically held by the same person (at the same time). In order to pair two devices, a person touches a conductive surface on each device. While the person is in contact with both devices, the human body acts as a transmission medium for intra-body communication and the two devices can communicate through the body. This body channel is used as part of a pairing protocol which allows the devices to agree on a mutual secret and, at the same time, extract physical features to verify that they are being held by the same person. We prove that our device pairing protocol is secure in our threat model and we build a proof of concept set-up and conduct experiments with 15 people to verify the idea in practice.

Face Flashing: a Secure Liveness Detection Protocol based on Light Reflections

Face authentication systems are becoming increasingly prevalent, especially with the rapid development of Deep Learning technologies. However, human facial information is easy to be captured and reproduced, which makes face authentication systems vulnerable to various attacks. Liveness detection is an important defense technique to prevent such attacks, but existing solutions did not provide clear and strong security guarantees, especially in terms of time. To overcome these limitations, we propose a new liveness detection protocol called Face Flashing that significantly increases the bar for launching successful attacks on face authentication systems. By randomly flashing well-designed pictures on a screen and analyzing the reflected light, our protocol has leveraged physical characteristics of human faces: reflection processing at the speed of light, unique textual features, and uneven 3D shapes. Cooperating with working mechanism of the screen and digital cameras, our protocol is able to detect subtle traces left by an attacking process. To demonstrate the effectiveness of Face Flashing, we implemented a prototype and performed thorough evaluations with large data set collected from real-world scenarios. The results show that our Timing Verification can effectively detect the time gap between legitimate authentications and malicious cases. Our Face Verification can also differentiate 2D plain from 3D objects accurately. The overall accuracy of our liveness detection system is 98.8%, and its robustness was evaluated in different scenarios. In the worst case, our system’s accuracy decreased to a still-high 97.3%.

Session 4A: Measurements

A Large-scale Analysis of Content Modification by Open HTTP Proxies

Open HTTP proxies offer a quick and convenient solution for routing web traffic towards a destination. In contrast to more elaborate relaying systems, such as anonymity networks or VPN services, users can freely connect to an open HTTP proxy without the need to install any special software. Therefore, open HTTP proxies are an attractive option for bypassing IPbased filters and geo-location restrictions, circumventing content blocking and censorship, and in general, hiding the client’s IP address when accessing a web server. Nevertheless, the consequences of routing traffic through an untrusted third party can be severe, while the operating incentives of the thousands of publicly available HTTP proxies are questionable. In this paper, we present the results of a large-scale analysis of open HTTP proxies, focusing on determining the extent to which user traffic is manipulated while being relayed. We have designed a methodology for detecting proxies that, instead of passively relaying traffic, actively modify the relayed content. Beyond simple detection, our framework is capable of macroscopically attributing certain traffic modifications at the network level to well-defined malicious actions, such as ad injection, user fingerprinting, and redirection to malware landing pages. We have applied our methodology on a large set of publicly available HTTP proxies, which we monitored for a period of two months, and identified that 38% of them perform some form of content modification. The majority of these proxies can be considered benign, as they do not perform any harmful content modification. However, 5.15% of the tested proxies were found to perform modification or injection that can be considered as malicious or unwanted. Specifically, 47% of the malicious proxies injected ads, 39% injected code for collecting user information that can be used for tracking and fingerprinting, and 12% attempted to redirect the user to pages that contain malware. Our study reveals the true incentives of many of the publicly available web proxies. Our findings raise several concerns, as we uncover multiple cases where users can be severely affected by connecting to an open proxy. As a step towards protecting users against unwanted content modification, we built a service that leverages our methodology to automatically collect and probe public proxies, and generates a list of safe proxies that do not perform any content modification, on a daily basis.

Measuring and Disrupting Anti-Adblockers Using Differential Execution Analysis

Millions of people use adblockers to remove intrusive and malicious ads as well as protect themselves against tracking and pervasive surveillance. Online publishers consider adblockers a major threat to the ad-powered “free” Web. They have started to retaliate against adblockers by employing antiadblockers which can detect and stop adblock users. To counter this retaliation, adblockers in turn try to detect and filter anti-adblocking scripts. This back and forth has prompted an escalating arms race between adblockers and anti-adblockers. We want to develop a comprehensive understanding of antiadblockers, with the ultimate aim of enabling adblockers to bypass state-of-the-art anti-adblockers. In this paper, we present a differential execution analysis to automatically detect and analyze anti-adblockers. At a high level, we collect execution traces by visiting a website with and without adblockers. Through differential execution analysis, we are able to pinpoint the conditions that lead to the differences caused by anti-adblocking code. Using our system, we detect anti-adblockers on 30.5% of the Alexa top10K websites which is 5-52 times more than reported in prior literature. Unlike prior work which is limited to detecting visible reactions (e.g., warning messages) by anti-adblockers, our system can discover attempts to detect adblockers even when there is no visible reaction. From manually checking one third of the detected websites, we find that the websites that have no visible reactions constitute over 90% of the cases, completely dominating the ones that have visible warning messages. Finally, based on our findings, we further develop JavaScript rewriting and API hooking based solutions (the latter implemented as a Chrome extension) to help adblockers bypass state-of-the-art anti-adblockers.

Towards Measuring the Effectiveness of Telephony Blacklists

The convergence of telephony with the Internet has led to numerous new attacks that make use of phone calls to defraud victims. In response to the increasing number of unwanted or fraudulent phone calls, a number of call blocking applications have appeared on smartphone app stores, including a recent update to the default Android phone app that alerts users of suspected spam calls. However, little is known about the methods used by these apps to identify malicious numbers, and how effective these methods are in practice. In this paper, we are the first to systematically investigate multiple data sources that may be leveraged to automatically learn phone blacklists, and to explore the potential effectiveness of such blacklists by measuring their ability to block future unwanted phone calls. Specifically, we consider four different data sources: user-reported call complaints submitted to the Federal Trade Commission (FTC), complaints collected via crowd-sourced efforts (e.g., 800notes.com), call detail records (CDR) from a large telephony honeypot [1], and honeypot-based phone call audio recordings. Overall, our results show that phone blacklists are capable of blocking a significant fraction of future unwanted calls (e.g., more than 55%). Also, they have a very low false positive rate of only 0.01% for phone numbers of legitimate businesses. We also propose an unsupervised learning method to identify prevalent spam campaigns from different data sources, and show how effective blacklists may be as a defense against such campaigns.

Things You May Not Know About Android (Un)Packers: A Systematic Study based on Whole-System Emulation

The prevalent usage of runtime packers has complicated Android malware analysis, as both legitimate and malicious apps are leveraging packing mechanisms to protect themselves against reverse engineer. Although recent efforts have been made to analyze particular packing techniques, little has been done to study the unique characteristics of Android packers. In this paper, we report the first systematic study on mainstream Android packers, in an attempt to understand their security implications. For this purpose, we developed DROIDUNPACK, a whole-system emulation based Android packing analysis framework, which compared with existing tools, relies on intrinsic characteristics of Android runtime (rather than heuristics), and further enables virtual machine inspection to precisely recover hidden code and reveal packing behaviors. Running our tool on 6 major commercial packers, 93,910 Android malware samples and 3 existing state-of-the-art unpackers, we found that not only are commercial packing services abused to encrypt malicious or plagiarized contents, they themselves also introduce securitycritical vulnerabilities to the apps being packed. Our study further reveals the prevalence and rapid evolution of custom packers used by malware authors, which cannot be defended against using existing techniques, due to their design weaknesses.

Session 6A: Cloud

Reduced Cooling Redundancy: A New Security Vulnerability in a Hot Data Center

Data centers have been growing rapidly in recent years to meet the surging demand of cloud services. However, the expanding scale and powerful servers generate a great amount of heat, resulting in significant cooling costs. A trend in modern data centers is to raise the temperature and maintain all servers in a relatively hot environment. While this can save on cooling costs given benign workloads running in servers, the hot environment increases the risk of cooling failure. In this paper, we unveil a new vulnerability of existing data centers with aggressive cooling energy saving policies. Such a vulnerability might be exploited to launch thermal attacks that could severely worsen the thermal conditions in a data center. Specifically, we conduct thermal measurements and uncover effective thermal attack vectors at the server, rack, and data center levels. We also present damage assessments of thermal attacks. Our results demonstrate that thermal attacks can (1) largely increase the temperature of victim servers degrading their performance and reliability, (2) negatively impact on thermal conditions of neighboring servers causing local hotspots, (3) raise the cooling cost, and (4) even lead to cooling failures. Finally, we propose effective defenses to prevent thermal attacks from becoming a serious security threat to data centers.

OBLIVIATE: A Data Oblivious Filesystem for Intel SGX

Intel SGX provides con dentiality and integrity of a program running within the con nes of an enclave, and is expected to enable valuable security applications such as private information retrieval. This paper is concerned with the security aspects of SGX in accessing a key system resource, les. Through concrete attack scenarios, we show that all existing SGX lesystems are vulnerable to either system call snooping, page fault, or cache based side-channel attacks. To address this security limitations in current SGX lesystems, we present OBLIVIATE, a data oblivious lesystem for Intel SGX. The key idea behind OBLIVIATE is in adapting the ORAM protocol to read and write data from a le within an SGX enclave. OBLIVIATE redesigns the conceptual components of ORAM for SGX environments, and it seamlessly supports an SGX program without requiring any changes in the application layer. OBLIVIATE also employs SGX-speci c defenses and optimizations in order to ensure complete security with acceptable overhead. The evaluation of the prototype of OBLIVIATE demonstrated its practical effectiveness in running popular server applications such as SQLite and Lighttpd, while also achieving a throughput improvement of 2×- 8× over a baseline ORAM-based solution, and less than 2× overhead over an in-memory SGX lesystem.

Microarchitectural Minefields: 4K-Aliasing Covert Channel and Multi-Tenant Detection in Iaas Clouds

We introduce a new microarchitectural timing covert channel using the processor memory order buffer (MOB). Specifically, we show how an adversary can infer the state of a spy process on the Intel 64 and IA-32 architectures when predicting dependent loads through the store buffer, called 4K-aliasing. The 4K-aliasing event is a side-effect of memory disambiguation misprediction while handling write-after-read data hazards wherein the lower 12-bits of a load address will falsely match with store addresses resident in the MOB. In this work, we extensively analyze 4K-aliasing and demonstrate a new timing channel measureable across processes when executed as hyperthreads. We then use 4K-aliasing to build a robust covert communication channel on both the Amazon EC2 and Google Compute Engine capable of communicating at speeds of 1.28 Mbps and 1.49 Mbps, respectively. In addition, we show that 4K-aliasing can also be used to reliably detect multi-tenancy.

Cloud Strife: Mitigating the Security Risks of Domain-Validated Certificates

Infrastructure-as-a-Service (IaaS), and more generally the “cloud,” like Amazon Web Services (AWS) or Microsoft Azure, have changed the landscape of system operations on the Internet. Their elasticity allows operators to rapidly allocate and use resources as needed, from virtual machines, to storage, to bandwidth, and even to IP addresses, which is what made them popular and spurred innovation. In this paper, we show that the dynamic component paired with recent developments in trust-based ecosystems (e.g., SSL certificates) creates so far unknown attack vectors. Specifically, we discover a substantial number of stale DNS records that point to available IP addresses in clouds, yet, are still actively attempted to be accessed. Often, these records belong to discontinued services that were previously hosted in the cloud. We demonstrate that it is practical, and time and cost efficient for attackers to allocate IP addresses to which stale DNS records point. Considering the ubiquity of domain validation in trust ecosystems, like SSL certificates, an attacker can impersonate the service using a valid certificate trusted by all major operating systems and browsers. The attacker can then also exploit residual trust in the domain name for phishing, receiving and sending emails, or possibly distribute code to clients that load remote code from the domain (e.g., loading of native code by mobile apps, or JavaScript libraries by websites). Even worse, an aggressive attacker could execute the attack in less than 70 seconds, well below common time-to-live (TTL) for DNS records. In turn, it means an attacker could exploit normal service migrations in the cloud to obtain a valid SSL certificate for domains owned and managed by others, and, worse, that she might not actually be bound by DNS records being (temporarily) stale, but that she can exploit caching instead. We introduce a new authentication method for trust-based domain validation that mitigates staleness issues without incurring additional certificate requester effort by incorporating existing trust of a name into the validation process. Furthermore, we provide recommendations for domain name owners and cloud operators to reduce their and their clients’ exposure to DNS staleness issues and the resulting domain takeover attacks.

Session 7A: Web Security

Game of Missuggestions: Semantic Analysis of Search-Autocomplete Manipulations

As a new type of blackhat Search Engine Optimization (SEO), autocomplete manipulations are increasingly utilized by miscreants and promotion companies alike to advertise desired suggestion terms when related trigger terms are entered by the user into a search engine. Like other illicit SEO, such activities game the search engine, mislead the querier, and in some cases, spread harmful content. However, little has been done to understand this new threat, in terms of its scope, impact and techniques, not to mention any serious effort to detect such manipulated terms on a large scale. Systematic analysis of autocomplete manipulation is challenging, due to the scale of the problem (tens or even hundreds of millions suggestion terms and their search results) and the heavy burdens it puts on the search engines. In this paper, we report the first technique that addresses these challenges, making a step toward better understanding and ultimately eliminating this new threat. Our technique, called Sacabuche, takes a semantics-based, two-step approach to minimize its performance impact: it utilizes Natural Language Processing (NLP) to analyze a large number of trigger and suggestion combinations, without querying search engines, to filter out the vast majority of legitimate suggestion terms; only a small set of suspicious suggestions are run against the search engines to get query results for identifying truly abused terms. This approach achieves a 96.23% precision and 95.63% recall, and its scalability enables us to perform a measurement study on 114 millions of suggestion terms, an unprecedented scale for this type of studies. The findings of the study bring to light the magnitude of the threat (0.48% Google suggestion terms we collected manipulated), and its significant security implications never reported before (e.g., exceedingly long lifetime of campaigns, sophisticated techniques and channels for spreading malware and phishing content).

SYNODE: Understanding and Automatically Preventing Injection Attacks on NODE.JS

The Node.js ecosystem has lead to the creation of many modern applications, such as serverside web applications and desktop applications. Unlike client-side JavaScript code, Node.js applications can interact freely with the operating system without the benefits of a security sandbox. As a result, command injection attacks can cause significant harm, which is compounded by the fact that independently developed Node.js modules interact in uncontrolled ways. This paper presents a large-scale study across 235,850 Node.js modules to explore injection vulnerabilities. We show that injection vulnerabilities are prevalent in practice, both due to eval, which was previously studied for browser code, and due to the powerful exec API introduced in Node.js. Our study suggests that thousands of modules may be vulnerable to command injection attacks and that fixing them takes a long time, even for popular projects. Motivated by these findings, we present Synode, an automatic mitigation technique that combines static analysis and runtime enforcement of security policies to use vulnerable modules in a safe way. The key idea is to statically compute a template of values passed to APIs that are prone to injections, and to synthesize a grammar-based runtime policy from these templates. Our mechanism is easy to deploy: it does not require any modification of the Node.js platform, it is fast (sub-millisecond runtime overhead), and it protects against attacks of vulnerable modules, while inducing very few false positives (less than 10%).

JavaScript Zero: Real JavaScript and Zero Side-Channel Attacks

Modern web browsers are ubiquitously used by billions of users, connecting them to the world wide web. From the other side, web browsers do not only provide a unified interface for businesses to reach customers, but they also provide a unified interface for malicious actors to reach users. The highly optimized scripting language JavaScript plays an important role in the modern web, as well as for browser-based attacks. These attacks include microarchitectural attacks, which exploit the design of the underlying hardware. In contrast to software bugs, there is often no easy fix for microarchitectural attacks. We propose JavaScript Zero, a highly practical and generic fine-grained permission model in JavaScript to reduce the attack surface in modern browsers. JavaScript Zero facilitates advanced features of the JavaScript language to dynamically deflect usage of dangerous JavaScript features. To implement JavaScript Zero in practice, we overcame a series of challenges to protect potentially dangerous features, guarantee the completeness of our solution, and provide full compatibility with all websites. We demonstrate that our proof-of-concept browser extension Chrome Zero protects against 11 unfixed state-of-the-art microarchitectural and sidechannel attacks. As a side effect, Chrome Zero also protects against 50% of the published JavaScript 0-day exploits since Chrome 49. Chrome Zero has a performance overhead of 1.82% on average. In a user study, we found that for 24 websites in the Alexa Top 25, users could not distinguish browsers with and without Chrome Zero correctly, showing that Chrome Zero has no perceivable effect on most websites. Hence, JavaScript Zero is a practical solution to mitigate JavaScript-based state-of-the-art microarchitectural and side-channel attacks.

Riding out DOMsday: Towards Detecting and Preventing DOM Cross-Site Scripting

Cross-site scripting (XSS) vulnerabilities are the most frequently reported web application vulnerability. As complex JavaScript applications become more widespread, DOM (Document Object Model) XSS vulnerabilities—a type of XSS vulnerability where the vulnerability is located in client-side JavaScript, rather than server-side code—are becoming more common. As the first contribution of this work, we empirically assess the impact of DOM XSS on the web using a browser with taint tracking embedded in the JavaScript engine. Building on the methodology used in a previous study that crawled popular websites, we collect a current dataset of potential DOM XSS vulnerabilities. We improve on the methodology for confirming XSS vulnerabilities, and using this improved methodology, we find 83% more vulnerabilities than previous methodology applied to the same dataset. As a second contribution, we identify the causes of and discuss how to prevent DOM XSS vulnerabilities. One example of our findings is that custom HTML templating designs—a design pattern that could prevent DOM XSS vulnerabilities analogous to parameterized SQL—can be buggy in practice, allowing DOM XSS attacks. As our third contribution, we evaluate the error rates of three static-analysis tools to detect DOM XSS vulnerabilities found with dynamic analysis techniques using in-the-wild examples. We find static-analysis tools to miss 90% of bugs found by our dynamic analysis, though some tools can have very few false positives and at the same time find vulnerabilities not found using the dynamic analysis.

Session 7B: Audit Logs

Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs

Investigating the nature of system intrusions in large distributed systems remains a notoriously difficult challenge. While monitoring tools (e.g., Firewalls, IDS) provide preliminary alerts through easy-to-use administrative interfaces, attack reconstruction still requires that administrators sift through gigabytes of system audit logs stored locally on hundreds of machines. At present, two fundamental obstacles prevent synergy between system-layer auditing and modern cluster monitoring tools: 1) the sheer volume of audit data generated in a data center is prohibitively costly to transmit to a central node, and 2) systemlayer auditing poses a “needle-in-a-haystack” problem, such that hundreds of employee hours may be required to diagnose a single intrusion. This paper presents Winnower, a scalable system for auditbased cluster monitoring that addresses these challenges. Our key insight is that, for tasks that are replicated across nodes in a distributed application, a model can be defined over audit logs to succinctly summarize the behavior of many nodes, thus eliminating the need to transmit redundant audit records to a central monitoring node. Specifically, Winnower parses audit records into provenance graphs that describe the actions of individual nodes, then performs grammatical inference over individual graphs using a novel adaptation of Deterministic Finite Automata (DFA) Learning to produce a behavioral model of many nodes at once. This provenance model can be efficiently transmitted to a central node and used to identify anomalous events in the cluster. We have implemented Winnower for Docker Swarm container clusters and evaluate our system against real-world applications and attacks. We show that Winnower dramatically reduces storage and network overhead associated with aggregating system audit logs, by as much as 98%, without sacrificing the important information needed for attack investigation. Winnower thus represents a significant step forward for security monitoring in distributed systems.

MCI : Modeling-based Causality Inference in Audit Logging for Attack Investigation

In this paper, we develop a model based causality inference technique for audit logging that does not require any application instrumentation or kernel modification. It leverages a recent dynamic analysis, dual execution (LDX), that can infer precise causality between system calls but unfortunately requires doubling the resource consumption such as CPU time and memory consumption. For each application, we use LDX to acquire precise causal models for a set of primitive operations. Each model is a sequence of system calls that have inter-dependences, some of them caused by memory operations and hence implicit at the system call level. These models are described by a language that supports various complexity such as regular, context-free, and even context-sensitive. In production run, a novel parser is deployed to parse audit logs (without any enhancement) to model instances and hence derive causality. Our evaluation on a set of real-world programs shows that the technique is highly effective. The generated models can recover causality with 0% false-positives (FP) and false-negatives (FN) for most programs and only 8.3% FP and 5.2% FN in the worst cases. The models also feature excellent composibility, meaning that the models derived from primitive operations can be composed together to describe causality for large and complex real world missions. Applying our technique to attack investigation shows that the system-wide attack causal graphs are highly precise and concise, having better quality than the state-of-the-art.

Towards a Timely Causality Analysis for Enterprise Security

The increasingly sophisticated Advanced Persistent Threat (APT) attacks have become a serious challenge for enterprise IT security. Attack causality analysis, which tracks multi-hop causal relationships between files and processes to diagnose attack provenances and consequences, is the first step towards understanding APT attacks and taking appropriate responses. Since attack causality analysis is a time-critical mission, it is essential to design causality tracking systems that extract useful attack information in a timely manner. However, prior work is limited in serving this need. Existing approaches have largely focused on pruning causal dependencies totally irrelevant to the attack, but fail to differentiate and prioritize abnormal events from numerous relevant, yet benign and complicated system operations, resulting in long investigation time and slow responses. To address this problem, we propose PRIOTRACKER, a backward and forward causality tracker that automatically prioritizes the investigation of abnormal causal dependencies in the tracking process. Specifically, to assess the priority of a system event, we consider its rareness and topological features in the causality graph. To distinguish unusual operations from normal system events, we quantify the rareness of each event by developing a reference model which records common routine activities in corporate computer systems. We implement PRIOTRACKER, in 20K lines of Java code, and a reference model builder in 10K lines of Java code. We evaluate our tool by deploying both systems in a real enterprise IT environment, where we collect 1TB of 2.5 billion OS events from 150 machines in one week. Experimental results show that PRIOTRACKER can capture attack traces that are missed by existing trackers and reduce the analysis time by up to two orders of magnitude.

JSgraph: Enabling Reconstruction of Web Attacks via Efficient Tracking of Live In-Browser JavaScript Executions

In this paper, we propose JSgraph, a forensic engine that is able to efficiently record fine-grained details pertaining to the execution of JavaScript (JS) programs within the browser, with particular focus on JS-driven DOM modifications. JSgraph’s main goal is to enable a detailed, post-mortem reconstruction of ephemeral JS-based web attacks experienced by real network users. In particular, we aim to enable the reconstruction of social engineering attacks that result in the download of malicious executable files or browser extensions, among other attacks. We implement JSgraph by instrumenting Chromium’s code base at the interface between Blink and V8, the rendering and JavaScript engines. We design JSgraph to be lightweight, highly portable, and to require low storage capacity for its fine-grained audit logs. Using a variety of both in-the-wild and lab-reproduced web attacks, we demonstrate how JSgraph can aid the forensic investigation process. We then show that JSgraph introduces acceptable overhead, with a median overhead on popular website page loads between 3.2% and 3.9%.

Session 10: Social Networks and Anonymity

Investigating Ad Transparency Mechanisms in Social Media: A Case Study of Facebooks Explanations

Targeted advertising has been subject to many privacy complaints from both users and policy makers. Despite this attention, users still have little understanding of what data the advertising platforms have about them and why they are shown particular ads. To address such concerns, Facebook recently introduced two transparency mechanisms: a “Why am I seeing this?” button that provides users with an explanation of why they were shown a particular ad (ad explanations), and an Ad Preferences Page that provides users with a list of attributes Facebook has inferred about them and how (data explanations). In this paper, we investigate the level of transparency provided by these two mechanisms. We first define a number of key properties of explanations and then evaluate empirically whether Facebook’s explanations satisfy them. For our experiments, we develop a browser extension that collects the ads users receive every time they browse Facebook, their respective explanations, and the attributes listed on the Ad Preferences Page; we then use controlled experiments where we create our own ad campaigns and target the users that installed our extension. Our results show that ad explanations are often incomplete and sometimes misleading while data explanations are often incomplete and vague. Taken together, our findings have significant implications for users, policy makers, and regulators as social media advertising services mature.

Inside Job: Applying Traffic Analysis to Measure Tor from Within

In this paper, we explore traffic analysis attacks on Tor that are conducted solely with middle relays rather than with relays from the entry or exit positions. We create a methodology to apply novel Tor circuit and website fingerprinting from middle relays to detect onion service usage; that is, we are able to identify websites with hidden network addresses by their traffic patterns. We also carry out the first privacypreserving popularity measurement of a single social networking website hosted as an onion service by deploying our novel circuit and website fingerprinting techniques in the wild. Our results show: (i) that the middle position enables wide-scale monitoring and measurement not possible from a comparable resource deployment in other relay positions, (ii) that traffic fingerprinting techniques are as effective from the middle relay position as prior works show from a guard relay, and (iii) that an adversary can use our fingerprinting methodology to discover the popularity of onion services, or as a filter to target specific nodes in the network, such as particular guard relays.

Smoke Screener or Straight Shooter: Detecting Elite Sybil Attacks in User-Review Social Networks

Popular User-Review Social Networks (URSNs)— such as Dianping, Yelp, and Amazon—are often the targets of reputation attacks in which fake reviews are posted in order to boost or diminish the ratings of listed products and services. These attacks often emanate from a collection of accounts, called Sybils, which are collectively managed by a group of real users. A new advanced scheme, which we term elite Sybil attacks, recruits organically highly-rated accounts to generate seeminglytrustworthy and realistic-looking reviews. These elite Sybil accounts taken together form a large-scale sparsely-knit Sybil network for which existing Sybil fake-review defense systems are unlikely to succeed. In this paper, we conduct the first study to define, characterize, and detect elite Sybil attacks. We show that contemporary elite Sybil attacks have a hybrid architecture, with the first tier recruiting elite Sybil workers and distributing tasks by Sybil organizers, and with the second tier posting fake reviews for profit by elite Sybil workers. We design ELSIEDET, a three-stage Sybil detection scheme, which first separates out suspicious groups of users, then identifies the campaign windows, and finally identifies elite Sybil users participating in the campaigns. We perform a large-scale empirical study on ten million reviews from Dianping, by far the most popular URSN service in China. Our results show that reviews from elite Sybil users are more spread out temporally, craft more convincing reviews, and have higher filter bypass rates. We also measure the impact of Sybil campaigns on various industries (such as cinemas, hotels, restaurants) as well as chain stores, and demonstrate that monitoring elite Sybil users over time can provide valuable early alerts against Sybil campaigns.

2019 programme(有PPT和视频) dblp

1B: Web Security

Don’t Trust The Locals: Investigating the Prevalence of Persistent Client-Side Cross-Site Scripting in the Wild

The Web has become highly interactive and an important driver for modern life, enabling information retrieval, social exchange, and online shopping. From the security perspective, Cross-Site Scripting (XSS) is one of the most nefarious attacks against Web clients. Research has long since focused on three categories of XSS: Reflected, Persistent, and DOM-based XSS. In this paper, we argue that our community must consider at least four important classes of XSS, and present the first systematic study of the threat of Persistent Client-Side XSS, caused by the insecure use of client-side storage. While the existence of this class has been acknowledged, especially by the non-academic community like OWASP, prior works have either only found such flaws as side effects of other analyses or focused on a limited set of applications to analyze. Therefore, the community lacks in-depth knowledge about the actual prevalence of Persistent Client-Side XSS in the wild.

To close this research gap, we leverage taint tracking to identify suspicious flows from client-side persistent storage (Web Storage, cookies) to dangerous sinks (HTML, JavaScript, and script.src).
We discuss two attacker models capable of injecting malicious payloads into storage, i.e., a Network Attacker capable of temporarily hijacking HTTP communication (e.g., in a public WiFi), and a Web Attacker who can leverage flows into storage or an existing reflected XSS flaw to persist their payload. With our taint-aware browser and these models in mind, we study the prevalence of Persistent Client-Side XSS in the Alexa Top 5,000 domains.
We find that more than 8% of them have unfiltered data flows from persistent storage to a dangerous sink, which showcases the developers’ inherent trust in the integrity of storage content. Even worse, if we only consider sites that make use of data originating from storage, 21% of the sites are vulnerable. For those sites with vulnerable flows from storage to sink, we find that at least 70% are directly exploitable by our attacker models. Finally, investigating the vulnerable flows originating from storage allows us to categorize them into four disjoint categories and propose appropriate mitigations.

Master of Web Puppets: Abusing Web Browsers for Persistent and Stealthy Computation

The proliferation of web applications has essentially transformed modern browsers into small but powerful operating systems. Upon visiting a website, user devices run implicitly trusted script code, the execution of which is confined within the browser to prevent any interference with the user’s system. Recent JavaScript APIs, however, provide advanced capabilities that not only enable feature-rich web applications, but also allow attackers to perform malicious operations despite the confined nature of JavaScript code execution.
In this paper, we demonstrate the powerful capabilities that modern browser APIs provide to attackers by presenting MarioNet: a framework that allows a remote malicious entity to control a visitor’s browser and abuse its resources for unwanted computation or harmful operations, such as cryptocurrency mining, password-cracking, and DDoS. MarioNet relies solely on already available HTML5 APIs, without requiring the installation of any additional software. In contrast to previous browser- based botnets, the persistence and stealthiness characteristics of MarioNet allow the malicious computations to continue in the background of the browser even after the user closes the window or tab of the initially visited malicious website. We present the design, implementation, and evaluation of our prototype system, which is compatible with all major browsers, and discuss potential defense strategies to counter the threat of such persistent in- browser attacks. Our main goal is to raise awareness about this new class of attacks, and inform the design of future browser APIs so that they provide a more secure client-side environment for web applications.

Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation

In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: we show for four main rankings how their inherent properties (similarity, stability, representativeness, responsiveness and benignness) affect their composition and therefore potentially skew the conclusions made in studies. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that the ranks of domains in each of the lists are easily altered, in the case of Alexa through as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. To overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide Tranco, an improved ranking that we offer through an online service available at https://tranco-list.eu.

JavaScript Template Attacks: Automatically Inferring Host Information for Targeted Exploits

Today, more and more web browsers and extensions provide anonymity features to hide user details. Primarily used to evade tracking by websites and advertisements, these features are also used by criminals to prevent identification. Thus, not only tracking companies but also law-enforcement agencies have an interest in finding flaws which break these anonymity features. For instance, for targeted exploitation using zero days, it is essential to have as much information about the target as possible. A failed exploitation attempt, e.g., due to a wrongly guessed operating system, can burn the zero-day, effectively costing the attacker money. Also for side-channel attacks, it is of the utmost importance to know certain aspects of the victim’s hardware configuration, e.g., the instruction-set architecture. Moreover, knowledge about specific environmental properties, such as the operating system, allows crafting more plausible dialogues for phishing attacks.

In this paper, we present a fully automated approach to find subtle differences in browser engines caused by the environment. Furthermore, we present two new side-channel attacks on browser engines to detect the instruction-set architecture and the used memory allocator. Using these differences, we can deduce information about the system, both about the software as well as the hardware. As a result, we cannot only ease the creation of fingerprints, but we gain the advantage of having a more precise picture for targeted exploitation. Our approach allows automating the cumbersome manual search for such differences. We collect all data available to the JavaScript engine and build templates from these properties. If a property of such a template stays the same on one system but differs on a different system, we found an environment-dependent property.

We found environment-dependent properties in Firefox, Chrome, Edge, and mobile Tor, allowing us to reveal the underlying operating system, CPU architecture, used privacy-enhancing plugins, as well as exact browser version. We stress that our method should be used in the development of browsers and privacy extensions to automatically find flaws in the implementation.

Latex Gloves: Protecting Browser Extensions from Probing and Revelation Attacks

Browser extensions enable rich experience for the users of today’s web. Being deployed with elevated privileges, extensions are given the power to overrule web pages. As a result, web pages often seek to detect the installed extensions, sometimes for benign adoption of their behavior but sometimes as part of privacy-violating user fingerprinting. Researchers have studied a class of attacks that allow detecting extensions by probing for Web Accessible Resources (WARs) via URLs that include public extension IDs. Realizing privacy risks associated with WARs, Firefox has recently moved to randomize a browser extension’s ID, prompting the Chrome team to plan for following the same path. However, rather than mitigating the issue, the randomized IDs can in fact exacerbate the extension detection problem, enabling attackers to use a randomized ID as a reliable fingerprint of a user. We study a class of extension revelation attacks, where extensions reveal themselves by injecting their code on web pages. We demonstrate how a combination of revelation and probing can uniquely identify 90% out of all extensions injecting content, in spite of a randomization scheme. We perform a series of large-scale studies to estimate possible implications of both classes of attacks. As a countermeasure, we propose a browser-based mechanism that enables control over which extensions are loaded on which web pages and present a proof of concept implementation which blocks both classes of attacks.

maTLS: How to Make TLS middlebox-aware?

Middleboxes (MBs) are widely deployed in order to enhance security and performance in networking.
However, as the communications over the TLS become increasingly common, the end-to-end channel model of the TLS undermines the efficacy of MBs.
Existing solutions, such as `split TLS’ that intercepts TLS sessions, often introduce significant security risks by installing a custom root certificate or sharing a private key.
Many studies have confirmed the vulnerabilities of combining the TLS with MBs, which include certificate validation failures, unwanted content modification, and using obsolete ciphersuites.
To address the above issues, we introduce an MB-aware TLS protocol, dubbed maTLS, that allows MBs to participate in the TLS in a visible and accountable fashion.
Every participating MB now splits a session into two segments with its own security parameters in collaboration with the two endpoints.
However, the session is still secure as the maTLS protocol is designed to achieve the authentication of MBs, the audit of MBs’ operations, and the verification of security parameters of segments.
We carry out testbed-based experiments to show that maTLS achieves the above security goals with marginal overhead.
We also prove the security model of maTLS by using Tamarin, a security verification tool.

2B: Malware and Threats

Cracking the Wall of Confinement: Understanding and Analyzing Malicious Domain Take-downs

Take-down operations aim to disrupt cybercrime involving malicious domains. In the past decade, many successful take-down operations have been reported, including those against the Conficker worm, and most recently, against VPNFilter. Although it plays an important role in fighting cybercrime, the domain take-down procedure is still surprisingly opaque. There seems to be no in-depth understanding about how the take-down operation works and whether there is due diligence to ensure its security and reliability.

In this paper, we report the first systematic study on domain takedown. Our study was made possible via a large collection of data, including various sinkhole feeds and blacklists, passive DNS data spanning six years, and historical Whois information. Over these datasets, we built a unique methodology that extensively used various reverse lookups and other data analysis techniques to address the challenges in identifying taken-down domains, sinkhole operators, and take-down durations. Applying the methodology on the data, we discovered over 620K taken-down domains and conducted a longitudinal analysis on the take-down process, thus facilitating a better understanding of the operation and its weaknesses. We found that more than 14% of domains taken-down over the past ten months have been released back to the domain market and that some of the released domains have been repurchased by the malicious actor again before being captured and seized, either by the same or different sinkholes. In addition, we showed that the misconfiguration of DNS records corresponding to the sinkholed domains allowed us to hijack a domain that was seized by the FBI. Further, we found that expired sinkholes have caused the transfer of around 30K taken-down domains whose traffic is now under the control of new owners.

Cleaning Up the Internet of Evil Things: Real-World Evidence on ISP and Consumer Efforts to Remove Mirai

With the rise of IoT botnets, the remediation of infected devices has become a critical task. As over 87% of these devices reside in broadband networks, this task will fall primarily to consumers and the Internet Service Providers. We present the first empirical study of IoT malware cleanup in the wild – more specifically, of removing Mirai infections in the network of a medium-sized ISP. To measure remediation rates, we combine data from an observational study and a randomized controlled trial involving 220 consumers who suffered a Mirai infection together with data from honeypots and darknets. We find that quarantining and notifying infected customers via a walled garden, a best practice from ISP botnet mitigation for conventional malware, remediates 92% of the infections within 14 days. Email-only notifications have no observable impact compared to a control group where no notifications were sent. We also measure surprisingly high natural remediation rates of 58-74% for this control group and for two reference networks where users were also not notified. Even more surprising, reinfection rates are low. Only 5% of the customers who remediated suffered another infection in the five months after our first study. This stands in contrast to our lab tests, which observed reinfection of real IoT devices within minutes – a discrepancy for which we explore various different possible explanations, but find no satisfactory answer. We gather data on customer experiences and actions via 76 phone interviews and the communications logs of the ISP. Remediation succeeds even though many users are operating from the wrong mental model – e.g., they run anti-virus software on their PC to solve the infection of an IoT device. While quarantining infected devices is clearly highly effective, future work will have to resolve several remaining mysteries. Furthermore, it will be hard to scale up the walled garden solution because of the weak incentives of the ISPs.

Measurement and Analysis of Hajime, a Peer-to-peer IoT Botnet

The Internet of Things (IoT) introduces an unprecedented diversity and ubiquity to networked computing. It also introduces new attack surfaces that are a boon to attackers. The recent Mirai botnet showed the potential and power of a collection of compromised IoT devices. A new botnet, known as Hajime, targets many of the same devices as Mirai, but differs considerably in its design and operation. Hajime uses a public peer-to-peer system as its command and control infrastructure, and regularly introduces new exploits, thereby increasing its resilience.

We show that Hajime’s distributed design makes it a valuable tool for better understanding IoT botnets. For instance, Hajime cleanly separates its bots into different peer groups depending on their underlying hardware architecture. Through detailed measurement—active scanning of Hajime’s peer-to-peer infrastructure and passive, longitudinal collection of root DNS backscatter traffic—we show that Hajime can be used as a lens into how IoT botnets operate, what kinds of devices they compromise, and what countries are more (or less) susceptible. Our results show that there are more compromised IoT devices than previously reported; that these devices use an assortment of CPU architectures, the popularity of which varies widely by country; that churn is high among IoT devices; and that new exploits can quickly and drastically increase the size and power of IoT botnets. Our code and data are available to assist future efforts to measure and mitigate the growing threat of IoT botnets.

Countering Malicious Processes with Process-DNS Association

Modern malware and cyber attacks depend heavily on DNS services to make their campaigns reliable and difficult to track. Monitoring network DNS activities and blocking suspicious domains have been proven an effective technique in countering such attacks. However, recent successful campaigns reveal that at- tackers adapt by using seemingly benign domains and public web storage services to hide malicious activity. Also, the recent support for encrypted DNS queries provides attacker easier means to hide malicious traffic from network-based DNS monitoring.

We propose PDNS, an end-point DNS monitoring system based on DNS sensor deployed at each host in a network, along with a centralized backend analysis server. To detect such attacks, PDNS expands the monitored DNS activity context and examines process context which triggered that activity. Specifically, each deployed PDNS sensor matches domain name and the IP address related to the DNS query with process ID, binary signature, loaded DLLs, and code signing information of the program that initiated it. We evaluate PDNS on a DNS activity dataset collected from 126 enterprise hosts and with data from multiple malware sources. Using ML Classifiers including DNN, our results outperform most previous works with high detection accuracy: a true positive rate at 98.55% and a low false positive rate at 0.03%.

ExSpectre: Hiding Malware in Speculative Execution

Recently, the Spectre and Meltdown attacks revealed serious vulnerabilities in modern CPU designs, allowing an attacker to exfiltrate data from sensitive programs. These vulnerabilities take advantage of speculative execution to coerce a processor to perform computation that would otherwise not occur, leaking the resulting information via side channels to an attacker.

In this paper, we extend these ideas in a different direction, and leverage speculative execution in order to hide malware from both static and dynamic analysis. Using this technique, critical portions of a malicious program’s computation can be shielded from view, such that even a debugger following an instruction-level trace of the program cannot tell how its results were computed.

We introduce ExSpectre, which compiles arbitrary malicious code into a seemingly-benign payload binary. When a separate trigger program runs on the same machine, it mistrains the CPU’s branch predictor, causing the payload program to speculatively execute its malicious payload, which communicates speculative results back to the rest of the payload program to change its real-world behavior.

We study the extent and types of execution that can be performed speculatively, and demonstrate several computations that can be performed covertly. In particular, within speculative execution we are able to decrypt memory using AES-NI instructions at over 11 kbps. Building on this, we decrypt and interpret a custom virtual machine language to perform arbitrary computation and system calls in the real world. We demonstrate this with a proof-of-concept dial back shell, which takes only a few milliseconds to execute after the trigger is issued. We also show how our corresponding trigger program can be a pre-existing benign application already running on the system, and demonstrate this concept with OpenSSL driven remotely by the attacker as a trigger program.

ExSpectre demonstrates a new kind of malware that evades existing reverse engineering and binary analysis techniques. Because its true functionality is contained in seemingly unreachable dead code, and its control flow driven externally by potentially any other program running at the same time, ExSpectre poses a novel threat to state-of-the-art malware analysis techniques.

3A: Adversarial Machine Learning

ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models

Machine learning (ML) has become a core component of many real-world applications and training data is a key factor that drives current progress. This huge success has led Internet companies to deploy machine learning as a service (MLaaS). Recently, the first membership inference attack has shown that extraction of information on the training set is possible in such MLaaS settings, which has severe security and privacy implications.

However, the early demonstrations of the feasibility of such attacks have many assumptions on the adversary, such as using multiple so-called shadow models, knowledge of the target model structure, and having a dataset from the same distribution as the target model’s training data. We relax all these key assumptions, thereby showing that such attacks are very broadly applicable at low cost and thereby pose a more severe risk than previously thought. We present the most comprehensive study so far on this emerging and developing threat using eight diverse datasets which show the viability of the proposed attacks across domains.

In addition, we propose the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.

MBeacon: Privacy-Preserving Beacons for DNA Methylation Data

The advancement of molecular profiling techniques fuels biomedical research with a deluge of data. To facilitate data sharing, the Global Alliance for Genomics and Health established the Beacon system, a search engine designed to help researchers find datasets of interest. While the current Beacon system only supports genomic data, other types of biomedical data, such as DNA methylation, are also essential for advancing our understanding in the field. In this paper, we propose the first Beacon system for DNA methylation data sharing: MBeacon. As the current genomic Beacon is vulnerable to privacy attacks, such as membership inference, and DNA methylation data is highly sensitive, we take a privacy-by-design approach to construct MBeacon. First, we demonstrate the privacy threat, by proposing a membership inference attack tailored specifically to unprotected methylation Beacons. Our experimental results show that 100 queries are sufficient to achieve a successful attack with AUC (area under the ROC curve) above 0.9. To remedy this situation, we propose a novel differential privacy mechanism, namely SVT^2, which is the core component of MBeacon. Extensive experiments over multiple datasets show that SVT^2 can successfully mitigate membership privacy risks without significantly harming utility. We further implement a fully functional prototype of MBeacon which we make available to the research community.

Stealthy Adversarial Perturbations Against Real-Time Video Classification Systems

Recent research has demonstrated the brittleness of machine learning systems to adversarial perturbations. However, the studies have been mostly limited to perturbations on images and more generally, classification tasks that do not deal with real-time stream inputs. In this paper we ask ”Are adversarial perturbations that cause misclassification in real-time video classification systems possible, and if so what properties must they satisfy?” Real-time video classification systems find application in surveillance applications, smart vehicles, and smart elderly care and thus, misclassification could be particularly harmful (e.g., a mishap at an elderly care facility may be missed). Video classification systems take video clips as inputs and these clip boundaries are not deterministic. We show that perturbations that do not take “the indeterminism in the clip boundaries input to the video classifier” into account, do not achieve high attack success rates. We propose novel approaches for generating 3D adversarial perturbations (perturbation clips) that exploit recent advances in generative models to not only overcome this key challenge but also provide stealth. In particular, our most potent 3D adversarial perturbations cause targeted activities in video streams to be misclassified with rates over 80%. At the same time, they also ensure that the perturbations leave other (untargeted) activities largely unaffected making them extremely stealthy. Finally, we also derive a single-frame (2D) perturbation that can be applied to every frame in a video stream, and which in many cases, achieves extremely high misclassification rates.

NIC: Detecting Adversarial Samples with Neural Network Invariant Checking

Deep Neural Networks (DNN) are vulnerable to adversarial samples that are generated by perturbing correctly classified inputs to cause DNN models to misbehave (e.g., misclassification). This can potentially lead to disastrous consequences especially in security-sensitive applications. Existing defense and detection techniques work well for specific attacks under various assumptions (e.g., the set of possible attacks are known beforehand). However, they are not sufficiently general to protect against a broader range of attacks. In this paper, we analyze the internals of DNN models under various attacks and identify two common exploitation channels: the provenance channel and he activation value distribution channel. We then propose a novel technique to extract DNN invariants and use them to perform runtime adversarial sample detection. Our experimental results of 11 different kinds of attacks on popular datasets including ImageNet and 13 models show that our technique can effectively detect all these attacks (over 90% accuracy) with limited false positives. We also compare it with three state-of-the-art techniques including the Local Intrinsic Dimensionality (LID) based method, denoiser based methods (i.e., MagNet and HGD), and the prediction inconsistency based approach (i.e., feature squeezing). Our experiments show promising results.

TextBugger: Generating Adversarial Text Against Real-world Applications

Deep Learning-based Text Understanding (DLTU) is the backbone technique behind various applications, including question answering, machine translation, and text classification. Despite its tremendous popularity, the security vulnerabilities of DLTU are still largely unknown, which is highly concerning given its increasing use in security-sensitive applications such as user sentiment analysis and toxic content detection. In this paper, we show that DLTU is inherently vulnerable to adversarial text attacks, in which maliciously crafted text triggers target DLTU systems and services to misbehave. Specifically, we present TextBugger, a general attack framework for generating adversarial text. In contrast of prior work, TextBugger differs in significant ways: (i) effective – it outperforms state-of-the-art attacks in terms of attack success rate; (ii) evasive – it preserves the utility of benign text, with 94.9% of the adversarial text correctly recognized by human readers; and (iii) efficient – it generates adversarial text with computational complexity sub-linear to the text length. We empirically evaluate TextBugger on a set of real-world DLTU systems and services used for sentiment analysis and toxic content detection, demonstrating its effectiveness, evasiveness, and efficiency. For instance, TextBugger achieves 100% success rate on the IMDB dataset based on Amazon AWS Comprehend within 4.61 seconds and preserves 97% semantic similarity. We further discuss possible defense mechanisms to mitigate such attack and the adversary’s potential countermeasures, which leads to promising directions for further research.

3B-2: Censorship

The use of TLS in Censorship Circumvention

TLS, the Transport Layer Security protocol, has quickly become the most popular protocol on the Internet, already used to load over 70% of web pages in Mozilla Firefox. Due to its ubiquity, TLS is also a popular protocol for censorship circumvention tools, including Tor and Signal, among others.

However, the wide range of features supported in TLS makes it possible to distinguish implementations from one another by what set of cipher suites, elliptic curves, signature algorithms, and other extensions they support. Already, censors have used deep packet inspection (DPI) to identify and block popular circumvention tools based on the fingerprint of their TLS implementation.

In response, many circumvention tools have attempted to mimic popular TLS implementations such as browsers, but this technique has several challenges. First, it is burdensome to keep up with the rapidly-changing browser TLS implementations, and know what fingerprints would be good candidates to mimic. Second, TLS implementations can be difficult to mimic correctly, as they offer many features that may not be supported by the relatively lightweight libraries used in typical circumvention tools. Finally, dependency changes and updates to the underlying libraries can silently impact what an application’s TLS fingerprint looks like, making it difficult for tools to control.

In this paper, we collect and analyze real-world TLS traffic from over 11.8 billion TLS connections over 9 months to identify a wide range of TLS client implementations actually used on the Internet. We use our data to analyze TLS implementations of several popular censorship circumvention tools, including Lantern, Psiphon, Signal, Outline, Tapdance, and Tor (Snowflake and meek). We find that the many of these tools use TLS configurations that are easily distinguishable from the real-world traffic they attempt to mimic, even when these tools have put effort into parroting popular TLS implementations.

To address this problem, we have developed a library, uTLS, that enables tool maintainers to automatically mimic other popular TLS implementations. Using our real-world traffic dataset, we observe many popular TLS implementations we are able to correctly mimic with uTLS, and we describe ways our tool can more flexibly adopt to the dynamic TLS ecosystem with minimal manual effort.

On the Challenges of Geographical Avoidance for Tor

Traffic-analysis attacks are a persisting threat for Tor users. When censors or law enforcement agencies try to identify users, they conduct traffic-confirmation attacks and monitor encrypted transmissions to extract metadata—in combination with routing attacks, these attacks become sufficiently powerful to de-anonymize users. While traffic-analysis attacks are hard to detect and expensive to counter in practice, geographical avoidance provides an option to reject circuits that might be routed through an untrusted area. Unfortunately, recently proposed solutions introduce severe security issues by imprudent design decisions.

In this paper, we approach geographical avoidance starting from a thorough assessment of its challenges. These challenges serve as the foundation for the design of an empirical avoidance concept that considers actual transmission characteristics for justified decisions. Furthermore, we address the problems of untrusted or intransparent ground truth information that hinder a reliable assessment of circuits. Taking these features into account, we conduct an empirical simulation study and compare the performance of our novel avoidance concept with existing
approaches. Our results show that we outperform existing systems by 22 % fewer rejected circuits, which reduces the collateral damage of overly restrictive avoidance decisions. In a second evaluation step, we extend our initial system concept and implement the prototype MultilateraTor. This prototype is the first to satisfy the requirements of a practical deployment, as it maintains Tor’s original level of security, provides reasonable performance, and overcomes the fundamental security flaws of existing systems.

4A: Fuzzing

PeriScope: An Effective Probing and Fuzzing Framework for the Hardware-OS Boundary

The OS kernel is an attractive target for remote attackers. If compromised, the kernel gives adversaries full system access, including the ability to install rootkits, extract sensitive information, and perform other malicious actions, all while evading detection. Most of the kernel’s attack surface is situated along the system call boundary. Ongoing kernel protection efforts have focused primarily on securing this boundary; several capable analysis and fuzzing frameworks have been developed for this purpose.

However, there are additional paths to kernel compromise that do not involve system calls, as demonstrated by several recent exploits. For example, by compromising the firmware of a peripheral device such as a Wi-Fi chipset and subsequently sending malicious inputs from the Wi-Fi chipset to the Wi-Fi driver, adversaries have been able to gain control over the kernel without invoking a single system call. Unfortunately, there are currently no practical probing and fuzzing frameworks that can help developers find and fix such vulnerabilities occurring along the hardware-OS boundary.

We present PeriScope, a Linux kernel based probing framework that enables fine-grained analysis of device-driver interactions. PeriScope hooks into the kernel’s page fault handling mechanism to either passively monitor and log traffic between device drivers and their corresponding hardware, or mutate the data stream on-the-fly using a fuzzing component, PeriFuzz, thus mimicking an active adversarial attack. PeriFuzz accurately models the capabilities of an attacker on peripheral devices, to expose different classes of bugs including, but not limited to, memory corruption bugs and double-fetch bugs. To demonstrate the risk that peripheral devices pose, as well as the value of our framework, we have evaluated PeriFuzz on the Wi-Fi drivers of two popular chipset vendors, where we discovered 15 unique vulnerabilities, 9 of which were previously unknown.

REDQUEEN: Fuzzing with Input-to-State Correspondence

Automated software testing based on fuzzing has experienced a revival in recent years. Especially feedback-driven fuzzing has become well-known for its ability to efficiently perform randomized testing with limited input corpora. Despite a lot of progress, two common problems are magic numbers and (nested) checksums. Computationally expensive methods such as taint tracking and symbolic execution are typically used to overcome such roadblocks. Unfortunately, such methods often require access to source code, a rather precise description of the environment (e.g., behavior of library calls or the underlying OS), or the exact semantics of the platform’s instruction set.

In this paper, we introduce a lightweight, yet very effective alternative to taint tracking and symbolic execution to facilitate and optimize state-of-the-art feedback fuzzing that easily scales to large binary applications and unknown environments. We observe that during the execution of a given program, parts of the input often end up directly (i.e., nearly unmodified) in the program state. This input-to-state correspondence can be exploited to create a robust method to overcome common fuzzing roadblocks in a highly effective and efficient manner. Our prototype implementation, called REDQUEEN, is able to solve magic bytes and (nested) checksum tests automatically for a given binary executable. Additionally, we show that our techniques outperform various state-of-the-art tools on a wide variety of targets across different privilege levels (kernel-space and userland) with no platform-specific code. REDQUEEN is the first method to find more than 100% of the bugs planted in LAVA-M across all targets. Furthermore, we were able to discover 65 new bugs and obtained 16 CVEs in multiple programs and OS kernel drivers. Finally, our evaluation demonstrates that REDQUEEN is fast, widely applicable and outperforms concurrent approaches by up to three orders of magnitude.

NAUTILUS: Fishing for Deep Bugs with Grammars

Fuzzing is a well-known method for efficiently identifying bugs in programs. Unfortunately, when fuzzing targets that require highly-structured inputs such as interpreters, many fuzzing methods struggle to pass the syntax checks. More specifically, interpreters often process inputs in multiple stages: first syntactic, then semantic correctness is checked. Only if these checks are passed, the interpreted code gets executed. This prevents fuzzers from executing deeper’’ — and hence potentially more interesting — code. Typically two valid inputs that lead to the execution of different features in the target application require too many mutations for simple mutation-based fuzzers to discover: making small changes like bit flips usually only leads to the execution of error paths in the parsing engine. So-called grammar fuzzers are able to pass the syntax checks by using Context-Free Grammars. Using feedback can significantly increase the efficiency of fuzzing engines. Hence, it is commonly used in state-of-the-art mutational fuzzers that do not use grammars. Yet, grammar fuzzers do not make use of code coverage, i.e., they do not know whether any input triggers new functionality or not.

In this paper, we propose NAUTILUS, a method to efficiently fuzz programs that require highly-structured inputs by combining the use of grammars with the use of code coverage feedback. This allows us to recombine aspects of interesting inputs that were learned individually, and to dramatically increase the probability that any generated input will be accepted by the parser. We implemented a proof-of-concept fuzzer that we tested on multiple targets, including ChakraCore (the JavaScript engine of Microsoft Edge), PHP, mruby, and Lua. NAUTILUS identified multiple bugs in all of the targets: Seven in mruby, three in PHP, two in hakraCore, and one in Lua. Reporting these bugs was awarded with a sum of 2600 USD and 6 CVEs were assigned. Our experiments show that combining context-free grammars and feedback-driven fuzzing significantly outperforms state-of-the-art approaches like American Fuzzy Lop (AFL) by an order of magnitude and grammar fuzzers by more than a factor of two when measuring code coverage.

Analyzing Semantic Correctness with Symbolic Execution: A Case Study on PKCS#1 v1.5 Signature Verification

We discuss how symbolic execution can be used to not only find low-level errors but also analyze the semantic correctness of protocol implementations. To avoid manually crafting test cases, we propose a strategy of meta-level search, which leverages constraints stemmed from the input formats to automatically generate concolic test cases. Additionally, to aid root-cause analysis, we develop constraint provenance tracking (CPT), a mechanism that associates atomic sub-formulas of path constraints with their corresponding source level origins. We demonstrate the power of symbolic analysis with a case study on PKCS#1 v1.5 signature verification. Leveraging meta-level search and CPT, we analyzed 15 recent open-source implementations using symbolic execution and found semantic flaws in 6 of them. Further analysis of these flaws showed that 4 implementations are susceptible to new variants of the Bleichenbacher low- exponent RSA signature forgery. One implementation suffers from potential denial of service attacks with purposefully crafted signatures. All our findings have been responsibly shared with the affected vendors. Among the flaws discovered, 6 new CVEs have been assigned to the immediately exploitable ones.

Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing

Hybrid fuzzing which combines fuzzing and concolic execution has become an advanced technique for software vulnerability detection. Based on the observation that fuzzing and concolic execution are complementary in nature, the state-of-the-art hybrid fuzzing systems deploy demand launch'' andoptimal switch’’ strategies. Although these ideas sound intriguing, we point out several fundamental limitations in them, due to oversimplified assumptions. We then propose a novel discriminative dispatch’’ strategy to better utilize the capability of concolic execution. We design a novel Monte Carlo based probabilistic path prioritization model to quantify each path’s difficulty and prioritize them for concolic execution. This model treats fuzzing as a random sampling process. It calculates each path’s probability based on the sampling information. Finally, our model prioritizes and assigns the most difficult paths to concolic execution. We implement a prototype system DigFuzz and evaluate our system with two representative datasets. Results show that the concolic execution in DigFuzz outperforms than that in a state-of-the-art hybrid fuzzing system Driller in every major aspect. In particular, the concolic execution in DigFuzz contributes to discovering more vulnerabilities (12 vs. 5) and producing more code coverage (18.9% vs. 3.8%) on the CQE dataset than the concolic execution in Driller.

4B: Privacy on the Web

Measuring the Facebook Advertising Ecosystem

The Facebook advertising platform has been subject to a number of controversies in the past years regarding privacy violations, lack of transparency, as well as its capacity to be used by dishonest actors for discrimination or propaganda. In this study, we aim to provide a better understanding of the Facebook advertising ecosystem, focusing on how it is being used by advertisers. We first analyze the set of advertisers and then investigate how those advertisers are targeting users and customizing ads via the platform. Our analysis is based on the data we collected from over 600 real-world users via a browser extension that collects the ads our users receive when they browse their Facebook timeline, as well as the explanations for why users received these ads.

Our results reveal that users are targeted by a wide range of advertisers (e.g., from popular to niche advertisers); that a non-negligible fraction of advertisers are part of potentially sensitive categories such as news and politics, health or religion; that a significant number of advertisers employ targeting strategies that could be either invasive or opaque; and that many advertisers use a variety of targeting parameters and ad texts. Overall, our work emphasizes the need for better mechanisms to audit ads and advertisers in social media and provides an overview of the platform usage that can help move towards such mechanisms.

We Value Your Privacy … Now Take Some Cookies: Measuring the GDPR’s Impact on Web Privacy

The European Union’s General Data Protection Regulation (GDPR) went into effect on May 25, 2018. Its privacy regulations apply to any service and company collecting or processing personal data in Europe. Many companies had to adjust their data handling processes, consent forms, and privacy policies to comply with the GDPR’s transparency requirements. We monitored this rare event by analyzing changes on popular websites in all 28 member states of the European Union. For each country, we periodically examined its 500 most popular websites – 6,579 in total – for the presence of and updates to their privacy policy between December 2017 and October 2018. While many websites already had privacy policies, we find that in some countries up to 15.7 % of websites added new privacy policies by May 25, 2018, resulting in 84.5 % of websites having privacy policies. 72.6 % of websites with existing privacy policies updated them close to the date. After May this positive development slowed down noticeably. Most visibly, 62.1 % of websites in Europe now display cookie consent notices, 16 % more than in January 2018. These notices inform users about a site’s cookie use and user tracking practices. We categorized all observed cookie consent notices and evaluated 28 common implementations with respect to their technical realization of cookie consent. Our analysis shows that core web security mechanisms such as the same-origin policy pose problems for the implementation of consent according to GDPR rules, and opting out of third-party cookies requires the third party to cooperate. Overall, we conclude that the web became more transparent at the time GDPR came into force, but there is still a lack of both functional and usable mechanisms for users to consent to or deny processing of their personal data on the Internet.

How Bad Can It Git? Characterizing Secret Leakage in Public GitHub Repositories

GitHub and similar platforms have made public collaborative development of software commonplace. However, a problem arises when this public code must manage authentication secrets, such as API keys or cryptographic secrets. These secrets must be kept private for security, yet common development practices like adding these secrets to code make accidental leakage frequent. In this paper, we present the first large-scale and longitudinal analysis of secret leakage on GitHub. We examine billions of files collected using two complementary approaches: a nearly six-month scan of real-time public GitHub commits and a public snapshot covering 13% of open-source repositories. We focus on private key files and 11 high-impact platforms with distinctive API key formats. This focus allows us to develop conservative detection techniques that we manually and automatically evaluate to ensure accurate results. We find that not only is secret leakage pervasive — affecting over 100,000 repositories— but that thousands of new, unique secrets are leaked every day. We also use our data to explore possible root causes of leakage and to evaluate potential mitigation strategies. This work shows that secret leakage on public repository platforms is rampant and far from a solved problem, placing developers and services at persistent risk of compromise and abuse.

DNS Cache-Based User Tracking

We describe a novel user tracking technique that is based on assigning statistically unique DNS records per user. This new tracking technique is unique in being able to distinguish between machines that have identical hardware and software, and track users even if they use “privacy mode” browsing, or use multiple browsers (on the same machine).
The technique overcomes issues related to the caching of DNS answers in resolvers, and utilizes per-device caching of DNS answers at the client. We experimentally demonstrate that it covers the technologies used by a very large fraction of Internet users (in terms of browsers, operating systems, and DNS resolution platforms).
Our technique can track users for up to a day (typically), and therefore works best when combined with other, narrower yet longer-lived techniques such as regular cookies - we briefly
explain how to combine such techniques.
We suggest mitigations to this tracking technique but note that it is not easily mitigated. There are possible workarounds, yet these are not without setup overhead, performance overhead or convenience overhead. A complete mitigation requires software modifications in both browsers and resolver software.

Quantity vs. Quality: Evaluating User Interest Profiles Using Ad Preference Managers

Widely reported privacy issues concerning major online advertising platforms (e.g., Facebook) have heightened concerns among users about the data that is collected about them. However, while we have a comprehensive understanding who collects data on users, as well as how tracking is implemented, there is still a significant gap in our understanding: what information do advertisers actually infer about users, and is this information accurate?

In this study, we leverage Ad Preference Managers (APMs) as a lens through which to address this gap. APMs are transparency tools offered by some advertising platforms that allow users to see the interest profiles that are constructed about them. We recruited 220 participants to install an IRB approved browser extension that collected their interest profiles from four APMs (Google, Facebook, Oracle BlueKai, and Neilsen eXelate), as well as behavioral and survey data. We use this data to analyze the size and correctness of interest profiles, compare their composition across the four platforms, and investigate the origins of the data underlying these profiles.

5A: Bugs and Vulnerabilities

Thunderclap: Exploring Vulnerabilities in Operating System IOMMU Protection via DMA from Untrustworthy Peripherals

Direct Memory Access (DMA) attacks have been known for many years: DMA-enabled I/O peripherals have complete access to the state of a computer and can fully compromise it including reading and writing all of system memory.

With the popularity of Thunderbolt 3 over USB Type-C and smart internal devices, opportunities for these attacks to be performed casually with only seconds of physical access to a computer have greatly broadened. In response, commodity hardware and operating-system (OS) vendors have incorporated support for Input-Output Memory Management Units (IOMMUs), which impose memory protection on DMA, and are widely believed to protect against DMA attacks.

We investigate the state-of-the-art in IOMMU protection across OSes using a novel I/O security research platform, and find that current protections fall short when faced with a functional network peripheral that uses its complex interactions with the OS for ill intent, and demonstrate compromises against macOS, FreeBSD, and Linux, which notionally utilize IOMMUs to protect against DMA attackers. Windows only uses the IOMMU in limited cases and remains vulnerable.

Using Thunderclap, an open-source FPGA research platform we built, we explore a number of novel exploit techniques to expose new classes of OS vulnerability. The complex vulnerability space for IOMMU-exposed shared memory available to DMA-enabled peripherals allows attackers to extract private data (sniffing cleartext VPN traffic) and hijack kernel control flow (launching a root shell) in seconds using devices such as USB-C projectors or power adapters.

We have worked closely with OS vendors to remedy these vulnerability classes, and they have now shipped substantial feature improvements and mitigations as a result of our work.

One Engine To Serve ‘em All: Inferring Taint Rules Without Architectural Semantics

Dynamic binary taint analysis has wide applications in the security analysis of commercial-off-the-shelf (COTS) binaries. One of the key challenges in dynamic binary analysis is to specify the taint rules that capture how taint information propagates for each instruction on an architecture. Most of the existing solutions specify taint rules using a deductive approach by summarizing the rules manually after analyzing the instruction semantics. Intuitively, taint propagation reflects on how an instruction input affects its output and thus can be observed from instruction executions. In this work, we propose an inductive method for taint propagation and develop a universal taint tracking engine that is architecture-agnostic. Our taint engine, TAINTINDUCE, can learn taint rules with minimal architectural knowledge by observing the execution behavior of instructions. To measure its correctness and guide taint rule generation, we define the precise notion of soundness for bit-level taint tracking in this novel setup. In our evaluation, we show that TAINT INDUCE automatically learns rules for 4 widely used architectures: x86, x64, AArch64, and MIPS-I. It can detect vulnerabilities for 24 CVEs in 15 applications on both Linux and Windows over millions of instructions and is comparable with other mature existing tools (TEMU [51], libdft [32], Triton [42]). TAINTINDUCE can be used as a standalone taint engine or be used to complement existing taint engines for unhandled instructions. Further, it can be used as a cross-referencing tool to uncover bugs in taint engines, emulation implementations and ISA documentations.

Automating Patching of Vulnerable Open-Source Software Versions in Application Binaries

Mobile application developers rely heavily on open-source software (OSS) to offload common functionalities such as the implementation of protocols and media format playback. Over the past years, several vulnerabilities have been found in popular open-source libraries like OpenSSL and FFmpeg. Mobile applications that include such libraries inherit these flaws, which make them vulnerable. Fortunately, the open-source community is responsive and patches are made available within days. However, mobile application developers are often left unaware of these flaws. The App Security Improvement Program (ASIP) is a commendable effort by Google to notify application developers of these flaws, but recent work has shown that many developers do not act on this information.

Our work addresses vulnerable mobile applications through automatic binary patching from source patches provided by the OSS maintainers and without involving the developers. We propose novel techniques to overcome difficult challenges like patching feasibility analysis, source-code-to-binary-code matching, and in-memory patching. Our technique uses a novel variability-aware approach, which we implement as OSSPatcher. We evaluated OSSPatcher with 39 OSS and a collection of 1,000 Android applications using their vulnerable versions. OSSPatcher generated 675 function-level patches that fixed the affected mobile applications without breaking their binary code. Further, we evaluated 10 vulnerabilities in popular apps such as Chrome with public exploits, which OSSPatcher was able to mitigate and thwart their exploitation.

CRCount: Pointer Invalidation with Reference Counting to Mitigate Use-after-free in Legacy C/C++

Pointer invalidation has been a popular approach adopted in many recent studies to mitigate use-after-free errors. The approach can be divided largely into two different schemes: explicit invalidation and implicit invalidation. The former aims to eradicate the root cause of use-after-free errors by invalidating every dangling pointer one by one explicitly. In contrast, the latter aims to prevent dangling pointers by freeing an object only if there is no pointer referring to it. A downside of the explicit scheme is that it is expensive, as it demands high-cost algorithms or a large amount of space to maintain every up-to-date list of pointer locations linking to each object at all times. Implicit invalidation is more efficient in that even without any explicit effort, it can eliminate dangling pointers by leaving objects undeleted until all the links between the objects and their referring pointers vanish by themselves during program execution. However, such an argument only holds if the scheme knows exactly when each link is created and deleted. Reference counting is a traditional method to determine the existence of reference links between objects and pointers. Unfortunately, impeccable reference counting for legacy C/C++ code is very difficult and expensive to achieve in practice, mainly because of the type unsafe operations in the code. In this paper, we present a solution, called CRCount, to the use-after-free problem in legacy C/C++. For effective and efficient problem solving, CRCount is armed with the pointer footprinting technique that enables us to compute, with high accuracy, the reference count of every object referred to by the pointers in the legacy code. Our experiments demonstrate that CRCount mitigates the use-after-free errors with a lower performance-wise and space-wise overhead than the existing pointer invalidation solutions.

CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines

JavaScript engines are an attractive target for attackers due to their popularity and flexibility in building exploits. Current state-of-the-art fuzzers for finding JavaScript engine vulnerabilities focus mainly on generating syntactically correct test cases based on either a predefined context-free grammar or a trained probabilistic language model. Unfortunately, syntactically correct JavaScript sentences are often semantically invalid at runtime. Furthermore, statically analyzing the semantics of JavaScript code is challenging due to its dynamic nature: JavaScript code is generated at runtime, and JavaScript expressions are dynamically-typed. To address this challenge, we propose a novel test case generation algorithm that we call semantics-aware assembly, and implement it in a fuzz testing tool termed CodeAlchemist. Our tool can generate arbitrary JavaScript code snippets that are both semantically and syntactically correct, and it effectively yields test cases that can crash JavaScript engines. We found numerous vulnerabilities of the latest JavaScript engines with CodeAlchemist and reported them to the vendors.

6A: Authentication

Coconut: Threshold Issuance Selective Disclosure Credentials with Applications to Distributed Ledgers

Coconut is a novel selective disclosure credential scheme supporting distributed threshold issuance, public and private attributes, re-randomization, and multiple unlinkable selective attribute revelations. Coconut integrates with Blockchains to ensure confidentiality, authenticity and availability even when a subset of credential issuing authorities are malicious or offline. We implement and evaluate a generic Coconut smart contract library for Chainspace and Ethereum; and present three applications related to anonymous payments, electronic petitions, and distribution of proxies for censorship resistance.
Coconut uses short and computationally efficient credentials, and our evaluation shows that most Coconut cryptographic primitives take just a few milliseconds on average, with verification taking the longest time (10 milliseconds).

Distinguishing Attacks from Legitimate Authentication Traffic at Scale

Online guessing attacks against password servers can be hard to address. Approaches that throttle or block repeated guesses on an account (e.g., three strikes type lockout rules) can be effective against depth-first attacks, but are of little help against breadth-first attacks that spread guesses very widely. At large providers with tens or hundreds of millions of accounts breadth-first attacks offer a way to send millions or even billions of guesses without ever triggering the depth-first defenses. The absence of labels and non-stationarity of attack traffic make it challenging to apply machine learning techniques.

We show how to accurately estimate the odds that an observation $x$ associated with a request is malicious. Our main assumptions are that successful malicious logins are a small fraction of the total, and that the distribution of $x$ in the legitimate traffic is stationary, or very-slowly varying. From these we show how we can estimate the ratio of bad-to-good traffic among any set of requests; how we can then identify subsets of the request data that contain least (or even no) attack traffic; how these least-attacked subsets allow us to estimate the distribution of values of $x$ over the legitimate data, and hence calculate the odds ratio. A sensitivity analysis shows that even when we fail to identify a subset with little attack traffic our odds ratio estimates are very robust.

Robust Performance Metrics for Authentication Systems

Research has produced many types of authentication systems that use machine learning. However, there is no consistent approach for reporting performance metrics and the reported metrics are inadequate. In this work, we show that several of the common metrics used for reporting performance, such as maximum accuracy (ACC), equal error rate (EER) and area under the ROC curve (AUROC), are inherently flawed. These common metrics hide the details of the inherent trade-offs a system must make when implemented. Our findings show that current metrics give no insight into how system performance degrades outside the ideal conditions in which they were designed. We argue that adequate performance reporting must be provided to enable meaningful evaluation and that current, commonly used approaches fail in this regard. We present the unnormalized frequency count of scores (FCS) to demonstrate the mathematical underpinnings that lead to these failures and show how they can be avoided. The FCS can be used to augment the performance reporting to enable comparison across systems in a visual way. When reported with the Receiver Operating Characteristics curve (ROC), these two metrics provide a solution to the limitations of currently reported metrics. Finally, we show how to use the FCS and ROC metrics to evaluate and compare different authentication systems.

Total Recall: Persistence of Passwords in Android

A good security practice for handling sensitive data, such as passwords, is to overwrite the data buffers with zeros once the data is no longer in use. This protects against attackers who gain a snapshot of a device’s physical memory, whether by in-person physical attacks, or by remote attacks like Meltdown and Spectre. This paper looks at unnecessary password retention in Android phones by popular apps, secure password management apps, and even the lockscreen system process. We have performed a comprehensive analysis of the Android framework and a variety of apps, and discovered that passwords can survive in a variety of locations, including UI widgets where users enter their passwords, apps that retain passwords rather than exchange them for tokens, old copies not yet reused by garbage collectors, and buffers in keyboard apps. We have developed solutions that successfully fix these problems with modest code changes.

How to End Password Reuse on the Web

We present a framework by which websites can coordinate to make it difficult for users to set similar passwords at these websites, in an effort to break the culture of password reuse on the web today.
Though the design of such a framework is fraught with risks to users’ security and privacy, we show that these risks can be effectively mitigated through careful scoping of the goals for such a framework and through principled design. At the core of our framework is a private set-membership-test protocol that enables one website to determine, upon a user setting a password for use at it, whether that user has already set a similar password at another participating website, but with neither side disclosing to the other the password(s) it employs in the protocol. Our framework then layers over this protocol a collection of techniques to mitigate the leakage necessitated by such a test. We verify via probabilistic model checking that these techniques are effective in maintaining account security, and since these mechanisms are consistent with common user experience today, our framework should be unobtrusive to users who do not reuse similar passwords across websites (e.g., due to having adopted a password manager). Through a working implementation of our framework and optimization of its parameters based on insights of how passwords tend to be reused, we show that our design can meet the scalability challenges facing such a service.

11: Machine Learning & Game Theory Applications

Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation

Many security and privacy problems can be modeled as a graph classification problem, where nodes in the graph are classified by collective classification simultaneously. State- of-the-art collective classification methods for such graph-based security and privacy analytics follow the following paradigm: assign weights to edges of the graph, iteratively propagate reputation scores of nodes among the weighted graph, and use the final reputation scores to classify nodes in the graph. The key challenge is to assign edge weights such that an edge has a large weight if the two corresponding nodes have the same label, and a small weight otherwise. Although collective classification has been studied and applied for security and privacy problems for more than a decade, how to address this challenge is still an open question. For instance, most existing methods simply set a constant weight to all edges.

In this work, we propose a novel collective classification framework to address this long-standing challenge. We first formulate learning edge weights as an optimization problem, which quantifies the goals about the final reputation scores that we aim to achieve. However, it is computationally hard to solve the optimization problem because the final reputation scores depend on the edge weights in a very complex way. To address the computational challenge, we propose to jointly learn the edge weights and propagate the reputation scores, which is essentially an approximate solution to the optimization problem. We compare our framework with state-of-the-art methods for graph-based security and privacy analytics using four large-scale real-world datasets from various application scenarios such as Sybil detection in social networks, fake review detection in Yelp, and attribute inference attacks. Our results demonstrate that our framework achieves higher accuracies than state-of-the-art methods with an acceptable computational overhead.

Enemy At the Gateways: Censorship-Resilient Proxy Distribution Using Game Theory

A core technique used by popular proxy-based circumvention systems like Tor is to privately and selectively distribute the IP addresses of circumvention proxies among censored clients to keep them unknown to the censors.

In Tor, for instance, such privately shared proxies are known as bridges. A key challenge to this mechanism is the insider attack problem: censoring agents can impersonate benign censored clients in order to learn (and then block) the privately shared circumvention proxies. To minimize the risks of the insider attack threat, in-the-wild circumvention systems like Tor use various proxy assignment mechanisms in order to minimize the risk of proxy enumeration by the censors, while providing access to a large fraction of censored clients.

Unfortunately, existing proxy assignment mechanisms (like the one used by Tor) are based on ad hoc heuristics that offer no theoretical guarantees and are easily evaded in practice. In this paper, we take a systematic approach to the problem of proxy distribution in circumvention systems by establishing a game-theoretic framework. We model the proxy assignment problem as a game between circumvention system operators and the censors, and use game theory to derive the optimal strategies of each of the parties. Using our framework, we derive the best (optimal) proxy assignment mechanism of a circumvention system like Tor in the presence of the strongest censorship adversary who takes her best censorship actions.

We perform extensive simulations to evaluate our optimal proxy assignment algorithm under various adversarial and network settings. We show that the algorithm has superior performance compared to the state of the art, i.e., provides stronger resistance to censorship even against the strongest censorship adversary. Our study establishes a generic framework for optimal proxy assignment that can be applied to various types of circumvention systems and under various threat models. We conclude with lessons and recommendations for the design of proxy-based circumvention systems.

Neuro-Symbolic Execution: Augmenting Symbolic Execution with Neural Constraints

Symbolic execution is a powerful technique for program analysis. However, it has many limitations in practical applicability: the path explosion problem encumbers scalability, the need for language-specific implementation, the inability to handle complex dependencies, and the limited expressiveness of theories supported by underlying satisfiability checkers. Often, relationships between variables of interest are not expressible directly as purely symbolic constraints. To this end, we present a new approach—neuro-symbolic execution—which learns an approximation of the relationship between program values of interest, as a neural network. We develop a procedure for checking satisfiability of mixed constraints, involving both symbolic expressions and neural representations. We implement our new approach in a tool called NeuEx as an extension of KLEE, a state-of-the-art dynamic symbolic execution engine. NeuEx finds 33 exploits in a benchmark of 7 programs within 12 hours. This is an improvement in the bug finding efficacy of 94% over vanilla KLEE. We show that this new approach drives execution down difficult paths on which KLEE and other DSE extensions get stuck, eliminating limitations of purely SMT-based techniques.

Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs

Binary code analysis allows analyzing binary code without having access to the corresponding source code. It is widely used for vulnerability discovery, malware dissection, attack investigation, etc. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different
instruction set architectures, determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code from a different architecture. The solutions to these two problems have many applications, such as cross-architecture code plagiarism detection, malware identification, and vulnerability discovery.

Despite the evident importance of Problem I, existing solutions are either inefficient or imprecise. Inspired by Neural Machine Translation (NMT), which is a new approach that tackles text across natural languages very well, we regard instructions as words and basic blocks as sentences, and propose a novel cross-(assembly)-lingual deep learning approach to solving the first problem, attaining high efficiency and precision. Regarding Problem II, many solutions have been proposed recently to solve this issue at the function level. However, performing cross-architecture code similarity comparison beyond function pairs is a new and more challenging endeavor. Employing our technique for cross-architecture basic-block comparison, we propose an effective solution to Problem II. We implement a prototype system and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.

2020 Programme(有视频) dblp(暂未更新)

Session 1A: Web

FUSE: Finding File Upload Bugs via Penetration Testing ✔👍

An Unrestricted File Upload (UFU) vulnerability is a critical security threat that enables an adversary to upload her choice of a forged file to a target web server. This bug evolves into an Unrestricted Executable File Upload (UEFU) vulnerability when the adversary is able to conduct remote code execution of the uploaded file via triggering its URL. We design and implement FUSE, the first penetration testing tool designed to discover UFU and UEFU vulnerabilities in server-side PHP web applications. The goal of FUSE is to generate upload requests; each request becomes an exploit payload that triggers a UFU or UEFU vulnerability. However, this approach entails two technical challenges: (1) it should generate an upload request that bypasses all content-filtering checks present in a target web application; and (2) it should preserve the execution semantic of the resulting uploaded file. We address these technical challenges by mutating standard upload requests with carefully designed mutation operations that enable the bypassing of content- filtering checks and do not tamper with the execution of uploaded files. FUSE discovered 30 previously unreported UEFU vulnerabilities, including 15 CVEs from 33 real-world web applications, thereby demonstrating its efficacy in finding code execution bugs via file uploads.

Melting Pot of Origins: Compromising the Intermediary Web Services that Rehost Websites

Intermediary web services such as web proxies, web translators, and web archives have become pervasive as a means to enhance the openness of the web. These services aim to remove the intrinsic obstacles to web access; i.e., access blocking, language barriers, and missing web pages. In this study, we refer to these services as web rehosting services and make the first exploration of their security flaws. The web rehosting services use a single domain name to rehost several websites that have distinct domain names; this characteristic makes web rehosting services intrinsically vulnerable to violating the same origin policy if not operated carefully. Based on the intrinsic vulnerability of web rehosting services, we demonstrate that an attacker can perform five different types of attacks that target users who make use of web rehosting services: persistent man-in-the-middle attack, abusing privileges to access various resources, stealing credentials, stealing browser history, and session hijacking/injection. Our extensive analysis of 21 popular web rehosting services, which have more than 200 million visits per day, revealed that these attacks are feasible. In response to this observation, we provide effective countermeasures against each type of attack.

Social media has become a primary mean of content and information sharing, thanks to its speed and simplicity. In this scenario, link previews play the important role of giving a meaningful first glance to users, summarizing the content of the shared webpage within its title, description and image. In our work, we analyzed the preview-rendering process, observing how it is possible to misuse it to obtain benign-looking previews for malicious links. Concrete use-case of this research field is phishing and spam spread, considering targeted attacks in addition to large-scale campaigns.

We designed a set of experiments for 20 social media platforms including social networks and instant messenger applications and found out how most of the platforms follow their own preview design and format, sometimes providing partial information. Four of these platforms allow preview crafting so as to hide the malicious target even to a tech-savvy user, and we found that it is possible to create misleading previews for the remaining 16 platforms when an attacker can register their own domain. We also observe how 18 social media platforms do not employ active nor passive countermeasures against the spread of known malicious links or software, and that existing cross-checks on malicious URLs can be bypassed through client- and server-side redirections. To conclude, we suggest seven recommendations covering the spectrum of our findings, to improve the overall preview-rendering mechanism and increase users’ overall trust in social media platforms.

Cross-Origin State Inference (COSI) Attacks: Leaking Web Site States through XS-Leaks

In a Cross-Origin State Inference (COSI) attack, an attacker convinces a victim into visiting an attack web page, which leverages the cross-origin interaction features of the victim’s web browser to infer the victim’s state at a target web site. Multiple instances of COSI attacks have been found in the past under different names such as login detection or access detection attacks. But, those attacks only consider two states (e.g., logged in or not) and focus on a specific browser leak method (or XS-Leak).

This work shows that mounting more complex COSI attacks such as deanonymizing the owner of an account, determining if the victim owns sensitive content, and determining the victim’s account type often requires considering more than two states. Furthermore, robust attacks require supporting a variety of browsers since the victim’s browser cannot be predicted apriori. To address these issues, we present a novel approach to identify and build complex COSI attacks that differentiate more than
two states and support multiple browsers by combining multiple attack vectors, possibly using different XS-Leaks. To enable our approach, we introduce the concept of a COSI attack class. We propose two novel techniques to generalize existing COSI attack instances into COSI attack classes and to discover new COSI attack classes. We systematically study existing attacks and apply our techniques to them, identifying 40 COSI attack classes. As part of this process, we discover a novel XS-Leak based on window.postMessage. We implement our approach into Basta-COSI, a tool to find COSI attacks in a target web site. We apply Basta-COSI to test four stand-alone web applications and 58 popular web sites, finding COSI attacks against each of them.

Carnus: Exploring the Privacy Threats of Browser Extension Fingerprinting

With users becoming increasingly privacy-aware and browser vendors incorporating anti-tracking mechanisms, browser fingerprinting has garnered significant attention. Accordingly, prior work has proposed techniques for identifying browser extensions and using them as part of a device’s fingerprint. While previous studies have demonstrated how extensions can be detected through their web accessible resources, there exists a significant gap regarding techniques that indirectly detect extensions through behavioral artifacts. In fact, no prior study has demonstrated that this can be done in an automated fashion. In this paper, we bridge this gap by presenting the first fully automated creation and detection of behavior-based extension fingerprints. We also introduce two novel fingerprinting techniques that monitor extensions’ communication patterns, namely outgoing HTTP requests and intra-browser message exchanges. These techniques comprise the core of Carnus, a modular system for the static and dynamic analysis of extensions, which we use to create the largest set of extension fingerprints to date. We leverage our dataset of 29,428 detectable extensions to conduct a comprehensive investigation of extension fingerprinting in realistic settings and demonstrate the practicality of our attack. Our experimental evaluation against a state-of-the-art countermeasure confirms the robustness of our techniques as 87.92% of our behavior-based fingerprints remain effective.

Subsequently, we aim to explore the true extent of the privacy threat that extension fingerprinting poses to users, and present a novel study on the feasibility of inference attacks that reveal private and sensitive user information based on the functionality and nature of their extensions. We first collect over 1.44 million public user reviews of our detectable extensions, which provide a unique macroscopic view of the browser extension ecosystem and enable a more precise evaluation of the discriminatory power of extensions as well as a new deanonymization vector. We also automatically categorize extensions based on the developers’ descriptions and identify those that can lead to the inference of personal data (religion, medical issues, etc.). Overall, our research sheds light on previously unexplored dimensions of the privacy threats of extension fingerprinting and highlights the need for more effective countermeasures that can prevent our attacks.

Session 1B: Fuzzing

HYPER-CUBE: High-Dimensional Hypervisor Fuzzing

Applying modern fuzzers to novel targets is often a very lucrative venture. Hypervisors are part of a very critical code base: compromising them could allow an attacker to compromise the whole cloud infrastructure of any cloud provider. In this paper, we build a novel fuzzer that aims explicitly at testing modern hypervisors.
Our high throughput fuzzer design for long running interactive targets allows us to fuzz a large number of hypervisors, both open source, and proprietary. In contrast to one-dimensional fuzzers such as AFL, HYPER-CUBE can interact with any number of interfaces in any order.

Our evaluation shows that we can find more bugs (over 2x) and coverage (as much as 2x) than state of the art hypervisor fuzzers. Additionally, in most cases, we were able to do so using multiple orders of magnitude less time than comparable fuzzers. HYPER-CUBE was also able to rediscover a set of well-known vulnerabilities for hypervisors, such as VENOM, in less than five minutes. In total, HYPER-CUBE found 54 novel bugs, and so far we obtained 37 CVEs.
Our evaluation results demonstrates that next generation coverage-guided fuzzers should incorporate a higher-throughput design for long running targets such as hypervisors.

HFL: Hybrid Fuzzing on the Linux Kernel

Hybrid fuzzing, combining symbolic execution and fuzzing, is a promising approach for vulnerability discovery because each approach can complement the other. However, we observe that applying hybrid fuzzing to kernel testing is challenging because the following unique characteristics of the kernel make a naive adoption of hybrid fuzzing inefficient: 1) having many implicit control transfers determined by syscall arguments, 2) controlling and matching internal system state via system calls, and 3) inferring nested argument type for invoking system calls. Failure to handling such challenges will render both fuzzing and symbolic execution inefficient, and thereby, will result in an inefficient hybrid fuzzing. Although these challenges are essential to both fuzzing and symbolic execution, however, to the best of our knowledge, existing kernel testing approaches either naively use each technique separately without handling such challenges or imprecisely handle a part of challenges only by static analysis.

To this end, this paper proposes HFL, which not only combines fuzzing with symbolic execution for hybrid fuzzing but also addresses kernel-specific fuzzing challenges via three distinct features: 1) converting implicit control transfers to explicit transfers, 2) inferring system call sequence to build a consistent system state, and 3) identifying nested arguments types of system calls. As a result, HFL found 24 previously unknown vulnerabilities in recent Linux kernels. Additionally, HFL achieves 14% higher code coverage than Syzkaller, and over S2E/TriforceAFL, achieving even eight times better coverage, using the same amount of resource (CPU, time, etc.). Regarding vulnerability discovery performance, HFL found 13 known vulnerabilities more than three times faster than Syzkaller.

HotFuzz: Discovering Algorithmic Denial-of-Service Vulnerabilities Through Guided Micro-Fuzzing

Fifteen billion devices run Java and many of them are connected to the Internet. As this ecosystem continues to grow, it remains an important task to discover the unknown security threats these devices face. Fuzz testing repeatedly runs software on random inputs in order to trigger unexpected program behaviors, such as crashes or timeouts, and has historically revealed serious security vulnerabilities. Contemporary fuzz testing techniques focus on identifying memory corruption vulnerabilities that allow adversaries to achieve remote code execution. Meanwhile, algorithmic complexity (AC) vulnerabilities, which are a common attack vector for denial-of-service attacks, remain an understudied threat.

In this paper, we present HotFuzz, a framework for automatically discovering AC vulnerabilities in Java libraries. HotFuzz uses micro-fuzzing, a genetic algorithm that evolves arbitrary Java objects in order to trigger the worst-case performance for a method under test. We define Small Recursive Instantiation (SRI) which provides seed inputs to micro-fuzzing represented as Java objects. After micro-fuzzing, HotFuzz synthesizes test cases that triggered AC vulnerabilities into Java programs and monitors their execution in order to reproduce vulnerabilities outside the analysis framework. HotFuzz outputs those programs that exhibit high CPU utilization as witnesses for AC vulnerabilities in a Java library.

We evaluate HotFuzz over the Java Runtime Environment (JRE), the 100 most popular Java libraries on Maven, and challenges contained in the DARPA Space and Time Analysis for Cyber-Security (STAC) program. We compare the effectiveness of using seed inputs derived using SRI against using empty values. In this evaluation, we verified known AC vulnerabilities, discovered previously unknown AC vulnerabilities that we responsibly reported to vendors, and received confirmation from both IBM and Oracle. Our results demonstrate micro-fuzzing finds AC vulnerabilities in real-world software, and that micro-fuzzing with SRI derived seed inputs complements using empty seed inputs.

Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization

Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With code coverage, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coverage-based fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing, which aims to full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be patched promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding real-world vulnerabilities.

To resolve the issue, we propose coverage accounting, an innovative approach that evaluates code coverage by security impacts. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 5 state-of-the-art greybox and hybrid fuzzers (AFL, AFLFast, FairFuzz, QSYM, and Angora). TortoiseFuzz outperformed all greybox fuzzers and most hybrid fuzzers. It also had comparative results for other hybrid fuzzers yet consumed much fewer resources. Additionally, TortoiseFuzz found 18 new real-world vulnerabilities and has got 8 new CVEs so far. We will open source TortoiseFuzz to foster future research.

Session 2A: Censorship

Detecting Probe-resistant Proxies

Censorship circumvention proxies have to resist active probing attempts, where censors connect to suspected servers and attempt to communicate using known proxy protocols. If the server responds in a way that reveals it is a proxy, the censor can block it with minimal collateral risk to other non-proxy services. Censors such as the Great Firewall of China have previously been observed using basic forms of this technique to find and block proxy servers as soon as they are used. In response, circumventors have created new “probe-resistant” proxy protocols, including obfs4, Shadowsocks, and Lampshade, that attempt to prevent censors from discovering them. These proxies require knowledge of a secret in order to use, and the servers remain silent when probed by a censor that doesn’t have the secret in an attempt to make it more difficult for censors to detect them.

In this paper, we identify ways that censors can still distinguish such probe-resistant proxies from other innocuous hosts on the Internet, despite their design. We discover unique TCP behaviors of five probe-resistant protocols used in popular circumvention software that could allow censors to effectively confirm suspected proxies with minimal false positives. We evaluate and analyze our attacks on hundreds of thousands of servers collected from a 10 Gbps university ISP vantage point over several days as well as active scanning using ZMap. We find that our attacks are able to efficiently identify proxy servers with only a handful of probing connections, with negligible false positives. Using our datasets, we also suggest defenses to these attacks that make it harder for censors to distinguish proxies from other common servers, and we work with proxy developers to implement these changes in several popular circumvention tools.

Decentralized Control: A Case Study of Russia

Until now, censorship research has largely focused on highly centralized networks that rely on government-run technical choke-points, such as the Great Firewall of China. Although it was previously thought to be prohibitively difficult, large-scale censorship in decentralized networks are on the rise. Our in-depth investigation of the mechanisms underlying decentralized information control in Russia shows that such large-scale censorship can be achieved in decentralized networks through inexpensive commodity equipment. This new form of information control presents a host of problems for censorship measurement, including difficulty identifying censored content, requiring measurements from diverse perspectives, and variegated censorship mechanisms that require significant effort to identify in a robust manner.

By working with activists on the ground in Russia, we obtained five leaked blocklists signed by Roskomnadzor, the Russian government’s federal service for mass communications, along with seven years of historical blocklist data. This authoritative list contains domains, IPs, and subnets that ISPs have been required to block since November 1st, 2012. We used the blocklist from April 24 2019, that contains 132,798 domains, 324,695 IPs, and 39 subnets, to collect active measurement data from residential, data center and infrastructural vantage points. Our vantage points span 408 unique ASes that control ~ 65% of Russian IP address space.

Our findings suggest that data centers block differently from the residential ISPs both in quantity and in method of blocking, resulting in different experiences of the Internet for residential network perspectives and data center perspectives. As expected, residential vantage points experience high levels of censorship. While we observe a range of blocking techniques, such as TCP/IP blocking, DNS manipulation, or keyword based filtering, we find that residential ISPs are more likely to inject blockpages with explicit notices to users when censorship is enforced. Russia’s censorship architecture is a blueprint, and perhaps a forewarning of what and how national censorship policies could be implemented in many other countries that have similarly diverse ISP ecosystems to Russia’s. Understanding decentralized control will be key to continuing to preserve Internet freedom for years to come.

Measuring the Deployment of Network Censorship Filters at Global Scale

Content filtering technologies are often used for Internet censorship, but even as these technologies have become cheaper and easier to deploy, the censorship measurement community lacks a systematic approach to monitor their proliferation. Past research has focused on a handful of specific filtering technologies, each of which required cumbersome manual detective work to identify. Researchers and policymakers require a more comprehensive picture of the state and evolution of censorship based on content filtering in order to establish effective policies that protect Internet freedom.

In this work, we present FilterMap, a novel framework that can scalably monitor content filtering technologies based on their blockpages. FilterMap first compiles in-network and new remote censorship measurement techniques to gather blockpages from filter deployments. We then show how the observed blockpages can be clustered, generating signatures for longitudinal tracking. FilterMap outputs a map of regions of address space in which the same blockpages appear (corresponding to filter deployments), and each unique blockpage is manually verified to avoid false positives.

By collecting and analyzing more than 379 million measurements from 45,000 vantage points against more than 18,000 sensitive test domains, we are able to identify filter deployments associated with 90 vendors and actors and observe filtering in 103 countries. We detect the use of commercial filtering technologies for censorship in 36 out of 48 countries labeled as ‘Not Free’ or ‘Partly Free’ by the Freedom House ‘’Freedom on the Net’’ report. The unrestricted transfer of content filtering technologies have led to high availability, low cost, and highly effective filtering techniques becoming easier to deploy and harder to circumvent. Identifying these filtering deployments highlights policy and corporate social responsibility issues, and adds accountability to filter manufacturers. Our continued publication of FilterMap data will help the international community track the scope, scale and evolution of content-based censorship.

SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy Discovery

A key characteristic of commonly deployed deep packet inspection (DPI) systems is that they implement a simpli- fied state machine of the network stack that often differs from that of the end hosts. The discrepancies between the two state machines have been exploited to bypass such DPI middleboxes. However, most prior approaches to do so rely on manually crafted adversarial packets, which not only is labor-intensive but may not work well across a plurality of DPI-based middleboxes. Our goal in this work is to develop an automated way to craft such candidate packets, targeting TCP implementations in particular. Our approach to achieve this goal hinges on the key insight that while the TCP state machines of DPI implementations are obscure, those of the end hosts are well established. Thus, in our system SYMTCP, using symbolic execution, we systematically explore the TCP implementation of an end host, identifying candidate packets that can reach critical points in the code (e.g., which causes the packets to be accepted or dropped/ignored); such automatically identified packets are then fed through the DPI middlebox to determine if a discrepancy is induced and the middlebox can be bypassed. We find that our approach is extremely effective. It can generate tens of thousands of candidate adversarial packets in less than an hour. When evaluating against multiple state-of-the-art DPI middleboxes such as Zeek and Snort, as well as a state-level censorship firewall, Great Firewall of China, we identify not only previously known evasion strategies, but also novel ones that were never previously reported (e.g., involving urgent pointer). The system can extend easily to test other combinations of operating systems and DPI middleboxes, and serve as a valuable testing tool of future DPIs’ robustness against evasion attempts.

MassBrowser: Unblocking the Censored Web for the Masses, by the Masses

Existing censorship circumvention systems fail to offer reliable circumvention without sacrificing their users’ QoS and privacy, or undertaking high costs of operation. We have designed and implemented a censorship circumvention system, called SwarmProxy (anonymized name), whose goal is to offer emph{effective censorship circumvention} to a large body of censored users, with emph{high QoS}, emph{low costs of operation}, and emph{adjustable privacy protection}. Towards this, we have made several key decisions in designing our system. First, we argue that circumvention systems should not bundle strong privacy protections (like anonymity) with censorship circumvention. Additional privacy properties should be offered as optional features to the users of circumvention users, which can be enabled by specific users or on specific connections (perhaps by trading off QoS). Second, we combine various state-of-the-art circumvention techniques (such as using censored clients to proxy circumvention traffic for other censored clients, using volunteer NATed proxies, and leveraging CDN hosting) to make SwarmProxy significantly resistant to blocking, while keeping its cost of operation small ($0.001 per censored client per month).

We have built and deployed SwarmProxy as a fully operational system with end-user GUI software for major operating systems. Our system has been in beta release for over a year with hundreds of users from major censoring countries testing it on a daily basis. A key part of SwarmProxy’s design is using non-censored Internet users to run volunteer proxies to help censored users. We have performed the first user study on the willingness of typical Internet users in helping circumvention operators. We have used the findings of our user study in the design of SwarmProxy to encourage wide adoption by volunteers; particularly, our GUI software offers high transparency, control, and safety to the volunteers.

Session 4A: Future Networks

When Match Fields Do Not Need to Match: Buffered Packets Hijacking in SDN

Software-Defined Networking (SDN) greatly meets the need in industry for programmable, agile, and dynamic networks by deploying diversified SDN applications on a centralized controller. However, SDN application ecosystem inevitably introduces new security threats since compromised or malicious applications can significantly disrupt network operations. A number of effective security enhancement systems have been developed to defend against potential attacks from SDN applications, including data provenance systems to protect applications from being poisoned by malicious applications, rule conflict detection systems to prevent data packets from bypassing network security policies, and application isolation systems to prevent applications from corrupting controllers. In this paper, we identify a new design flaw on flow rule installation in SDN, and this vulnerability can be exploited by malicious applications to launch effective attacks bypassing existing defense systems. We discover that SDN systems do not check the inconsistency between the buffer ID and match fields when an application attempts to install flow rules, so that a malicious application can manipulate the buffer ID to hijack buffered packets even though the installed flow rule from the application does not match the packet with that buffer ID. We name this new vulnerability as buffered packet hijacking, which can be exploited to launch attacks that disrupt all three SDN layers, namely, application layer, data plane layer, and control layer. First, by modifying buffered packets and resending them to controllers, a malicious application can poison other applications. Second, by manipulating forwarding behaviors of buffered packets, a malicious application can not only disrupt TCP connections of flows but also make flows bypass network security policies. Third, by copying massive buffered packets to controllers, a malicious application can saturate the bandwidth of the SDN control channel and computing resources. We demonstrate the feasibility and effectiveness of these attacks with both theoretical analysis and experiments in a real SDN testbed. Finally, we develop a lightweight defense system that can be readily deployed in existing SDN controllers as a patch.

Automated Discovery of Cross-Plane Event-Based Vulnerabilities in Software-Defined Networking

Software-defined networking (SDN) achieves a programmable control plane through the use of logically centralized, event-driven controllers and through network applications (apps) that extend the controllers’ functionality. As control plane decisions are often based on the data plane, it is possible for carefully-crafted malicious data plane inputs to direct the control plane towards unwanted states that bypass network security restrictions (i.e., cross-plane attacks). Unfortunately, due to the complex interplay between controllers, apps, and data plane inputs, at present it is difficult to systematically identify and analyze these cross-plane vulnerabilities.

We present EventScope, a vulnerability detection tool that automatically analyzes SDN control plane event usage, discovers candidate vulnerabilities based on missing event handling routines, and validates vulnerabilities based on data plane effects. To accurately detect missing event handlers without ground truth or developer aid, we cluster apps according to similar event usage and mark inconsistencies as candidates. We create an event flow graph to observe a global view of events and control flows within the control plane and use it to validate vulnerabilities that affect the data plane. We applied EventScope to the ONOS SDN controller and uncovered 14 new vulnerabilities.

SVLAN: Secure & Scalable Network Virtualization

Network isolation is a critical modern Internet service. To date, network operators have created a logical network of distributed systems to provide communication isolation between different parties. However, the current network isolation is limited in scalability and flexibility. It limits the number of virtual networks and it only supports isolation at host (or virtual-machine) granularity. In this paper, we introduce Scalable Virtual Local Area Networking (SVLAN) that scales to a large number of distributed systems and offers improved flexibility in providing secure network isolation. With the notion of destination-driven reachability and packet-carrying forwarding state, SVLAN not only offers communication isolation but isolation can be specified at different granularities, e.g., per-application or per-process. Our proof-of-concept SVLAN implementation demonstrates its feasibility and practicality for real-world applications.

Session 5A: Network Crime and Privacy

A Practical Approach for Taking Down Avalanche Botnets Under Real-World Constraints

In 2016, law enforcement dismantled the infrastructure of the Avalanche bulletproof hosting service, the largest takedown of a cybercrime operation so far. The malware families supported by Avalanche use Domain Generation Algorithms (DGAs) to generate random domain names for controlling their botnets. The takedown proactively targets these presumably malicious domains; however, as coincidental collisions with legitimate domains are possible, investigators must first classify domains to prevent undesirable harm to website owners and botnet victims.

The constraints of this real-world takedown (proactive decisions without access to malware activity, no bulk patterns and no active connections to domains) mean that approaches from the state of the art cannot be applied. The problem of classifying thousands of registered DGA domain names therefore required an extensive, painstaking manual effort by law enforcement investigators. To significantly reduce this effort without compromising correctness, we develop a model that automates the classification. We achieve an accuracy of 95.8% on the ground truth of the 2017 Avalanche takedown, which translates into a reduction of up to 94.0% in manual investigation effort. Furthermore, we interpret the model to provide investigators with insights into how benign and malicious domains differ in behavior, which features and data sources are most important, and how the model can be configured according to the practical requirements of a real-world takedown.

Designing a Better Browser for Tor with BLAST

Tor is an anonymity network that allows clients to browse web pages privately, but loading web pages with Tor is slow. To analyze how the browser loads web pages, we examine their resource trees using our new browser logging and simulation tool, BLAST. We find that the time it takes to load a web page with Tor is almost entirely determined by the number of round trips incurred, not its bandwidth, and Tor Browser incurs unnecessary round trips. Resources sit in the browser queue excessively waiting for the TCP, TLS or ALPN handshakes, each of which takes a separate round trip. We show that increasing resource loading capacity with larger pipelines and even HTTP/2 do not decrease load time because they do not save round trips.

We set out to minimize round trips with a number of protocol and browser improvements,
including TCP Fast Open, optimistic data, zero-RTT TLS. We also recommend the use of databases to assist the client with redirection, identifying HTTP/2 servers, and prefetching. All of these features are designed to cut down on the number of round trips incurred in loading web pages. To evaluate these proposed improvements, we create a simulation tool and validate that it is highly accurate in predicting mean page load times. We use the simulator to analyze these features and it predicts that they will decrease the mean page load time by 61% in total over HTTP/2. Our large improvement to user experience comes at trivial cost to the Tor network.

Encrypted DNS –> Privacy? A Traffic Analysis Perspective

Virtually every connection to an Internet service is preceded by a DNS lookup which is performed without any traffic-level protection, thus enabling manipulation, redirection, surveillance, and censorship. To address these issues, large organizations such as Google and Cloudflare are deploying recently standardized protocols that encrypt DNS traffic between end users and recursive resolvers such as DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH). In this paper, we examine whether encrypting DNS traffic can protect users from traffic analysis-based monitoring and censoring. We propose a novel feature set to perform the attacks, as those used to attack HTTPS or Tor traffic are not suitable for DNS’ characteristics. We show that traffic analysis enables the identification of domains with high accuracy in closed and open world settings, using 124 times less data than attacks on HTTPS flows. We find that factors such as location, resolver, platform, or client do mitigate the attacks performance but they are far from completely stopping them. Our results indicate that DNS-based censorship is still possible on encrypted DNS traffic. In fact, we demonstrate that the standardized padding schemes are not effective. Yet, Tor — which does not effectively mitigate traffic analysis attacks on web traffic— is a good defense against DoH traffic analysis.

On Using Application-Layer Middlebox Protocols for Peeking Behind NAT Gateways

Typical port scanning approaches do not achieve a full coverage of all devices connected to the Internet as not all devices are directly reachable via a public (IPv4) address: due to IP address space exhaustion, firewalls, and many other reasons, an end-to-end connectivity is not achieved in today’s Internet anymore. Especially Network Address Translation (NAT) is widely deployed in practice and it has the side effect of “hiding” devices from being scanned. Some protocols, however, require end-to-end connectivity to function properly and hence several methods were developed in the past to enable crossing network borders.

In this paper, we explore how an attacker can take advantage of such application-layer middlebox protocols to access devices hidden behind these gateways. More specifically, we investigate different methods for identifying such devices by (ab)using legitimate protocol features. We categorize the available protocols into two classes: First, there are persistent protocols that are typically port forwarding based. Such protocols are used to allow local network devices to open and forward external ports to them. Second, there are non-persistent protocols that are typically proxy-based to route packets between network edges, such as HTTP and SOCKS proxies. We perform a comprehensive, Internet-wide analysis to obtain an accurate overview of how prevalent and widespread such protocols are in practice. Our results indicate that hundreds of thousands of hosts are vulnerable for different types of attacks, e. g., we detect over 400.000 hosts that are likely vulnerable for attacks involving the UPnP IGD protocol. More worrisome, we find empirical evidence that attackers are already actively exploiting such protocols in the wild to access devices located behind NAT gateways. Amongst other findings, we discover that at least 24 % of all open Internet proxies are misconfigured to allow accessing hosts on non-routable addresses.

Session 6A: Network Defenses

Hold the Door! Fingerprinting Your Car Key to Prevent Keyless Entry Car Theft

Recently, the traditional way to unlock car doors has been replaced with a keyless entry system which proves more convenient for automobile owners. When a driver with a key fob is in vicinity of the vehicle, doors automatically unlock on user command. However, unfortunately, it has been known that these keyless entry systems are vulnerable to signal-relaying attacks. While it is evident that automobile manufacturers incorporate preventative methods to secure these keyless entry systems, a range of attacks continue to occur. Relayed signals fit into the valid packets that are verified as legitimate, and this makes it is difficult to distinguish a legitimate request for doors to be unlocked from malicious signals. In response to this vulnerability, this paper presents an RF-fingerprinting method (coined “HOld the DOoR”, HODOR) to detect attacks on keyless entry systems, which is the first attempt to exploit RF-fingerprint technique in automotive domain. HODOR is designed as a sub-authentication system that supports existing authentication systems for keyless entry systems and does not require any modification of the main system to perform. Through a series of experiments, the results demonstrate that HODOR competently and reliably detects attacks on keyless entry systems. HODOR achieves both an average false positive rate (FPR) of 0.27% with a false negative rate (FNR) of 0% for the detection of simulated attacks corresponding to the current issue on keyless entry car theft. Furthermore, HODOR was also observed under environmental factors: temperature variation, non-line-of-sight (NLoS) conditions and battery aging. HODOR yields a false positive rate of 1.32% for the identification of a legitimated key fob which is even under NLoS condition. Based on the experimental results, it is expected that HODOR will provide a secure service for keyless entry systems, while remaining convenient.

Poseidon: Mitigating Volumetric DDoS Attacks with Programmable Switches

Distributed Denial-of-Service (DDoS) attacks have become a critical threat to the Internet. Due to the increasing number of vulnerable Internet of Things (IoT) devices, attackers can easily compromise a large set of nodes to form botnets and launch high-volume DDoS attacks. State-of-the-art DDoS defenses, however, have not caught up with the fast evolution of the attacks and the requirements of latency-sensitive services in data centers, as most existing defense systems are high in cost and low in agility. In this paper, we propose POSEIDON, a framework that is designed to address those key limitations in today’s DDoS defenses, leveraging emerging programmable switches that can be reconfigured in the field without additional hardware cost. In designing POSEIDON, we address three key challenges in terms of intent expression, resource orchestration, and runtime management. Evaluations using our prototype demonstrate that POSEIDON can potentially defend against ∼Tbps attack traffic, simplify the defense intent expression within tens of lines of code, accommodate to policy changes in seconds, and adapt to dynamic attacks with negligible overheads. Moreover, compared with the state of the art, POSEIDON reduces the defense costs and the end-to-end packet processing latency by two orders of magnitude, making it a promising defense against modern advanced DDoS attacks.

EASI: Edge-Based Sender Identification on Resource-Constrained Platforms for Automotive Networks

In vehicles, internal Electronic Control Units (ECUs) are increasingly prone to adversarial exploitation over wireless connections due to ongoing digitalization. Controlling an ECU allows an adversary to send messages to the internal vehicle bus and thereby to control various vehicle functions. Access to the Controller Area Network (CAN), the most widely used bus technology, is especially severe as it controls brakes and steering. However, state of the art receivers are not able to identify the sender of a frame. Retrofitting frame authenticity, e.g. through Message Authentication Codes (MACs), is only possible to a limited extent due to reduced bandwidth, low payload and limited computational resources. To address this problem, observation in analog differences of the CAN signal was proposed to determine the actual sender. These prior approaches, which exhibit good identification rates in some cases, require high sample rates and a high computational effort. With EASI we significantly reduce the required resources and at the same time show increased identification rates of 99.98% by having no false positives in a prototype structure and two series production vehicles. In comparison to the most lightweight approach so far, we have reduced the memory footprint and the computational requirements by a factor of 168 and 142, respectively. In addition, we show the feasibility of EASI and thus for the first time that sender identification is realizable using comprehensive signal characteristics on resource-constrained platforms. Due to the lightweight design, we achieved a classification in under 100,$mu$s with a training time of 2.61 seconds. We also showed the ability to adapt the system to incremental changes during operation. Since cost effectiveness is of utmost importance in the automotive industry due to high production volumes, the achieved improvements are significant and necessary to realize sender identification.

BLAG: Improving the Accuracy of Blacklists

IP address blacklists are a useful source of information about repeat attackers. Such information can be used to prioritize which traffic to divert for deeper inspection (e.g., repeat offender traffic), or which traffic to serve first (e.g., traffic from sources that are not blacklisted). But blacklists also suffer from overspecialization – each list is geared towards a specific purpose – and they may be inaccurate due to misclassification or stale information. We propose BLAG, a system that evaluates and aggregates multiple blacklists feeds, producing a more useful, accurate and timely master blacklist, tailored to the specific customer network. BLAG uses a sample of the legitimate sources of the customer network’s inbound traffic to evaluate the accuracy of each blacklist over regions of address space. It then leverages recommendation systems to select the most accurate information to aggregate into its master blacklist. Finally, BLAG identifies portions of the master blacklist that can be expanded into larger address regions (e.g. /24 prefixes) to uncover more malicious addresses with minimum collateral damage. Our evaluation of 157 blacklists of various attack types and three ground-truth datasets shows that BLAG achieves high specificity up to 99%, improves recall by up to 114 times compared to competing approaches, and detects attacks up to 13.7 days faster, which makes it a promising approach for blacklist generation.

DefRec: Establishing Physical Function Virtualization to Disrupt Reconnaissance of Power Grids’ Cyber-Physical Infrastructures

Reconnaissance is critical for adversaries to prepare attacks causing physical damage in industrial control systems (ICS) like smart power grids. Disrupting the reconnaissance is challenging. The state-of-the-art moving target defense (MTD) techniques based on mimicking and simulating system behaviors do not consider the physical infrastructure of power grids and can be easily identified.

To overcome those challenges, we propose physical function virtualization (PFV) that hooks’’ network interactions with real physical devices and uses them to build lightweight virtual nodes following the actual implementation of network stacks, system invariants, and physical state variations of real devices. On top of PFV, we propose DefRec, a defense mechanism that significantly increases the reconnaissance efforts for adversaries to obtain the knowledge of power grids’ cyber-physical infrastructures. By randomizing communications and crafting decoy data for the virtual physical nodes, DefRec can mislead adversaries into designing damage-free attacks. We implement PFV and DefRec in the ONOS network operating system and evaluate them in a cyber-physical testbed, which uses real devices from different vendors and HP physical switches to simulate six power grids. The experiment results show that with negligible overhead, PFV can accurately follow the behavior of real devices. DefRec can significantly delay passive attacks for at least five months and isolate proactive attacks with less than $10^{-30}$ false negatives.

Session 7A: Network Attacks

Withdrawing the BGP Re-Routing Curtain: Understanding the Security Impact of BGP Poisoning through Real-World Measurements

The security of the Internet’s routing infrastructure has underpinned much of the past two decades of distributed systems security research. However, the converse is increasingly true. Routing and path decisions are now important for the security properties of systems built on top of the Internet. In particular, BGP poisoning leverages the de facto routing protocol between Autonomous Systems (ASes) to maneuver the return paths of upstream networks onto previously unusable, new paths. These new paths can be used to avoid congestion, censors, geo-political boundaries, or any feature of the topology which can be expressed at an AS-level. Given the increase in use of BGP poisoning as a security primitive for security systems, we set out to evaluate the feasibility of poisoning in practice, going beyond simulation.

To that end, using a multi-country and multi-router Internet-scale measurement infrastructure, we capture and analyze over 1,400 instances of BGP poisoning across thousands of ASes as a mechanism to maneuver return paths of traffic. We analyze in detail the performance of steering paths, the graph-theoretic aspects of available paths, and re-evaluate simulated systems with this data. We find that the real-world evidence does not completely support the findings from simulated systems published in the literature. We also analyze filtering of BGP poisoning across types of ASes and ISP working groups. We explore the connectivity concerns when poisoning by reproducing a decade old experiment to uncover the current state of an Internet triple the size. We build predictive models for understanding an ASes vulnerability to poisoning. Finally, an exhaustive measurement of an upper bound on the maximum path length of the Internet is presented, detailing how recent and future security research should react to ASes leveraging poisoning with long paths. In total, our results and analysis attempt to expose the real-world impact of BGP poisoning on past and future security research.

IMP4GT: IMPersonation Attacks in 4G NeTworks

Long Term Evolution (LTE/4G) establishes mutual authentication with a provably secure Authentication and Key Agreement protocol on layer three of the network stack. Permanent integrity protection of the control plane safeguards the traffic against manipulations. However, missing integrity protection of the user plane still allows an adversary to manipulate and redirect IP packets, as recently demonstrated.

In this work, we introduce a novel cross-layer attack that exploits the existing vulnerability on layer two and extends it with an attack mechanism on layer three. More precisely, we take advantage of the default IP stack behavior of operating systems, which allows an active attacker to impersonate a user towards the network and vice versa; we name these attacks IMP4GT (IMPersonation attacks in 4G neTworks). In contrast to a simple redirection attack as demonstrated in prior work, our attack dramatically extends the possible attack scenarios and thus emphasizes the need for user plane integrity protection in mobile communication standards. The results of our work imply that providers can no longer rely on mutual authentication for billing, access control, and legal prosecution. On the other side, users are exposed to any incoming IP connection as an adversary can bypass the provider’s firewall. To demonstrate the practical impact of our attack, we conduct two IMP4GT attack variants in a commercial network, which—for the first time—completely break the mutual authentication aim of LTE on the user plane in a real-world setting.

Practical Traffic Analysis Attacks on Secure Messaging Applications

Instant Messaging (IM) applications like Telegram, Signal, and WhatsApp have become extremely popular in recent years. Unfortunately, such IM services have been the target of continuous governmental surveillance and censorship, as these services are home to public and private communication channels on socially and politically sensitive topics. To protect their clients, popular IM services deploy state-of-the-art encryption mechanisms. In this paper, we show that despite the use of advanced encryption, popular IM applications leak sensitive information about their clients to adversaries who merely monitor their encrypted IM traffic, with no need for leveraging any software vulnerabilities of IM applications. Specifically, we devise traffic analysis attacks that enable an adversary to identify administrators as well as members of target IM channels (e.g., forums) with high accuracies. We believe that our study demonstrates a significant, real-world threat to the users of such services given the increasing attempts by oppressive governments at cracking down controversial IM channels.

We demonstrate the practicality of our traffic analysis attacks through extensive experiments on real-world IM communications. We show that standard countermeasure techniques such as adding cover traffic can degrade the effectiveness of the attacks we introduce in this paper. We hope that our study urges IM providers to integrate effective traffic obfuscation countermeasures into their software. In the meantime, we have designed and deployed an open-source, publicly available countermeasure system, called IMProxy, that can be used by IM clients with no need for any support from IM providers. We have demonstrated the effectiveness of IMProxy through experiments.

CDN Judo: Breaking the CDN DoS Protection with Itself

Content Delivery Network (CDN) improves the websites’ accessing performance and availability with its globally distributed network infrastructures, which contributes to the flourish of CDN-powered websites on the Internet. As CDN-powered websites are normally operating important businesses or critical services, the attackers are mostly interested to take down these high-value websites, achieving severe damage with maximum influence. As the CDN absorbs distributed attacking traffic with its massive bandwidth resources, CDN vendors have always claimed that they provide effective DoS protection for the CDN-powered websites.

However, we reveal that, implementation or protocol weaknesses in the CDN’s forwarding mechanism can be exploited to break the CDN protection. By sending crafted but legal requests, an attacker can launch an efficient DoS attack against the website Origin behind.
In particular, we present three CDN threats in this study.
Through abusing the CDN’s HTTP/2 request converting behavior and HTTP pre-POST behavior, an attacker can saturate the CDN-Origin bandwidth and exhaust the Origin’s connection limits.
What is more concerning is that, some CDN vendors only use a small set of traffic forwarding IPs with lower IP-churning ratio to establish connections with the Origin. This characteristic provides a great opportunity for an attacker to effectively degrade the website’s global availability, by just cutting off specific CDN-Origin connections.

In this work, we examine the CDN’s request-forwarding behaviors across six well-known CDN vendors, and we perform real-world experiments to evaluate the severity of the threats. As the threats are caused by the CDN vendor’s poor trade-offs between usability and security, we discuss the possible mitigations, and we receive positive feedback after responsible disclosure to related CDN vendors.

Session 9B: Authentication

OcuLock: Exploring Human Visual System for Authentication in Virtual Reality Head-mounted Display

The increasing popularity of virtual reality (VR) in a wide spectrum of applications has generated sensitive personal data such as medical records and credit card information. While protecting such data from unauthorized access is critical, directly applying traditional authentication methods (e.g., PIN) through new VR input modalities such as remote controllers and head navigation would cause security issues. The authentication action can be purposefully observed by attackers to infer the authentication input. Unlike any other mobile devices, VR presents immersive experience via a head-mounted display (HMD) that fully covers users’ eye area without public exposure. Leveraging this feature, we explore human visual system (HVS) as a novel biometric authentication tailored for VR platforms. While previous works used eye globe movement (gaze) to authenticate smartphones or PCs, they suffer from a high error rate and low stability since eye gaze is highly dependent on cognitive states. In this paper, we explore the HVS as a whole to consider not just the eye globe movement but also the eyelid, extraocular muscles, cells, and surrounding nerves in the HVS. Exploring HVS biostructure and unique HVS features triggered by immersive VR content can enhance authentication stability. To this end, we present OcuLock, an HVS-based system for reliable and unobservable VR HMD authentication. OcuLock is empowered by an electrooculography (EOG) based HVS sensing framework and a record-comparison driven authentication scheme. Experiments through 70 subjects show that OcuLock is resistant against common types of attacks such as impersonation attack and statistical attack with Equal Error Rates as low as 3.55% and 4.97% respectively. More importantly, OcuLock maintains a stable performance over a 2-month period and is preferred by users when compared to other potential approaches.

On the Resilience of Biometric Authentication Systems against Random Inputs

We assess the security of machine learning based biometric authentication systems against an attacker who submits uniform random inputs, either as feature vectors or raw inputs, in order to find an emph{accepting sample} of a target user. The average false positive rate (FPR) of the system, i.e., the rate at which an impostor is incorrectly accepted as the legitimate user, may be interpreted as a measure of the success probability of such an attack. However, we show that the success rate is often higher than the FPR. In particular, for one reconstructed biometric system with an average FPR of 0.03, the success rate was as high as 0.78. This has implications for the security of the system, as an attacker with only the knowledge of the length of the feature space can impersonate the user with less than 2 attempts on average. We provide detailed analysis of why the attack is successful, and validate our results using four different biometric modalities and four different machine learning classifiers. Finally, we propose mitigation techniques that render such attacks ineffective, with little to no effect on the accuracy of the system.

Strong Authentication without Temper-Resistant Hardware and Application to Federated Identities

Shared credential is currently the most widespread form of end user authentication with its convenience, but it is also criticized for being vulnerable to credential database theft and phishing attacks. While several alternative mechanisms are proposed to offer strong authentication with cryptographic challenge-response protocols, they are cumbersome to use due to the need of tamper-resistant hardware modules at user end. In this paper, we propose the first strong authentication mechanism without the reliance on tamper-resistant hardware at user end. A user authenticates with a password-based credential via generating designated-verifiable authentication tokens. Our scheme is resistant to offline dictionary attacks in spite that the attacker can steal the password-protected credentials, and thus can be implemented for general-purpose device.
More specifically, we first introduce and formalize the notion of Password-Based Credential (PBC), which models the resistance of offline attacks and the unforageability of authentication tokens even if attackers can see authentication tokens and capture password-wrapped credentials of honest users. We then present a highly-efficient construction of PBC using a “randomize-then-prove” approach, and prove its security. The construction doesn’t involve bilinear-pairings, and can be implemented with common cryptographic libraries for many platforms. We also present a technique to transform the PBC scheme to be publicly-verifiable, and present an application of PBC in federated identity systems to provide holder-of-key assertion mechanisms. Compared with current certificate-based approaches, it is more convenient and user-friendly, and can be used with the federation systems that employs privacy-preserving measures (e.g., Sign-in with Apple). We also implement the PBC scheme and evaluate its performance for different applications over various network environment. When PBC is used as a strong authentication mechanism for end users, it saves 26%-36% of time than the approach based on ECDSA with a tamper-resistant hardware module. As for its application in federation, it could even save more time when the user proves its possession of key to a Relying Party.

Session 10A: Case Studies & Human Factors

A View from the Cockpit: Exploring Pilot Reactions to Attacks on Avionic Systems

Many wireless communications systems found in aircraft lack standard security mechanisms, leaving them fundamentally vulnerable to attack. With affordable software-defined radios available, a novel threat has emerged, allowing a wide range of attackers to easily interfere with wireless avionic systems. Whilst these vulnerabilities are known, concrete attacks that exploit them are still novel and not yet well understood. This is true in particular with regards to their kinetic impact on the handling of the attacked aircraft and consequently its safety. To investigate this, we invited 30 Airbus A320 type-rated pilots to fly simulator scenarios in which they were subjected to attacks on their avionics. We implement and analyze novel wireless attacks on three safety-related systems: Traffic Collision Avoidance System (TCAS), Ground Proximity Warning System (GPWS) and the Instrument Landing System (ILS). We found that all three analyzed attack scenarios created significant control impact and cost of disruption through turnarounds, avoidance manoeuvres, and diversions. They further increased workload, distrust in the affected system, and in 38% of cases caused the attacked safety system to be switched off entirely. All pilots felt the scenarios were useful, with 93.3% feeling that specific simulator training for wireless attacks could be valuable.

Genotype Extraction and False Relative Attacks: Security Risks to Third-Party Genetic Genealogy Services Beyond Identity Inference

Here, we evaluate the security of a consumer-facing, third party genetic analysis service, called GEDmatch, that specializes in genetic genealogy: a field that uses genetic data to identify relatives. GEDmatch is one of the most prominent third-party genetic genealogy services due to its size (over 1 million genetic data files) and the large role it now plays in criminal investigations. In this work, we focus on security risks particular to genetic genealogy, namely relative matching queries – the algorithms used to identify genetic relatives – and the resulting relative predictions. We experimentally demonstrate that GEDmatch is vulnerable to a number of attacks by an adversary that only uploads normally formatted genetic data files and runs relative matching queries. Using a small number of specifically designed files and queries, an attacker can extract a large percentage of the genetic markers from other users; 92% of markers can be extracted with 98% accuracy, including hundreds of medically sensitive markers. We also find that an adversary can construct genetic data files that falsely appear like relatives to other samples in the database; in certain situations, these false relatives can be used to make the de-identification of genetic data more difficult. These vulnerabilities exist because of particular design choices meant to improve functionality. However, our results show how security and the goals of genetic genealogy can come in conflict. We conclude with a discussion of the broader impact of these results to the entire consumer genetic testing community and provide recommendations for genetic genealogy services.

Complex Security Policy? A Longitudinal Analysis of Deployed Content Security Policies

The Content Security Policy (CSP) mechanism was developed as a mitigation against script injection attacks in 2010. In this paper, we leverage the unique vantage point of the Internet Archive to conduct a historical and longitudinal analysis of how CSP deployment has evolved for a set of 10,000 highly ranked domains. In doing so, we document the long-term struggle site operators face when trying to roll out CSP for content restriction and highlight that even seemingly secure whitelists can be bypassed through expired or typo domains. Next to these new insights, we also shed light on the usage of CSP for other use cases, in particular, TLS enforcement and framing control. Here, we find that CSP can be easily deployed to fit those security scenarios, but both lack wide-spread adoption. Specifically, while the underspecified and thus inconsistently implemented X-Frame-Options header is increasingly used on the Web, CSP’s well-specified and secure alternative cannot keep up. To understand the reasons behind this, we run a notification campaign and subsequent survey, concluding that operators have often experienced the complexity of CSP (and given up), utterly unaware of the easy-to-deploy components of CSP. Hence, we find the complexity of secure, yet functional content restriction gives CSP a bad reputation, resulting in operators not leveraging its potential to secure a site against the non-original attack vectors.

Into the Deep Web: Understanding E-commerce Fraud from Autonomous Chat with Cybercriminals

E-commerce miscreants heavily rely on instant messaging (IM) to promote their illicit businesses and coordinate their operations. The threat intelligence provided by IM communication, therefore, becomes invaluable for understanding and mitigating the threats of e-commerce frauds. However, such information is hard to get since it is usually shared only through one-on-one conversations with the criminals. In this paper, we present the first chatbot, called Aubrey, to actively collect such intelligence through autonomous chats with real-world e-commerce miscreants. Our approach leverages the question-driven conversation pattern of small-time workers, who seek from e-commerce fraudsters jobs and/or attack resources, to model the interaction process as a finite state machine, thereby enabling an autonomous conversation. Aubrey successfully chatted with 470 real-world e-commerce miscreants and gathered a large amount of fraud-related artifact, including 40 SIM gateways, 323K fraud phone numbers, and previously-unknown attack toolkits, etc. Further, the conversations reveal the supply chain of e-commerce fraudulent activities on the deep web and the complicated relations (e.g., complicity and reselling) among miscreant roles.

Compliance Cautions: Investigating Security Issues Associated with U.S. Digital-Security Standards

Digital security compliance programs and policies serve as powerful tools for protecting organizations’ intellectual property, sensitive resources, customers, and employees through mandated security controls. Organizations place a significant emphasis on compliance and often conflate high compliance audit scores with strong security; however, no compliance standard has been systemically evaluated for security concerns that may exist even within fully-compliant organizations. In this study, we describe our approach for auditing three exemplar compliance standards that affect nearly every person within the United States: standards for federal tax information, credit card transactions, and the electric grid. We partner with organizations that use these standards to validate our findings within enterprise environments and provide first-hand narratives describing impact.

We find that when compliance standards are used literally as checklists — a common occurrence, as confirmed by compliance experts — their technical controls and processes are not always sufficient. Security concerns can exist even with perfect compliance. We identified 148 issues of varying severity across three standards; our expert partners assessed 49 of these issues and validated that 36 were present in their own environments and 10 could plausibly occur elsewhere. We also discovered that no clearly-defined process exists for reporting security concerns associated with compliance standards; we report on our varying levels of success in responsibly disclosing our findings and influencing revisions to the affected standards. Overall, our results suggest that auditing compliance standards can provide valuable benefits to the security posture of compliant organizations.