CoNEXT 2019 Conference Program

Hotel - UCF Shuttle Bus Schedule

  • 08:00 - 09:00 - Breakfast
    • Keynote: Riding along with the Time-Traveling Networking Researcher
      Abstract: A networking researcher, traveling forward in time from 1985 to the present, would be shocked by many things – not the least of which is the fact that people are still doing networking research in 2019. In this talk we will first have fun riding along with this time-traveler as they shuttle between 1985 and 2019. What a difference 34 years make! For those of us who cannot afford a time machine, we have observed a more gradual evolution. So getting off the ride-along for a moment, I will argue that this evolution has been governed by a specific fundamental mechanism for many decades, ultimately yielding the Internet as we know it today, a common and shared global networking infrastructure that delivers almost all services. At some point, many have argued, this evolutionary mechanism stopped working and the Internet has become ossified and unable to change in response to new demands. However, novel service requirements and scale increases continue to exert significant pressure on this ossified infrastructure. The result, I will conjecture, will be a fragmentation, the beginnings of which are evident today, that will ultimately fundamentally change the character of the network infrastructure. By ushering in a ManyNets world, this fragmentation will enable networking evolution to continue unabated. To confirm this conjecture, we will then continue our ride-along with our time traveler to the year 2050 and take a look at what networks and networking research look like then.

  • 10:30 - 11:00 - Coffee Break
    • PURR: A Primitive for Reconfigurable Fast Reroute  long
      Marco Chiesa (KTH Royal Institute of Technology); Roshan Sedar (Independent Researcher); Gianni Antichi (Queen Mary, University of London); Michael Borokhovich (Independent Researcher); Andrzej Kamisiński (AGH University of Science and Technology in Kraków); Georgios Nikolaidis (Barefoot Networks); Stefan Schmid (University of Vienna)
      Abstract: Highly dependable communication networks usually rely on some kind of Fast Re-Route (FRR) mechanism which allows to quickly re-route traffic upon failures, entirely in the data plane. This paper studies the design of FRR mechanisms for emerging reconfigurable switches.

      Our main contribution is an FRR primitive for programmable data planes, PURR, which provides low failover latency and high switch throughput, by avoiding packet recirculation. PURR tolerates multiple concurrent failures and comes with minimal memory requirements, ensuring compact forwarding tables, by unveiling an intriguing connection to classic "string theory" (i.e., stringology), and in particular, the shortest common supersequence problem. PURR is well-suited for high-speed match-action forwarding architectures (e.g., PISA) and supports the implementation of arbitrary network-wide FRR mechanisms. Our simulations and prototype implementation (on an FPGA and Tofino) show that PURR improves TCAM memory occupancy by a factor of 1.5x-10.8x compared to a näive encoding when implementing state-of-the-art FRR mechanisms. PURR also improves the latency and throughput of datacenter traffic up to a factor of 2.8x-5.5x and 1.2x-2x, respectively, compared to approaches based on recirculating packets.

    • Fine-Grained Queue Measurement in the Data Plane  long
      Xiaoqi Chen, Shir Landau-Feibish (Princeton University); Yaron Koral (AT&T Labs); Jennifer Rexford (Princeton University); Ori Rottenstreich (Technion); Steven A Monetti, Tzuu-Yi Wang (AT&T Labs)
      Abstract: Short-lived surges in traffic can cause periods of high queue utilization, leading to packet loss and delay. To diagnose and alleviate performance problems, networks need support for real-time, fine-grained queue measurement. By identifying the flows that contribute significantly to queue build-up directly in the data plane, switches can make targeted decisions to mark, drop, or reroute these flows in real time. However, collecting fine-grained queue statistics is challenging even with modern programmable switch hardware, due to limited memory and processing resources in the data plane. We present ConQuest, a compact data structure that identifies the flows making a significant contribution to the queue. ConQuest operates entirely in the data plane, while working within the hardware constraints of programmable switches. Additionally, we show how to measure queues in legacy devices through link tapping and an off-path switch running ConQuest. Simulations show that ConQuest can identify contributing flows with 90% precision on a 1ms timescale, using less than 65KB of memory. Experiments with our Barefoot Tofino prototype show that ConQuest-enabled active queue management reduces flow-completion time.
    • HyperTester: High-performance Network Testing Driven by Programmable Switches  long
      Yu Zhou, Zhaowei Xi, Dai Zhang, Yangyang Wang, Jinqiu Wang, Mingwei Xu, Jianping Wu (Tsinghua University)
      Abstract: Modern network research and operations are inseparable from network testers to evaluate performance limits of proof-of-concepts, troubleshoot failures and so on. Existing network testers have either constrained flexibility or very low cost performance ratios. In this paper, we propose a new network tester, HyperTester. The core idea of HyperTester is to leverage new-generation programmable switches for generating and capturing test packets with high performance, low cost, and remarkable flexibility. We design a series of efficient mechanisms including template-based packet generation, false-positive-free counter-based queries, and stateless connections to implement complicated network testing tasks upon switches with limited programmability and resources. Meanwhile, to facilitate developing testing tasks upon HyperTester, we provide a high-level API dedicated for specifying network testing intents. We have implemented HyperTester on the Tofino switch and built dozens of network testing applications. The evaluations on the hardware testbed show that HyperTester supports line-rate packet generation with highly-accurate rate control, while HyperTester can save $38400 per Tbps and 7150W per Tbps when comparing with the software network testers.
    • Normal Forms for Match-Action Programs  short
      Felicián Németh (Budapest University of Technology and Economics); Marco Chiesa (KTH Royal Institute of Technology); Gábor Rétvári (Budapest University of Technology and Economics)
      Abstract: Packet processing programs may have multiple semantically equivalent representations in terms of the match-action abstraction exposed by the underlying data plane. Some representations may encode the entire packet processing program into one large table allowing packets to be matched in a single lookup, while others may encode the same functionality decomposed into a pipeline of smaller match-action tables, maximizing modularity at the cost of increased lookup latency. In this paper, we provide the first systematic study of match-action program representations in order to assist network programmers in navigating this vast design space. Borrowing from relational database and formal language theory, we define a framework for the equivalent transformation of match-action programs to obtain certain irredundant representations that we call "normal forms". We find that normalization generally improves the capacity of the control plane to program the data-plane and to observe its state, at the same time having negligible, or positive, performance impact.
    • PEERING: Virtualizing BGP at the Edge for Research  long
      Brandon Schlinker (University of Southern California); Todd Arnold (Columbia University); Italo Cunha (Universidade Federal de Minas Gerais); Ethan Katz-Bassett (Columbia University)
      Abstract: Internet routing research has long been hindered by obstacles to executing the wide class of experiments necessary to characterize problems and opportunities, and evaluate candidate solutions. Prior works proposed a platform that would provide experiments with control of an Internet-connected AS. However, because BGP does not natively support multiplexing or the requisite security policies for building such a platform, prior works were ultimately unable to realize this vision.

      We present PEERING, a community platform that provides multiple parallel experiments with control and visibility equivalent to directly operating a production AS. PEERING is built atop vBGP, our design for virtualizing the data and control planes of a BGP edge router while simultaneously enforcing security policies to prevent experiments from disrupting the Internet and each other. With PEERING, experiments operate in an environment qualitatively similar to that of a cloud provider, and can exchange routes and traffic with hundreds of neighboring networks and the broader Internet at locations around the world. To date, PEERINGs' rich connectivity and flexibility have enabled it to support over 40 experiments and 15 publications in key research areas such as security, traffic engineering, and routing policies.

    • Comparing the Performance of State-of-the-Art Software Switches  long
      Tianzhu Zhang, Leonardo Linguaglossa (Telecom ParisTech); Massimo Gallo (Nokia Bell Labs); Paolo Giaccone (Politecnico di Torino); Luigi Iannone (ParisTech); James Roberts (Telecom ParisTech)
      Abstract: Software switches are increasingly used in network function virtualization (NFV) to route traffic between virtualized network functions (VNFs) and physical network interface cards (NICs). Understanding of alternative switch designs remains deficient, however, in the absence of a comprehensive, comparative performance analysis. In this paper, we propose a methodology intended to be fair and use it to compare the performance of seven state-of-the-art software switches. We first explore their respective design spaces and then compare their performance under four representative test scenarios. Each scenario corresponds to a specific case of routing NFV traffic between NICs and/or VNFs. Our experimental results show that no single software switch prevails in all scenarios. It is therefore important to choose the one that is best adapted to a given use-case. The presented results and analysis bring a better understanding of design tradeoffs and identify potential bottlenecks that limit the performance of software switches.
    • Steering Hyper-Giants' Traffic at Scale  long
      Enric Pujol, Ingmar Poese (BENOCS); Johannes Zerwas (Technische Universität München); Georgios Smaragdakis (TU Berlin); Anja Feldmann (MPI-Informatics)
      Abstract: Large content providers, known as hyper-giants, are responsible for sending the majority of traffic to content consumers. These hyper-giants operate complex and highly distributed infrastructures to cope with the ever-increasing demand for online content. To achieve commercial-grade performance of Web applications, enhanced end-user experience, improved reliability, and scaled network capacity, hyper-giants increasingly interconnect with eyeball networks at multiple locations. This poses new challenges for both (1) the eyeball networks having to perform complex inbound traffic engineering, and (2) hyper-giants having to map end-user requests to appropriate servers.

      We report on our experience in designing, building, rolling-out, and operating the first-ever large scale system, the Flow Director, which enables automated cooperation between one of the largest eyeball networks and a leading hyper-giant. We use empirical data collected at the eyeball network to evaluate its impact over two years of operation. We find very high compliance of the hyper-giant to Flow Director's recommendation, resulting in (1) close to optimal user-server mapping, and (2) 15% reduction of the hyper-giant's traffic overhead on the ISP's long-haul links, i.e., benefits for both parties and end-users alike.

  • 15:15 - 15:45 - Coffee Break
    • Challenges in Inferring Spoofed Traffic at IXPs  long
      Lucas Müller (UFRGS / CAIDA); Matthew Luckie (University of Waikato); Bradley Huffaker, kc claffy (CAIDA / UC San Diego); Marinho Barcellos (UFRGS / University of Waikato)
      Abstract: Ascertaining that a network will forward spoofed traffic usually requires an active probing vantage point in that network, effectively preventing a comprehensive view of this global Internet vulnerability. Recently, researchers have proposed using Internet Exchange Points (IXPs) as observatories to detect spoofed packets, by leveraging Autonomous System (AS) topology knowledge extracted from Border Gateway Protocol (BGP) data to infer which source addresses should legitimately appear across parts of the IXP switch fabric. We demonstrate that the existing literature does not capture several fundamental challenges to this approach, including noise in BGP data sources, heuristic AS relationship inference, and idiosyncrasies in IXP interconnectivity fabrics. We propose a novel method to navigate these challenges, leveraging *customer cone* semantics of AS relationships to guide precise classification of inter-domain traffic as in-cone, out-of-cone (*spoofed*), unverifiable, bogon, and unassigned. We apply our method to a mid-size IXP with approximately 200 members, and find an upper bound volume of out-of-cone traffic to be more than an order of magnitude less than the previous method inferred on the same data. Our work illustrates the subtleties of scientific assessments of operational Internet infrastructure, and the need for a community focus on reproducing and repeating previous methods.
    • Beyond content analysis: Detecting targeted ads via distributed counting  long
      Costas Iordanou (MPI); Nicolas Kourtellis (Telefonica Research); Juan Miguel Carrascosa (LSTech); Claudio Soriente (NEC Laboratories Europe); Ruben Cuevas (Universidad Carlos III de Madrid); Nikolaos Laoutaris (IMDEA Networks Institute)
      Abstract: Being able to check whether an online advertisement has been targeted is essential for resolving privacy controversies and implementing in practice data protection regulations like GDPR, CCPA, and COPPA. In this paper we describe the design, implementation, and deployment of an advertisement auditing system called eyeWnder that uses crowdsourcing to reveal in real time whether a display advertisement has been targeted or not. Crowdsourcing simplifies the detection of targeted advertising, but requires reporting to a central repository the impressions seen by different users, thereby jeopardizing their privacy. We break this deadlock with a privacy preserving data sharing protocol that allows eyeWnder to compute global statistics required to detect targeting, while keeping the advertisements seen by individual users and their browsing history private. We conduct a simulation study to explore the effect of different parameters and a live validation to demonstrate the accuracy of our approach. Unlike previous solutions, eyeWnder can even detect indirect targeting, i.e., marketing campaigns that promote a product or service whose description bears no semantic overlap with its targeted audience.
    • An Investigation on Information Leakage of DNS over TLS  long
      Rebekah Houser (University of Delaware); Zhou Li (UC Irvine); Haining Wang, Chase Cotton (University of Delaware)
      Abstract: DNS over TLS (DoT) protects the confidentiality and integrity of DNS communication by encrypting DNS messages transmitted between users and resolvers. In recent years, DoT has been deployed by popular recursive resolvers like Cloudflare and Google. While DoT is supposed to prevent on-path adversaries from learning and tampering with victims' DNS requests and responses, it is unclear how much information can be deduced through traffic analysis on DoT messages. To answer this question, in this work, we develop a DoT fingerprinting method to analyze DoT traffic and determine if a user has visited websites of interest to adversaries. Given that a visit to a website typically introduces a sequence of DNS packets, we can infer the visited websites by modeling the temporal patterns of packet sizes. Our method can identify DoT traffic for websites with a false negative rate of less than 17% and a false positive rate of less than 0.5% when DNS messages are not padded. Moreover, we show that information leakage is still possible even when DoT messages are padded. These findings highlight the challenges of protecting DNS privacy, and indicate the necessity of a thorough analysis of the threats underlying DNS communications for effective defenses.
    • DNS Privacy in Practice and Preparation  short
      Casey Deccio (Brigham Young University); Jacob Davis (Sandia National Laboratories)
      Abstract: An increased demand for privacy in Internet communications has resulted in privacy-centric enhancements to the Domain Name System (DNS), including the use of Transport Layer Security (TLS) and Hypertext Transfer Protocol Secure (HTTPS) for DNS queries. In this paper, we seek to answer questions about their deployment, including their prevalence and their characteristics. Our work includes an analysis of DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH) availability at open resolvers and authoritative DNS servers. We find that DoT and DoH services exist on just a fraction of open resolvers, but among them are the major vendors of public DNS services. We also analyze the state of TCP Fast Open (TFO), which is considered key to reducing the latency associated with TCP-based DNS queries, required by DoT and DoH. The uptake of TFO is extremely low, both on the server side and the client side, and it must be improved to avoid performance degradation with continued adoption of DNS Privacy enhancements.
    • 08:00 - 09:00 - Breakfast
      • Perceiving QUIC: Do Users Notice or Even Care?  short
        Jan Rüth, Konrad Wolsing, Klaus Wehrle (RWTH Aachen University); Oliver Hohlfeld (Brandenburg University of Technology)
        Abstract: QUIC, as the foundation for HTTP/3, is becoming an Internet reality. A plethora of studies already show that QUIC excels beyond TCP+TLS+HTTP/2. Yet, these studies compare a highly optimized QUIC Web stack against an unoptimized TCP-based stack. In this paper, we bring TCP up to speed to perform an eye-level comparison. Instead of relying on technical metrics, we perform two extensive user studies to investigate QUIC's impact on the quality of experience. First, we investigate if users can distinguish two protocol versions in a direct comparison, and we find that QUIC is indeed rated faster than TCP and even a tuned TCP. Yet, our second study shows that this perceived performance increase does mostly not matter to the users, and they rate QUIC and TCP indistinguishable.
      • Q-Tag: A transparent solution to measure ads viewability rate in online advertising campaigns  short
        Patricia Callejo (IMDEA Networks Institute / Universidad Carlos III de Madrid); Antonio Pastor, Rubén Cuevas, Ángel Cuevas (Universidad Carlos III de Madrid)
        Abstract: Viewability is one of the most important metrics used in ad-tech to measure the performance quality of ad campaigns. The viewability standard defines the visibility conditions an ad impression must meet to achieve a sufficient marketing effect to be considered viewed. The ad-tech industry offers opaque measures of viewability whose performance is questionable. To address this issue, we propose a novel methodology for measuring viewability in ad campaigns. The disclosure of the functional details of this technique makes it reproducible and auditable.

        Our solution has been deployed in production by a Demand Side Platform (DSP) to measure the viewability rate of the ad campaigns. Leveraging the infrastructure of this DSP, we compare the performance of our methodology with a commercial solution. Both techniques report a similar overall viewability rate of 50%. However, our solution measured the viewability in 93% of the ads served by the DSP, unlike to 74% of the ads measured by the commercial solution. A rough estimation indicates that this increase in the measured rate may lead to a revenue increase of $3.5 million per year for a mid-sized DSP serving 100M of ads per day.

      • ABR Streaming with Separate Audio and Video Tracks: Measurements and Best Practices  short
        Yanyuan Qin (University of Connecticut); Subhabrata Sen (AT&T Labs Research); Bing Wang (University of Connecticut)
        Abstract: Adaptive bitrate (ABR) streaming is the predominant approach for video streaming over the Internet. When the audio and video tracks are stored separately (i.e., in demuxed mode), the client needs to dynamically determine which audio and which video track to select for each chunk/playback position. Somewhat surprisingly, there is very little literature on how to best mesh together audio and video adaptation in ABR streaming. In this paper, we first examine the state of the art in the handling of demuxed audio and video tracks in predominant ABR protocols (DASH and HLS), as well as in real ABR client implementations in three popular players covering both browsers and mobile platforms. Combining experimental insights with code analysis, we shed light on a number of limitations in existing practices both in the protocols and the player implementations, which can cause undesirable behaviors such as stalls, selection of potentially undesirable combinations such as very low quality video with very high quality audio, etc. Based on our gained insights, we identify the underlying root causes of these issues, and propose a number of practical design best practices and principles whose collective adoption will help avoid these issues and lead to better QoE.
      • Analyzing Viewport Prediction Under Different VR Interactions  short
        Tan Xu, Bo Han (AT&T Labs - Research); Feng Qian (University of Minnesota - Twin Cities)
        Abstract: In this paper, we study the problem of predicting a users' viewport movement in a networked VR system (i.e., predicting which direction the viewer will look at shortly). This critical knowledge will guide the VR system through making judicious content fetching decisions, leading to efficient network bandwidth utilization and improved Quality of Experience (QoE). For this study, we collect viewport trajectory traces from 275 users who have watched popular 360° panoramic videos for a total duration of 156 hours. Leveraging this unique dataset, we compare viewport movement patterns of different interaction modes: wearing a head-mounted device, tilting a smartphone, and dragging a mouse on a PC. We then apply diverse machine learning algorithms from simple regression to sophisticated deep learning that leverages crowd-sourced data to analyze the performance of viewport prediction. We find that the deep learning approach is robust for all interaction modes and yields supreme performance, especially when the viewport is more challenging to predict, e.g., for a longer prediction window, or with a more dynamic movement. Overall, our analysis provides key insights on how to intelligently perform viewport prediction in networked VR systems.
    • 10:00 - 10:30 - Coffee Break
      • Veni Vidi Dixi: Reliable Wireless Communication with Depth Images  long
        Serkut Ayvaşık, Murat Gürsu, Wolfgang Kellerer (Technical University of Munich)
        Abstract: The upcoming industrial revolution requires deployment of critical wireless sensor networks for automation and monitoring purposes. However, the reliability of the wireless communication is rendered unpredictable by mobile elements in the communication environment such as humans or mobile robots which lead to dynamically changing radio environments. Changes in the wireless channel can be monitored with frequent pilot transmission. However, that would stress the battery life of sensors. In this work a new wireless channel estimation technique, Veni Vidi Dixi, VVD, is proposed. VVD leverages the redundant information in depth images obtained from the surveillance camera(s) in the communication environment and utilizes Convolutional Neural Networks (CNNs) to map the depth images of the communication environment to complex wireless channel estimations. VVD increases the wireless communication reliability without the need for frequent pilot transmission and with no additional complexity on the receiver. The proposed method is tested by conducting measurements in an indoor environment with a single mobile human. Up to authors' best knowledge our work is the first to obtain complex wireless channel estimation from only depth images without any pilot transmission. The collected wireless trace, depth images and codes are publicly available.
      • EasyPass: Combating IoT Delay with Multiple Access Wireless Side Channels  long
        Haoyang Lu, Ruirong Chen, Wei Gao (University of Pittsburgh)
        Abstract: Many IoT applications have stringent requirements on wireless transmission delay, but have to compete for channel access with other wireless traffic. Traditional techniques enable multiple access to wireless channels, but yield severe delay when the channel is congested. In this paper, we present EasyPass, a wireless PHY technique that allows multiple IoT devices to simultaneously transmit data over a congested wireless link without being delayed. The key idea of EasyPass is to exploit the excessive SNR margin in a wireless channel as a dedicated side channel for IoT traffic, and allow multiple access to the side channel by separating signals from different transmitters on the air. We implemented EasyPass on software-defined radio platforms. Experiment results demonstrate that EasyPass reduces the data transmission delay in congested IoT networks by 90%, but provides a throughput up to 2.5 Mbps over a narrowband 20MHz wireless link that can be accessed by more than 100 IoT devices.
      • Smartphone Positioning with Radio Measurements from a Single WiFi Access Point  short
        Maurizio Rea (IMDEA Networks Institute); Traian Emanuel Abrudan (Nokia Bell Labs, Dublin, Ireland); Domenico Giustiniano (IMDEA Networks Institute); Holger Claussen (Nokia Bell Labs, Dublin, Ireland); Veli-Matti Kolmonen (Nokia Bell Labs, Espoo, Southern Finland, Finland)
        Abstract: Despite the large literature on localization, there is no solution yet to localize a commercial off-the-shelf smartphone device using radio measurements from a single WiFi AP. We present SPRING, Smartphone Positioning with Radio measurements from a sINGle wifi access point. SPRING exploits Fine Time Measurements (FTM) and Angle of Arrival (AOA) extracted from commercial chipsets exploiting the specifications of the recent 802.11-2016 and the 802.11ac amendment to combine distance and direction from the AP to the client for positioning. Our system has the potential to bring indoor positioning to homes and small businesses which typically have a single access point. We exploit physical layer (PHY) information to detect the number of paths and their directions. We use this information to derive a new method for filtering ranging measurements obtained with the FTM protocol. We achieve sub-meter distance estimation accuracy eliminating the adverse effect of multipath in FTM using calibrated inputs from Channel State Information (CSI). Our evaluation in indoor scenarios in multipath rich environments demonstrates that the combination of AOA estimation and the proposed FTM refinement approach can locate a Google Pixel 3 smartphone with a median positioning error of 0.9-2.15 m through an area comparable to typical flat sizes.
      • ADS: Accurate Decoding of RFID tags at Scale  long
        Tanmoy Das, Prasun Sinha (The Ohio State University)
        Abstract: RFID tags are popular as they are inexpensive and batteryless. These tags are used for tracking and monitoring of objects within a short-range. According to recent studies, such applications can save billions of dollars in shopping malls, airports, and warehouses. However, the existing works failed to decode a large number of RFID tags due to their poor scalability and accuracy. This poor performance is the result of the estimation technique used by the existing solutions.

        ADS proposes an estimation technique that is highly accurate and scalable. ADS uses multiple frequencies in the excitation signal to receive packets from tags. It helps us to utilize channel independence across several frequencies but reduces the communication range. By exploiting the channel independence, ADS reduces the number of parameters for the estimation technique, which improves the performance. ADS provides 1.6x better scalability and 1.5x higher accuracy. In our evaluations, ADS achieves 1.7x better throughput on average and a maximum of 3x better throughput than the existing methods.

    • 12:00 - 13:30 - Lunch (Rooms 109 and 101C)
      • zD: A Scalable Zero-Drop Network Stack at End Hosts  long
        Yimeng Zhao, Ahmed Saeed, Ellen Zegura, Mostafa Ammar (Georgia Institute of Technology)
        Abstract: Modern end-host network stacks have to handle traffic from tens of thousands of flows and hundreds of virtual machines per single host, to keep up with the scale of modern clouds. This can cause congestion for traffic egressing from the end host. The effects of this congestion have received little attention. Currently, an overflowing queue, like a kernel queuing discipline, will drop incoming packets. Packet drops lead to worse network and CPU performance by inflating the time to transmit the packet as well as spending extra effort on retansmissions. In this paper, we show that current end-host mechanisms can lead to high CPU utilization, high tail latency, and low throughput in cases of congestion of egress traffic within the end host. We present zD, a framework for applying backpressure from a congested queue to traffic sources at end hosts that can scale to thousands of flows. We implement zD to apply backpressure in two settings: i) between TCP sources and kernel queuing discipline, and ii) between VMs as traffic sources and kernel queuing discipline in the hypervisor. zD improves throughput by up to 60%, and improves tail RTT by at least 10x at high loads, compared to standard kernel implementation.
      • Enabling ECN for Datacenter Networks with RTT Variations  long
        Junxue Zhang (Hong Kong University of Science and Technology); Wei Bai (Microsoft Research); Kai Chen (Hong Kong University of Science and Technology)
        Abstract: ECN has been widely employed in production datacenters to deliver high throughput low latency communications. Despite being successful, prior ECN-based transports have an important drawback: they adopt a fixed RTT value in calculating instantaneous ECN marking threshold while overlooking the RTT variations in practice.

        In this paper, we reveal that the current practice of using a fixed high-percentile RTT for ECN threshold calculation can lead to persistent queue buildups, significantly increasing packet latency. On the other hand, directly adopting lower percentile RTTs results in throughput degradation. To handle the problem, we introduce ECN, a simple yet effective solution to enable ECN for RTT variations. At its heart, ECN inherits the current instantaneous ECN marking (based on a high-percentile RTT) to achieve high throughput and burst tolerance, while further marking packets (conservatively) upon detecting long-term queue buildups to eliminate unnecessary queueing delay without degrading throughput. We implement ECN on a Barefoot Tofino switch and evaluate it through extensive testbed experiments and large-scale simulations. Our evaluation confirms that ECN can effectively reduce latency without hurting throughput. For example, compared to the current practice, ECN achieves up to 23.4% (31.2%) lower average (99th percentile) flow completion time (FCT) for short flows while delivering similar FCT for large flows under production workloads.

      • Reducing Tail Latency using Duplication: A Multi-Layered Approach  long
        Hafiz Muhammad Mohsin Bashir, Abdullah Bin Faisal (Tufts Univeristy); M. Asim Jamshed (Intel Labs); Peter Vondras (Indigo Ag); Ali Musa Iftikhar (Tufts University); Ihsan Ayyub Qazi (LUMS); Fahad Dogar (Tufts University)
        Abstract: Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We call for making duplication a first-class concept in cloud systems, and make two contributions in this regard. First, we present duplicate-aware scheduling or DAS, an aggressive duplication policy that duplicates every job, but keeps the system safe by providing suitable support (prioritization and purging) at multiple layers of the cloud system. Second, we present the D-Stage abstraction, which supports DAS and other duplication policies across diverse layers of a cloud system (e.g., network, storage, etc.). The D-Stage abstraction decouples the duplication policy from the mechanism, and facilitates working with legacy layers of a system. Using this abstraction, we evaluate the benefits of DAS for two data parallel applications (HDFS, an in-memory workload generator) and a network function (snort-based IDS cluster). Our experiments on the public cloud and Emulab show that DAS is safe to use, and the tail latency improvement holds across a wide range of workloads.
      • Sift: Resource-Efficient Consensus with RDMA  long
        Mikhail Kazhamiaka, Babar Memon, Chathura Kankanamge, Siddhartha Sahu, Sajjad Rizvi, Bernard Wong, Khuzaima Daudjee (University of Waterloo)
        Abstract: Sift is a new consensus protocol for replicating state machines. It disaggregates CPU and memory consumption by creating a novel system architecture enabled by one-sided RDMA operations. We show that this system architecture allows us to develop a consensus protocol design which centralizes the replication logic. This simplifies the design of the protocol by preventing complex interactions between the participants of the consensus group. Our dissaggregated design enables Sift to reduce deployment costs by sharing backup computational nodes across consensus groups deployed within the same cloud environment. We can further reduce the amount of required resources by integrating erasure codes, made simple by our architecture. Our evaluation results show that in a cloud environment with 100 groups where each group can support up to 2 simultaneous failures, Sift can reduce the cost by 56% compared to an RDMA-based Raft deployment.
    • 15:15 - 15:45 - Coffee Break
      • UNARI: An Uncertainty-aware Approach to AS Relationships Inference  long
        Guoyao Feng, Srinivasan Seshan, Peter Steenkiste (Carnegie Mellon University)
        Abstract: Over the last two decades, several algorithms have been pro-posed to infer the type of relationship between Autonomous Systems (ASes). While the recent works have achieved increasingly higher accuracy, there has not been a systematic study on the uncertainty of AS relationship inference. In this paper, we analyze the factors contributing to this uncertainty and introduce a new paradigm to explicitly model the uncertainty and reflect it in the inference result. We also present UNARI, an exemplary algorithm implementing this paradigm, that leverages a novel technique to capture the interdependence of relationship inference across AS links.
      • QPipe: Quantiles Sketch Fully in the Data Plane  short
        Nikita Ivkin (Amazon); Zhuolong Yu, Vladimir Braverman, Xin Jin (Johns Hopkins University)
        Abstract: Efficient network management requires collecting a variety of statistics over the packet flows. Monitoring the flows directly in the data plane allows the system to detect anomalies faster. However, monitoring algorithms have to handle a throughput of 109 packets per second and to maintain a very low memory footprint. Widely adopted sampling-based approaches suffer from low accuracy in estimations. Thus, it is natural to ask: "Is it possible to maintain important statistics in the data plane using small memory footprint?". In this paper, we answer this question in affirmative for an important case of quantiles. We introduce QPipe, the first quantiles sketching algorithm that can be implemented entirely in the data plane. Our main technical contribution is an on-the-plane implementation of a variant of SweepKLL algorithm. Specifically, we give novel implementations of argmin(), the major building block of SweepKLL which are usually not supported in the data plane of the commodity switch. We prototype QPipe in P4 and compare its performance with a sampling-based baseline. Our evaluations demonstrate 10x memory reduction for a fixed approximation error and 90x error improvement for a fixed amount of memory. We conclude that QPipe can be an attractive alternative to sampling-based methods.
      • Tuple Space Explosion: A Denial-of-Service Attack Against a Software Packet Classifier  long
        Levente Csikor, Dinil Mon Divakaran, Min Suk Kang (National University of Singapore); Attila Korosi, Balázs Sonkoly, David Haja (Budapest University of Technology and Economics); Dimitrios Pezaros (University of Glasgow); Stefan Schmid (University of Vienna); Gábor Rétvári (Budapest University of Technology and Economics)
        Abstract: Efficient and highly available packet classification is fundamental for various security primitives. In this paper, we evaluate whether the de facto Tuple Space Search (TSS) packet classification algorithm used in popular software networking stacks such as the Open vSwitch is robust against low-rate denial-of-service attacks. We present the Tuple Space Explosion (TSE) attack that exploits the fundamental space/time complexity of the TSS algorithm. TSE can degrade the switch performance to 12% of its full capacity with a very low packet rate (0.7 Mbps) when the target only has simple policies such as, "allow some, but drop others".Worse, an adversary with additional partial knowledge of these policies can virtually bring down the target with the same low attack rate. Interestingly, TSE does not generate any specific traffic patterns but only requires arbitrary headers and payloads which makes it particularly hard to detect.

        Due to the fundamental complexity characteristics of TSS, unfortunately, there seems to be no complete mitigation to the problem. As a long-term solution, we suggest the use of other algorithms (e.g., HaRP) that are not vulnerable to the TSE attack. As a short-term countermeasure, we propose MFCguard that carefully manages the tuple space and keeps packet classification fast.

    • 17:30 - 22:30 - Social Dinner
    • 08:00 - 09:00 - Breakfast
      • AViC: A Cache for Adaptive Bitrate Video  long
        Zahaib Akhtar, Yaguang Li, Ramesh Govindan (University of Southern California); Emir Halepovic, Shuai Hao (AT&T Labs Research); Yan Liu (University of Southern California); Subhabrata Sen (AT&T Labs Research)
        Abstract: Video dominates Internet traffic today. Users retrieve on-demand video from Content Delivery Networks (CDNs) which cache video chunks at front-ends. In this paper, we describe AViC, a caching algorithm that leverages properties of video delivery, such as request predictability and the presence of highly unpopular chunks. AViCs' eviction policy exploits request predictability to estimate a chunks' future request time and evict the chunk with the furthest future request time. Its admission control policy uses a classifier to predict singletons— chunks evicted before a second reference. Using real world CDN traces from a commercial video service, we show that AViC outperforms a range of algorithm including LRU, GDSF, AdaptSize and LHD. In particular LRU requires up to 3.5x the cache size to match AViCs' performance. Further, AViC has low time complexity and has memory complexity comparable to GDSF.
      • RSS++: load and state-aware receive side scaling  long
        Tom Barbette, Georgios P. Katsikas, Gerald Q. Maguire Jr., Dejan Kostic (KTH Royal Institute of Technology)
        Abstract: While the current literature typically focuses on load-balancing among multiple servers, in this paper, we demonstrate the importance of load-balancing within a single machine (potentially with hundreds of CPU cores). In this context, we propose a new load-balancing technique (RSS++) that dynamically modifies the receive side scaling (RSS) indirection table to spread the load across the CPU cores in a more optimal way. RSS++ incurs up to 14x lower 95th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. RSS++ allows higher CPU utilization and dynamic scaling of the number of allocated CPU cores to accommodate the input load, while avoiding the typical 25% over-provisioning.

        RSS++ has been implemented for both (i) DPDK and (ii) the Linux kernel. Additionally, we implement a new state migration technique, which facilitates sharding and reduces contention between CPU cores accessing per-flow data. RSS++ keeps the flow-state by groups that can be migrated at once, leading to a 20% higher efficiency than a state of the art shared flow table.

      • Egret: Simplifying Traffic Management for Physical and Virtual Network Functions  short
        Yikai Lin (University of Michigan); Ajay Mahimkar, Bo Han, Zihui Ge, Vijay Gopalakrishnan (AT&T Labs Research); Z. Morley Mao (University of Michigan)
        Abstract: Traffic migration is a common procedure performed by operators during planned maintenance and unexpected incidents to prevent/reduce service disruptions. However, current practices of traffic migration often couple operators' intentions (e.g. device upgrades) with network setups (e.g. load-balancers), resulting in poor re-usability and substantial operational complexities. Our study of 205 Methods of Procedure (MOPs) from a major U.S. carrier suggests that generalizing traffic migration with a unified model is feasible. Such generalization along with SDN's automation capability is key to scalable and flexible management of traffic, especially for virtualized network functions with unprecedented scale, heterogeneity, and fast iteration. In this paper, we propose Egret, a generic traffic migration system that simplifies traffic management for physical and virtual network functions. Egret (1) hides intricate implementation details from operators with generic intention-based interfaces, and (2) modularizes common traffic migration procedures to enable plug-and-play by developers and vendors. Leveraging a novel mask-based abstraction of traffic migration jobs, Egret can further simplify reverse traffic migration and enable job interleaving.
    • 10:15 - 10:45 - Coffee Break
      • Network topology design at 27,000 km/hour  long
        Debopam Bhattacherjee, Ankit Singla (ETH Zürich)
        Abstract: Upstart space companies are actively developing massive constellations of low-flying satellites to provide global Internet service. We examine the problem of designing the inter-satellite network for low latency and high capacity. We posit that the high density of these new constellations and the high-velocity nature of such systems render traditional approaches for network design ineffective, motivating new methods specialized for this problem setting.

        We propose one such method, explicitly aimed at tackling the high temporal dynamism inherent to low-Earth orbit satellites. We exploit repetitive patterns in the network topology such that we avoid expensive link changes over time, while still providing near-minimal latencies at nearly 2x the throughput of standard past methods. Further, we observe that the geometry of satellite constellations admits more efficient designs, if a small, controlled amount of dynamism in links is permissible. For the leading Starlink constellation, our approach enables an efficiency improvement of 54%.

      • Loko: Predictable Latency in Small Networks  long
        Amaury Van Bemten, Nemanja Deric, Johannes Zerwas, Andreas Blenk (Technical University of Munich); Stefan Schmid (University of Vienna); Wolfgang Kellerer (Technical University of Munich)
        Abstract: A predictable network performance is mission critical for many applications and yet hard to provide due to difficulties in modeling the behavior of the increasingly complex network equipment. This paper studies the problem of providing deterministic latency guarantees in small networks based on low-capacity hardware (e.g., in-cabin and industrial networks): such networks are of increasing importance, need to meet stringent performance requirements, but have hardly been explored so far. Our main contribution is the design, implementation, and evaluation of Loko, a system which provides predictable latency guarantees in programmable networks using low-cost hardware. Loko relies on a novel measurement-based methodology and uses deterministic network calculus to derive a reliable performance model of a given switch. To this end, we also show that state-of-the-art models in the literature like QJump and Silo fall short to model the behavior of such switches, due to incorrect architectural and performance assumptions. As a case study, we implement Loko for the Zodiac FX switch. Our experiments are encouraging: we find that the derived models are indeed accurate, allowing Loko to provide deterministic end-to-end guarantees with low-cost programmable devices.
      • Flash: Efficient Dynamic Routing for Offchain Networks  long
        Peng Wang, Hong Xu (City University of Hong Kong); Xin Jin (Johns Hopkins University); Tao Wang (New York University)
        Abstract: Offchain networks emerge as a promising solution to address the scalability challenge of blockchain. Participants make payments through offchain networks instead of committing transactions onchain. Routing is critical to the performance of offchain networks. Existing solutions use either static routing with poor performance or dynamic routing with high overhead to obtain the dynamic channel balance information. In this paper, we propose Flash, a new dynamic routing solution that leverages the unique transactions characteristics in offchain networks to strike a better tradeoff between path optimality and probing overhead. By studying the traces of real offchain networks, we find that the payment sizes are heavy-tailed, and most payments are highly recurrent. Flash thus differentiates the treatment of elephant payments from that of mice payments. It uses a modified max-flow algorithm for elephant payments to find paths with sufficient capacity, and strategically routes the payment across paths to minimize the transaction fees. Mice payments are sent directly by looking up a routing table with a few precomputed paths to reduce probing overhead. Testbed experiments and trace-driven simulations show that Flash improves the success volume of payments by up to 2.3x compared to the state-of-the-art routing algorithm.
    • 12:00 - Boxed Lunch (Room 109)