Matvey Arye is presenting his Pre FPO April 30, 2015 at 3pm in CS 401. The members of his/her committee are: Mike Freedman(advisor), Kai LI (reader), Andrea Lapaugh(reader), Jen Rexford (non-reader), Nick Feamster(non-reader) Everyone is invited to attend his talk. The talk title and abstract follow below: Title: "Data Processing Across Continents" Abstract: An increasing number of data sources create data across the globe. These include everything from server logs owned by Internet-scale companies to military intelligence systems. This thesis addresses the question of how to enable near-real-time analytical queries on such data. Existing systems tend to centralize such data into a single datacenter before analyzing it. However, in light of low and asymmetric bandwidth provisioning in and between certain geographic regions, centralizing all data can be both slow and costly. I have addressed this problem with three complementary research directions. First, I describe a system that queries the data in a distributed manner and centralizes only the data that is needed to fulfill the query. It dynamically adjusts the accuracy of its answers to tradeoff data quality versus responsiveness. Second, I explore some challenges due to the interaction between an application-level dynamic quality adaptation control loop and TCP (which has its own control loop). These two control loops can create negative feedback effects and propose a way to overcome this. We translate these insights into the domain of Internet video streaming, and show improvements in streaming performance over leading industry players. Finally, I present a case study of how to optimize the top-k query for wide-area analysis. The top-k query addresses questions of popularity and is thus ubiquitous in modern computer systems. My algorithms reduce both the bandwidth usage and number of rounds used by such queries.