Yaping Zhu will present her preFPO on Tuesday March 1 at 10:30AM in Room 402. The members of her committee are: Jen Rexford, advisor; Vivek Pai and Aman Shaikh (AT&T Labs Research), readers; Mike Freedman and Andrea LaPaugh, nonreaders. Everyone is invited to attend her talk. Her abstract follows below. -------------------------------------- Minimizing Wide-area Performance Disruptions in Inter-domain Routing Abstract The Internet is the platform for most of our communications needs today. However, network changes like routing changes or congestion lead to performance disruptions, which would affect the user experience. Therefore, minimizing performance disruptions is crucial, and network operators have to react and fix the problems. Diagnosing wide-area performance disruptions is challenging: first, each network has limited visibility into the root cause of performance disruptions, requiring network operators to collect and analyze measurements of routing and traffic data; second, there are so many potential factors which might lead to performance disruptions, and these factors are usually interdependent with each other. Thus, the network diagnosis is usually done in an ad-hoc manner, and there are no formalized ways to define metrics and classify the performance disruption according to the causes. The thesis conducts two case studies to diagnose wide-area performance disruptions from the perspectives of a large tier-1 ISP and a large CDN: i) From the ISP's perspective, we designed and implemented a system that tracks inter-domain route changes at scale and in real time. Our system could be used as the building block for many diagnosis scenarios for the ISPs. ii) From the CDN's perspective, we focus on diagnosing wide-area network changes which caused latency increases to access the services in the CDN. We proposed a method for automatically classifying large increases of latency, and evaluated our techniques on one month of measurement data to identify major sources of high latency for a large CDN. Stepping back from the protocol designer's perspective, we refactor the inter-domain routing protocol BGP (Border Gateway Protocol), based on the lessons learned from the case studies: first, since each network only has limited visibility and control within its own network and the neighbors, we propose to select a route only based on the next- hop AS. second, the BGP protocol is not designed with operational challenges of performance and security in mind. Thus, there are many proposals to add additional BGP attributes and satisfy the operational needs. These proposals make the protocol and configuration complicated and error-prone, and make it difficult for network operators to diagnose problems. Instead, we argue to separate the performance and security requirements out of the protocol. Our proposal of next-hop BGP simplifies the protocol, and has the benefits of fast convergence, incentive compatibility, and easier support for multi-path routing.