How Facebook’s Project SEISMIC helps bring greener telecom infrastructure

All deployment site photos from Peru were taken by our partners at Mayu Telecomunicaciones and are used here with permission. To request permission to use the photos, contact servicios@mayutel.com.

Facebook Connectivity’s mission is to enable better, broader global connectivity to bring more people online to a faster internet. This mission has become more important, with ever-increasing data consumption and need for coverage. We collaborate with others in the industry — including telecom operators, community leaders, technology developers, and researchers — in order to find solutions that are scalable and sustainable. One of our most recent collaborations is Project SEISMIC: Smart Energy Infrastructure for Mobile Internet Connectivity. In this project, we are developing a solution to smartly manage the power and functionality of telecom sites. For example, we can reduce the capacity and transmission power of the site during less busy periods. By doing so, we want to better design and operate off-grid sites in order to reduce cost and improve their sustainability.

Many parts of the world still lack coverage and capacity, especially in rural areas. To help close this gap, telecommunications providers need to build new telecom sites and links. However, many rural areas lack access to an electrical grid. This presents a major challenge, as telecom sites and networks consume a significant amount of power, and this consumption is expected to rise even further.

In places where there is no reliable electricity grid, we have to rely on solar power, diesel power, or hydropower. Each has its own set of requirements: Solar-powered sites require solar panels and batteries to be brought on-site, diesel-powered sites need periodic resupply of diesel, and hydropowered sites require the construction of hydro generators. All this leads to significant challenges in cost, logistics, and transportation, presenting a barrier to providing connectivity in remote areas. To help remove these barriers and help make rural connectivity more accessible, Facebook is exploring innovations like Project SEISMIC that enable us to build and operate telecom sites more efficiently.

Bringing a more sustainable connectivity

Project SEISMIC offers smart power management of telecom sites, using dynamic power management to better design and operate off-grid sites.

A major challenge that we are addressing in Project SEISMIC is how to provide high reliability and availability over time, as the supply of power varies. For example, the output power from solar panels depends on the amount of sunlight, which changes depending on many factors including weather conditions, the time of the day, and the day of the year. This means that more solar panels and batteries have to be provisioned in order to meet availability requirements in areas that are subjected to longer rainy and cloudy periods, as well as in areas that receive less sunlight. Similarly, the output of a hydropower generator depends on its water supply.

Conventional telecom power system sizing of a solar-powered site is based on (1) the worst-case historic irradiance in the installation site, which can be much worse than the average irradiance, and (2) the average power consumption of the telecom system, which typically remains static and invariable over the time, no matter how the weather is and even when most of the people are sleeping and the traffic is close to zero.

Finally, many off-grid sites also lack accessible transportation, and in many cases, the equipment must be brought on-site by pack animals, boat, or even on foot. Inclement weather, floods, and inaccessible tracks all present incredible logistical challenges. All this leads to high telecom site costs and can make connectivity unfeasible.


Mayutel engineers and local workers load a solar panel onto a boat to take it to our test sites in rural Peru. This photo was taken by our partners at Mayu Telecomunicaciones and are used here with permission. To request permission to use the photos, contact servicios@mayutel.com.

However, as power supply performance varies over time, so does usage pattern. Taking inspiration from this, we considered the potential of smart, dynamic power management of a telecom site. What if we could adjust performance parameters — such as transmit power, bandwidth, number of channels, bit rates — to better meet power supply variations while maintaining the right level of connectivity performance at the right time? This is the inspiration behind Project SEISMIC.

Conventional telecom sites are designed and operated with few to no adjustments done during operation. This means that availability requirements are derived from peak power consumption. With SEISMIC, we believe that we can better design telecom sites that are greener — requiring fewer solar panels, batteries, and other power system elements — to drive down cost, improve sustainability, and overcome cost challenges.

SEISMIC uses predictive analytics, smart telecom site management, smart telecom site elements, and cloud services to improve the power efficiency of a telecom site while maintaining the right level of performance to meet availability requirements.

Proving a more sustainable connectivity

In order to prove our research concept, we developed partnerships with several key players. Mayu Telecomunicaciones, the first rural mobile infrastructure operator in Peru, agreed to become an operator partner and collaborate with us to deploy SEISMIC test sites. They work with the local communities in rural Peru to build the telecom sites, deploy 4G radio systems, and provide broadband connectivity for the first time to many in the community.

Clear Blue Technologies, a smart power management solutions and services company, provided a power management module, software, and cloud service to enable dynamic power management. Aviat Networks provided wireless microwave technology to enable backhaul connectivity to the test sites. BaiCells provided radio access network units.


Mayutel engineers ride on a boat to take them to our test sites in rural Peru. These photos were taken by our partners at Mayu Telecomunicaciones and are used here with permission. To request permission to use the photos, contact servicios@mayutel.com.

To obtain the best set of data, we commissioned two active telecom sites in Peru. One is the baseline site that uses conventional telecom sizing and operational methodology. The other is a smartly designed site that uses fewer solar panels and batteries. By commissioning these two sites side by side, we can compare their performance over time and track relevant telecom performance indicators, such as number of connections, bandwidth, and reliability. We believe that significant savings in power costs — on the order of 40 percent to 60 percent — are possible while maintaining relevant telecom performance.


Our smart power test sites under construction during dry and rainy periods. These photos were taken by our partners at Mayu Telecomunicaciones and are used here with permission. To request permission to use the photos, contact servicios@mayutel.com.

As two sites we commissioned are now live in Mayutel’s network, we have started collecting data as we test the functionality of both sites. As we collect data over time, we will improve our analysis and give an update on the performance, availability, and power reliability of our telecom sites.

A call to action

Power management is key to connectivity and networking infrastructure, to bring performance, economic, and sustainability benefits. To learn about our smart power management solution in the Telecom Infra Project, please join the Network as a Service Solutions project group. For more about the Telecom Infra Project, visit their website. You can also learn about other initiatives on the Facebook Connectivity website.

In addition to the current focus on rural and deep rural applications, we believe that this idea can be applicable to a wide range of other telecom deployment use cases, including urban small cell sites and edge computing nodes. In the case of urban small cells, for example, being able to efficiently power small cells without the need for grid connection may provide a significant cost benefit, provided that the solar battery system sizing can be made suitably small. We welcome interested parties to explore these and other use cases with us.

Further, besides the current project focus on solar battery powering solutions, the concept can be readily applied to benefit various other powering architectures such as diesel battery, wind battery, and other deployment scenarios.

Thanks to our partners

This project would not be possible without the indefatigable commitment and help from our partners. We thank Mayu Telecomunicaciones for agreeing to become our operator partner, providing their expertise, access to their sites, engineering, and support.

We also thank our technology partners for their engineering support: Clear Blue Technologies, Aviat Networks, Parallel Wireless, and BaiCells. We are grateful for the excellent collaboration and teamwork that has resulted in successful deployment of this demonstration.

The post How Facebook’s Project SEISMIC helps bring greener telecom infrastructure appeared first on Facebook Research.

Read More

Q&A with Tianyin Xu, visiting scientist at Facebook Core Systems and assistant professor at UIUC

In this monthly interview series, we turn the spotlight on members of the academic community and the important research they do — as thought partners, collaborators, and independent contributors.

For May, we nominated Tianyin Xu, a visiting scientist from the University of Illinois at Urbana-Champaign (UIUC). Before starting his professorship at UIUC, Xu joined Facebook’s Core Systems Disaster Recovery team in order to explore real-world systems applications. Visiting scientist positions are short-term employees (STEs) sponsored by research teams and are posted on the Facebook Careers page.

In this Q&A, Xu shares his experience as a visiting scientist at Facebook, discusses the research projects he’s worked on so far, and offers advice for academics thinking about spending some time in industry.

Q: Tell us about your role at UIUC and the type of research you and your department specialize in.

Tianyin Xu: I’m an assistant professor in the computer science department at UIUC. My research interests are broadly in computer systems, with a focus on software and system reliability. I’m particularly interested in computer systems being operated at the cloud and data center scale.

UIUC has a very strong, active computer science department, with more than a hundred faculty members. With such a big department, we have a strong presence in pretty much every field of computer science.

Q: What inspired you to spend some time at Facebook Core Systems at the beginning of your professorship?

TX: Taking a 6–12 month stint (a so-called prebbatical) is a common practice for new assistant professors of computer science nowadays. I also liked the idea — it would help me take a break to be physically and mentally ready, and, more important, would allow me to spend time thinking about the type of research I would like to do for my faculty job.

As a PhD graduate with a faculty job lined up, I was looking for an environment drastically different from that of an ivory tower. Particularly, I was seeking opportunities that allowed me to step into real-world large-scale systems and to understand the important problems that truly matter. I believed such experiences would be invaluable for my growth as a systems researcher. For example, a key question I always seek answers for is “Why do existing systems still fail in practice, despite the rigorous software engineering process and the wide adoption of reliability techniques?” Answers to such questions open doors for me to think clearly and to make relevant technical contributions; however, it is challenging to accurately and comprehensively answer such questions in a purely academic environment.

Facebook Core Systems provides a fantastic environment, where I can have firsthand experience on large-scale production systems and develop deep, comprehensive understandings on real-world challenges. The open culture lets me access almost all the resources and encourages me to connect to researchers and engineers with diverse expertise and experiences. One really special thing I find is the incredibly flat organization — everyone sits in the same open space and is close to one another, no matter whether they’re a VP, a director, or a level-3 engineer. I constantly used to look folks up, walk to their desks, ask them questions, and have great conversations.

Q: What is it like being a visiting scientist at Facebook?

TX: The position provides the luxury to understand large-scale distributed systems from the inside out, while thinking about fundamental research problems. Very few jobs provide both at the same time. I had a wonderful experience — I learned a huge amount (many of which can never be learned in an ivory tower), did really interesting research, had a lot of fun, built strong connections, made very close friends, and ate too much gourmet (and free!) food.

Q: What research projects have you worked on?

TX: I worked on two infrastructure systems, Maelstrom and Taiji. Maelstrom is a system for mitigating data center–level disasters by draining interdependent traffic safely and efficiently, and Taiji is a system for managing global user traffic for large-scale internet services at the edge. We later published the two systems at premier computer system conferences, with Maelstrom published at OSDI 2018 and Taiji at SOSP 2019.

One question I frequently received is why I didn’t choose to work on configuration management systems. Configuration management was my PhD thesis topic, and it was what connected me to Facebook researchers (I met CQ Tang and the Configurator team at SOSP 2015, where they published the paperHolistic configuration management at Facebook”). In fact, I always thought I would join the Configurator team.

When I finally showed up in Menlo Park in October 2017, CQ suggested that I meet a few teams at Core Systems to explore more potential collaboration opportunities. In one of those meetings, I talked to Kaushik Veeraraghavan and Justin Meza from the Disaster Recovery (DR) team. Kaushik threw me an incredibly intriguing research problem: What can we do when an entire data center is failing (for example, due to fiber cuts)? I had no answer, as all the reliability techniques I had in mind could not handle such widespread failures to that scale. That was the problem Maelstrom tried to address.

When I joined the DR team, my initial plan was to switch to a new team after six months (so I could see different systems and research problems). However, I ended up spending my entire prebbatical on the DR team because I enjoyed the work and my colleagues so much.

Q: What is the impact of your STE experience on your research and teaching?

TX: This is also a common question I get! There are too many impacts, which will overflow this interview if I try to list them all. So, let me give some examples.

Doing a PhD is more about depth. I worked on one research problem (misconfiguration detection and prevention) and dug deep on this topic to claim a PhD. However, upon graduating, I found myself being very narrowly focused, as I only knew my thesis topic. I asked myself, “How can I be a professor who is supposed to have broad knowledge?” Yes, I did read papers on many other systems topics, but I often find it hard to get to the bottom of the problems from reading papers.

The STE experience helped me develop a direct, holistic understanding of many real-world problems and answered many of my questions/doubts. Furthermore, my work on the Disaster Ready team pushed me to understand various types of production systems and how each system fits in place (our mission was to make every system at Facebook disaster-ready). Based on my understanding and experience, I later created a new course at UIUC entitled “Reliability of Cloud-Scale Systems.” It was a success. The course was ranked as “excellent,” and one major praise was the relevance and importance of the materials.

The STE experience also greatly benefits my research. In particular, it helps me think much deeper about the practicality of my work, which I care deeply about. For example, I took some time to rethink my PhD work on configuration management based on the configuration-related failures at Facebook. The rethinking and reflection led to my recent project, configuration testing (also known as ctest), which is a more practical technique to defend against misconfigurations and prevent production failures. The work is published at OSDI 2020 and is supported by the Facebook Distributed Systems research award.

Q: What advice would you give to university researchers looking to become visiting scientists at Facebook?

TX: I’ve internally shared my experience transitioning from a PhD student to a Facebook engineer because I learned a lot. It was not easy in the beginning. At the time, I had suddenly found myself no longer good and lacking in many skills. I later changed the way I worked at Facebook and started to be effective and enjoyed myself. Here is what I learned:

  1. Don’t work alone. Many PhD students tend to work alone because independence is required in grad school. However, independence doesn’t mean working alone. Working alone is a common pitfall, and teamwork is a key to success. If you want to understand something, don’t try to spend two weeks reading the code and document yourself. Instead, talk to a colleague, and you will probably get things done much faster. You will be much more effective if you know how to work with people.
  2. Focus on impact, not papers. If you do great work and make an impact at Facebook, you won’t have a problem publishing your work at top academic venues. This may not work the other way around. Note that impact is always much harder to get than a paper and will be weighted heavily during your job search (for both academic and industry jobs).
  3. Learn more than your project. Facebook has a very open culture, and you can access almost all of its many resources. My advice is to take the opportunity to learn about more than your own project and understand broadly about the important systems and problems.
  4. Build connections. There are many senior and junior, well-known and rising-star researchers at Facebook. Build your connections! I still keep connections with my colleagues and my mentors at Facebook, and constantly bug them for feedback and advice.
  5. Make friends. I made a lot of personal friends at Facebook. For example, the year I started at UIUC, my tech lead, David Chou, flew all the way from Menlo Park to Champaign only to visit and check on me.

Q: Where can people learn more about your research?

TX: People can find more information on my website. If anyone ever wants to discuss anything about my research, they can always feel free to reach out.

The post Q&A with Tianyin Xu, visiting scientist at Facebook Core Systems and assistant professor at UIUC appeared first on Facebook Research.

Read More

Announcing the recipients of the 2021 Facebook Fellowship awards

The Facebook Fellowship program provides awards to PhD candidates conducting research on important topics across computer science and engineering, such as computer vision, programming languages, computational social science, and more. Recipients of the award receive tuition and fees paid for up to two academic years and a stipend of $42,000, which includes conference travel support.

The Fellows are also invited to Facebook HQ in Menlo Park to attend the annual Fellowship Summit. This summit serves as an opportunity for Fellows to network with the rest of their cohort, share their research, and learn more about what researchers at Facebook are working on. As in 2020, we will host the summit virtually this year.

The program is now in its 10th year and has supported more than 144 PhD candidates from a broad range of universities. This year, we received 2,163 applications from over 100 universities worldwide, and we selected 26 outstanding Fellows from 19 universities.

Congratulations to this year’s winners, and thank you to everyone who took the time to submit an application.

2021 Facebook Fellows

Applied statistics


Hsiang Hsu
Harvard University

Finalists: Ayush Jain, University of California San Diego; Hanyu Song, Duke University

AR/VR photonics and optics


Prachi Tureja
California Institute of Technology

Finalists: Nathan Tessema Ersumo, University of California, Berkeley; Geun Ho Ahn, Stanford University; Christina Maria Spaegele, Harvard University

AR/VR future technologies


Logan Clark
University of Virginia


Caitlin Morris
Massachusetts Institute of Technology (MIT)

Finalists: Dishita Turakhia, MIT; Adam Williams, Colorado State University; Feiyu Lu, Virginia Polytechnic Institute and State University (Virginia Tech)

Blockchain and cryptoeconomics


Yan Ji
Cornell University

Finalists: Vibhaalakshmi Sivaraman, MIT; Itay Tsabary, Technion — Israel Institute of Technology

Computational social science


Manoel Horta Ribeiro
École Polytechnique Fédérale de Lausanne

Finalists: Kelsey Gonzalez, University of Arizona; Marianne Aubin Le Quere, Cornell University

AR/VR computer graphics


Cheng Zhang
University of California, Irvine


Liang Shi
MIT

Finalist: Joey Litalien, McGill University

Computer vision


Shuang Li
MIT


Xingyi Zhou
University of Texas at Austin

Finalists: Xinshuo Weng, Carnegie Mellon University; Yunzhu Li, MIT; Jiayuan Mao, MIT; Yinpeng Dong, Tsinghua University

Distributed systems


Yunhao Zhang
Cornell University

Finalists: Vikram Narayanan, University of California, Irvine; Ahmed Alquraan, University of Waterloo

Economics and computation


Andrés Ignacio Cristi Espinosa
Universidad de Chile

Finalist: Hanrui Zhang, Duke University

Networking


Jiaxin Lin
University of Wisconsin–Madison

Finalists: Siva Kesava Reddy Kakarla, University of California, Los Angeles; Junzhi Gong, Harvard University; Fabian Ruffy Varga, New York University (NYU)

Programming languages


Yuanbo Li
Georgia Institute of Technology

Finalists: Jenna Wise, Carnegie Mellon University; Victor A. Ying, MIT

Security and privacy


Jiaheng Zhang
University of California, Berkeley


Marina Minkin
University of Michigan

Finalists: Lillian Yow Tsai, MIT; Praneeth Vepakomma, MIT; Alexander Bienstock, NYU; Amrita Roy Chowdhury, University of Wisconsin–Madison; Trishita Tiwari, Cornell University; Harjasleen Malvai, Cornell University; Jiameng Pu, Virginia Tech

Database systems


Leonhard Spiegelberg
Brown University


Jialin Ding
MIT

Finalists: Tobias Ziegler, Technical University of Darmstadt; Ian Neal, University of Michigan; Pedro Thiago Timbó Holanda, Centrum Wiskunde & Informatica; Avinash Kumar, University of California, Irvine

Systems for machine learning


Weizhe Hua
Cornell University

Finalist: Qinyi Luo, University of Southern California

Instagram/Facebook app well-being and safety


Yasaman Sadat Sefidgar
University of Washington

Finalists: Nicholas Santer, University of California, Santa Cruz; Brian Ward Bauer, University of Southern Mississippi; Morgan Klaus Scheuerman, University of Colorado Boulder

Privacy and data use


Reza Ghaiumy Anaraky
Clemson University

Finalist: Yixi Zou, University of Michigan

Machine learning


Mikhail Khodak
Carnegie Mellon University


Yuval Dagan
MIT

Natural language processing


Tiago Pimentel Martins da Silva
University of Cambridge


Kawin Ethayarajh
Stanford University

Finalists: Haoyue “Freda” Shi, Toyota Technological Institute at Chicago; Tom McCoy, Johns Hopkins University

Spoken language processing and audio classification


Paul Pu Liang
Carnegie Mellon University

Finalists: Jonah Casebeer, University of Illinois at Urbana-Champaign; Efthymios Tzinis, University of Illinois at Urbana-Champaign; Karan Ahuja, Carnegie Mellon University

To learn more about application requirements and program details, visit the Facebook Fellowship Program page.

The post Announcing the recipients of the 2021 Facebook Fellowship awards appeared first on Facebook Research.

Read More

Database researchers discuss research award opportunity in next-generation data infrastructure

On April 19, 2021, Facebook launched a request for proposals (RFP) on next-generation data infrastructure. With this RFP, which closes on June 2, 2021, the Facebook Core Data and Data Infra teams hope to deepen their ties to the academic research community by seeking out innovative solutions to the challenges that still remain in the data management community. To provide an inside look from the teams behind the RFP, we reached out to Stavros Harizopoulos and Shrikanth Shankar, who are leading the effort within their respective teams.
View RFPShankar is a Director of Engineering on the Core Data team, which builds and supports the online data serving stack for Facebook, providing the databases, caches, and worldwide distribution that power Facebook, Instagram, WhatsApp, and more. Harizopoulos is a Software Engineer within Data Infrastructure, which delivers efficient platforms and end-user tools for the collection, management, and analysis of Facebook data. In this Q&A, Shankar and Harizopoulos contextualize the RFP by providing more background to database research at Facebook. They also discuss what inspired this RFP and where people can stay updated about what their teams are up to.

Q: What does database research look like at Facebook, and how has it evolved over the years?

A: Facebook has had a long history of making contributions to the database space — Hive, Presto, RocksDB, and MyRocks all being examples of innovative work that started within the company. The scale we run at and the unique constraints of our workloads make many existing solutions infeasible and provide a perspective that leads to new ideas. This has become increasingly true over the years as the company has grown and new challenges associated with this scale have shown up. We aspire to continue our tradition of building new, innovative database technologies.

Q: What’s the goal of this RFP?

A: As businesses and organizations become increasingly data driven and products and services are further built around intelligence derived from data, the need for highly reliable, flexible, and efficient data infrastructure becomes even more important. Modern data infrastructure architectures inherit from decades of database research, but recent trends and developments, such as the decoupling of compute and storage and the need to operate efficiently at global scale, as well as the emergence of new use cases such as data science and machine learning workloads, pose new challenges and opportunities.

With this RFP, we seek out innovative approaches to a number of problems that have the potential to set the defining characteristics of next-generation data infrastructure. Many of these problems are not unique to Facebook, and we are keen to learn about the great research done in this area as well as to strengthen our relations with academia.

Q: How does this RFP fit into the bigger picture for database research at Facebook?

A: Defining the underpinnings of data infrastructure that is reliable, resilient, flexible, efficient, and performant at global scale is at the core of database research at Facebook. Our research efforts, however, extend to several directions along modeling, managing, and visualizing different types of data, ranging from structured data to machine-generated logs and time-series data. We innovate in areas such as data storage and indexing, query processing, data modeling, transaction processing, and distributed systems, as well as novel approaches to privacy and security in data management.

Q: What inspired this RFP?

A: While we share our experiences by writing papers and publishing them and, in turn, we benefit from all the innovation in the database space, we’ve seen a couple of ways we could be making this better. Concretely, we’ve seen that certain areas may not be perceived externally as being impactful or important even when they are critical for us. On our side, we recognize that the solutions we have in place or are considering may be limited by our specific systems and the history behind them. We began this RFP process as a way for us to collaborate with academia by highlighting specific problems and looking for innovative approaches that tackle these issues.

Q: Where can people stay updated and learn more?

A: We actively participate each year in major database conferences, such as ICDE, SIGMOD, VLDB, and CIDR. This is where the academic community can reach out to us with questions and ideas. We also contribute a lot of our work through open source. Here are some examples:

Applications for the Next-Generation Data Infrastructure RFP close on June 2, 2021, and winners will be announced the following month. To receive updates about new research award opportunities and deadline notifications, subscribe to our RFP email list.

The post Database researchers discuss research award opportunity in next-generation data infrastructure appeared first on Facebook Research.

Read More

Facebook invites telecommunications experts to register for the 2021 Connectivity Research Workshop

Facebook Connectivity invites telecommunications experts from around the world to attend the 2021 Connectivity Research Workshop on May 18. Those interested can register at the link below by 5:00 pm PDT on Monday, May 17.

Register

Internet access has become more important than ever, and the COVID-19 pandemic has underscored the need for internet inclusion. It gives us an important channel to connect with our friends and family, prepares us for the technology-intensive jobs of tomorrow, and enables us to participate in the broader economy. Further, according to the recent World Bank report on 4G/5G universal broadband, internet access is increasingly seen as a cornerstone for sustainable development.

However, many people around the world still do not have access to mobile connectivity, and technology innovations are needed to help close this gap. To provide connectivity in rural regions, for example, we have to overcome many challenges, including lack of infrastructure, high development costs, and lack of tailored technical solutions. In urban areas, affordability is a persistent challenge in many areas. In addition, rapid increase of data traffic means telecommunication networks are stressed and future internet performance is at risk.

These complex challenges require innovative business and tech solutions. At Facebook Connectivity, our mission is to bring more people online to a faster internet. To ensure relevance and impact, we need input from thought leaders in industry and academia to identify the most relevant challenges and the most promising solution elements. We need to work together to ensure that our research and development efforts contribute to impactful solution sets.

Over the past several years, we’ve used input from experts to guide our research activities, including the V-band mmWave Channel Sounder program and the diffractive non-line-of-sight wireless backhaul project.

To continue our progress in rural connectivity, Facebook Connectivity is inviting some of our key partners to share their thoughts on our collaborative research activities. The workshop will feature presentations from the following attendees:

  • Alex Aimé, Director of Network Investments, Facebook
  • Renan Ruiz, CTO, Internet para Todos de Peru
  • Omar Tupayachi Calderón, CEO, Mayu Telecomunicaciones
  • Roberto Nogueira, CEO, Brisanet
  • Grace Chen, Product Manager, Facebook

To register for the workshop, click the link below. Registration closes at 5:00 pm PDT on Monday, May 17.

Register

The post Facebook invites telecommunications experts to register for the 2021 Connectivity Research Workshop appeared first on Facebook Research.

Read More

Cost-sensitive exploration in multi-armed bandits: Application to SMS routing

What the research is:

Many businesses, including Facebook, send SMS messages to their users for phone number verification, two-factor authentication, and notifications. In order to deliver these SMS messages to their users, companies generally leverage aggregators (e.g., Twilio) that have deals with operators throughout the world. These aggregators are responsible for delivering the SMS message to users and offer different quality and cost attributes. Quality, in this context, is the likelihood that a message will be successfully delivered to a user. A key decision faced by the business is identifying the best aggregator to route these messages. However, a significant challenge here is the nonstationarity in aggregators’ quality, which varies substantially over time. This necessitates a balanced exploration-exploitation approach, where we need to learn the best aggregator at any given time and maximize the number of messages we route through them. Multi-armed bandits (MAB) are a natural framework to formulate this problem. However, existing multi-armed bandit literature largely focuses on optimizing a single objective function and cannot be easily generalizable to the setting where we have multiple metrics, like quality and cost.

Motivated by the above problem, in this paper, we propose a novel variant of the MAB problem, which factors costs associated with playing an arm and introduces new metrics that uniquely capture the features of multiple real-world applications. We argue about the hardness of this problem by establishing fundamental limits on the performance of any online algorithm. We further show that naive generalization of existing algorithms performs poorly from a theoretical and empirical standpoint. Lastly, we propose a simple algorithm that balances two asymmetric objectives and achieves near-optimal performance.

How it works:

In traditional (stochastic) multi-armed bandit problem, the learning agent has access to a set of K actions (arms) with unknown but fixed reward distributions and has to repeatedly select an arm to maximize the cumulative reward. Here, the challenge is developing a policy that balances the tension between acquiring information about actions with little historical observations and exploiting the most rewarding arm based on existing information. Regret metric is typically used to measure the effectiveness of any such adaptive policy. Briefly, regret measures the cumulative difference between expected reward from the best action, had we known the true reward distributions with the expected reward from the action preferred by the policy. Existing literature has extensively studied this setting, leading to simple but extremely effective algorithms like Upper Confidence Bound (UCB) and Thompson Sampling (TS), which have been further generalized and applied to a wide range of application domains. The central focus of these algorithms is incentivizing sufficient exploration of actions that appear promising. In particular, these approaches ensure that the best action always has a chance to get out of situations where the expected reward function is underestimated due to unfavorable randomness. However, there is also a fundamental limitation on how well an online learning algorithm can perform in general settings. In situations where the number of decision epochs is small and/or there are many actions with reward distributions similar to the optimal action, it becomes hard to effectively learn the optimal action. Mathematically speaking, in the traditional problem, it has been established that any online learning algorithm must incur a regret of Ω(KT) where K is the number of arms and T is the number of decision epochs.

In this work, we generalize the MAB framework to the multiobjective setting. In particular, we consider the problem setting where the learning agent has balanced the traditional exploration-exploitation trade-offs and trade-offs associated with multiple objectives, for example reward and cost. Keeping the SMS application in mind, to manage costs, we allow the agent to be agnostic between actions, whose expected reward (quality) is greater than 1−α fraction of the highest expected reward (quality). We refer to α as the subsidy factor and assume it is a known parameter specified by the problem domain. The agent’s objective is to explore various actions and identify the cheapest arm among these high-quality arms as frequently as possible. To measure the performance of any policy, we define two notions of regret, quality regret and cost regret. Quality regret is defined as the cumulative difference between α-adjusted expected reward from the highest quality action and the expected reward from the action preferred by the policy. Similarly, cost regret is defined as the cumulative difference between the cost of the action preferred by our policy and the cost of the cheapest feasible arms, had the qualities and costs been known beforehand.

For this problem, we show that naively extending existing algorithms, like TS, will perform poorly on the cost regret. In particular, we consider the variant of TS, where we form the feasible set based on the corresponding estimates on quality and pick the cheapest feasible action. For this variant, we can show that TS performs arbitrarily worse (i.e., incurs linear regret). This is primarily because existing algorithms are designed to incentivize exploration of promising actions and can end up incurring large cost regret in settings where there are two actions with similar rewards but vastly different costs. We then establish a fundamental lower bound of Ω(K1/3T2/3) on the performance of any online learning algorithm for this problem, highlighting the hardness of our problem in comparison to the classical MAB problem. Building on these insights, we develop a simple algorithm based on the explore-then-commit idea, which balances the tension between two asymmetric objectives and achieves near-optimal performance up to logarithmic factors. We also demonstrate the superior performance of our algorithm through numerical simulations:

Why it matters:

Multi-armed Bandits (MAB) is the most popular paradigm to balance the exploration-exploitation trade-off that is intrinsic to online decision making under uncertainty. They are applied to a wide range of application domains, including drug trials and online experiments, to ensure that maximal number samples are offered to the most promising candidate. Similar trade-offs are typical in recommendation systems, where the possible options to recommend and user preferences are constantly evolving. Facebook also has leveraged the MAB framework to improve various products including identifying the ideal video bandwidth allocation for users and best aggregator for sending authentication messages. Though one can model this in the traditional MAB framework by considering cost subtracted from the reward as the modified objective, such a modification is not always meaningful, particularly in settings where the reward and cost associated with an action represent different quantities (for example, quality and cost of an aggregator). In such problems, it is natural for the learning agent to optimize for both the metrics, typically avoiding incurring exorbitant costs for a marginal increase in cumulative reward. To the best of our knowledge, this paper takes a first step in generalizing the multi-armed bandit framework to problems with two metrics and presents both fundamental theoretical performance limits as well as easy-to-implement algorithms to balance the multiobjective trade-offs.

Lastly, we perform extensive simulations to understand various regimes of the problem parameters and compare different algorithms. More specifically, we consider scenarios where naive generalizations of UCB and TS, which have been adapted in real-life implementations, perform well and settings where they perform poorly, which is of interest to practitioners.

Read the full paper:

Multi-armed bandits with cost subsidy

The post Cost-sensitive exploration in multi-armed bandits: Application to SMS routing appeared first on Facebook Research.

Read More