How Facebook Core Systems’ Shruti Padmanabha transitioned from PhD candidate to Research Scientist

Our expert teams of Facebook scientists and engineers work quickly and collaboratively to solve some of the world’s most complex technology challenges. Many researchers join Facebook from top research institutions around the world, often maintaining their academic connections through industry collaborations, partnerships, workshops, and other programs.

While some researchers come to work at Facebook after extensive careers in academia, others make the transition toward the beginning of their career. Shruti Padmanabha, who joined Facebook after receiving her PhD in computer science and engineering, is one such example.

Padmanabha is a Research Scientist on the Disaster Recovery team within Facebook Core Systems. Her focus is on building distributed systems that are reliable and tolerant to failures in their hardware and service dependency stack. Padmanabha’s journey with Facebook Core Systems began with a PhD internship, an experience that encouraged her to switch fields from computer architecture to distributed systems.

We sat down with Padmanabha to learn more about her experience joining Facebook full-time directly after earning her PhD, as well as her current research projects, the differences between industry and PhD research, and her academic community engagement efforts. Padmanabha also offers advice for PhDs looking to transition to industry after graduation.

Tell us about your experience in academia before joining Facebook.

After earning an undergraduate degree in electrical engineering in India, I moved to Michigan to pursue my master’s and PhD in computer science and engineering at the University of Michigan, in Ann Arbor. I was interested in how low-level circuits came together to form computers, and started working research problems in computer architecture. Specifically, I focused on energy-efficient general-purpose architectures (like those that drive desktops).

What excited me about grad school was the opportunity to continuously extend state-of-the-art technology. There, I learned the importance of communication, both written and in person, and to navigate myself in a male-dominated environment (I was the first and only woman in my adviser’s lab!) by leaning on and volunteering in underrepresented groups for grad students. I also found passion in teaching and mentorship activities.

But foremost, I valued the company of the brightest peers and network that being in grad school brought me. It was a proud feeling to be treated as a peer by professors that I greatly respected!

What has your journey with Facebook been like so far?

My first experience at Facebook was a 2015 summer internship in Core Systems. This internship was a leap for me since my expertise was in computer architecture, not in distributed systems. I worked on a proposal to improve energy efficiency of Facebook data centers by dynamically scaling machine tiers based on traffic patterns. The insights we gained from this early experimental work influenced the design of some of our production capacity management systems today.

During my internship, I had the opportunity to work on challenging problems for state-of-the-art technologies and to work with smart, driven peers. I liked that I could innovate on problems with more immediate, real-world applications than in academia. This is why I decided to pursue a full-time position at Facebook after 27 years in school.

When I joined, I was given the choice of being a computer architect on another Facebook team or to switch fields and rejoin Core Systems. I had a very positive internship experience on the Core Systems team, and I liked that they maintained strong academic collaborations, so I decided to permanently switch fields and move to distributed systems.

My first position was on an exploratory team that was trying to think of novel ways to distribute Facebook systems in data centers of different sizes and in different geological locations, which gave me a crash course in real-world distributed systems of all kinds. It also provided me with the unique opportunity to learn about the physical side of designing data centers at Facebook’s scale, from power transformers to submarine network backbones to CDNs.

I’ve learned that there’s always room to learn new things and grow within Core Systems. To keep a healthy flow of fresh information flowing, I lead biweekly brown bag sessions where teams present technical achievements and design challenges, as well as help present introductory infrastructure classes to Facebook n00bs.

What are you currently working on?

My current focus is on improving fault tolerance and reliability at Facebook. Facebook’s service infrastructure consists of a complex web of hundreds of interconnected microservices that change dynamically throughout the day. These services run on 11 data centers, which are designed in house and located across the globe.

At this scale, failures are bound to happen — like a single host losing power, a power transformer getting hit by lightning, a hurricane posing a risk to an entire data center, a single code change leading to a cascading failure, and so on. Our team’s focus is to help design Facebook’s services in a way that such failures are tolerated gracefully and transparently to the user.

Our OSDI paper from 2018 talks about one mitigation approach of draining traffic away from failing data centers. Justin Meza and I also gave a talk at Systems@Scale 2019 where we described the problem of handling power outages at a sub–data center scale. The solution required building solutions that span the stack — from respreading hardware resources across the data center floor to balancing services across them.

What are some of the ways you’ve shown up in the academic community?

I’ve stayed in touch with the academic community by attending conferences, participating in program committees, and mentoring interns. I served on the program committees for a few academic conferences (ISCA, HPCA), as well as for Grace Hopper. I enjoy participating in events where I can reach out to PhD candidates directly, especially those from underrepresented groups. For instance, I moderated a panel at MICRO 2020 on Tips and Strategies for PhD candidates looking for jobs in industry, and participated in the Research@ panel in GHC 2017.

At Facebook Core Systems, we maintain and encourage a strong tie to academia, in terms of both publishing at top systems conferences and awarding research grants to academic researchers, for which I also had the opportunity to help out.

What are some main differences between doing PhD research and doing industry research?

In a PhD candidacy, one’s work tends to be fairly independent and self-driven. In industry, on the other hand, projects are distributed across team members who are equally invested in its success. I had to deliberately change my academic mindset of being solely responsible for the delivery of a project to working in a collaborative environment as an individual contributor.

Industry research also might have the advantage of mature infrastructure and access to real-world data. To me, not having to build/hack simulators and micro-benchmarks was a breath of fresh air. At Core Systems, development progresses from a proof of concept to being tested and rolled out in production relatively quickly.

I’ve also observed a difference in how projects are prioritized in industry and how success is measured. Within Core Systems, we have the freedom to choose the most important problems that need to be worked on within the scope of company-wide goals and build long-term visions for their solutions. Half-yearly reviews are the checkpoints for measuring success toward this vision.

For computer science PhDs curious about transitioning to industry after completing their dissertation, where would you recommend they start?

Explore different sides of the industry through internships, and leverage your professional networks. We’re fortunate that, in our field, internships are usually plentiful. Experiment with different kinds of internships at research labs and product groups if you can. These opportunities will not only give you a window into the kind of career you could build in the industry but also help you build connections with industry researchers. Many industry jobs and internships have rather generic-sounding descriptions that make it hard to envision the work involved, so it’s best to get hands-on experience.

Also, leverage connections through your adviser and alumni network, and engage with industry researchers at conferences. Be sure to discuss the breadth of research at their company and their job interview processes. Lastly, don’t be afraid to step outside of your research comfort zone to try something new!

The post How Facebook Core Systems’ Shruti Padmanabha transitioned from PhD candidate to Research Scientist appeared first on Facebook Research.

Read More

Registration now open for the 2020 Testing and Verification Symposium

The fourth annual Facebook Testing and Verification (TAV) Symposium brings together academia and industry in an open environment to exchange ideas and showcase the top experts from testing and verification scientific research and practice. Taking place virtually this year, the symposium is open to all testing and verification practitioners and researchers and is free to attend. Those interested in attending may submit their registration request below.

RegisterAttendees are invited to join the event over the course of three days, from December 1 to 3. The symposium’s agenda will include several talks that will offer opportunities for Q&A via the event platform.

“The TAV Symposium is all about bringing communities together: testing and verification, and academia and industry,” says Peter O’Hearn, Facebook researcher and University College London professor. “Speakers include leading academic researchers in TAV as well as folks from industry who deploy TAV techniques to practicing engineers. Both of these groups of people are pushing boundaries on what is known and on how TAV techniques can be used to help people. We believe that cross-fertilization of ideas is incredibly valuable, helping us all go further together.”

At the 2019 TAV Symposium, Facebook Software Engineer Nadia Alshahwan gave a talk on Sapienz testing and shared some of her team’s challenges and solutions with the community. For the 2020 symposium, she hopes to continue to have valuable discussions with attendees. “I can’t wait to see what this year will bring,” says Alshahwan. “The TAV Symposium has allowed my team to discover new collaborations and gain valuable feedback from this diverse audience.”

Below is the list of confirmed speakers, which can also be found on the registration page. As speakers are confirmed, they will be added on the registration site leading up to the event.

Confirmed speakers

Nadia Alshahwan (Facebook)

David Clark (University College London)

David Dill (Novi at Facebook)

Alistair Donaldson (Google; Imperial College London)

Philippa Gardner (Imperial College London)

Mark Harman (Facebook; University College London)

John Hughes (Quviq AB; Chalmers University, Gothenburg)

Bryan O’Sullivan (Facebook)

Sukyoung Ryu (KAIST)

For more information about speakers, including full bios and topics, visit the registration page.

To learn more about TAV research at Facebook, watch the 2018 TAV RFP video and read the summaries from the 2018 TAV Symposium and the 2019 TAV Symposium (images below).

The post Registration now open for the 2020 Testing and Verification Symposium appeared first on Facebook Research.

Read More

Facebook announces winners of the People’s Expectations and Experiences with Digital Privacy research awards

Facebook is committed to honoring people’s privacy in our products, policies, and services. We serve a diverse global community of users, and we need to understand people’s unique privacy needs around the world. This is why we launched a request for research proposals in August to help broaden and deepen our collective knowledge of global privacy expectations and experiences. After a thorough review process, we’ve selected the following as recipients of these awards.
VIEW RFP“I am thrilled by the quality of the proposed work we received in our latest funding opportunity about users’ privacy expectations and experiences,” says Head of Facebook Privacy Research Liz Keneski. “I look forward to seeing the results of these carefully selected funded studies. I expect these projects to have an impact on inclusive privacy at Facebook and in the tech industry at large, including for vulnerable populations in the diverse communities that we serve — and on the ways we measure and understand people’s privacy expectations reliably.”

We sought applications from across the social sciences and technical disciplines, with particular interest in (1) improving understanding of users’ privacy attitudes, concerns, preferences, needs, behaviors, and outcomes, and (2) novel interventions for digital transparency and control that are meaningful for diverse populations, context, and data types.

“We received an impressive 147 proposals from 34 countries and 120 universities,” says Sharon Ayalde, Facebook Research Program Manager. “This was our first year offering funding opportunities in the privacy research space. We look forward to continuing these efforts into 2021 and to strengthening our collaborations with key experts globally.”

Thank you to everyone who took the time to submit a proposal, and congratulations to the winners. Recipients of our two earlier privacy research award opportunities were announced in May.

Research award recipients

Deploying visual interventions for users of varying digital literacy levels
Xinru Page, Brian Smith, Mainack Mondal, Norman Makoto Su (Brigham Young University)

Empowering under-resourced parents in teens’ social media privacy education
Natalie Bazarova, Dan Cosley, Ellen Wenting Zou, Leslie Park, Martha Brandt (Cornell University)

Helping people manage privacy settings using social influences
Jason Hong, Laura Dabbish (Carnegie Mellon University)

Measuring privacy across cultural, political, and expressive contexts
Dmitry Epstein, Aysenur Dal, Elizabeth Stoycheff, Erik Nisbet, Kelly Quinn, Olga Kamenchuk, Thorsten Faas (The Hebrew University of Jerusalem)

Privacy personas in the MENA region: A large-scale analysis of 21 countries
Bernard J. Jansen, Joni Salminen, Justin D. Martin, Kareem Darwish, Lene Nielsen, Soon-gyo Jung, Yin (David) Yang (Hamad Bin Khalifa University)

Finalists

Bridging the urban-rural gap in digital privacy autonomy among older adults
Kaileigh Byrne, Bart Knijnenburg (Clemson University)

Digital afterlife and the displaced: Challenges and practices among Rohingyas
Faheem Hussain, Md. Arifur Rahman (Arizona State University)

Digital privacy attitudes and misperceptions among key stakeholders
Joseph Turow Yphtach Lelkes (University of Pennsylvania)

Digital privacy in a connected world — awareness and expectations of children
Serdar Abaci (University of Edinburgh)

Digital privacy rights and government data collection during COVID-19
Ariadne Vromen, Francesco Bailo, Kimberlee Weatherall (Australian National University)

Informed consent: Conditions of effective consent to privacy messages
Glenna Read, Jonathan Peters (University of Georgia)

Managing your personal “data bank”
Aija E. Leiponen, Joy Z. Wu (Cornell University)

Privacy design for vulnerable and digitally low-literate populations
Maryam Mustafa, Mobin Javed (Lahore University of Management Sciences)

Understanding folk theories of privacy
Jeff Hancock, Xun “Sunny” Liu (Stanford University)

For more information about topics of interest, eligibility, and requirements, visit the award page. To receive updates on new research award opportunities, subscribe to our email newsletter.

The post Facebook announces winners of the People’s Expectations and Experiences with Digital Privacy research awards appeared first on Facebook Research.

Read More

What video infrastructure research looks like at Facebook

Research informs everything we do at Facebook, from improving real-time augmented reality experiences to keeping people safe and secure on our platforms. In the realm of video infrastructure, the Facebook Video Quality and Research team works on technology challenges that come from implementing videos at a large scale.

Researchers in video infrastructure work on two fundamental and related problems: (1) How to improve video coding efficiency, or how to spend fewer bits to compress a given video at a certain quality, and (2) how to accurately measure video quality, or how to predict a viewer’s perception of video quality through automated algorithms. An important constraint in their work is that both of these tasks need to be performed with the highest compute efficiency possible, given the billion-scale of Facebook and Instagram videos.

We have a top team of engineers and researchers in video infrastructure, some coming from industry and some straight from PhD programs in the field of video processing at top research universities. To learn more about the Video Quality and Research team’s contributions to the field so far and where to find their research, we talked with Ioannis Katsavounidis, Research Scientist in video infrastructure at Facebook.

Facebook Video@Scale 2019 and SPIE 2020

With Katsavounidis’s quality keynote at Facebook’s 2019 Video@Scale event, the team introduced the concept of compute-efficiency/compression-efficiency convex hull. This allows different encoders, using even different coding standards, such as AVC, VP9, and AV1, to be compared and prioritized for videos of varying popularity.

At the most recent SPIE Applications of Digital Image Processing conference, which took place online in August 2020, Ioannis invited researchers from industry and academia to contribute to a special session on energy-efficient video processing. There were a total of 18 papers, covering all aspects of video processing, including video quality metrics, video encoders, and software and hardware architectures. Among these special session papers were four contributions from two different teams at Facebook:

“All these papers, but also others presented at SPIE, represent major steps toward our goal to secure the highest possible video quality for Facebook videos within the constraints of our data center power and physical capacity,” says Katsavounidis. “Attendees reacted positively to the prerecorded video presentations of each paper, and we received a lot of engagement that still continues today.”

ICIP, Video@Scale, and more

The team is continuing to conduct research in this space and share results with the academic community. A recent example is the IEEE International Conference on Image Processing (ICIP) 2020. “We are proud to be Platinum sponsors of this flagship conference,” says Katsavounidis. “At the conference, we held an industry workshop on efficient video compression and quality measurement at Facebook, where we talked about the types of problems we are facing and the solutions we are seeking.” Watch the entire workshop below.

Katsavounidis and others also announced short updates and research highlights at the recent Video@Scale event on October 22 and will be following up on their energy-efficiency research during the 2021 Picture Coding Symposium. “I am a general co-chair, and the team is working to have a number of papers submitted to that conference, which we are excited about,” says Katsavounidis.

Staying updated

The Facebook Video Quality and Research team plans to continue engagement efforts at the 2021 Picture Coding Symposium, as well as at SPIE and ICIP 2021. All updates will be posted to our blog. For all the conferences that Facebook sponsors, be sure to check our Events page.

The post What video infrastructure research looks like at Facebook appeared first on Facebook Research.

Read More

Inside blockchain and cryptoeconomics research at Facebook

Blockchains provide an excellent example of how foundational computer science and economic research can change the trajectory of entire industries. As a result of the work of many researchers over decades, we are starting to see a transformation in the technology underlying payment systems.

At Facebook, we’re contributing to this effort with a world-class research team that sits within Novi, a regulated financial company building for the Libra payment system. We’re a group of research scientists, economists, and software engineers working across interdisciplinary fields, including cryptography, programming languages, distributed systems, program verification, game theory, security, privacy, financial inclusion, economic development, macroeconomics, and market design. Check out the Blockchain and Cryptoeconomics page to learn more about us.

As an impact-driven research team, we’re focused on supporting the development of the Novi wallet and contributing to a variety of blockchain and research communities, including the Libra project, which is built on an open source blockchain. We are addressing the limitations of blockchain systems by incubating open source tech and pursuing research that will make these systems more scalable, safer, and more accessible.

We’re excited to share more about the technical advancements we’ve made and the economic research we’re undertaking to advance Novi’s goal of making money move more freely for more people.

Novi technical publications

We share many of the same ambitions outlined in the Initiative for CryptoCurrencies and Contracts’ Seven Grand Challenges, which identify the major blockers to blockchain adoption. Based on the work we’ve published to date, we’ve made some early progress toward addressing issues of scaling and performance, correctness, and safety — but we are just getting started. Here’s a shortlist of recent publications that highlight our work across several key dimensions:

Scaling and performance

  • State machine replication in the Libra blockchain: A state-of-the-art Byzantine fault tolerance algorithm for Libra (LibraBFT) for forming agreement on ordering and finalizing transactions among a configurable set of validators.
  • Cogsworth: A new Byzantine view synchronization algorithm that has optimistically linear communication complexity and constant latency. Faced with benign failures, Cogsworth has expected linear communication and constant latency.
  • FastPay: A set of distributed authorities that allows organizations to leverage prefunded quorums to settle transactions at ~80,000 tps.

Correctness by design and construction

  • Move: A language with programmable resources: An executable bytecode language used to implement custom transactions and smart contracts. With Move, a resource can never be copied or implicitly discarded, only moved between program storage locations.
  • The Move Prover: A formal verification system that enables automatic verification of functional correctness for Move modules.
  • Twins: White-glove approach for BFT testing: A novel approach to emulate common Byzantine behaviors. The main idea of Twins is that we can emulate Byzantine behavior by running two instances of a node with the same identity. Each of the two instances (or Twins) runs unmodified, correct code.

Safety and compliance

  • Proof of liabilities: A novel algorithm for proving liabilities with privacy that allows entities to undergo a distributed audit of their liabilities. It is the first scheme to protect users against dishonest entities, without leaking individual identifiable data.
  • Taming the many EdDSAs: A novel technique to check compatibility of cryptographic libraries that implement the EdDSA signature scheme. It surfaces discrepancies between libraries and the standards, and it justifies the best way of implementing the scheme securely focusing on practical aspects.

Novi Cryptoeconomics

In addition to our open source technical advancements, economists on the Novi cryptoeconomics team conduct research into financial inclusion, applied microeconomics, macroeconomics, and market design. We work closely with computer scientists, translating economic theory into real-world, practical solutions for blockchain-based problems. Here are more details on where the team is focused within each of our core research areas:

  • Financial inclusion: Designing, testing, and scaling low-cost financial tools that could improve the lives of the world’s unbanked and underbanked populations. Using the power of smart contracts and programmable payments to reduce costs, increase efficiency and access to financial services.
  • Applied microeconomics: Using experiments to learn about the behavior of people and businesses in a global payment system.
  • Macroeconomics: Investigating stablecoin systems and reserves that are solvent, liquid, and support financial stability. Creating a competitive environment that allows for financial opportunity and innovation.
  • Market design: Designing objective and transparent mechanisms that help establish and maintain proper governance in stablecoin systems, to ensure that operations are aligned with the long-run needs of their users.

Getting involved

If you’re inspired by any of this work, we’d love to hear from you. As new opportunities arise for roles on the team, including for visiting researchers, we will post them to our careers page. Be sure to check back often for updates.

We also regularly offer PhD fellowships in the area of blockchain and cryptoeconomics. Learn more by visiting the Facebook Fellowship Program page. Applications open every fall, and winners are announced the following January.

We also actively participate in industry and academic conferences, whether they’re virtual or in-person. Feel free to say hello if you spot any members of our group.

The post Inside blockchain and cryptoeconomics research at Facebook appeared first on Facebook Research.

Read More

From academia to industry: How Facebook Engineer Jason Flinn started his journey in Core Systems

Partnering with university faculty helps us drive impactful, innovative solutions to real-world technology challenges. From collaborations to funding research through requests for proposals, working with academia is important to our mission of giving people the power to build community and bringing the world closer together.

Many members of our Facebook research community come from long and accomplished careers in academia. One example is Jason Flinn, a former professor at the University of Michigan. After an extensive academic career in software systems, which recently earned him the prestigious Test of Time award, Flinn became a Software Engineer on Facebook’s Core Systems, a team that performs forward-looking research in the area of distributed systems and applies key systems architecture techniques at Facebook’s scale.

Flinn’s first industry collaboration with Facebook was with one of his PhD students, Mike Chow, who was a PhD intern at the time. This experience gave Flinn a preview of what it would be like to work in industry as a researcher. “I do my best work when I build systems that have real-world use,” he explains. “In my early career in mobile computing, I was the person using the system, and I learned the right research questions to ask from examining my own experiences. Today, with Core Systems, I have thousands of engineers using the systems that I am building, and I am learning the right research questions to ask from deploying these systems at scale.”

We sat down with Flinn to learn more about how he came to work at Facebook after a career in academia, the differences between industry and academia for someone in Core Systems, his current research projects, advice for those looking to follow a similar path, and more.

Q: Tell us about your experience in academia before joining Facebook.

Jason Flinn: Prior to joining Facebook, I was a professor at the University of Michigan for over two decades. My research interests over the years have been really varied. I’ve always enjoyed the opportunity afforded by computer science research to explore new topics and branch out into related subfields. My PhD dissertation looked at software power management and developed ways to extend the battery lifetime of mobile computers. I’ve returned to mobile computing throughout my career, developing distributed storage systems and investigating ways to improve wireless networking through better power management and strategic use of multiple networks. I also was fortunate to get involved in some of the earliest work in edge computing and vehicle-to-infrastructure networking. For another large part of my career, I studied topics in storage systems and distributed computing, including distributed file systems and software applications of speculative execution and deterministic replay.

Since joining Facebook, I have run into so many former students who now work for the company and have taken these classes with me. This has been a great reminder that one of the primary contributions to academia is our impact on the students we teach.

Q: What has your journey with Facebook been like so far?

JF: When I was at the University of Michigan, I participated in a couple of joint research projects with Facebook engineers. In both cases, the collaborations were kicked off by discussions with my former PhD students who had joined Facebook as full-time engineers. One of my then-PhD students, Mike Chow, joined Facebook for an extended nine-month internship, and we jointly developed a tool with Dan Peek (another former student), David Meisner, and Thomas Wenisch called the Mystery Machine. The key insight in this paper was that we could apply data at massive scale to learn the relationships and dependencies between execution points in software systems without needing to annotate or fully instrument such systems by hand.

This paper received a lot of visibility when it was published at OSDI, and has proved to be quite influential in showing the community the potential of applying machine learning and data at scale to software tracing and debugging. This collaboration was so successful that Mike did a subsequent internship with another of my PhD students, Kaushik Veeraraghavan, resulting in the DQBarge paper at OSDI 2016.

In 2018, I was eligible for a sabbatical and looking for a change of pace. I wound up talking to Mahesh Balakrishnan about the Delos project he had recently started at Facebook around the idea of virtualizing consensus through the use of a reconfigurable, distributed shared log. Delos offered me the chance to dive right in and design new, cutting-edge protocols, so I quickly jumped into this project. We were originally only a small team of four people, but within my first few months on the project, we were deploying our code at the heart of the Facebook control plane. After about nine months, I decided to join Facebook as a full-time employee.

Q: What are you currently working on?

JF: I’ve worked on two major projects at Facebook. The first is the Delos project mentioned above. Our team built a strongly consistent metadata store for Facebook control plane services like the container scheduler and resource allocation systems. Such systems are notoriously complex and fraught with peril to develop and deploy, often because they are a low-level building block on which all higher levels of the software stack depend.

One of the most fun parts of this project for me was when we deployed this new protocol in production for the first time. We executed a single command and the Delos virtualized architecture swapped an entire data center to the new protocol with zero downtime and no fuss. I don’t think anything like this had ever been done before, so it felt like quite an achievement to see it happen. The team has leveraged virtualized consensus in lots of different ways since then: for example, in deploying a point-in-time database restore capability, swapping protocols for Delos’s own internal metadata storage, and swapping between disaggregated and aggregated logs to help mitigate production issues.

My second project is called Hedwig. This project is unifying the delivery of large, hot data content to very large numbers of consumer machines distributed around the world. In academic research and in industry, there has been a lot of prior work on highly decentralized systems for delivering such content (BitTorrent is one example of a system in this space). Yet, with the deployment of public and private clouds, there is an opportunity to reexamine this space and look at how we can optimize such systems for a managed data center environment in which the system has greater visibility into network topology and resource availability, and in which we also have the opportunity to leverage highly-reliable, well-maintained centralized services.

Hedwig aims to achieve the best of both worlds by providing a simple, decentralized peer-to-peer data plane in combination with a well-managed, centralized control plane. Hedwig is also designed to be highly customizable through flexible policy modules in its centralized control plane. These policies let Hedwig employ different caching, routing, and failure handling policies for different use cases. In turn, this lets us easily optimize Hedwig for different workloads and services.

Q: What’s the difference between working in systems at Facebook versus in academia?

JF: I have always admired the industry papers that appeared in early SOSP conferences that described experiences building production software systems (Unix, AFS, etc.). What makes these papers great is that they not only contain big ideas, but they also combine these ideas with practical lessons and observations that come from deploying and using the systems. Reading the papers, I can feel how the deployment of the systems really helped the authors understand what was most important and innovative about their work (for example, the simplicity of the Unix interface, or the concept of scalability in the AFS paper that was decades ahead of its time).

Working in Core Systems gives me the opportunity to replicate some of the ingredients that helped make these papers so great. In academia, my focus was on writing papers and working with my students. My students and I built systems to validate our ideas, and together we might write several papers about a particular system as we were developing the ideas. At Facebook Core Systems, my focus has been first on building the systems, deploying them at scale, and learning from them. I can let the systems bake and evolve over time before writing a paper that describes what we did. This process leads to fewer papers, but I hope it also leads to stronger papers like the early industry papers I admire.

We followed this path with our Delos paper that’s appearing at OSDI this year, and I hope to take a similar approach to describing my current work on Hedwig.

Q: You recently earned a Test of Time award for your work in adaptable battery use in mobile apps. What influenced this research?

JF: It’s often said that asking the right questions is the hardest part of research, and I think this is especially true in this situation. It was all about being in the right place at the right time.

I was really fortunate to attend grad school at Carnegie Mellon when they had just deployed the first campus-wide wireless network. This gave me the opportunity to take my laptop outside and work with an actual internet connection. (Although hard to imagine today, this was incredibly novel at the time.) Almost the first thing I noticed was that my laptop battery would quickly die. This was the “aha!” moment — that reducing energy usage was going to be incredibly vital for any type of mobile computer. This led to all sorts of interesting questions: Can we measure energy usage and attribute that energy to the software running on the computer? What types of strategies can software employ to extend the battery lifetime of the computer? Can the operating system adapt the behavior of the software to optimize for energy savings or quality/performance?

Q: For someone in academia curious about collaborating with or working at Facebook, where would you recommend they start?

JF: My best collaborations (both with Facebook and elsewhere) have involved sending a student to work directly with industry teams for a period of time (i.e., an internship) or working directly on the project myself (e.g., on sabbatical or for a few hours every week). My Facebook collaborations started out with long conversations with Facebook engineers at conferences where we would kick a bunch of ideas around. The final project wound up in the same general area as these conversations, but it was really the process of embedding with a Facebook team that generated the best research directions.

Working with Facebook, there is a tremendous opportunity to collect real-world systems measurements at scale to validate ideas. It’s important to utilize this opportunity during any collaboration.

I also learned to budget some time after any internship or sabbatical to work on the idea in academia where one can build a smaller-scale replica, tweak, and measure the system in a way that is not possible in a production system. Combining these two styles of research can result in really strong work.

The post From academia to industry: How Facebook Engineer Jason Flinn started his journey in Core Systems appeared first on Facebook Research.

Read More