Diverse representation in advertising: Q&A with Creative Shop Researcher Fernanda de Lima Alcantara

International Women’s Day celebrates the social, economic, cultural, and political achievements of women. To highlight the impactful work that women researchers are doing at Facebook, we reached out to Fernanda de Lima Alcantara, Marketing Science Researcher at Facebook’s Creative Shop.

The Creative Shop is an internal team of creative strategists, designers, writers, producers, and data experts who collaborate with advertisers to help them run effective campaigns on Facebook’s apps and services. Within this team, De Lima focuses on helping businesses succeed by providing them with marketing and advertising insights, with a current focus in representation in online ads.

In this Q&A, we ask De Lima about her journey at Facebook, her background, and her current research projects. She shares insights from her recent white paper, “Diverse and inclusive representation in online advertising: An exploration of the current landscape and people’s expectations,” and explains what marketers should take away from this research.

Q: Tell us about your experience in academia before joining Facebook.

Fernanda de Lima Alcantara: I first started my career in Brazil as a telecommunications technician, but soon I found my passion for data analysis and earned an undergraduate degree in computer science. For almost six years, I worked with data mining and decision science in the finance sector. I also obtained many certifications in analytical and statistical tools. To continue growing my skills in quantitative and qualitative analysis, I moved to Europe to pursue a master’s in machine learning at University College London.

What excited me about machine learning is that it can be applied to multiple domains (like neuroscience, bioinformatics, machine vision, and so on) to solve real-world problems using a data-driven approach. I learned to design, develop, and evaluate appropriate algorithms and methods for new applications, as well as some new techniques to analyze data. I felt the machine learning master’s program was strongly aligned with my business experience and my field of interest.

Q: What has your journey with Facebook been like so far?

FDLA: I joined Facebook in 2012 in the São Paulo office. In Brazil, I helped many businesses grow by transforming current marketing practices and developing new strategies, always grounded in our foundational measurement practices. Over the years, I worked on projects using simple aggregation, descriptive analysis, or more advanced analyses using data models and causal inference.

I officially joined the research team five years ago, when I moved to the United States to work from the Facebook Menlo Park office in California. In the first two years, I was dedicated to consumer insights and spent time studying the intersection of advertiser value and consumer behavior within ads products. I worked on a range of projects, some focused on the consumer journey and others focused on understanding how people feel about our products. It was very exciting to work with a breadth of methodologies like behavioral lab and consumer neuroscience, passive measurement in sales touchpoints, surveys, focus groups, and in-depth interviews.

For the last three years, I’ve been working in New York as a Marketing Science Researcher in Creative Shop. Every day, I’m provided with the unique opportunity to explore the creative potential of Facebook platforms and help businesses connect with people in meaningful ways and succeed. In my day-to-day, I use experimental design, online surveys, and Facebook data to build tools for statistical, qualitative, and quantitative analysis. My goal is to learn, share, and inspire business with new possibilities through data, creativity, and storytelling. And I love working at the intersection of art and science.

Q: What are you currently working on?

FDLA: Every half we are presented with exciting challenges to advance the industry. Currently, I have two projects that are top of mind: The first one promotes diverse and inclusive representation in online advertising, and the second explores the creative opportunities in emerging platforms.

The first project is very close to my heart because it promotes social justice and business equality. The objective is to identify opportunities to better represent people in online ads, inspire more inclusive and authentic advertising content, and uncover the positive impact of inclusive portrayal — for people and businesses.

The second project investigates the new ways people are connecting online and the new creative potential for people and businesses. In this project, I explore creative ideas to help businesses succeed in AR, VR, and other immersive experiences.

Both projects bring me a sense of community and meaningfulness because they aim to create a positive social impact by improving people’s representation in ads and their experience with Facebook, and they support business growth.

Q: Social impact, diversity, and inclusion continue to play a big role in the advertising industry. What should marketers take away from this research?

FDLA: Our research in diverse and inclusive representation in online advertising showed that stereotypes and bias still exist within advertising, with some groups practically absent or portrayed in stereotypical ways. In contrast, people expect the advertising industry to ensure diverse voices and experiences are represented authentically, and they want to see ads that reflect their lived experiences and communities more accurately.

While there’s no single path to progress, part of this process involves getting more comfortable having conversations around inclusivity and ensuring diversity of people both building and leading creative development. It is also part of the challenge to support creative development with mechanisms to spot bias and track progress with data.

Fundamentally, people expect brands to get involved and promote better representation and portrayal of people in advertising. In doing so, they might see a range of positive effects on business outcomes.

Advertising aims to tell stories, evoke emotions, and compel actions. But to improve the representation and portrayal of people in advertising, we must close the gap between what people want to see in advertising and what the ad creative — that is, characters and storyline — is actually showing them. This is how we can better reflect the full breadth of people we serve and make progress.

More details can be found in the white paper.

Q: Where can people learn more about your research?

FDLA: You can find an article about this research at fb.me/representationinads.

To learn more about how Facebook is celebrating the achievements of women during Women’s History Month, visit Newsroom.

The post Diverse representation in advertising: Q&A with Creative Shop Researcher Fernanda de Lima Alcantara appeared first on Facebook Research.

Read More

Inside Magma: A look at the team behind the software

Facebook Connectivity’s mission is to bring more people online to a faster internet. Together with partners around the world, we’re developing programs and technologies that increase the availability, affordability, and awareness of high-quality internet access. In this blog, we take a closer look at Magma, an open source software platform that enables operators and internet service providers to deploy mobile networks in hard-to-reach areas.

To help tell the story of Magma, we reached out to two members of the Facebook Magma team, Brian Barritt (Software Engineering Manager) and Ulas Kozat (Software Engineer). As experts in this space, they provide more details about Magma, its use cases, as well as its growing community of academics, developers, and industry partners.

About Magma

Every mobile network needs a high-performance packet core at the center of its network. But the market has made it difficult for communications providers to buy, deploy, and maintain the latest technologies at a reasonable cost. According to Kozat, Magma is an open-source, enhanced packet core solution that delivers flexibility, openness, and lower costs to communications service providers. This ultimately means people can experience better connectivity, whether through 4G, 5G, Wi-Fi, or other wireless access technologies.

Kozat names the following potential use cases for Magma:

  • Providing connectivity solutions for smaller populations (such as remote locations, enterprises, and factories) that need more localized, self-managed networking
  • Providing regional or national operators with a solution to fill gaps in coverage or capacity in both rural and urban areas
  • Providing low-latency, high-bandwidth access to edge cloud (like AR/VR applications), to proliferate the next generation of applications and services

Facebook Connectivity’s work by nature is highly collaborative and spans several fields of expertise, and Magma is no exception. “Facebook Connectivity open-sourced Magma in 2019, and we continue to be major contributors to the code base,” says Kozat. “Our partner engineers, marketing team, and management team build partnerships with vendors, system integrators, academics, and service providers to accelerate market adoption and bring millions of real users online powered by Magma.”

Partnerships, collaborations, and community

The Magma team actively solicits researchers to join its advanced research arm through the Magma Academic Partnership Program, which was launched in 2020. “The program aims to foster strong participation from academic researchers to advance edge connectivity over open wireless research testbeds and platforms. The program also supports research projects that more directly explore advanced use cases using the Magma platform,” says Kozat.

In line with this vision, Magma and others within Facebook Connectivity have been part of the organization committee for the academic program, and have been speakers for the first OpenWireless Workshop, which was organized as part of ACM Mobisys in June 2020.

The Magma team further fosters collaboration and community among industry and academic partners through events like the Magma Developers Conference, which took place this year on February 3, 2021. The annual event brings together developers, communications service providers, field experts, academia, and technology leaders to discuss opportunities, challenges, and new ways to improve and expand global connectivity. “The major theme of the event this year echoes Facebook Connectivity’s mission and underscores Magma’s role in connecting people to a faster internet,” says Barritt.

This year’s conference featured three talks led by academic collaborators: Sylvia Ratnasamy (University of California, Berkeley), Kurtis Heimerl (University of Washington), and Rahman Doost-Mohammady (Rice University). For those interested in learning more about the event, all the sessions are on the Open Infra Foundation’s YouTube channel.

Magma is expanding its community of developers through an open-source industry collaboration with the Linux Foundation. The Linux Foundation will provide a neutral governance framework for Magma, and is joined by other open-source communities including the Open Infrastructure Foundation and the OpenAirInterface Software Alliance. Many other partner companies of varying sizes have also joined the project.

“The sustainability of open-source projects depends on a healthy ecosystem. For Magma, there are many partners actively contributing to the codebase and are actively deploying Magma. Their business success is intertwined with the success of Magma,” Kozat says. More information about the collaboration is available on the Linux Foundation blog.

Next steps

For 2021, the Magma team will continue to emphasize the importance of collaboration. “Our efforts to include the research community in the Magma ecosystem will continue in 2021 with full thrust,” says Kozat. “New funding opportunities and support mechanisms for universities will be offered to push the envelope further than the near-term industry needs.”

To get involved with the Magma developer community, check out Magma’s GitHub page. Here you will find information about Slack channels and mailing lists. For marketing and industry-related news and announcements, visit the Magma website and subscribe to receive updates. For general updates from the Facebook research community, follow our Facebook page.

The post Inside Magma: A look at the team behind the software appeared first on Facebook Research.

Read More

Understanding the intentions of Child Sexual Abuse Material (CSAM) sharers

No matter the reason, sharing images or videos of child sexual abuse (CSAM) online has a devastating impact on the child depicted in that content. Every time that content is shared, it revictimizes that child. Preventing and eradicating online child sexual exploitation and abuse requires a cross-industry approach, and Facebook is committed to doing our part to protect children on and off our apps. To that end, we have taken a careful, research-informed approach to understand the basis for sharing child sexual abuse and exploitation material on our platform, to ultimately develop effective and targeted solutions for our apps and help others committed to protecting children.

Over the past year, we’ve consulted with the world’s leading experts in child exploitation, including the National Center for Missing and Exploited Children (NCMEC) and Professor Ethel Quayle, to improve our understanding of why people may share child exploitation material on our platform. Research, such as Ken Lanning’s work in 2010, and our own child safety investigative team’s experiences suggested that people who share these images are not a homogeneous group; they share this imagery for different reasons. Understanding the possible or apparent intent of a sharer is important to developing effective interventions. For example, to be effective, the intervention we make to stop those who share this imagery based on a sexual interest in children will be different from the action we take to stop someone who shares this content in a poor attempt to be funny.

We set out to answer the following questions: Did we see possible evidence of different intentions among people who shared CSAM on our platforms, and if so, what behaviors were usually associated with them? Do some users likely share CSAM to intentionally exploit children (for example, out of sexual interest or for commercial benefit)? Were there other users who shared CSAM without necessarily intending to harm children (for example, a person sharing it out of outrage or shock, or two teens sexting)?

Why did we want to understand differences in sharers? 

Protecting children and addressing the sharing of child sexual abuse material cannot be solely rooted in a “detect, report, and remove” model. Prevention must also be at the core of the work we do to protect children, alongside our continual reporting and removal responsibilities. By attempting to understand the differences in CSAM sharers, we hope to: 

  • Provide additional context to NCMEC and law enforcement to improve our reports to NCMEC of cases of child sexual abuse and exploitation found on our apps. Our CyberTips allow for more effective triaging of cases, helping them quickly identify children who are presently being abused.  
  • Develop more effective and targeted interventions to prevent the sharing of these images. 
  • Tailor our responses to people who share this imagery based on their likely intent to reduce the sharing and resharing of these images — from the most severe product actions (for example, removing from the platform) to prevention education messaging (for example, our recently announced proactive warnings)
  • Develop a deeper understanding of why people share CSAM to support a prevention-first approach to child exploitation in the future supported by a more effective “detect and response” model; and share our learnings with all those dedicated to safeguarding children to inform their important work

Research review 

We reviewed 10 pieces of research from the world’s leading experts of child exploitation focused on the intentions, behaviors, or typologies of CSAM offenders. The papers included Ken Lanning’s work in 2010, “Child Molesters: A Behavioural Analysis”; Ethel Quayle’s 2016 review of typologies of internet offenders; Tony Krone’s foundational 2004 paper, “A Typology of Child Pornography Offending”; and Elliott and Beech’s 2009 work, “Understanding Online Child Pornography Use: Applying Sexual Offense Theory to Internet Offenders.” 

The research noted a number of key themes around the types of involvement people can have with child sexual abuse material, such as the intersections between online and offline offending, spectrums of offending involvements (from browsing to producing imagery), populations involved in this behavior have diverse characteristics and that there are distinct behaviors from different categories of offenders.

From research to a draft taxonomy of intent   

Much of the foundational research on why people engage with CSAM involved access to and evaluations of individuals’ psychological make up. However, Facebook’s application of this research to our platforms involves relying on behavioral signals from a fixed point in time and from a snapshot of users’ life on our platforms. We do not label users in any specific clinical or medical way, but try to understand likely intentions and potential trajectories of behavior in order to provide the most effective online response to prevent this abuse from occurring in the first place. A diverse range of people can and do offend against children.

Our taxonomy, built in consultation with the National Center for Missing and Exploited Children and Professor Ethel Quayle, was most heavily influenced by Lanning’s 2010 work. His research outlined a number of categories of people who engage with harmful behavior or content involving children. Lanning broke down those who offend against children into two key groups: preferential sex offenders and situational sex offenders. 

Lanning categorized the situational offender as someone who “does not usually have compulsive-paraphilic sexual preferences including a preference for children. They may, however, engage in sex with children for varied and sometimes complex reasons.” A preferential sex offender, according to Lanning, has “definite sexual inclinations” toward children, such as pedophilia. 

Lanning also wrote about miscellaneous offenders, in which he included a number of different types of personae of sharers. This group captures media reporters (or vigilante groups or individuals), “pranksters” (people who share out of humor or outrage), and other groups. This category for Lanning was a catchall to explain the intention and behavior of those who were still breaking the law but who “are obviously less likely to be prosecuted.”

Development for Facebook users

We used an evidence-informed approach to understand the presentation of child exploitation material offenders on our platform. This means that we used the best information available and combined it with our experiences at Facebook to create an initial taxonomy of intent for those who share child exploitative material on our platforms. 

The prevalence of this content on our platform is very low, meaning sends and views of it are very infrequent. But when we do find this type of violating content, regardless of the context or the person’s motivation for sharing it, we remove it and report it to NCMEC.

When we applied these groupings to what we were seeing at Facebook, we developed the following taxonomic groupings: 

  1. “Malicious” users
    1. Preferential offenders 
    2. Commercial offenders 
    3. Situational offenders 
  2. “Nonmalicious” users 
    1. Unintentional offenders 
    2. Minor nonexploitative users 
    3. Situational “risky” offenders 

Within the taxonomy, we have two overarching categories. In the “malicious” group are people we believe intended to harm children with their behavior, and in the “nonmalicious” group are people whose behavior is problematic and potentially harmful, but who we believe based on contextual clues and other behaviours likely did not intend to cause harm to children. For example, they shared the content with an expression of abhorrence.

In the subcategories of “malicious” users, we have leaned on work by Lanning and Elliott and Beech. Preferential and situational offenders being as per Lanning’s definition above, with commercial offenders coming from Elliott and Beech’s work — criminally minded individuals, not necessarily motivated by sexual gratification but by the desire to profit from child-exploitative imagery. 

In the subcategories of “nonmalicious” users, we parse Lanning’s “miscellaneous” grouping. Unintentional offenders groups individuals who have shared imagery that depicts the exploitation of children, but who did so out of intentions such as outrage, attempted humor (e.g., . through the creation of a meme), or vigilante motives. This behavior is still illegal; we will still report it to NCMEC, but the user experience for these people might need to be different from that of those with malicious intent. 

The category “minor nonexploitative” came from feedback from our partners about the consensual, developmentally appropriate behavior that older teens may engage in. Throughout our consultations, a number of global experts, organizations, and academic researchers highlighted that children are, at times, a distinct grouping in some ways. Children can and do sexually offend against other children, but children also can engage in consensual, developmentally appropriate sexual behavior with one another. In a systematic review of sexting behavior among youth in 2018, it was noted that ‘sexting’ is “becoming a normative component of teen sexual behavior and development”  We added the “minor nonexploitative” category to our taxonomy in accord with this research. While the content produced is technically illegal and the behavior risky – as this imagery can be later exploited by others – it was important for us to separate out the nonexploitative sharing of sexual imagery between teens. 

The final grouping of nonmalicious offenders came from initial analysis of Facebook data. Our investigators and data scientists observed behavior from users where they were sharing a large amount of adult sexual content and amongst that content was child exploitative imagery, to which the user was potentially unaware that the imagery represented a child (which may depict children in their late teens, whose primary and secondary sexual characters appear developed to that of adulthood). We report these images to NCMEC as they depict the abuse of children that we are aware of, but the users in these situations may not be aware that the image depicts a child. We are still concerned about their behavior and may want to offer interventions to prevent any escalation in behavior. 

Application 

Using our above taxonomy, a group of child-safety investigators at Facebook analysed over 200  accounts that we reported to NCMEC for uploading CSAM, drawn across taxonomic classes during the period 2019 to mid-2020, to identify on-platform behaviors that they believed were indicative of the different intent classes. While only the individual who uploads CSAM will ever truly know their own intent, these indicators generally surfaced patterns of persistent, conscious engagement with CSAM and other minor-sexualising content if it existed. These indicators include behaviours such as obfuscation of identity, child-sexualising search terms, and creating connections with groups, pages, and accounts whose clear purpose is to sexualise children

We have now been testing these indicators to identify individuals who exhibit or lack malicious intent. Our application of the intent taxonomy is very new and we continue to develop our methodology, but we can share some early results. We evaluated 150 accounts that we reported to NCMEC for uploading CSAM in July and August of 2020 and January 2021, and we estimate that more than 75% of these did not exhibit malicious intent (i.e. did not intend to harm a child), but appeared to share for other reasons, such as outrage or poor humor. While this study represents our best understanding, these findings should not be considered a precise measure of the child safety ecosystem. 

Our work is now to develop technology that will apply the intent taxonomy to our data at scale. We believe understanding the intentions of sharers of CSAM will help us to effectively target messaging, interventions, and enforcement to drive down the sharing of CSAM overall. 


Definitions and Examples for each taxonomic group

Malicious Users 

  • Preferential Offenders: People whose motivation is based on an inherent and underlying sexual interest in children (i.e. pedophiles/hebephiles). They are only sexually interested in children. 
    • Example: User is connected with a number of minors, who is coercing them to produce CSAM, with threats to share the existing CSAM they have obtained. 
  • Commercial Offenders: People who facilitate child sexual abuse for the purpose of financial gain. These individuals profit from the creation of CSAM and may not have a sexual interest in children
    • Example: A parent who is making their child available for child abuse via live stream in exchange for payment. 
  • Situational Offenders: People who take advantage of situations and opportunities that present to engage with CSAM and minors. They may be morally indiscriminate, they may be interested in many paraphilic topics and CSAM is one part of that.
    • Example: User who is reaching out to multiple other users to solicit sexual imagery (adults and children), if a child shares back imagery, they will engage with that imagery and child. 

Non-malicious Users 

  • Unintentional Offenders: This is a broad category of people who may not mean to cause harm to the child depicted in the CSAM share but are sharing out of humor, outrage, or ignorance. 
    • Example: User shares a CSAM meme of a child’s genitals being bitten by an animal because they think it’s funny.
  • Minor Non-Exploitative Users: Children who are engaging in developmentally normative behaviour, that while technically illegal or against policy, is not inherently exploitative, but does contain risk.
    • Example: Two 16 year olds sending sexual imagery to each other. They know each other from school and are currently in a relationship.
  • Situational “Risky” Offenders: Individuals who habitually consume and share adult sexual content, and who come into contact with and share CSAM as part of this behaviour, potentially without awareness of the age of subjects in the imagery they have received or shared.
    • Example: A user received CSAM that depicts a 17 year old, they are unaware that the content is CSAM. They reshare it in a group where people are sharing adult sexual content. 

The post Understanding the intentions of Child Sexual Abuse Material (CSAM) sharers appeared first on Facebook Research.

Read More

Validating symptom responses from the COVID-19 Survey with COVID-19 outcomes

In collaboration with Carnegie Mellon University (CMU) and the University of Maryland (UMD), Facebook has been helping facilitate a large-scale and privacy-focused daily survey to monitor the spread and impact of the COVID-19 pandemic in the United States and around the world. The COVID-19 Survey is an ongoing operation, taken by about 45,000 people in the U.S. each day. Respondents provide information about COVID-related symptoms, vaccine acceptance, contacts and behaviors, risk factors, and demographics, allowing researchers to examine regional trends throughout the world. To date, the survey has collected more than 50 million responses worldwide.

In addition to visualizing this data on the Facebook Data for Good website, researchers can find publicly available aggregate data through the COVIDcast API and UMD API, and downloadable CSVs (USA, world). The analyses shown here are all based on publicly available data from CMU and other public data sources (e.g., the U.S. Census Bureau and the Institute for Health Metrics and Evaluation). Microdata is also available upon request to academic and nonprofit researchers under data license agreements.

Now that the survey has run for several months, the aggregated, publicly available data sets can be analyzed to determine important properties of the COVID-related symptom signals obtained through the survey. Here, we first investigate whether survey responses provide leading indicators of COVID-19 outbreaks. We find that survey signals related to symptoms can lead COVID-19-related deaths and even cases by many days, although the strength of the correlation can depend on population size and the height of the peak of the pandemic.

Following this observation, we analyzed under which conditions these leading indicators are detectable. We find that small-sample-size statistics and the presence of a small but significant “confuser” signal can contribute an offset in the signals that obscure actual changes in the COVID-19 Survey signals.

Survey responses can provide leading indicators for COVID-19 outbreaks

For the following analyses, we used publicly available aggregate data from the CMU downloadable CSV that has been smoothed and weighted. We focus on illness indicators (symptoms) that surveyed individuals reported having personally or knowing about in their local community between May 1, 2020, and January 4, 2021. To determine whether symptom signals from the survey act as leading indicators of new COVID-19 cases or deaths, we take data at the U.S. state level, lag symptom signals in time, and note the correlation with COVID-19 outcomes (e.g., new daily cases or new daily deaths).

In the figure below, we show how Community CLI (COVID-like illness in the local community) from the survey is a leading signal of new daily deaths in Texas, as tabulated by the Institute for Health Metrics and Evaluation (IHME). In the upper row, we compare the estimated percentage of survey respondents who know people in their local community with CLI symptoms (fever along with cough, shortness of breath, or difficulty breathing) with new daily deaths over time when lagging the symptom signal by 0, 12, or 24 days. In the lower row, we plot the time-lagged Community CLI against new daily deaths and determine the Pearson correlation coefficient (Pearson’s r: 0.57, 0.86, 0.98, respectively).

We can use this approach with the various illness indicators captured in the survey and multiple lag times to determine how “leading” the signal is to COVID-19 outcomes, as in the figure below. In the upper row, we show time series plots of symptom signals in the survey (% CLI, % Community CLI, % CLI + Anosmia, and % Anosmia), new daily cases, and new daily deaths in Texas and Arizona from May 2020 through December 2020. In the lower row, we plot the Pearson correlation coefficient of symptoms and new daily deaths when lagging the symptom signal between -10 and 40 days.

For each U.S. state, we can approximate how leading a symptom signal is by determining the optimal time lag, or the time lag that gives the highest Pearson’s r for that symptom. However, this method will not find an optimal time lag when a region 1) has poor outcome ascertainment (e.g., insufficient testing), 2) is less populated and has too few survey samples (see below), or 3) has data only for one side of an outcome peak (e.g., cases constantly falling or constantly rising), as the optimal lag is ambiguous. In the figure below, we show the optimal time lag (days, mean ± 95 percent c.i.) for four symptom signals (CLI, Community CLI, CLI + Anosmia, and Anosmia) in 39 U.S. states with large populations that experienced large COVID-19 outbreaks.

While all four symptom signals lead new deaths by many days, the symptom signal CLI appears to lead new deaths by more time than new cases (left, CLI: 21.3±3.0 days, new cases: 17.7±2.3 days). This is confirmed when running the same analysis for all four symptom signals using new daily cases as the outcome (right, CLI leads new cases by 8.2±4.0 days).

Detectability of COVID-19-related signals

In regions with relatively large outbreaks and reliable COVID prevalence data, the strength of symptom-outcome correlations depend on the height of the peak of the pandemic. The plot below shows that states with larger populations or that experienced a high pandemic peak (maximum number of COVID-19 cases per million people) show better correlation between CLI and new cases than smaller states or states that avoided a large outbreak.

We observed two major influences that reduced the survey signal quality for these states with poor correlations: 1) statistical noise arising from a limited number of survey responses, and 2) the presence of a confuser signal in the data.

The statistical noise in the COVID-19 Survey originates from the fact that surveys are conducted on small samples of a population (read more about our sampling and weighting methodology here). That is, if you were to ask a random person on a random day about their health, it is unlikely that they would be experiencing COVID-19 symptoms at that time. Further, if not enough people are sampled, the survey will be unlikely to identify even one person with COVID-19 symptoms. This means that survey signals for rare symptoms like CLI will have higher relative variance than Community CLI, since the probability of a person knowing another with COVID-like symptoms is typically higher than the probability of the respondent’s having symptoms.

Turning to the second point, our analysis revealed that even in the absence of a COVID-19 outbreak, there exists a persistent baseline in symptom signals like CLI and Community CLI. Irrespective of the origin of this confuser signal (one explanation being survey respondents who happen to have COVID-like symptoms but not COVID-19), it can obscure actual outbreaks even in situations with a large number of survey responses and low statistical noise.

Take the state of Washington, for example, from April 2020 to December 2020. In the left panel, % CLI (green) shows high relative variance and never falls below the confuser baseline of ~0.25 percent, obscuring the summer 2020 COVID-19 outbreak and rendering the fall outbreak barely visible. On the other hand, % Community CLI (orange) has lower relative variance, and both the summer and fall outbreaks are clearly visible. The right panel affirms this, showing that the Community CLI survey signal with approximately 7 days’ lag correlates very well with new deaths, while CLI does not.

Summary

In our preliminary exploration of the illness indicators available in the public COVID-19 Survey data sets, we find that symptom signals like COVID-like illness (CLI) and Community CLI correlate with COVID-19 outcomes, sometimes leading new COVID-19 cases and deaths by weeks. Additionally, we observe two main components of these signals that are not COVID-related and that can ultimately obscure the real effects of an outbreak in the data. More work will be needed to quantify this in more detail, and to expand this analysis globally.

Because the COVID-19 Surveys are run daily, worldwide, and are not subject to the types of reporting delays associated with COVID test results, for example, survey responses may represent the current pandemic situation better than official case counts. In agreement with past work showing that COVID-19 Survey signals can improve short-term forecasts, our analysis here demonstrates the potential of the survey to power COVID-19 hotspot detection algorithms or improve pandemic forecasting.

Facebook and our partners encourage researchers, public health officials, and the public to make use of the COVID-19 survey data (available through the COVIDcast API and UMD API) and other data sets (such as Facebook Data for Good’s population density maps and disease prevention maps) for new analyses and insights. Microdata from the surveys is also available upon request to academic and nonprofit researchers under data license agreements.

The post Validating symptom responses from the COVID-19 Survey with COVID-19 outcomes appeared first on Facebook Research.

Read More

Introducing new election-related ad data sets for researchers

We previously announced that starting February 1, 2021, we would make targeting information for more than 1.3 million social issues, electoral, and political Facebook ads available to academic researchers for the first time. This data package includes ads that ran during the three-month period prior to Election Day in the United States, from August 3 to November 3, 2020, and is accessible through the Facebook Open Research and Transparency (FORT) platform.

As part of this launch, we are sharing access to two new data sets:

  • Ad Targeting data set: Includes the targeting logic of social issues, election, and political ads that ran between August 3, 2020, and November 3, 2020. We exclude ads that had fewer than 100 impressions, which is one of several steps we take to protect users’ privacy.
  • Ad Library data set: Includes social issues, election, and political ads that are part of the Ad Library product. It is included so that researchers can analyze the ads and targeting information in the same environment. In other words, this data set is a copy of corresponding Ad Library data made available in the FORT platform and is different from the Ad Library API product.

We created this tool to enable academic researchers to better study the impact of Facebook’s products on elections, and we included measures to protect people’s privacy and keep the platform secure.

How to access the data sets

To apply for access to these data sets, please fill out this form.

After you provide your information, Facebook will contact you with details on next steps, which includes details on the Facebook Research Data Agreement (RDA) and the ID verification process. Once you and your university have signed the RDA and you are ID verified, you will gain access to the Facebook Open Research and Transparency platform and these two new data sets.

More information about the election-related ad data sets

Ad Targeting data set

This includes the targeting options selected by advertisers when creating an ad. You can learn more about Facebook ads here.

Overview of targeting options

Learn about more targeting options here.

  • Location: Cities, communities, and countries
  • Demographics: Age, gender, education, job title, and more
  • Interests: Interests and hobbies of the people advertisers want to reach — these help make ads more relevant
  • Behaviors: Consumer behavior, such as prior purchases and device usage
  • Connections: Audiences based on people who are connected to the advertiser’s Facebook Page, app, or event
  • Custom audiences: Options that enable an advertiser to find their existing audiences among people who are on Facebook, e.g., through customer lists, website or app traffic, or engagement on Facebook. Learn more about the different types of custom audiences here. In the data set, we indicate whether the custom audience used is based specifically on a customer list.
  • Lookalike audiences: Help advertisers reach new people potentially interested in their business because they’re similar to their existing customers. Learn more about how advertisers set up a lookalike audience.

Wherever applicable, we indicate whether the targeting options were selected for inclusion or exclusion targeting.

Key considerations for the Ad Targeting data set

Location data

This data set includes the location targeting chosen by an advertiser for the ad. Advertisers can input location targeting in a number of ways, such as by selecting zip codes, countries, designated market areas (DMAs), or pindrops/addresses/places with a specified radius.

We’ve provided the location targeting selected by advertisers. When an advertiser selects an address, a place, or a location pindrop, we note the type of selection, the city it falls in, and the radius specified by the advertiser. Larger geographic areas, such as zip codes, cities, or countries, are included in the data set.

The following examples help illustrate the transformations:

On the left, the targeting selection by the advertiser; on the right, the transformation found in the data set. You will see that addresses, pindrops (longitude and latitude), and places are replaced by the redacted text <address>, <location>, and <place>, respectively.

  1. Seattle + 5 miles —> Seattle (+5 miles)
  2. 1 North Almaden Blvd, San Jose + 5 miles —> <address> San Jose (+5 miles)
  3. 95110, San Jose —> 95110, San Jose (so no change)
  4. 95110, San Jose + 5 miles —> 95110, San Jose (+5 miles)
  5. 37.335080; -121.895480 + 5 miles —> <location> San Jose (+5 miles)
  6. Acme Park, San Jose (+ 1 mile) —> <place> San Jose (+1 mile)

Joining targeting data with Ad Library data

If researchers want to understand more information about an ad (its creative, spend, etc.) and want to map this ad with its targeting information, they can perform a join between the column ad_archive_id of ad_archive_api (Ad Library data) with the column archive_id of ad_library_targeting table (Ad Targeting data).

However, you may see inconsistencies between ad_archive_api and ad_library_targeting tables:

If you perform a join between the two tables, you will see that there are ads in the targeting table that do not have a corresponding entry in the ad library table (or vice versa).

This happens because ads could be classified as political/nonpolitical long after they have been run. When this happens, the ad library is updated. But since the targeting data set is a one-time data release that occurred on January 22, it would not be reflected in it.

However, this doesn’t happen often. For example, when we made the one-time data release of targeting data on January 22, 2021, we found nine ads (out of roughly 1.3 million) that were in the targeting data set but not in the ad library because they were later found to be incorrectly labeled as political ads and were removed from the library. However, since the targeting data set had already been generated by then, these nine ads were included in this data set.

Over the next few weeks, we will monitor this situation, and if we notice a large volume of ads where such an issue exists, then we will evaluate whether to update this data set with these new ads.

Ad Library data set

The Ad Library data set contains the following fields:

  • ad_archive_id: ID for the archived ad object
  • ad_creation_time: The UTC date and time when someone created the ad. This is not the same time as when the ad ran. Includes date and time separated by T. Example: 2019-01-24T19:02:04+0000, where +0000 is the UTC offset.
  • ad_creative_body: The text that displays in the ad. Typically 90 characters. See Reference, Ad Creative.
  • ad_creative_link_caption: If an ad contains a link, the text that appears in the link
  • ad_creative_link_description: If an ad contains a link, any text description that appears next to the link, such as a caption or description
  • ad_creative_link_title: If an ad contains a link, any title provided
  • ad_delivery_start_time: Date and time when an advertiser wants Facebook to start delivering any of the ads. Provided in UTC as in ad_creation_time
  • ad_delivery_stop_time: The time when an advertiser wants to stop delivery of their ad. If this is blank, Facebook runs the ad until the advertiser stops it or they spend their entire campaign budget. In UTC.
  • ad_snapshot_url: String with URL link which displays the archived ad
  • currency: The currency used to pay for the ad, as an ISO currency code
  • impressions: A string containing the number of times the ad created an impression. In ranges of <1000, 1K-5K, 5K-10K, 10K-50K, 50K-100K, 100K-200K, 200K-500K, >1M.
  • demographic_distribution: The demographic distribution of people reached by the ad. Provided as age ranges and gender:
    • Age ranges can be one of the following: 18-24, 25-34, 35-44, 45-54, 55-64, 65+
    • Gender can be any of the following strings: “Male”, “Female”, “Unknown”
  • funding_entity: A string containing the name of the person, company, or entity that provided funding for the ad. Provided by the purchaser of the ad.
  • page_id: ID of the Facebook Page that ran the ad
  • page_name: Name of the Facebook Page that ran the ad
  • region_distribution: Regional distribution of people reached by the ad. Provided as a percentage and where regions are at a subcountry level.
  • spend: A string showing the amount of money spent running the ad as specified in currency. This is reported in ranges: <100, 100-499, 500-999, 1K-5K, 5K-10K, 10K-50K, 50K-100K, 100K-200K, 200K-500K, >1M.
  • is_active: Binary; describes whether an ad is active
  • reached_countries: Facebook delivered the ads in these countries. Provided as ISO country codes.
  • publisher_platforms: Search for ads based on whether they appear on a particular platform, such as Instagram or Facebook. You can provide one platform or a comma-separated list of platforms.
  • potential_reach: This is an estimate of the size of the audience that’s eligible to see this ad. It’s based on targeting criteria, ad placements, and how many people were shown ads on Facebook apps and services in the past 30 days. This is not an estimate of how many people will actually see this ad, and the number may change over time. It isn’t designed to match population or census estimates.

About the Facebook Open Research and Transparency platform

The Facebook Open Research and Transparency (FORT) platform facilitates responsible research by providing flexible access to valuable data. The platform is built with validated privacy and security protections, such as data access controls, and has been penetration-tested by internal and external experts.

The FORT platform runs on a configured version of JupyterHub, an open source tool that is widely used by the academic community. Hosted on Amazon Web Services on servers in Ireland, the FORT platform supports multiple standard programs, including SQL, Python, and R, and a specialized bridge to specific Facebook Graph APIs.

Publication guidelines

Researchers may publish research conducted using this data without Facebook’s permission. Note that the terms of the Facebook Research Data Agreement require researchers to submit publications of any kind to Facebook for a privacy review at least 30 days prior to publication.

The post Introducing new election-related ad data sets for researchers appeared first on Facebook Research.

Read More

What precautions do people take for COVID-19?

To prevent the spread of COVID-19, people have been encouraged to adopt preventative measures such as hand-washing and mask-wearing. Through a survey of over 200,000 Facebook users in 26 countries conducted on Facebook between May and October 2020, we examined how the adoption of such precautions has ebbed and flowed over time, and how precaution-taking relates to both offline activity (e.g., seeing people in person) and online activity (e.g., viewing COVID-19-related content on Facebook).

In this survey, we asked about whether people adopted any of seven precautions:

  • Washing hands or using hand sanitizer
  • Using a face mask
  • Avoiding crowded places
  • Practicing social distancing
  • Avoiding public locations
  • Not leaving home except for essentials
  • Not leaving home at all

Precaution-taking

How common were different precautions?

From May through October 2020, hand-washing and mask-wearing were the two most common precautions that people took worldwide. Of the people we surveyed, 73 percent reported washing their hands or using hand sanitizer, and 72 percent reported wearing a face mask.


Figure 1: Hand-washing and mask-wearing were the most commonly reported precautions that people took between May and October 2020 because of the COVID-19 pandemic. Country flags indicate prevalence in the respective countries (Taiwan, Peru, and the United States).

Figure 1 also shows how these overall rates compare with those of three countries: Taiwan (0.29 deaths per million as of early October 2020), the United States (640 deaths per million, and Peru (1,000 deaths per million). While people in Taiwan reported lower rates of following all precautions, people in Peru reported the highest rates of the more restrictive precautions (such as not leaving home at all).

How did precaution-taking change over time?

Despite the COVID-19 case rate rising and the death rate holding relatively steady from May to October 2020, likely through a combination of improved testing and treatment, people took fewer precautions over the same time period (Figure 2). In May 2020, three in five people reported avoiding public locations; by October 2020, only two in five people did.


Figure 2: Globally, between May and October 2020, people were less likely to take precautions over time.

Precaution-taking decreased over time. For all precautions surveyed except for mask-wearing, the proportion of respondents who reported taking them decreased over the period of the study. Mask-wearing increased between May and September 2020 but decreased in October 2020. In the United States, there were significant decreases in people washing their hands, avoiding crowded places, avoiding public locations, and not leaving home except for essentials.

Women took more precautions than men (Figure 4). This aligns with recent research that women were more likely to perceive COVID-19 as a serious problem and to adopt preventive measures.


Figure 3: Women took more COVID-19-related precautions than men.

Age differences. Older adults took more precautions, primarily in hand-washing and social distancing (Figure 4), while 18- to 30-year-olds were most likely to not take any precautions at all. These trends are largely consistent with the increase in risk of getting COVID-19 as one gets older (e.g., the case fatality rate is about 10 times higher for those above 60 than it is for those under 40).


Figure 4: Older adults took more precautions than younger adults.

How did precaution-taking vary by country?

Precaution-taking differed substantially by country. Not only were there large differences across countries in measures taken, such as mask-wearing and social distancing, but depending on the country, related precautions (e.g., mask-wearing vs. social distancing) were adopted at vastly differing rates.

Differences in mask-wearing (Figure 5). A majority of Americans wear masks: On average, three out of four people reported wearing a mask as a precaution. People in Japan were most likely to wear masks (87 percent), while Swedes were least likely to wear masks (7 percent), likely because Sweden’s public health authorities discouraged people from wearing masks. And though only 68 percent of people in Taiwan reported wearing a mask, Taiwan has one of the lowest death rates in the world, likely because of the successful countermeasures taken from the beginning of the pandemic.


Figure 5: Almost nine in 10 people in Japan reported wearing a mask as a preventive measure, while fewer than one in 10 people in Sweden reported wearing a mask.

The relatively high compliance in the United States is notable, despite poor guidance from the federal government during the time. Mask-wearing was highest in Maryland (93 percent) and lowest in Tennessee (58 percent) (Figure 6). These largely correspond to whether states had instituted a mask-wearing mandate. For example, Tennessee’s governor did not support a mask mandate. These findings on mask-wearing largely mirror existing surveys (e.g., by YouGov). Still, even higher rates of mask-wearing may be necessary to substantially reduce the spread of COVID-19, as described in a recent CMU blog post.


Figure 6: Between May and October 2020, in the United States, mask-wearing was highest in Maryland (MD) and lowest in Tennessee (TN). Grayed-out states had fewer than 50 responses.

Social distancing. People in Canada, the United Kingdom, and the United States reported some of the highest rates of social distancing in the world (Figure 7). Notably, 70 percent of Swedes reported social distancing, despite only 7 percent wearing masks. Like the mask-wearing rate, the social distancing rate was lower in Taiwan (27 percent), likely reflecting the already-low prevalence of COVID-19.


Figure 7: People in Canada, the United Kingdom, and the United States reported some of the highest rates of social distancing. In contrast with Swedes’ low rate of mask-wearing, they reported a relatively high rate of social distancing.

Precaution-taking was correlated with the deadliness of the disease. People reported being more cautious when the risk of dying from COVID-19 was higher. In countries with more COVID-19 deaths per million, people were more likely to take one or more precautions, and this trend was strongest for social distancing in particular (Figure 8).


Figure 8: Social distancing was correlated with the death rate in each country.

Surprising differences. While one may expect the adoption of different precautions to be largely correlated with each other, there were some countries where some precautions were relatively more prevalent than others:

  • Sweden: Sweden had high rates of hand-washing (81 percent) and social distancing (71 percent), but it had low rates of mask-wearing (7 percent), staying at home except for essentials (30 percent), and the avoidance of public locations (34 percent).
  • Japan: Other than for mask-wearing (88 percent), Japan had a comparatively low rate of following most other precautions (e.g., 36 percent avoided public locations).
  • Italy: While Italy had a relatively higher rate of mask-wearing (83 percent), people were less likely to stay home (33 percent) and less likely to avoid public locations (26 percent).


Figure 9: There are substantial differences in how common different preventive measures are, both within and across countries. Blue indicates relatively higher prevalence of a precaution in a country compared with other countries, while red indicates relatively lower prevalence.

Why do these country differences exist? Multiple factors play a role: the death rate, perceptions of risk, preexisting norms (e.g., Japan’s culture of mask-wearing), and differences in government directives (e.g., Sweden discouraged mask-wearing). Other factors such as population density (e.g., it may be harder to avoid crowds in cities) also likely play a role.

Precaution-taking, social interactions, and feeling better

In the previous section, we described how precaution-taking varied by the type of precaution, a person’s age and gender, and the country that they live in. In this section, we explore how precaution-taking related to whether people saw others in person, and how it related to generally feeling better or feeling worse.

Was precaution-taking associated with seeing more people in person?

COVID-19 is a communicable disease, so many precautions are aimed at reducing the risk of person-to-person transmission. As Figure 10 shows, people who adopted “protective” measures (e.g., mask-wearing and hand-washing) reported having interacted with more people offline outside of their homes, while those who adopted “isolating” measures (e.g., staying home) reported having interacted with fewer people offline.


Figure 10: People who adopted more “protective” measures (e.g., hand-washing) had more offline interactions; people who adopted more “isolating” measures (e.g., not leaving home) had fewer offline interactions. Error bars represent 95 percent confidence intervals; darker bars indicate statistically significant associations (p < 0.05).

Does taking more precautions lead to more offline interactions (or vice versa)? On one hand, some researchers have hypothesized that precautions such as mask-wearing may give people a false sense of security, leading them to become less careful in their interactions with others (e.g., how often a person interacts with others outside their household). On the other hand, people with more offline interactions likely have a stronger incentive to adopt “protective” precautions.

Our findings show little evidence of the former. When we examined the extent to which precaution-taking in the past was correlated with future offline interactions, we found that taking more precautions was associated with a small subsequent increase in offline interactions in Spain and a decrease in offline interactions in Indonesia and the Philippines. Additionally, in Canada and Japan, having more offline interactions was associated with subsequent decreases in the number of precautions people took. More research is necessary to understand and verify these observations.

Was precaution-taking associated with feeling better or feeling worse?

Some studies have warned that “quarantine and social distancing […] lead to elevated levels of loneliness and social isolation.” Past research found that loneliness is associated with negative affect, so we expected largely similar findings with respect to precaution-taking. However, this was not the case.

People who took precautions reported feeling better. To understand whether people felt better or worse, the survey included a question about the extent to which a person “felt good most of the time” and a question about the extent to which they “felt bad most of the time.” People who reported taking precautions were more likely to feel good most of the time (Figure 11) and less likely to feel bad most of the time (not shown). Only in the extreme case of not leaving home at all did people report feeling worse.

Why might people who take precautions feel better? They may feel safer from taking these precautions or feel better from actively playing a role in their community to stem the spread of the disease. Taking “protective” precautions such as mask-wearing may also have allowed people to better maintain social connections with each other and thus feel more supported.


Figure 11: Precaution-taking is generally associated with feeling better.

Precaution-taking and Facebook

In this section, we show how seeing COVID-19-related content on Facebook may be related to precaution-taking, and how precaution-taking may be related to use of the “care” reaction.

Was viewing COVID-19-related content on Facebook associated with precaution-taking?

People who saw more COVID-19-related content on Facebook were more likely to take one or more precautions. After adjusting for differences in demographics and on-Facebook activity (e.g., the total number of posts that a person saw), we found a significant association between the proportion of COVID-19-related content seen and whether any precautionary measures were taken. On one hand, people who see more COVID-19-related content may end up becoming more informed and thus take more precautions; on the other hand, people who are more concerned about COVID-19 may also be the people who follow and view more COVID-19-related content.

Which precautions were associated with viewing more COVID-19-related content? People who viewed proportionally more COVID-19-related content were more likely to wear a mask (Figure 12). They were less likely to not leave their homes at all.


Figure 12. People who viewed proportionally more COVID-19-related content were more likely to also report wearing a face mask or washing their hands.

Was precaution-taking associated with using the “care” reaction?

Mask-wearing was associated with using the “care” reaction more. One commonly emphasized benefit of mask-wearing is that it protects others from being infected by the wearer. This benefit is largely different from many of the other precautionary measures, where the perceived benefit accrues largely to oneself. As such, mask-wearing may be more strongly associated with caring for others compared with other precautionary measures. To understand if this may be the case, we examined how people used the “care” reaction, which typically indicates support or care for others.

In regression analyses, mask-wearing was associated with proportionally greater use of the “care” reaction (Figure 13). This corroborates prior work that indicated that antisocial traits such as a lack of empathy and risk-taking were associated with a lower likelihood of mask-wearing or social distancing. While there were also significant associations with respect to not leaving home or avoiding crowded places, follow-up analyses indicate that these were because mask-wearing was correlated with both.


Figure 13. Mask-wearing was associated with using the “care” reaction more, suggesting a link between mask-wearing and empathy.

Conclusion

In summary, from May through October 2020, people took fewer preventive measures against COVID-19. Across 26 countries, including the United States, hand-washing and mask-wearing were the two most common preventive measures. Women adopted more precautions than men, and older adults adopted more precautions than younger adults.

Still, there were differences in how often precautions were taken in different countries. While Japan had the highest rate of mask-wearing (88 percent), social distancing was far less prevalent (39 percent). Conversely, Sweden had the lowest rate of mask-wearing (7 percent), but social-distancing was more common (71 percent).

And while some precautions were associated with seeing more people in person (e.g., wearing masks), others were not (e.g., avoiding public locations). Precaution-taking was also associated with feeling better.

Additionally, people who saw more COVID-19-related content on Facebook were also more likely to take one or more precautions. Mask-wearing was also associated with using the “care” reaction more, so people who wear masks may tend to be more supportive of others.

We hope that these findings begin to shed light on the relationship between COVID-19, the precautions that people take, and the online and offline interactions that people have. For more information about what Facebook is doing to keep people safe and informed about the coronavirus, read the latest updates on Newsroom.

Read Appendix

The post What precautions do people take for COVID-19? appeared first on Facebook Research.

Read More

How Facebook partners with academia to help drive innovation in energy-efficient technology

Facebook is committed to sustainability and the fight against climate change. That’s why in September 2020 we announced our commitment to reaching net-zero emissions across our value chain in 2030. Part of Facebook’s sustainability efforts involve data center efficiency — from building servers that require less energy to run to developing a liquid cooling system that uses less water.

To learn more about these data center sustainability efforts at Facebook and how we’re engaging with the academic community in this space, we sat down with Dharmesh Jani (“DJ”), the Open Ecosystem Lead on the hardware engineering team and the Open Compute IC Chair, and Dr. Katharine Schmidtke, Director of Sourcing at Facebook for application-specific integrated circuits and custom silicon. DJ and Schmidtke’s teams are working to achieve four goals:

  1. Extend Facebook data center equipment lifecycle and make our gear reusable by others.
  2. Improve energy efficiency in Facebook infrastructure via hardware and software innovations.
  3. Reduce carbon-heavy content in our data centers.
  4. Work with industry and academia to drive innovation on sustainability across our value chain.

DJ and Schmidtke discuss why it’s important to build data centers that are as energy efficient as possible, how we’re working with and supporting academia and industry partners in this space, and potential research challenges that the Facebook researchers and engineers could tackle next.

Building energy-efficient data centers

For over a decade, Facebook has been committed to sustainability and energy efficiency. In 2009, Facebook built its first company-owned data center in Prineville, Oregon, one of the world’s most energy-efficient data centers with a power usage effectiveness (PUE) ratio between 1.06 and 1.08. In 2011, Facebook shared its designs with the public and — along with other industry experts — launched the Open Compute Project (OCP), a rapidly growing global community whose mission is to design, use, and enable mainstream delivery of the most efficient designs for scalable computing. However, there’s more to be done.

“On average, data centers use 205 TWh of electricity per year, which is the equivalent of 145 metric tons of CO2 emissions,” explains DJ. “With the growth of hyperscale data centers in the coming years, this emission is going to increase dramatically if mitigation is not considered today (source 1, source 2). Facebook wants to work to address this growing emission, as well, to ensure we run efficient operations and achieve our goal of net-carbon zero in 2030.”

According to DJ, Facebook is doing multiple things to address these problems: “The sustainability team within Facebook is working across organizations to align on the goals that lead to reduction in carbon. Circularity is one of the emerging efforts within infrastructure to increase equipment life cycle, which has the biggest impact on the net-zero-carbon effort. We’re driving sustainability and circularity efforts in the industry through the Open Compute Project,” he says.

Data center construction itself also contributes to carbon emission. High-utilization efficiency on already-built data centers is the key to reducing new data center construction demand. Over the years, Facebook has been developing a suite of industry-leading technologies to control and manage the peak power demand of data centers. As a result, many more servers can be hosted in existing data centers with limited power capacity. This has led to more than 50% data center construction demand reduction. The technology is developed in-house with the help of academic collaborations and research internship programs. Some key research findings and hyper-scale industrial operation experience are also shared back to the community via top academic conference publications. Here are some examples: Dynamo: Facebook’s Data Center-Wide Power Management System, Coordinated Priority-aware Charging of Distributed Batteries in Oversubscribed Data Centers.

Learn more about Facebook data center efficiency on the Tech@ blog, and read our latest Sustainability Report on Newsroom.

Partnerships and collaborations

Developing energy-efficient technology isn’t something that industry can do alone, which is why we often partner with experts in academia and support their pioneering work. “Facebook has launched a number of research collaborations directed at power reduction and energy efficiency over the past few years,” Schmidtke says. “Recently, Facebook sponsored the Institute of Energy Efficiency at UC Santa Barbara with a gift of $1.5 million over three years. We hope our contribution will help foster research in data center energy efficiency.”

“Another example is the ongoing research collaboration with Professor Clint Schow at UCSB,” Schmidtke says. “The project is focused on increasing the efficiency of optical interconnect data transmission between servers in our data center network. The research has just entered its second phase and is targeting highly efficient coherent optical links for data transmission.”

Facebook is also an industry member of Center for Energy-Smart Electronic Systems (partnering with the University of Texas at Arlington) and Future Renewable Electric Energy Delivery and Management Systems Engineering Research Center (at North Carolina State University).

In addition to fostering innovation within the academic community, Facebook is leveraging industry partners. According to DJ, “We’re looking to drive sustainability-related initiatives within the OCP community to align other industry players across the value chain. We plan to define sustainability as one of the OCP tenets so that all future contributions can focus on it.”

What’s next

DJ offers three sustainability challenges that researchers in the field could tackle next, all of which would involve industry collaborations with academia and other research organizations.

One research challenge is making computation more carbon neutral. The AI field’s computing demands have witnessed exponential growth: Since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.4-month doubling time — a 300,000x increase in compute from AlexNet to AlphaGo Zero. “How can we make AI more efficient while the current approach of increasing computation is not viable?” says DJ. “This is one of the biggest challenges in the field, so I’m eager to see more Green AI initiatives.”

Another challenge is scheduling workloads (WL) within data centers when carbon intensity is low. “We have to think of the amount of WL coming into the data centers and complex interactions to optimize for such use cases,” explains DJ. “I hope to see novel algorithmic ways of reducing energy consumption, distributing workloads, and impacting carbon emissions.”

An additional potential area of focus is technology that utilizes chiplets. Chiplets can be thought of as reusable, mix-and-match building blocks that come together to form more complex chips, which is a more efficient system that uses a smaller carbon footprint. “I’m looking forward to new computer architectures that are domain specific and driven by chiplets,” says DJ. “We have only explored the tip of the iceberg in terms of sustainability. There is much we can do together in this space to further the goal of a greener tomorrow.”

Facebook is committed to open science, and we value our partnerships with industry and academia. We are confident that together we can help drive technology and innovation forward in this space.

The post How Facebook partners with academia to help drive innovation in energy-efficient technology appeared first on Facebook Research.

Read More