Announcing the Winners of the 2021 PyTorch Annual Hackathon

More than 1,900 people worked hard in this year’s PyTorch Annual Hackathon to create unique tools and applications for PyTorch developers and researchers.

Notice: None of the projects submitted to the hackathon are associated with or offered by Meta Platforms, Inc.

This year, participants could enter their projects into following three categories:

  • PyTorch Developer Tools: a tool or library for improving productivity and efficiency for PyTorch researchers and developers.
  • Web and Mobile Applications Powered by PyTorch: a web or mobile interface and/or an embedded device built using PyTorch.
  • PyTorch Responsible AI Development Tools: a tool, library, or web/mobile app to support researchers and developers in creating responsible AI that factors in fairness, security, privacy, and more throughout its entire development process.

The virtual hackathon ran from September 8 through November 2, 2021, with more than 1,900 registered participants from 110 countries, submitting a total of 65 projects. Entrants were judged on their idea’s quality, originality, potential impact, and how well they implemented it. All projects can be viewed here.

Meet the winners of each category below!

PYTORCH DEVELOPER TOOLS

First Place: RaNNC

RaNNC is a middleware to automate hybrid model/data parallelism for training very large-scale neural networks capable of training 100 billion parameter models without any manual tuning.

Second Place: XiTorch

XiTorch provides first and higher order gradients of functional routines, such as optimization, rootfinder, and ODE solver. It also contains operations for implicit linear operators (e.g. large matrix that is expressed only by its matrix-vector multiplication) such as symmetric eigen-decomposition, linear solve, and singular value decomposition.

Third Place: TorchLiberator

TorchLiberator automates model surgery, finding the maximum correspondence between weights in two networks.

Honorable Mentions

  • PADL manages your entire PyTorch work flow with a single python abstraction and a beautiful functional API, so there’s no more complex configuration or juggling preprocessing, postprocessing and forward passes.
  • PyTree is a PyTorch package for recursive neural networks that provides highly generic recursive neural network implementations as well as efficient batching methods.
  • IndicLP makes it easier for developers and researchers to build applications and models in Indian Languages, thus making NLP a more diverse field.

WEB/MOBILE APPLICATIONS POWERED BY PYTORCH

First Place: PyTorch Driving Guardian

PyTorch Driving Guardian is a tool that monitors driver alertness, emotional state, and potential blind spots on the road.

Second Place: Kronia

Kronia is an Android mobile app built to maximize the harvest outputs for farmers.

Third Place: Heyoh camera for Mac

Heyoh is a Mac virtual camera for Zoom and Meets that augments live video by recognizing hand gestures and smiles and shows animated effects to other video participants.

Honorable Mentions

  • Mamma AI is a tool that helps doctors with the breast cancer identification process by identifying areas likely to have cancer using ultrasonic and x-ray images.
  • AgingClock is a tool that predicts biological age first with methylation genome data, then blood test data and eventually with multimodal omics and lifestyle data.
  • Iris is an open source photos platform which is more of an alternative of Google Photos that includes features such as Listing photos, Detecting Categories, Detecting and Classifying Faces from Photos, Detecting and Clustering by Location and Things in Photos.

PYTORCH RESPONSIBLE AI DEVELOPMENT TOOLS

First Place: FairWell

FairWell aims to address model bias on specific groups of people by allowing data scientists to evaluate their dataset and model predictions and take steps to make their datasets more inclusive and their models less biased.

Second Place: promp2slip

Promp2slip is a library that tests the ethics of language models by using natural adversarial texts.

Third Place: Phorch

Phorch adversarially attacks the data using FIGA (Feature Importance Guided Attack) and creates 3 different attack sets of data based on certain parameters. These features are utilized to implement adversarial training as a defense against FIGA using neural net architecture in PyTorch.

Honorable Mentions

  • Greenops helps to measure the footprints of deep learning models at training, testing and evaluating to reduce energy consumption and carbon footprints.
  • Xaitk-saliency is an open-source, explainable AI toolkit for visual saliency algorithm interfaces and implementations, built for analytic and autonomy applications.

Thank you,

Team PyTorch

Read More

Machines that see the world more like humans do

Computer vision systems sometimes make inferences about a scene that fly in the face of common sense. For example, if a robot were processing a scene of a dinner table, it might completely ignore a bowl that is visible to any human observer, estimate that a plate is floating above the table, or misperceive a fork to be penetrating a bowl rather than leaning against it.

Move that computer vision system to a self-driving car and the stakes become much higher  — for example, such systems have failed to detect emergency vehicles and pedestrians crossing the street.

To overcome these errors, MIT researchers have developed a framework that helps machines see the world more like humans do. Their new artificial intelligence system for analyzing scenes learns to perceive real-world objects from just a few images, and perceives scenes in terms of these learned objects.

The researchers built the framework using probabilistic programming, an AI approach that enables the system to cross-check detected objects against input data, to see if the images recorded from a camera are a likely match to any candidate scene. Probabilistic inference allows the system to infer whether mismatches are likely due to noise or to errors in the scene interpretation that need to be corrected by further processing.

This common-sense safeguard allows the system to detect and correct many errors that plague the “deep-learning” approaches that have also been used for computer vision. Probabilistic programming also makes it possible to infer probable contact relationships between objects in the scene, and use common-sense reasoning about these contacts to infer more accurate positions for objects.

“If you don’t know about the contact relationships, then you could say that an object is floating above the table — that would be a valid explanation. As humans, it is obvious to us that this is physically unrealistic and the object resting on top of the table is a more likely pose of the object. Because our reasoning system is aware of this sort of knowledge, it can infer more accurate poses. That is a key insight of this work,” says lead author Nishad Gothoskar, an electrical engineering and computer science (EECS) PhD student with the Probabilistic Computing Project.

In addition to improving the safety of self-driving cars, this work could enhance the performance of computer perception systems that must interpret complicated arrangements of objects, like a robot tasked with cleaning a cluttered kitchen.

Gothoskar’s co-authors include recent EECS PhD graduate Marco Cusumano-Towner; research engineer Ben Zinberg; visiting student Matin Ghavamizadeh; Falk Pollok, a software engineer in the MIT-IBM Watson AI Lab; recent EECS master’s graduate Austin Garrett; Dan Gutfreund, a principal investigator in the MIT-IBM Watson AI Lab; Joshua B. Tenenbaum, the Paul E. Newton Career Development Professor of Cognitive Science and Computation in the Department of Brain and Cognitive Sciences (BCS) and a member of the Computer Science and Artificial Intelligence Laboratory; and senior author Vikash K. Mansinghka, principal research scientist and leader of the Probabilistic Computing Project in BCS. The research is being presented at the Conference on Neural Information Processing Systems in December.

A blast from the past

To develop the system, called “3D Scene Perception via Probabilistic Programming (3DP3),” the researchers drew on a concept from the early days of AI research, which is that computer vision can be thought of as the “inverse” of computer graphics.

Computer graphics focuses on generating images based on the representation of a scene; computer vision can be seen as the inverse of this process. Gothoskar and his collaborators made this technique more learnable and scalable by incorporating it into a framework built using probabilistic programming.

“Probabilistic programming allows us to write down our knowledge about some aspects of the world in a way a computer can interpret, but at the same time, it allows us to express what we don’t know, the uncertainty. So, the system is able to automatically learn from data and also automatically detect when the rules don’t hold,” Cusumano-Towner explains.

In this case, the model is encoded with prior knowledge about 3D scenes. For instance, 3DP3 “knows” that scenes are composed of different objects, and that these objects often lay flat on top of each other — but they may not always be in such simple relationships. This enables the model to reason about a scene with more common sense.

Learning shapes and scenes

To analyze an image of a scene, 3DP3 first learns about the objects in that scene. After being shown only five images of an object, each taken from a different angle, 3DP3 learns the object’s shape and estimates the volume it would occupy in space.

“If I show you an object from five different perspectives, you can build a pretty good representation of that object. You’d understand its color, its shape, and you’d be able to recognize that object in many different scenes,” Gothoskar says.

Mansinghka adds, “This is way less data than deep-learning approaches. For example, the Dense Fusion neural object detection system requires thousands of training examples for each object type. In contrast, 3DP3 only requires a few images per object, and reports uncertainty about the parts of each objects’ shape that it doesn’t know.”

The 3DP3 system generates a graph to represent the scene, where each object is a node and the lines that connect the nodes indicate which objects are in contact with one another. This enables 3DP3 to produce a more accurate estimation of how the objects are arranged. (Deep-learning approaches rely on depth images to estimate object poses, but these methods don’t produce a graph structure of contact relationships, so their estimations are less accurate.)

Outperforming baseline models

The researchers compared 3DP3 with several deep-learning systems, all tasked with estimating the poses of 3D objects in a scene.

In nearly all instances, 3DP3 generated more accurate poses than other models and performed far better when some objects were partially obstructing others. And 3DP3 only needed to see five images of each object, while each of the baseline models it outperformed needed thousands of images for training.

When used in conjunction with another model, 3DP3 was able to improve its accuracy. For instance, a deep-learning model might predict that a bowl is floating slightly above a table, but because 3DP3 has knowledge of the contact relationships and can see that this is an unlikely configuration, it is able to make a correction by aligning the bowl with the table.

“I found it surprising to see how large the errors from deep learning could sometimes be — producing scene representations where objects really didn’t match with what people would perceive. I also found it surprising that only a little bit of model-based inference in our causal probabilistic program was enough to detect and fix these errors. Of course, there is still a long way to go to make it fast and robust enough for challenging real-time vision systems — but for the first time, we’re seeing probabilistic programming and structured causal models improving robustness over deep learning on hard 3D vision benchmarks,” Mansinghka says.

In the future, the researchers would like to push the system further so it can learn about an object from a single image, or a single frame in a movie, and then be able to detect that object robustly in different scenes. They would also like to explore the use of 3DP3 to gather training data for a neural network. It is often difficult for humans to manually label images with 3D geometry, so 3DP3 could be used to generate more complex image labels.

The 3DP3 system “combines low-fidelity graphics modeling with common-sense reasoning to correct large scene interpretation errors made by deep learning neural nets. This type of approach could have broad applicability as it addresses important failure modes of deep learning. The MIT researchers’ accomplishment also shows how probabilistic programming technology previously developed under DARPA’s Probabilistic Programming for Advancing Machine Learning (PPAML) program can be applied to solve central problems of common-sense AI under DARPA’s current Machine Common Sense (MCS) program,” says Matt Turek, DARPA Program Manager for the Machine Common Sense Program, who was not involved in this research, though the program partially funded the study.

Additional funders include the Singapore Defense Science and Technology Agency collaboration with the MIT Schwarzman College of Computing, Intel’s Probabilistic Computing Center, the MIT-IBM Watson AI Lab, the Aphorism Foundation, and the Siegel Family Foundation.

Read More

Language modelling at scale: Gopher, ethical considerations, and retrieval

Language, and its role in demonstrating and facilitating comprehension – or intelligence – is a fundamental part of being human. It gives people the ability to communicate thoughts and concepts, express ideas, create memories, and build mutual understanding. These are foundational parts of social intelligence. It’s why our teams at DeepMind study aspects of language processing and communication, both in artificial agents and in humans.Read More

Creating Interactive Agents with Imitation Learning

We show that imitation learning of human-human interactions in a simulated world, in conjunction with self-supervised learning, is sufficient to produce a multimodal interactive agent, which we call MIA, that successfully interacts with non-adversarial humans 75% of the time. We further identify architectural and algorithmic techniques that improve performance, such as hierarchical action selection.Read More

Creating Interactive Agents with Imitation Learning

We show that imitation learning of human-human interactions in a simulated world, in conjunction with self-supervised learning, is sufficient to produce a multimodal interactive agent, which we call MIA, that successfully interacts with non-adversarial humans 75% of the time. We further identify architectural and algorithmic techniques that improve performance, such as hierarchical action selection.Read More

Q&A: More-sustainable concrete with machine learning

As a building material, concrete withstands the test of time. Its use dates back to early civilizations, and today it is the most popular composite choice in the world. However, it’s not without its faults. Production of its key ingredient, cement, contributes 8-9 percent of the global anthropogenic CO2 emissions and 2-3 percent of energy consumption, which is only projected to increase in the coming years. With aging United States infrastructure, the federal government recently passed a milestone bill to revitalize and upgrade it, along with a push to reduce greenhouse gas emissions where possible, putting concrete in the crosshairs for modernization, too.

Elsa Olivetti, the Esther and Harold E. Edgerton Associate Professor in the MIT Department of Materials Science and Engineering, and Jie Chen, MIT-IBM Watson AI Lab research scientist and manager, think artificial intelligence can help meet this need by designing and formulating new, more sustainable concrete mixtures, with lower costs and carbon dioxide emissions, while improving material performance and reusing manufacturing byproducts in the material itself. Olivetti’s research improves environmental and economic sustainability of materials, and Chen develops and optimizes machine learning and computational techniques, which he can apply to materials reformulation. Olivetti and Chen, along with their collaborators, have recently teamed up for an MIT-IBM Watson AI Lab project to make concrete more sustainable for the benefit of society, the climate, and the economy.

Q: What applications does concrete have, and what properties make it a preferred building material?

Olivetti: Concrete is the dominant building material globally with an annual consumption of 30 billion metric tons. That is over 20 times the next most produced material, steel, and the scale of its use leads to considerable environmental impact, approximately 5-8 percent of global greenhouse gas (GHG) emissions. It can be made locally, has a broad range of structural applications, and is cost-effective. Concrete is a mixture of fine and coarse aggregate, water, cement binder (the glue), and other additives.

Q: Why isn’t it sustainable, and what research problems are you trying to tackle with this project?

Olivetti: The community is working on several ways to reduce the impact of this material, including alternative fuels use for heating the cement mixture, increasing energy and materials efficiency and carbon sequestration at production facilities, but one important opportunity is to develop an alternative to the cement binder.

While cement is 10 percent of the concrete mass, it accounts for 80 percent of the GHG footprint. This impact is derived from the fuel burned to heat and run the chemical reaction required in manufacturing, but also the chemical reaction itself releases CO2 from the calcination of limestone. Therefore, partially replacing the input ingredients to cement (traditionally ordinary Portland cement or OPC) with alternative materials from waste and byproducts can reduce the GHG footprint. But use of these alternatives is not inherently more sustainable because wastes might have to travel long distances, which adds to fuel emissions and cost, or might require pretreatment processes. The optimal way to make use of these alternate materials will be situation-dependent. But because of the vast scale, we also need solutions that account for the huge volumes of concrete needed. This project is trying to develop novel concrete mixtures that will decrease the GHG impact of the cement and concrete, moving away from the trial-and-error processes towards those that are more predictive.

Chen: If we want to fight climate change and make our environment better, are there alternative ingredients or a reformulation we could use so that less greenhouse gas is emitted? We hope that through this project using machine learning we’ll be able to find a good answer.

Q: Why is this problem important to address now, at this point in history?

Olivetti: There is urgent need to address greenhouse gas emissions as aggressively as possible, and the road to doing so isn’t necessarily straightforward for all areas of industry. For transportation and electricity generation, there are paths that have been identified to decarbonize those sectors. We need to move much more aggressively to achieve those in the time needed; further, the technological approaches to achieve that are more clear. However, for tough-to-decarbonize sectors, such as industrial materials production, the pathways to decarbonization are not as mapped out.

Q: How are you planning to address this problem to produce better concrete?

Olivetti: The goal is to predict mixtures that will both meet performance criteria, such as strength and durability, with those that also balance economic and environmental impact. A key to this is to use industrial wastes in blended cements and concretes. To do this, we need to understand the glass and mineral reactivity of constituent materials. This reactivity not only determines the limit of the possible use in cement systems but also controls concrete processing, and the development of strength and pore structure, which ultimately control concrete durability and life-cycle CO2 emissions.

Chen: We investigate using waste materials to replace part of the cement component. This is something that we’ve hypothesized would be more sustainable and economic — actually waste materials are common, and they cost less. Because of the reduction in the use of cement, the final concrete product would be responsible for much less carbon dioxide production. Figuring out the right concrete mixture proportion that makes endurable concretes while achieving other goals is a very challenging problem. Machine learning is giving us an opportunity to explore the advancement of predictive modeling, uncertainty quantification, and optimization to solve the issue. What we are doing is exploring options using deep learning as well as multi-objective optimization techniques to find an answer. These efforts are now more feasible to carry out, and they will produce results with reliability estimates that we need to understand what makes a good concrete.

Q: What kinds of AI and computational techniques are you employing for this?

Olivetti: We use AI techniques to collect data on individual concrete ingredients, mix proportions, and concrete performance from the literature through natural language processing. We also add data obtained from industry and/or high throughput atomistic modeling and experiments to optimize the design of concrete mixtures. Then we use this information to develop insight into the reactivity of possible waste and byproduct materials as alternatives to cement materials for low-CO2 concrete. By incorporating generic information on concrete ingredients, the resulting concrete performance predictors are expected to be more reliable and transformative than existing AI models.

Chen: The final objective is to figure out what constituents, and how much of each, to put into the recipe for producing the concrete that optimizes the various factors: strength, cost, environmental impact, performance, etc. For each of the objectives, we need certain models: We need a model to predict the performance of the concrete (like, how long does it last and how much weight does it sustain?), a model to estimate the cost, and a model to estimate how much carbon dioxide is generated. We will need to build these models by using data from literature, from industry, and from lab experiments.

We are exploring Gaussian process models to predict the concrete strength, going forward into days and weeks. This model can give us an uncertainty estimate of the prediction as well. Such a model needs specification of parameters, for which we will use another model to calculate. At the same time, we also explore neural network models because we can inject domain knowledge from human experience into them. Some models are as simple as multi-layer perceptions, while some are more complex, like graph neural networks. The goal here is that we want to have a model that is not only accurate but also robust — the input data is noisy, and the model must embrace the noise, so that its prediction is still accurate and reliable for the multi-objective optimization.

Once we have built models that we are confident with, we will inject their predictions and uncertainty estimates into the optimization of multiple objectives, under constraints and under uncertainties.

Q: How do you balance cost-benefit trade-offs?

Chen: The multiple objectives we consider are not necessarily consistent, and sometimes they are at odds with each other. The goal is to identify scenarios where the values for our objectives cannot be further pushed simultaneously without compromising one or a few. For example, if you want to further reduce the cost, you probably have to suffer the performance or suffer the environmental impact. Eventually, we will give the results to policymakers and they will look into the results and weigh the options. For example, they may be able to tolerate a slightly higher cost under a significant reduction in greenhouse gas. Alternatively, if the cost varies little but the concrete performance changes drastically, say, doubles or triples, then this is definitely a favorable outcome.

Q: What kinds of challenges do you face in this work?

Chen: The data we get either from industry or from literature are very noisy; the concrete measurements can vary a lot, depending on where and when they are taken. There are also substantial missing data when we integrate them from different sources, so, we need to spend a lot of effort to organize and make the data usable for building and training machine learning models. We also explore imputation techniques that substitute missing features, as well as models that tolerate missing features, in our predictive modeling and uncertainty estimate.

Q: What do you hope to achieve through this work?

Chen: In the end, we are suggesting either one or a few concrete recipes, or a continuum of recipes, to manufacturers and policymakers. We hope that this will provide invaluable information for both the construction industry and for the effort of protecting our beloved Earth.

Olivetti: We’d like to develop a robust way to design cements that make use of waste materials to lower their CO2 footprint. Nobody is trying to make waste, so we can’t rely on one stream as a feedstock if we want this to be massively scalable. We have to be flexible and robust to shift with feedstocks changes, and for that we need improved understanding. Our approach to develop local, dynamic, and flexible alternatives is to learn what makes these wastes reactive, so we know how to optimize their use and do so as broadly as possible. We do that through predictive model development through software we have developed in my group to automatically extract data from literature on over 5 million texts and patents on various topics. We link this to the creative capabilities of our IBM collaborators to design methods that predict the final impact of new cements. If we are successful, we can lower the emissions of this ubiquitous material and play our part in achieving carbon emissions mitigation goals.

Other researchers involved with this project include Stefanie Jegelka, the X-Window Consortium Career Development Associate Professor in the MIT Department of Electrical Engineering and Computer Science; Richard Goodwin, IBM principal researcher; Soumya Ghosh, MIT-IBM Watson AI Lab research staff member; and Kristen Severson, former research staff member. Collaborators included Nghia Hoang, former research staff member with MIT-IBM Watson AI Lab and IBM Research, and Executive Director of MIT Climate & Sustainability Consortium Jeremy Gregory.​

This research is supported by the MIT-IBM Watson AI Lab.

Read More