Artificial intelligence that understands object relationships

When humans look at a scene, they see objects and the relationships between them. On top of your desk, there might be a laptop that is sitting to the left of a phone, which is in front of a computer monitor.

Many deep learning models struggle to see the world this way because they don’t understand the entangled relationships between individual objects. Without knowledge of these relationships, a robot designed to help someone in a kitchen would have difficulty following a command like “pick up the spatula that is to the left of the stove and place it on top of the cutting board.”

In an effort to solve this problem, MIT researchers have developed a model that understands the underlying relationships between objects in a scene. Their model represents individual relationships one at a time, then combines these representations to describe the overall scene. This enables the model to generate more accurate images from text descriptions, even when the scene includes several objects that are arranged in different relationships with one another.

This work could be applied in situations where industrial robots must perform intricate, multistep manipulation tasks, like stacking items in a warehouse or assembling appliances. It also moves the field one step closer to enabling machines that can learn from and interact with their environments more like humans do.

“When I look at a table, I can’t say that there is an object at XYZ location. Our minds don’t work like that. In our minds, when we understand a scene, we really understand it based on the relationships between the objects. We think that by building a system that can understand the relationships between objects, we could use that system to more effectively manipulate and change our environments,” says Yilun Du, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

Du wrote the paper with co-lead authors Shuang Li, a CSAIL PhD student, and Nan Liu, a graduate student at the University of Illinois at Urbana-Champaign; as well as Joshua B. Tenenbaum, a professor of computational cognitive science in the Department of Brain and Cognitive Sciences and a member of CSAIL; and senior author Antonio Torralba, the Delta Electronics Professor of Electrical Engineering and Computer Science and a member of CSAIL. The research will be presented at the Conference on Neural Information Processing Systems in December.

One relationship at a time

The framework the researchers developed can generate an image of a scene based on a text description of objects and their relationships, like “A wood table to the left of a blue stool. A red couch to the right of a blue stool.”

Their system would break these sentences down into two smaller pieces that describe each individual relationship (“a wood table to the left of a blue stool” and “a red couch to the right of a blue stool”), and then model each part separately. Those pieces are then combined through an optimization process that generates an image of the scene.

The researchers used a machine-learning technique called energy-based models to represent the individual object relationships in a scene description. This technique enables them to use one energy-based model to encode each relational description, and then compose them together in a way that infers all objects and relationships.

By breaking the sentences down into shorter pieces for each relationship, the system can recombine them in a variety of ways, so it is better able to adapt to scene descriptions it hasn’t seen before, Li explains.

“Other systems would take all the relations holistically and generate the image one-shot from the description. However, such approaches fail when we have out-of-distribution descriptions, such as descriptions with more relations, since these model can’t really adapt one shot to generate images containing more relationships. However, as we are composing these separate, smaller models together, we can model a larger number of relationships and adapt to novel combinations,” Du says.

The system also works in reverse — given an image, it can find text descriptions that match the relationships between objects in the scene. In addition, their model can be used to edit an image by rearranging the objects in the scene so they match a new description.

Understanding complex scenes

The researchers compared their model to other deep learning methods that were given text descriptions and tasked with generating images that displayed the corresponding objects and their relationships. In each instance, their model outperformed the baselines.

They also asked humans to evaluate whether the generated images matched the original scene description. In the most complex examples, where descriptions contained three relationships, 91 percent of participants concluded that the new model performed better.

“One interesting thing we found is that for our model, we can increase our sentence from having one relation description to having two, or three, or even four descriptions, and our approach continues to be able to generate images that are correctly described by those descriptions, while other methods fail,” Du says.

The researchers also showed the model images of scenes it hadn’t seen before, as well as several different text descriptions of each image, and it was able to successfully identify the description that best matched the object relationships in the image.

And when the researchers gave the system two relational scene descriptions that described the same image but in different ways, the model was able to understand that the descriptions were equivalent.

The researchers were impressed by the robustness of their model, especially when working with descriptions it hadn’t encountered before.

“This is very promising because that is closer to how humans work. Humans may only see several examples, but we can extract useful information from just those few examples and combine them together to create infinite combinations. And our model has such a property that allows it to learn from fewer data but generalize to more complex scenes or image generations,” Li says.

While these early results are encouraging, the researchers would like to see how their model performs on real-world images that are more complex, with noisy backgrounds and objects that are blocking one another.

They are also interested in eventually incorporating their model into robotics systems, enabling a robot to infer object relationships from videos and then apply this knowledge to manipulate objects in the world.

“Developing visual representations that can deal with the compositional nature of the world around us is one of the key open problems in computer vision. This paper makes significant progress on this problem by proposing an energy-based model that explicitly models multiple relations among the objects depicted in the image. The results are really impressive,” says Josef Sivic, a distinguished researcher at the Czech Institute of Informatics, Robotics, and Cybernetics at Czech Technical University, who was not involved with this research.

This research is supported, in part, by Raytheon BBN Technologies Corp., Mitsubishi Electric Research Laboratory, the National Science Foundation, the Office of Naval Research, and the IBM Thomas J. Watson Research Center.

Read More

In MIT visit, Dropbox CEO Drew Houston ’05 explores the accelerated shift to distributed work

When the cloud storage firm Dropbox decided to shut down its offices with the outbreak of the Covid-19 pandemic, co-founder and CEO Drew Houston ’05 had to send the company’s nearly 3,000 employees home and tell them they were not coming back to work anytime soon. “It felt like I was announcing a snow day or something.”

In the early days of the pandemic, Houston says that Dropbox reacted as many others did to ensure that employees were safe and customers were taken care of. “It’s surreal, there’s no playbook for running a global company in a pandemic over Zoom. For a lot of it we were just taking it as we go.”

Houston talked about his experience leading Dropbox through a public health crisis and how Covid-19 has accelerated a shift to distributed work in a fireside chat on Oct. 14 with Dan Huttenlocher, dean of the MIT Stephen A. Schwarzman College of Computing.

During the discussion, Houston also spoke about his $10 million gift to MIT, which will endow the first shared professorship between the MIT Schwarzman College of Computing and the MIT Sloan School of Management, as well as provide a catalyst startup fund for the college.

“The goal is to find ways to unlock more of our brainpower through a multidisciplinary approach between computing and management,” says Houston. “It’s often at the intersection of these disciplines where you can bring people together from different perspectives, where you can have really big unlocks. I think academia has a huge role to play [here], and I think MIT is super well-positioned to lead. So, I want to do anything I can to help with that.”

Virtual first

While the abrupt swing to remote work was unexpected, Houston says it was pretty clear that the entire way of working as we knew it was going to change indefinitely for knowledge workers. “There’s a silver lining in every crisis,” says Houston, noting that people have been using Dropbox for years to work more flexibly so it made sense for the company to lean in and become early adopters of a distributed work paradigm in which employees work in different physical locations.

Dropbox proceeded to redesign the work experience throughout the company, unveiling a “virtual first” working model in October 2020 in which remote work is the primary experience for all employees. Individual work spaces went by the wayside and offices located in areas with a high concentration of employees were converted into convening and collaborative spaces called Dropbox Studios for in-person work with teammates.

“There’s a lot we could say about Covid, but for me, the most significant thing is that we’ll look back at 2020 as the year we shifted permanently from working out of offices to primarily working out of screens. It’s a transition that’s been underway for a while, but Covid completely finished the swing,” says Houston.

Designing for the future workplace

Houston says the pandemic also prompted Dropbox to reevaluate its product line and begin thinking of ways to make improvements. “We’ve had this whole new way of working sort of forced on us. No one designed it; it just happened. Even tools like Zoom, Slack, and Dropbox were designed in and for the old world.”

Undergoing that process helped Dropbox gain clarity on where they could add value and led to the realization that they needed to get back to their roots. “In a lot of ways, what people need today in principle is the same thing they needed in the beginning — one place for all their stuff,” says Houston.

Dropbox reoriented its product roadmap to refocus efforts from syncing files to organizing cloud content. The company is focused on building toward this new direction with the release of new automation features that users can easily implement to better organize their uploaded content and find it quickly. Dropbox also recently announced the acquisition of Command E, a universal search and productivity company, to help accelerate its efforts in this space.

Houston views Dropbox as still evolving and sees many opportunities ahead in this new era of distributed work. “We need to design better tools and smarter systems. It’s not just the individual parts, but how they’re woven together.” He’s surprised by how little intelligence is actually integrated into current systems and believes that rapid advances in AI and machine learning will soon lead to a new generation of smart tools that will ultimately reshape the nature of work — “in the same way that we had a new generation of cloud tools revolutionize how we work and had all these advantages that we couldn’t imagine not having now.”

Founding roots

Houston famously turned his frustration with carrying USB drives and emailing files to himself into a demo for what became Dropbox.

After graduating from MIT in 2005 with a bachelor’s degree in electrical engineering and computer science, he teamed up with fellow classmate Arash Ferdowsi to found Dropbox in 2007 and led the company’s growth from a simple idea to a service used by 700 million people around the world today.

Houston credits MIT for preparing him well for his entrepreneurial journey, recalling that what surprised him most about his student experience was how much he learned outside the classroom. At the event, he stressed the importance of developing both sides of the brain to a select group of computer science and management students who were in attendance, and a broader live stream audience. “One thing you learn about starting a company is that the hardest problems are usually not technical problems; they’re people problems.” He says that he didn’t realize it at the time, but some of his first lessons in management were gained by taking on responsibilities in his fraternity and in various student organizations that evoked a sense of being “on the hook.”

As CEO, Houston has had a chance to look behind the curtain at how things happen and has come to appreciate that problems don’t solve themselves. While individual people can make a huge difference, he explains that many of the challenges the world faces right now are inherently multidisciplinary ones, which sparked his interest in the MIT Schwarzman College of Computing.

He says that the mindset embodied by the college to connect computing with other disciplines resonated and inspired him to initiate his biggest philanthropic effort to date sooner rather than later because “we don’t have that much time to address these problems.”

Read More

Design’s new frontier

In the 1960s, the advent of computer-aided design (CAD) sparked a revolution in design. For his PhD thesis in 1963, MIT Professor Ivan Sutherland developed Sketchpad, a game-changing software program that enabled users to draw, move, and resize shapes on a computer. Over the course of the next few decades, CAD software reshaped how everything from consumer products to buildings and airplanes were designed.

“CAD was part of the first wave in computing in design. The ability of researchers and practitioners to represent and model designs using computers was a major breakthrough and still is one of the biggest outcomes of design research, in my opinion,” says Maria Yang, Gail E. Kendall Professor and director of MIT’s Ideation Lab.

Innovations in 3D printing during the 1980s and 1990s expanded CAD’s capabilities beyond traditional injection molding and casting methods, providing designers even more flexibility. Designers could sketch, ideate, and develop prototypes or models faster and more efficiently. Meanwhile, with the push of a button, software like that developed by Professor Emeritus David Gossard of MIT’s CAD Lab could solve equations simultaneously to produce a new geometry on the fly.

In recent years, mechanical engineers have expanded the computing tools they use to ideate, design, and prototype. More sophisticated algorithms and the explosion of machine learning and artificial intelligence technologies have sparked a second revolution in design engineering.

Researchers and faculty at MIT’s Department of Mechanical Engineering are utilizing these technologies to re-imagine how the products, systems, and infrastructures we use are designed. These researchers are at the forefront of the new frontier in design.

Computational design

Faez Ahmed wants to reinvent the wheel, or at least the bicycle wheel. He and his team at MIT’s Design Computation & Digital Engineering Lab (DeCoDE) use an artificial intelligence-driven design method that can generate entirely novel and improved designs for a range of products — including the traditional bicycle. They create advanced computational methods to blend human-driven design with simulation-based design.

“The focus of our DeCoDE lab is computational design. We are looking at how we can create machine learning and AI algorithms to help us discover new designs that are optimized based on specific performance parameters,” says Ahmed, an assistant professor of mechanical engineering at MIT.

For their work using AI-driven design for bicycles, Ahmed and his collaborator Professor Daniel Frey wanted to make it easier to design customizable bicycles, and by extension, encourage more people to use bicycles over transportation methods that emit greenhouse gases.

To start, the group gathered a dataset of 4,500 bicycle designs. Using this massive dataset, they tested the limits of what machine learning could do. First, they developed algorithms to group bicycles that looked similar together and explore the design space. They then created machine learning models that could successfully predict what components are key in identifying a bicycle style, such as a road bike versus a mountain bike.

Once the algorithms were good enough at identifying bicycle designs and parts, the team proposed novel machine learning tools that could use this data to create a unique and creative design for a bicycle based on certain performance parameters and rider dimensions.

Ahmed used a generative adversarial network — or GAN — as the basis of this model. GAN models utilize neural networks that can create new designs based on vast amounts of data. However, using GAN models alone would result in homogeneous designs that lack novelty and can’t be assessed in terms of performance. To address these issues in design problems, Ahmed has developed a new method which he calls “PaDGAN,” performance augmented diverse GAN.

“When we apply this type of model, what we see is that we can get large improvements in the diversity, quality, as well as novelty of the designs,” Ahmed explains.

Using this approach, Ahmed’s team developed an open-source computational design tool for bicycles freely available on their lab website. They hope to further develop a set of generalizable tools that can be used across industries and products.

Longer term, Ahmed has his sights set on loftier goals. He hopes the computational design tools he develops could lead to “design democratization,” putting more power in the hands of the end user.

“With these algorithms, you can have more individualization where the algorithm assists a customer in understanding their needs and helps them create a product that satisfies their exact requirements,” he adds.

Using algorithms to democratize the design process is a goal shared by Stefanie Mueller, an associate professor in electrical engineering and computer science and mechanical engineering.

Personal fabrication

Platforms like Instagram give users the freedom to instantly edit their photographs or videos using filters. In one click, users can alter the palette, tone, and brightness of their content by applying filters that range from bold colors to sepia-toned or black-and-white. Mueller, X-Window Consortium Career Development Professor, wants to bring this concept of the Instagram filter to the physical world.

“We want to explore how digital capabilities can be applied to tangible objects. Our goal is to bring reprogrammable appearance to the physical world,” explains Mueller, director of the HCI Engineering Group based out of MIT’s Computer Science and Artificial Intelligence Laboratory.

Mueller’s team utilizes a combination of smart materials, optics, and computation to advance personal fabrication technologies that would allow end users to alter the design and appearance of the products they own. They tested this concept in a project they dubbed “Photo-Chromeleon.”

First, a mix of photochromic cyan, magenta, and yellow dies are airbrushed onto an object — in this instance, a 3D sculpture of a chameleon. Using software they developed, the team sketches the exact color pattern they want to achieve on the object itself. An ultraviolet light shines on the object to activate the dyes.

To actually create the physical pattern on the object, Mueller has developed an optimization algorithm to use alongside a normal office projector outfitted with red, green, and blue LED lights. These lights shine on specific pixels on the object for a given period of time to physically change the makeup of the photochromic pigments.

“This fancy algorithm tells us exactly how long we have to shine the red, green, and blue light on every single pixel of an object to get the exact pattern we’ve programmed in our software,” says Mueller.

Giving this freedom to the end user enables limitless possibilities. Mueller’s team has applied this technology to iPhone cases, shoes, and even cars. In the case of shoes, Mueller envisions a shoebox embedded with UV and LED light projectors. Users could put their shoes in the box overnight and the next day have a pair of shoes in a completely new pattern.

Mueller wants to expand her personal fabrication methods to the clothes we wear. Rather than utilize the light projection technique developed in the PhotoChromeleon project, her team is exploring the possibility of weaving LEDs directly into clothing fibers, allowing people to change their shirt’s appearance as they wear it. These personal fabrication technologies could completely alter consumer habits.

“It’s very interesting for me to think about how these computational techniques will change product design on a high level,” adds Mueller. “In the future, a consumer could buy a blank iPhone case and update the design on a weekly or daily basis.”

Computational fluid dynamics and participatory design

Another team of mechanical engineers, including Sili Deng, the Brit (1961) & Alex (1949) d’Arbeloff Career Development Professor, are developing a different kind of design tool that could have a large impact on individuals in low- and middle-income countries across the world.

As Deng walked down the hallway of Building 1 on MIT’s campus, a monitor playing a video caught her eye. The video featured work done by mechanical engineers and MIT D-Lab on developing cleaner burning briquettes for cookstoves in Uganda. Deng immediately knew she wanted to get involved.

“As a combustion scientist, I’ve always wanted to work on such a tangible real-world problem, but the field of combustion tends to focus more heavily on the academic side of things,” explains Deng.

After reaching out to colleagues in MIT D-Lab, Deng joined a collaborative effort to develop a new cookstove design tool for the 3 billion people across the world who burn solid fuels to cook and heat their homes. These stoves often emit soot and carbon monoxide, leading not only to millions of deaths each year, but also worsening the world’s greenhouse gas emission problem.

The team is taking a three-pronged approach to developing this solution, using a combination of participatory design, physical modeling, and experimental validation to create a tool that will lead to the production of high-performing, low-cost energy products.

Deng and her team in the Deng Energy and Nanotechnology Group use physics-based modeling for the combustion and emission process in cookstoves.

“My team is focused on computational fluid dynamics. We use computational and numerical studies to understand the flow field where the fuel is burned and releases heat,” says Deng.

These flow mechanics are crucial to understanding how to minimize heat loss and make cookstoves more efficient, as well as learning how dangerous pollutants are formed and released in the process.

Using computational methods, Deng’s team performs three-dimensional simulations of the complex chemistry and transport coupling at play in the combustion and emission processes. They then use these simulations to build a combustion model for how fuel is burned and a pollution model that predicts carbon monoxide emissions.

Deng’s models are used by a group led by Daniel Sweeney in MIT D-Lab to test the experimental validation in prototypes of stoves. Finally, Professor Maria Yang uses participatory design methods to integrate user feedback, ensuring the design tool can actually be used by people across the world.

The end goal for this collaborative team is to not only provide local manufacturers with a prototype they could produce themselves, but to also provide them with a tool that can tweak the design based on local needs and available materials.

Deng sees wide-ranging applications for the computational fluid dynamics her team is developing.

“We see an opportunity to use physics-based modeling, augmented with a machine learning approach, to come up with chemical models for practical fuels that help us better understand combustion. Therefore, we can design new methods to minimize carbon emissions,” she adds.

While Deng is utilizing simulations and machine learning at the molecular level to improve designs, others are taking a more macro approach.

Designing intelligent systems

When it comes to intelligent design, Navid Azizan thinks big. He hopes to help create future intelligent systems that are capable of making decisions autonomously by using the enormous amounts of data emerging from the physical world. From smart robots and autonomous vehicles to smart power grids and smart cities, Azizan focuses on the analysis, design, and control of intelligent systems.

Achieving such massive feats takes a truly interdisciplinary approach that draws upon various fields such as machine learning, dynamical systems, control, optimization, statistics, and network science, among others.

“Developing intelligent systems is a multifaceted problem, and it really requires a confluence of disciplines,” says Azizan, assistant professor of mechanical engineering with a dual appointment in MIT’s Institute for Data, Systems, and Society (IDSS). “To create such systems, we need to go beyond standard approaches to machine learning, such as those commonly used in computer vision, and devise algorithms that can enable safe, efficient, real-time decision-making for physical systems.”

For robot control to work in the complex dynamic environments that arise in the real world, real-time adaptation is key. If, for example, an autonomous vehicle is going to drive in icy conditions or a drone is operating in windy conditions, they need to be able to adapt to their new environment quickly.

To address this challenge, Azizan and his collaborators at MIT and Stanford University have developed a new algorithm that combines adaptive control, a powerful methodology from control theory, with meta learning, a new machine learning paradigm.

“This ‘control-oriented’ learning approach outperforms the existing ‘regression-oriented’ methods, which are mostly focused on just fitting the data, by a wide margin,” says Azizan.

Another critical aspect of deploying machine learning algorithms in physical systems that Azizan and his team hope to address is safety. Deep neural networks are a crucial part of autonomous systems. They are used for interpreting complex visual inputs and making data-driven predictions of future behavior in real time. However, Azizan urges caution.

“These deep neural networks are only as good as their training data, and their predictions can often be untrustworthy in scenarios not covered by their training data,” he says. Making decisions based on such untrustworthy predictions could lead to fatal accidents in autonomous vehicles or other safety-critical systems.

To avoid these potentially catastrophic events, Azizan proposes that it is imperative to equip neural networks with a measure of their uncertainty. When the uncertainty is high, they can then be switched to a “safe policy.”

In pursuit of this goal, Azizan and his collaborators have developed a new algorithm known as SCOD — Sketching Curvature of Out-of-Distribution Detection. This framework could be embedded within any deep neural network to equip them with a measure of their uncertainty.

“This algorithm is model-agnostic and can be applied to neural networks used in various kinds of autonomous systems, whether it’s drones, vehicles, or robots,” says Azizan.

Azizan hopes to continue working on algorithms for even larger-scale systems. He and his team are designing efficient algorithms to better control supply and demand in smart energy grids. According to Azizan, even if we create the most efficient solar panels and batteries, we can never achieve a sustainable grid powered by renewable resources without the right control mechanisms.

Mechanical engineers like Ahmed, Mueller, Deng, and Azizan serve as the key to realizing the next revolution of computing in design.

“MechE is in a unique position at the intersection of the computational and physical worlds,” Azizan says. “Mechanical engineers build a bridge between theoretical, algorithmic tools and real, physical world applications.”

Sophisticated computational tools, coupled with the ground truth mechanical engineers have in the physical world, could unlock limitless possibilities for design engineering, well beyond what could have been imagined in those early days of CAD.

Read More

MIT Lincoln Laboratory wins nine R&D 100 Awards for 2021

Nine technologies developed at MIT Lincoln Laboratory have been selected as R&D 100 Award winners for 2021. Since 1963, this awards program has recognized the 100 most significant technologies transitioned to use or introduced into the marketplace over the past year. The winners are selected by an independent panel of expert judges. R&D World, an online publication that serves research scientists and engineers worldwide, announces the awards.

The winning technologies are diverse in their applications. One technology empowers medics to initiate life-saving interventions at the site of an emergency; another could help first responders find survivors buried under rubble. Others present new approaches to building motors at the microscale, combining arrays of optical fibers, and reducing electromagnetic interference in circuit boards. A handful of the awardees leverage machine learning to enable novel capabilities.

Field-programmable imaging array

Advanced imagers, such as lidars and high-resolution wide-field-of-view sensors, need the ability to process huge amounts of data directly in the system, or “on chip.” However, developing this capability for novel or niche applications is prohibitively expensive. To help designers overcome this barrier, Lincoln Laboratory developed a field-programmable imaging array to make high-performance on-chip digital processing available to a broad spectrum of new imaging applications.

The technology serves as a universal digital back end, adaptable to any type of optical detector. Once a front end for a specific detector type is integrated, the design cycle for new applications of that detector type can be greatly shortened.

Free-space Quantum Network Link Architecture

The Free-space Quantum Network Link Architecture enables the generation, distribution, and interaction of entangled photons across free-space links. These capabilities are crucial for the development of emerging quantum network applications, such as networked computing and distributed sensing.

Three primary technologies make up this system: a gigahertz clock-rate, three-stage pump laser system; a source of spectrally pure and long-duration entangled photons; and a pump-forwarding architecture that synchronizes quantum systems across free-space links with high precision. This architecture was successfully demonstrated over a 3.2-kilometer free-space atmospheric link between two buildings on Hanscom Air Force Base.

Global Synthetic Weather Radar

The Global Synthetic Weather Radar (GSWR) provides radar-like weather imagery and radar-forward forecasts for regions where actual weather radars are not deployed or are limited in range. The technology generates these synthetic images by using advanced machine learning techniques that combine satellite, lightning, numerical weather model, and radar truth data to produce its predictions.

The laboratory collaborated with the U.S. Air Force on this technology, which will help mission planners schedule operations in remote regions of the world. GSWR’s reliable imagery and forecasts can also provide decision-making guidance for emergency responders and for the transportation, agriculture, and tourism industries.

Guided Ultrasound Intervention Device

The Guided Ultrasound Intervention Device (GUIDE) is the first technology to enable a medic or emergency medical technician to catheterize a major blood vessel in a pre-hospital environment. This procedure can save lives from hemorrhage after traumatic injury.

To use GUIDE, a medic scans a target area of a patient with an ultrasound probe integrated with the device. The device then uses artificial intelligence software to locate a femoral vessel in real time and direct the medic to it via a gamified display. Once in position, the device inserts a needle and guide wire into the vessel, after which the medic can easily complete the process of catheterization. Similar to the impact of automated external defibrillators, GUIDE can empower non-experts to take life-saving measures at the scene of an emergency.

Microhydraulic motors

Microhydraulic motors provide a new way of making things move on a microscale. These tiny actuators are constructed by layering thin, disc-shaped polymer sheets on top of microfabricated electrodes and inserting droplets of water and oil in between the layers. A voltage applied to the electrodes distorts the surface tension of the droplets, causing them to move and rotate the entire disk with them.

These precise, powerful, and efficient motors could enable shape-changing materials, self-folding displays, or microrobots for medical procedures.

Monolithic fiber array launcher

A fiber array launcher is a subsystem that holds an array of optical fibers in place and shapes the laser beams emanating from the fibers. Traditional launchers are composed of many small components, which can become misaligned with vibration and are made of inefficient materials that absorb light. To address these problems, the laboratory developed a monolithic fiber array launcher.

Built out of a single piece of glass, this launcher is one-tenth the volume of traditional arrays and less susceptible to thermo-optic effects to allow for scaling to much higher laser powers and channel counts.

Motion Under Rubble Measured Using Radar

The Motion Under Rubble Measured Using Radar (MURMUR) technology was created to help rescue teams save lives in complex disaster environments. This remote-controlled system is mounted on a robotic ground vehicle for rapid deployment and uses radar to transmit low-frequency signals that penetrate walls, rubble, and debris. 

Signals that reflect back to the radar are digitized and then processed using both classical signal processing techniques and novel machine learning algorithms to determine the range in depth at which there is life-indicating motion, such as breathing, from someone buried under the rubble. Search-and-rescue personnel monitor these detections in real time on a mobile device, reducing time-consuming search efforts and enabling timely recovery of survivors.

Spectrally Efficient Digital Logic

Spectrally Efficient Digital Logic (SEDL) is a set of digital logic building blocks that operate with intrinsically low electromagnetic interference (EMI) emissions.

EMI emissions cause interference between electrical components and present security risks. These emission levels are often discovered late in the electronics development process, once all the pieces are put together, and are thus costly to fix. SEDL is designed to reduce EMI problems while being compatible with traditional logic, giving designers the freedom to construct systems composed of SEDL components entirely or a hybrid of traditional logic and SEDL. It also comes at comparable size, cost, and clock speed with respect to traditional logic.

Traffic Flow Impact Tool

Developed in collaboration with the Federal Aviation Administration, the Traffic Flow Impact Tool helps air traffic control managers handle disruptions to air traffic caused by dangerous weather, such as thunderstorms.

The tool uses a novel machine learning technique to fuse multiple convective weather forecast models and compute a metric called permeability, a measure of the amount of usable airspace in a given area. These permeability predictions are displayed on a user interface and allow managers to plan ahead for weather impacts to air traffic.

Since 2010, Lincoln Laboratory has received 75 R&D 100 Awards. The awards are a recognition of the laboratory’s transfer of unclassified technologies to industry and government. Each year, many technology transitions also occur for classified projects. This transfer of technology is central to the laboratory’s role as a federally funded research and development center.

“Our R&D 100 Awards recognize the significant, ongoing technology development and transition success at the laboratory. We have had much success with our classified work as well,” says Eric Evans, the director of Lincoln Laboratory. “We are very proud of everyone involved in these programs.”

Editors of R&D World announced the 2021 R&D 100 Award winners at virtual ceremonies broadcast on October 19, 20, and 21.

Read More

Electrochemistry, from batteries to brains

Bilge Yildiz’s research impacts a wide range of technologies. The members of her lab study fuel cells, which convert hydrogen and oxygen into electricity (and water). They study electrolyzers, which go the other way, using electricity to convert water into hydrogen and oxygen. They study batteries. They study corrosion. They even study computers that attempt to mimic the way the brain processes information in learning. What brings all this together in her lab is the electrochemistry of ionic-electronic oxides and their interfaces.

“It may seem like we’ve been contributing to different technologies,” says Yildiz, MIT’s Breene M. Kerr (1951) Professor in the Department of Nuclear Science and Engineering (NSE) and the Department of Materials Science and Engineering, who was recently named a fellow of the American Physical Society. “It’s true. But fundamentally, it’s the same phenomena that we’re after in all these.” That is, the behavior of ions — charged atoms — in materials, particularly on surfaces and interfaces.

Yildiz’s comfort crossing scientific borders may come from her trek to where she is — or vice versa. She grew up in the seaside city of Izmir, Turkey, the daughter of two math teachers. She spent a lot of fun time by the sea, and also tinkered with her dad on repair and construction projects at home. She enjoyed studying and attended a science-focused high school, where she vividly recalls a particular two-year project. The city sat on a polluted bay, and her biology teacher connected her and a friend with a university professor who got them working on ways to clean the water using algae. “We had a lot of fun in the lab with limited supplies, collecting samples from the bay, and oxygenating them in the lab with algae,” she says. They wrote a report for the municipality. She’s no longer in biology, but “it made me aware of the research process and the importance of the environment,” she says, “that still stays.”

Before entering university, Yildiz decided to study nuclear energy engineering, because it sounded interesting, although she didn’t yet know the field’s importance for mitigating global warming. She ended up enjoying the combination of math, physics, and engineering. Turkey didn’t have much of a nuclear energy program, so she ventured to MIT for her PhD in nuclear engineering, studying artificial intelligence for the safe operation of nuclear power plants. She liked applying computer science to nuclear systems, but came to realize she preferred the physical sciences over algorithms.

Yildiz stayed at MIT for a postdoctoral fellowship, between the nuclear engineering and mechanical engineering departments, studying electrochemistry in fuel cells. “My postdoc advisors at the time were, I think, taking a risk by hiring me, because I really didn’t know anything” about electrochemistry, she says. “It was an extremely helpful and defining experience for me — eye-opening — and allowed me to move in the direction of electrochemistry and materials.” She then headed in another new direction, at Argonne National Laboratory in Illinois, learning about X-ray spectroscopy, blasting materials with powerful synchrotron X-rays to probe their structure and chemistry.

At MIT, to where Yildiz returned in 2007, she still uses Argonne’s instruments, as well as other synchrotrons in the United States and abroad. In a typical experiment, she and her group might first create a material that could be used, for example, in a fuel cell. They’ll then use X-rays in her lab or at synchrotrons to characterize its surface under various operational conditions. They’ll build computational models on the atomic or electron level to help interpret the results, and to guide the next experiment. In fuel cells, this work allowed to identify and circumvent a surface degradation problem. Connecting the dots between surface chemistry and performance allows her to predict better material surfaces to increase the efficiency and durability of fuel cells and batteries. “These are findings that we have built over many years,” she says, “from having identified the problem to identifying the reasons for that problem, then to proposing some solutions for that problem.”

Solid oxide fuel cells use materials called perovskite oxides to catalyze reactions with oxygen. Substitutions — for instance, strontium atoms — added to the crystal enhance its ability to transport electrons and oxygen ions. But these atoms, also called dopants, often precipitate at the surface of the material, reducing both its stability and its performance. Yildiz’s group uncovered the reason: The negatively charged dopants migrate toward positively charged oxygen vacancies near the crystal’s surface. They then engineered a solution. Removing some of the excess oxygen vacancies by oxidizing the surface with another element, hafnium, prevented the movement of strontium to the surface, keeping the fuel cell functioning longer and more efficiently.

“The coupling of mechanics to chemistry has also been a very exciting theme in our research,” she says. She has investigated the effects of strain on materials’ ion transport and surface catalytic activity properties. She’s found that certain types of elastic strain can facilitate diffusion of ions as well as surface reactivity. Accelerating ion transport and surface reactions improves the performance of solid oxide fuel cells and batteries.

In her recent work, she considers analog, brain-guided computing. Most computers we use daily are digital, flipping electrical switches on and off, but the brain operates with many orders of magnitude more energy efficiency, in part because it stores and processes information in the same location, and does so by varying the local electrical properties on a continuum. Yildiz is using small ions to vary the resistance of a given material continuously, as ions enter or exit the material. She controls the ions electrochemically, similar to a process in the brain. In effect, she’s replicating some functionality of biological synapses, in particular the strengthening and weakening of synapses, by creating tiny, energy-efficient batteries.

She is collaborating with colleagues across the Institute — Ju Li from NSE, Jesus del Alamo from the Department of Electrical Engineering and Computer Science, and Michale Fee and Ila Fiete from the Department of Brain and Cognitive Sciences. Their team is investigating different ions, materials, and device geometries, and is working with the MIT Quest for Intelligence to translate learning rules from brain studies to the design of brain-guided machine intelligence hardware.

In retrospect, Yildiz says, the leap from her formal training on nuclear engineering into electrochemistry and materials was a big one. “I work on a research problem, because it sparks my curiosity, I am very motivated and excited to work on it and it makes me happy. I never think whether this problem is easy or difficult when I am working on it. I really just want to do it, no matter what. When I look back now, I notice this leap was not trivial.” She adds, “But now I also see that we do this in our faculty work all the time. We identify new questions that are not necessarily in our direct expertise. And we learn, contribute, and evolve.”

Describing her return to MIT, after an “exciting and gratifying” time at Argonne, Yildiz says she preferred the intellectual flexibility of having her own academic lab — as well as the chance to teach and mentor her students and postdocs. “We get to work with young students who are energetic, motivated, smart, hardworking,” she says. “Luckily, they don’t know what’s difficult. Like I didn’t.”

Read More

Dexterous robotic hands manipulate thousands of objects with ease

At just one year old, a baby is more dexterous than a robot. Sure, machines can do more than just pick up and put down objects, but we’re not quite there as far as replicating a natural pull toward exploratory or sophisticated dexterous manipulation goes. 

Artificial intelligence firm OpenAI gave it a try with Dactyl (meaning “finger,” from the Greek word “daktylos”), using their humanoid robot hand to solve a Rubik’s cube with software that’s a step toward more general AI, and a step away from the common single-task mentality. DeepMind created “RGB-Stacking,” a vision-based system that challenges a robot to learn how to grab items and stack them. 

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), in the ever-present quest to get machines to replicate human abilities, created a framework that’s more scaled up: a system that can reorient over 2,000 different objects, with the robotic hand facing both upwards and downwards. This ability to manipulate anything from a cup to a tuna can to a Cheez-It box could help the hand quickly pick-and-place objects in specific ways and locations — and even generalize to unseen objects. 

This deft “handiwork” — which is usually limited to single tasks and upright positions — could be an asset in speeding up logistics and manufacturing, helping with common demands such as packing objects into slots for kitting, or dexterously manipulating a wider range of tools. The team used a simulated, anthropomorphic hand with 24 degrees of freedom, and showed evidence that the system could be transferred to a real robotic system in the future. 

“In industry, a parallel-jaw gripper is most commonly used, partially due to its simplicity in control, but it’s physically unable to handle many tools we see in daily life,” says MIT CSAIL PhD student Tao Chen, member of the MIT Improbable AI Lab and the lead researcher on the project. “Even using a plier is difficult because it can’t dexterously move one handle back and forth. Our system will allow a multi-fingered hand to dexterously manipulate such tools, which opens up a new area for robotics applications.”

This type of “in-hand” object reorientation has been a challenging problem in robotics, due to the large number of motors to be controlled and the frequent change in contact state between the fingers and the objects. And with over 2,000 objects, the model had a lot to learn. 

The problem becomes even more tricky when the hand is facing downwards. Not only does the robot need to manipulate the object, but also circumvent gravity so it doesn’t fall down. 

The team found that a simple approach could solve complex problems. They used a model-free reinforcement learning algorithm (meaning the system has to figure out value functions from interactions with the environment) with deep learning, and something called a “teacher-student” training method. 

For this to work, the “teacher” network is trained on information about the object and robot that’s easily available in simulation, but not in the real world, such as the location of fingertips or object velocity. To ensure that the robots can work outside of the simulation, the knowledge of the “teacher” is distilled into observations that can be acquired in the real world, such as depth images captured by cameras, object pose, and the robot’s joint positions. They also used a “gravity curriculum,” where the robot first learns the skill in a zero-gravity environment, and then slowly adapts the controller to the normal gravity condition, which, when taking things at this pace, really improved the overall performance. 

While seemingly counterintuitive, a single controller (known as brain of the robot) could reorient a large number of objects it had never seen before, and with no knowledge of shape. 

“We initially thought that visual perception algorithms for inferring shape while the robot manipulates the object was going to be the primary challenge,” says MIT Professor Pulkit Agrawal, an author on the paper about the research. “To the contrary, our results show that one can learn robust control strategies that are shape-agnostic. This suggests that visual perception may be far less important for manipulation than what we are used to thinking, and simpler perceptual processing strategies might suffice.” 

Many small, circular shaped objects (apples, tennis balls, marbles), had close to 100 percent success rates when reoriented with the hand facing up and down, with the lowest success rates, unsurprisingly, for more complex objects, like a spoon, a screwdriver, or scissors, being closer to 30 percent. 

Beyond bringing the system out into the wild, since success rates varied with object shape, in the future, the team notes that training the model based on object shapes could improve performance. 

Chen wrote a paper about the research alongside MIT CSAIL PhD student Jie Xu and MIT Professor Pulkit Agrawal. The research is funded by Toyota Research Institute, Amazon Research Award, and DARPA Machine Common Sense Program. It will be presented at the 2021 The Conference on Robot Learning (CoRL).

Read More

Toward speech recognition for uncommon spoken languages

Automated speech-recognition technology has become more common with the popularity of virtual assistants like Siri, but many of these systems only perform well with the most widely spoken of the world’s roughly 7,000 languages.

Because these systems largely don’t exist for less common languages, the millions of people who speak them are cut off from many technologies that rely on speech, from smart home devices to assistive technologies and translation services.

Recent advances have enabled machine learning models that can learn the world’s uncommon languages, which lack the large amount of transcribed speech needed to train algorithms. However, these solutions are often too complex and expensive to be applied widely.

Researchers at MIT and elsewhere have now tackled this problem by developing a simple technique that reduces the complexity of an advanced speech-learning model, enabling it to run more efficiently and achieve higher performance.

Their technique involves removing unnecessary parts of a common, but complex, speech recognition model and then making minor adjustments so it can recognize a specific language. Because only small tweaks are needed once the larger model is cut down to size, it is much less expensive and time-consuming to teach this model an uncommon language.

This work could help level the playing field and bring automatic speech-recognition systems to many areas of the world where they have yet to be deployed. The systems are important in some academic environments, where they can assist students who are blind or have low vision, and are also being used to improve efficiency in health care settings through medical transcription and in the legal field through court reporting. Automatic speech-recognition can also help users learn new languages and improve their pronunciation skills. This technology could even be used to transcribe and document rare languages that are in danger of vanishing.  

“This is an important problem to solve because we have amazing technology in natural language processing and speech recognition, but taking the research in this direction will help us scale the technology to many more underexplored languages in the world,” says Cheng-I Jeff Lai, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and first author of the paper.

Lai wrote the paper with fellow MIT PhD students Alexander H. Liu, Yi-Lun Liao, Sameer Khurana, and Yung-Sung Chuang; his advisor and senior author James Glass, senior research scientist and head of the Spoken Language Systems Group in CSAIL; MIT-IBM Watson AI Lab research scientists Yang Zhang, Shiyu Chang, and Kaizhi Qian; and David Cox, the IBM director of the MIT-IBM Watson AI Lab. The research will be presented at the Conference on Neural Information Processing Systems in December.

Learning speech from audio

The researchers studied a powerful neural network that has been pretrained to learn basic speech from raw audio, called Wave2vec 2.0.

A neural network is a series of algorithms that can learn to recognize patterns in data; modeled loosely off the human brain, neural networks are arranged into layers of interconnected nodes that process data inputs.

Wave2vec 2.0 is a self-supervised learning model, so it learns to recognize a spoken language after it is fed a large amount of unlabeled speech. The training process only requires a few minutes of transcribed speech. This opens the door for speech recognition of uncommon languages that lack large amounts of transcribed speech, like Wolof, which is spoken by 5 million people in West Africa.

However, the neural network has about 300 million individual connections, so it requires a massive amount of computing power to train on a specific language.

The researchers set out to improve the efficiency of this network by pruning it. Just like a gardener cuts off superfluous branches, neural network pruning involves removing connections that aren’t necessary for a specific task, in this case, learning a language. Lai and his collaborators wanted to see how the pruning process would affect this model’s speech recognition performance.

After pruning the full neural network to create a smaller subnetwork, they trained the subnetwork with a small amount of labeled Spanish speech and then again with French speech, a process called finetuning.  

“We would expect these two models to be very different because they are finetuned for different languages. But the surprising part is that if we prune these models, they will end up with highly similar pruning patterns. For French and Spanish, they have 97 percent overlap,” Lai says.

They ran experiments using 10 languages, from Romance languages like Italian and Spanish to languages that have completely different alphabets, like Russian and Mandarin. The results were the same — the finetuned models all had a very large overlap.

A simple solution

Drawing on that unique finding, they developed a simple technique to improve the efficiency and boost the performance of the neural network, called PARP (Prune, Adjust, and Re-Prune).

In the first step, a pretrained speech recognition neural network like Wave2vec 2.0 is pruned by removing unnecessary connections. Then in the second step, the resulting subnetwork is adjusted for a specific language, and then pruned again. During this second step, connections that had been removed are allowed to grow back if they are important for that particular language.

Because connections are allowed to grow back during the second step, the model only needs to be finetuned once, rather than over multiple iterations, which vastly reduces the amount of computing power required.

Testing the technique

The researchers put PARP to the test against other common pruning techniques and found that it outperformed them all for speech recognition. It was especially effective when there was only a very small amount of transcribed speech to train on.

They also showed that PARP can create one smaller subnetwork that can be finetuned for 10 languages at once, eliminating the need to prune separate subnetworks for each language, which could also reduce the expense and time required to train these models.

Moving forward, the researchers would like to apply PARP to text-to-speech models and also see how their technique could improve the efficiency of other deep learning networks.

“There are increasing needs to put large deep-learning models on edge devices. Having more efficient models allows these models to be squeezed onto more primitive systems, like cell phones. Speech technology is very important for cell phones, for instance, but having a smaller model does not necessarily mean it is computing faster. We need additional technology to bring about faster computation, so there is still a long way to go,” Zhang says.

Self-supervised learning (SSL) is changing the field of speech processing, so making SSL models smaller without degrading performance is a crucial research direction, says Hung-yi Lee, associate professor in the Department of Electrical Engineering and the Department of Computer Science and Information Engineering at National Taiwan University, who was not involved in this research.

“PARP trims the SSL models, and at the same time, surprisingly improves the recognition accuracy. Moreover, the paper shows there is a subnet in the SSL model, which is suitable for ASR tasks of many languages. This discovery will stimulate research on language/task agnostic network pruning. In other words, SSL models can be compressed while maintaining their performance on various tasks and languages,” he says.

This work is partially funded by the MIT-IBM Watson AI Lab and the 5k Language Learning Project.

Read More

3 Questions: Blending computing with other disciplines at MIT

The demand for computing-related training is at an all-time high. At MIT, there has been a remarkable tide of interest in computer science programs, with heavy enrollment from students studying everything from economics to life sciences eager to learn how computational techniques and methodologies can be used and applied within their primary field.

Launched in 2020, the Common Ground for Computing Education was created through the MIT Stephen A. Schwarzman College of Computing to meet the growing need for enhanced curricula that connect computer science and artificial intelligence with different domains. In order to advance this mission, the Common Ground is bringing experts across MIT together and facilitating collaborations among multiple departments to develop new classes and approaches that blend computing topics with other disciplines.

Dan Huttenlocher, dean of the MIT Schwarzman College of Computing, and the chairs of the Common Ground Standing Committee — Jeff Grossman, head of the Department of Materials Science and Engineering and the Morton and Claire Goulder and Family Professor of Environmental Systems; and Asu Ozdaglar, deputy dean of academics for the MIT Schwarzman College of Computing, head of the Department of Electrical Engineering and Computer Science, and the MathWorks Professor of Electrical Engineering and Computer Science — discuss here the objectives of the Common Ground, pilot subjects that are underway, and ways they’re engaging faculty to create new curricula for MIT’s class of “computing bilinguals.”

Q: What are the objectives of the Common Ground and how does it fit into the mission of the MIT Schwarzman College of Computing?

Huttenlocher: One of the core components of the college mission is to educate students who are fluent in both the “language” of computing and that of other disciplines. Machine learning classes, for example, attract a lot of students outside of electrical engineering and computer science (EECS) majors. These students are interested in machine learning for modeling within the context of their fields of interest, rather than inner workings of machine learning itself as taught in Course 6. So, we need new approaches to how we develop computing curricula in order to provide students with a thorough grounding in computing that is relevant to their interests, to not just enable them to use computational tools, but understand conceptually how they can be developed and applied in their primary field, whether it be science, engineering, humanities, business, or design.

The core goals of the Common Ground are to infuse computing education throughout MIT in a coordinated manner, as well as to serve as a platform for multi-departmental collaborations. All classes and curricula developed through the Common Ground are intended to be created and offered jointly by multiple academic departments to meet ‘common’ needs. We’re bringing the forefront of rapidly-changing computer science and artificial intelligence fields together with the problems and methods of other disciplines, so the process has to be collaborative. As much as computing is changing thinking in the disciplines, the disciplines are changing the way people develop new computing approaches. It can’t be a stand-alone effort — otherwise it won’t work.

Q: How is the Common Ground facilitating collaborations and engaging faculty across MIT to develop new curricula?

Grossman: The Common Ground Standing Committee was formed to oversee the activities of the Common Ground and is charged with evaluating how best to support and advance program objectives. There are 29 members on the committee — all are faculty experts in various computing areas, and they represent 18 academic departments across all five MIT schools and the college. The structure of the committee very much aligns with the mission of the Common Ground in that it draws from all parts of the Institute. Members are organized into subcommittees currently centered on three primary focus areas: fundamentals of computational science and engineering; fundamentals of programming/computational thinking; and machine learning, data science, and algorithms. The subcommittees, with extensive input from departments, framed prototypes for what Common Ground subjects would look like in each area, and a number of classes have already been piloted to date.

It has been wonderful working with colleagues from different departments. The level of commitment that everyone on the committee has put into this effort has truly been amazing to see, and I share their enthusiasm for pursuing opportunities in computing education.

Q: Can you tell us more about the subjects that are already underway?

Ozdaglar: So far, we have four offerings for students to choose from: in the fall, there’s Linear Algebra and Optimization with the Department of Mathematics and EECS, and Programming Skills and Computational Thinking in-Context with the Experimental Study Group and EECS; Modeling with Machine Learning: From Algorithms to Applications in the spring, with disciplinary modules developed by multiple engineering departments and MIT Supply Chain Management; and Introduction to Computational Science and Engineering during both semesters, which is a collaboration between the Department of Aeronautics and Astronautics and the Department of Mathematics. 

We have had students from a range of majors take these classes, including mechanical engineering, physics, chemical engineering, economics, and management, among others. The response has been very positive. It is very exciting to see MIT students having access to these unique offerings. Our goal is to enable them to frame disciplinary problems using a rich computational framework, which is one of the objectives of the Common Ground.

We are planning to expand Common Ground offerings in the years to come and welcome ideas for new subjects. Some ideas that we currently have in the works include classes on causal inference, creative programming, and data visualization with communication. In addition, this fall, we put out a call for proposals to develop new subjects. We invited instructors from all across the campus to submit ideas for pilot computing classes that are useful across a range of areas and support the educational mission of individual departments. The selected proposals will receive seed funding from the Common Ground to assist in the design, development, and staffing of new, broadly-applicable computing subjects and revision of existing subjects in alignment with the Common Ground’s objectives. We are looking explicitly to facilitate opportunities in which multiple departments would benefit from coordinated teaching.

Read More

Taming the data deluge

An oncoming tsunami of data threatens to overwhelm huge data-rich research projects on such areas that range from the tiny neutrino to an exploding supernova, as well as the mysteries deep within the brain. 

When LIGO picks up a gravitational-wave signal from a distant collision of black holes and neutron stars, a clock starts ticking for capturing the earliest possible light that may accompany them: time is of the essence in this race. Data collected from electrical sensors monitoring brain activity are outpacing computing capacity. Information from the Large Hadron Collider (LHC)’s smashed particle beams will soon exceed 1 petabit per second. 

To tackle this approaching data bottleneck in real-time, a team of researchers from nine institutions led by the University of Washington, including MIT, has received $15 million in funding to establish the Accelerated AI Algorithms for Data-Driven Discovery (A3D3) Institute. From MIT, the research team includes Philip Harris, assistant professor of physics, who will serve as the deputy director of the A3D3 Institute; Song Han, assistant professor of electrical engineering and computer science, who will serve as the A3D3’s co-PI; and Erik Katsavounidis, senior research scientist with the MIT Kavli Institute for Astrophysics and Space Research.

Infused with this five-year Harnessing the Data Revolution Big Idea grant, and jointly funded by the Office of Advanced Cyberinfrastructure, A3D3 will focus on three data-rich fields: multi-messenger astrophysics, high-energy particle physics, and brain imaging neuroscience. By enriching AI algorithms with new processors, A3D3 seeks to speed up AI algorithms for solving fundamental problems in collider physics, neutrino physics, astronomy, gravitational-wave physics, computer science, and neuroscience. 

“I am very excited about the new Institute’s opportunities for research in nuclear and particle physics,” says Laboratory for Nuclear Science Director Boleslaw Wyslouch. “Modern particle detectors produce an enormous amount of data, and we are looking for extraordinarily rare signatures. The application of extremely fast processors to sift through these mountains of data will make a huge difference in what we will measure and discover.”

The seeds of A3D3 were planted in 2017, when Harris and his colleagues at Fermilab and CERN decided to integrate real-time AI algorithms to process the incredible rates of data at the LHC. Through email correspondence with Han, Harris’ team built a compiler, HLS4ML, that could run an AI algorithm in nanoseconds.

“Before the development of HLS4ML, the fastest processing that we knew of was roughly a millisecond per AI inference, maybe a little faster,” says Harris. “We realized all the AI algorithms were designed to solve much slower problems, such as image and voice recognition. To get to nanosecond inference timescales, we recognized we could make smaller algorithms and rely on custom implementations with Field Programmable Gate Array (FPGA) processors in an approach that was largely different from what others were doing.”

A few months later, Harris presented their research at a physics faculty meeting, where Katsavounidis became intrigued. Over coffee in Building 7, they discussed combining Harris’ FPGA with Katsavounidis’s use of machine learning for finding gravitational waves. FPGAs and other new processor types, such as graphics processing units (GPUs), accelerate AI algorithms to more quickly analyze huge amounts of data.

“I had worked with the first FPGAs that were out in the market in the early ’90s and have witnessed first-hand how they revolutionized front-end electronics and data acquisition in big high-energy physics experiments I was working on back then,” recalls Katsavounidis. “The ability to have them crunch gravitational-wave data has been in the back of my mind since joining LIGO over 20 years ago.”

Two years ago they received their first grant, and the University of Washington’s Shih-Chieh Hsu joined in. The team initiated the Fast Machine Lab, published about 40 papers on the subject, built the group to about 50 researchers, and “launched a whole industry of how to explore a region of AI that has not been explored in the past,” says Harris. “We basically started this without any funding. We’ve been getting small grants for various projects over the years. A3D3 represents our first large grant to support this effort.”  

“What makes A3D3 so special and suited to MIT is its exploration of a technical frontier, where AI is implemented not in high-level software, but rather in lower-level firmware, reconfiguring individual gates to address the scientific question at hand,” says Rob Simcoe, director of MIT Kavli Institute for Astrophysics and Space Research and the Francis Friedman Professor of Physics. “We are in an era where experiments generate torrents of data. The acceleration gained from tailoring reprogrammable, bespoke computers at the processor level can advance real-time analysis of these data to new levels of speed and sophistication.”

The Huge Data from the Large Hadron Collider 

With data rates already exceeding 500 terabits per second, the LHC processes more data than any other scientific instrument on earth. Its future aggregate data rates will soon exceed 1 petabit per second, the biggest data rate in the world. 

“Through the use of AI, A3D3 aims to perform advanced analyses, such as anomaly detection, and particle reconstruction on all collisions happening 40 million times per second,” says Harris.

The goal is to find within all of this data a way to identify the few collisions out of the 3.2 billion collisions per second that could reveal new forces, explain how dark matter is formed, and complete the picture of how fundamental forces interact with matter. Processing all of this information requires a customized computing system capable of interpreting the collider information within ultra-low latencies.  

“The challenge of running this on all of the 100s of terabits per second in real-time is daunting and requires a complete overhaul of how we design and implement AI algorithms,” says Harris. “With large increases in the detector resolution leading to data rates that are even larger the challenge of finding the one collision, among many, will become even more daunting.” 

The Brain and the Universe

Thanks to advances in techniques such as medical imaging and electrical recordings from implanted electrodes, neuroscience is also gathering larger amounts of data on how the brain’s neural networks process responses to stimuli and perform motor information. A3D3 plans to develop and implement high-throughput and low-latency AI algorithms to process, organize, and analyze massive neural datasets in real time, to probe brain function in order to enable new experiments and therapies.   

With Multi-Messenger Astrophysics (MMA), A3D3 aims to quickly identify astronomical events by efficiently processing data from gravitational waves, gamma-ray bursts, and neutrinos picked up by telescopes and detectors. 

The A3D3 researchers also include a multi-disciplinary group of 15 other researchers, including project lead the University of Washington, along with Caltech, Duke University, Purdue University, UC San Diego, University of Illinois Urbana-Champaign, University of Minnesota, and the University of Wisconsin-Madison. It will include neutrinos research at Icecube and DUNE, and visible astronomy at Zwicky Transient Facility, and will organize deep-learning workshops and boot camps to train students and researchers on how to contribute to the framework and widen the use of fast AI strategies.

“We have reached a point where detector network growth will be transformative, both in terms of event rates and in terms of astrophysical reach and ultimately, discoveries,” says Katsavounidis. “‘Fast’ and ‘efficient’ is the only way to fight the ‘faint’ and ‘fuzzy’ that is out there in the universe, and the path for getting the most out of our detectors. A3D3 on one hand is going to bring production-scale AI to gravitational-wave physics and multi-messenger astronomy; but on the other hand, we aspire to go beyond our immediate domains and become the go-to place across the country for applications of accelerated AI to data-driven disciplines.”

Read More

3 Questions: Investigating a long-standing neutrino mystery

Neutrinos are one of the most mysterious members of the Standard Model, a framework for describing fundamental forces and particles in nature. While they are among the most abundant known particles in the universe, they interact very rarely with matter, making their detection a challenging experimental feat. One of the long-standing puzzles in neutrino physics comes from the Mini Booster Neutrino Experiment (MiniBooNE), which ran from 2002 to 2017 at the Fermi National Accelerator Laboratory, or Fermilab, in Illinois. MiniBooNE observed significantly more neutrino interactions that produce electrons than one would expect given our best knowledge of the Standard Model — and physicists are trying to understand why.

In 2007, researchers developed the idea for a follow-up experiment, MicroBooNE, which recently finished collecting data at Fermilab. MicroBooNE is an ideal test of the MiniBooNE excess thanks to its use of a novel detector technology known as the liquid argon time projection chamber (LArTPC), which yields high-resolution pictures of the particles that get created in neutrino interactions.  

Physics graduate students Nicholas Kamp and Lauren Yates, along with Professor Janet Conrad, all within the MIT Laboratory for Nuclear Science, have played a leading role in MicroBooNE’s deep-learning-based search for an excess of neutrinos in the Fermilab Booster Neutrino Beam. In this interview, Kamp discusses the future of the MiniBooNE anomaly within the context of MicroBooNE’s latest findings.

Q: Why is the MiniBooNE anomaly a big deal?

A: One of the big open questions in neutrino physics concerns the possible existence of a hypothetical particle called the “sterile neutrino.” Finding a new particle would be a very big deal because it can give us clues to the larger theory that explains the many particles we see. The most common explanation of the MiniBooNE excess involves the addition of such a sterile neutrino to the Standard Model. Due to the effects of neutrino oscillations, this sterile neutrino would manifest itself as an enhancement of electron neutrinos in MiniBooNE.

There are many additional anomalies seen in neutrino physics that indicate this particle might exist. However, it is difficult to explain these anomalies along with MiniBooNE through a single sterile neutrino — the full picture doesn’t quite fit. Our group at MIT is interested in new physics models that can potentially explain this full picture.

Q: What is our current understanding of the MiniBooNE excess?

A: Our understanding has progressed significantly of late thanks to developments in both the experimental and theoretical realms.

Our group has worked with physicists from Harvard, Columbia, and Cambridge universities to explore new sources of photons that can appear in a theoretical model that also has a 20 percent electron signature. We developed a “mixed model” that involves two types of exotic neutrinos — one which morphs to electron flavor and one which decays to a photon. This work is forthcoming in Physical Review D.

On the experimental end, more recent MicroBooNE results — including a deep-learning-based analysis in which our MIT group played an important role — observed no excess of neutrinos that produce electrons in the MicroBooNE detector. Keeping in mind the level at which MicroBooNE can make the measurement, this suggests that the MiniBooNE excess cannot be attributed entirely to extra neutrino interactions. If it isn’t electrons, then it must be photons, because that is the only particle that can produce a similar signature in MiniBooNE. But we are sure it is not photons produced by interactions that we know about because those are restricted to a low level. So, they must be coming from something new, such as the exotic neutrino decay in the mixed model. Next, MicroBooNE is working on a search that could isolate and identify these additional photons. Stay tuned!

Q: You mentioned that your group is involved in deep-learning-based MicroBooNE analysis. Why use deep learning in neutrino physics?

A: When humans look at images of cats, they can tell the difference between species without much difficulty. Similarly, when physicists look at images coming from a LArTPC, they can tell the difference between the particles produced in neutrino interactions without much difficulty. However, due to the nuance of the differences, both tasks turn out to be difficult for conventional algorithms.

MIT is a nexus of deep-learning ideas. Recently, for example, it became the site of the National Science Foundation AI Institute for Artificial Intelligence and Fundamental Interactions. It made sense for our group to build on the extensive local expertise in the field. We have also had the opportunity to work with fantastic groups at SLAC, Tufts University, Columbia University, and IIT, each with a strong knowledge base in the ties between deep learning and neutrino physics.

One of the key ideas in deep learning is that of a “neutral network,” which is an algorithm that makes decisions (such as identifying particles in a LArTPC) based on previous exposure to a suite of training data. Our group produced the first paper on particle identification using deep learning in neutrino physics, proving it to be a powerful technique. This is a major reason why the recently-released results of MicroBooNE’s deep learning-based analysis place strong constraints on an electron neutrino interpretation of the MiniBooNE excess.

All in all, it’s very fortunate that much of the groundwork for this analysis was done in the AI-rich environment at MIT.

Read More