How well do explanation methods for machine-learning models work?

Imagine a team of physicians using a neural network to detect cancer in mammogram images. Even if this machine-learning model seems to be performing well, it might be focusing on image features that are accidentally correlated with tumors, like a watermark or timestamp, rather than actual signs of tumors.

To test these models, researchers use “feature-attribution methods,” techniques that are supposed to tell them which parts of the image are the most important for the neural network’s prediction. But what if the attribution method misses features that are important to the model? Since the researchers don’t know which features are important to begin with, they have no way of knowing that their evaluation method isn’t effective.

To help solve this problem, MIT researchers have devised a process to modify the original data so they will be certain which features are actually important to the model. Then they use this modified dataset to evaluate whether feature-attribution methods can correctly identify those important features.

They find that even the most popular methods often miss the important features in an image, and some methods barely manage to perform as well as a random baseline. This could have major implications, especially if neural networks are applied in high-stakes situations like medical diagnoses. If the network isn’t working properly, and attempts to catch such anomalies aren’t working properly either, human experts may have no idea they are misled by the faulty model, explains lead author Yilun Zhou, an electrical engineering and computer science graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL).

“All these methods are very widely used, especially in some really high-stakes scenarios, like detecting cancer from X-rays or CT scans. But these feature-attribution methods could be wrong in the first place. They may highlight something that doesn’t correspond to the true feature the model is using to make a prediction, which we found to often be the case. If you want to use these feature-attribution methods to justify that a model is working correctly, you better ensure the feature-attribution method itself is working correctly in the first place,” he says.

Zhou wrote the paper with fellow EECS graduate student Serena Booth, Microsoft Research researcher Marco Tulio Ribeiro, and senior author Julie Shah, who is an MIT professor of aeronautics and astronautics and the director of the Interactive Robotics Group in CSAIL.

Focusing on features

In image classification, each pixel in an image is a feature that the neural network can use to make predictions, so there are literally millions of possible features it can focus on. If researchers want to design an algorithm to help aspiring photographers improve, for example, they could train a model to distinguish photos taken by professional photographers from those taken by casual tourists. This model could be used to assess how much the amateur photos resemble the professional ones, and even provide specific feedback on improvement. Researchers would want this model to focus on identifying artistic elements in professional photos during training, such as color space, composition, and postprocessing. But it just so happens that a professionally shot photo likely contains a watermark of the photographer’s name, while few tourist photos have it, so the model could just take the shortcut of finding the watermark.

“Obviously, we don’t want to tell aspiring photographers that a watermark is all you need for a successful career, so we want to make sure that our model focuses on the artistic features instead of the watermark presence. It is tempting to use feature attribution methods to analyze our model, but at the end of the day, there is no guarantee that they work correctly, since the model could use artistic features, the watermark, or any other features,” Zhou says.

“We don’t know what those spurious correlations in the dataset are. There could be so many different things that might be completely imperceptible to a person, like the resolution of an image,” Booth adds. “Even if it is not perceptible to us, a neural network can likely pull out those features and use them to classify. That is the underlying problem. We don’t understand our datasets that well, but it is also impossible to understand our datasets that well.”

The researchers modified the dataset to weaken all the correlations between the original image and the data labels, which guarantees that none of the original features will be important anymore.

Then, they add a new feature to the image that is so obvious the neural network has to focus on it to make its prediction, like bright rectangles of different colors for different image classes.  

“We can confidently assert that any model achieving really high confidence has to focus on that colored rectangle that we put in. Then we can see if all these feature-attribution methods rush to highlight that location rather than everything else,” Zhou says.

“Especially alarming” results

They applied this technique to a number of different feature-attribution methods. For image classifications, these methods produce what is known as a saliency map, which shows the concentration of important features spread across the entire image. For instance, if the neural network is classifying images of birds, the saliency map might show that 80 percent of the important features are concentrated around the bird’s beak.

After removing all the correlations in the image data, they manipulated the photos in several ways, such as blurring parts of the image, adjusting the brightness, or adding a watermark. If the feature-attribution method is working correctly, nearly 100 percent of the important features should be located around the area the researchers manipulated.

The results were not encouraging. None of the feature-attribution methods got close to the 100 percent goal, most barely reached a random baseline level of 50 percent, and some even performed worse than the baseline in some instances. So, even though the new feature is the only one the model could use to make a prediction, the feature-attribution methods sometimes fail to pick that up.

“None of these methods seem to be very reliable, across all different types of spurious correlations. This is especially alarming because, in natural datasets, we don’t know which of those spurious correlations might apply,” Zhou says. “It could be all sorts of factors. We thought that we could trust these methods to tell us, but in our experiment, it seems really hard to trust them.”

All feature-attribution methods they studied were better at detecting an anomaly than the absence of an anomaly. In other words, these methods could find a watermark more easily than they could identify that an image does not contain a watermark. So, in this case, it would be more difficult for humans to trust a model that gives a negative prediction.

The team’s work shows that it is critical to test feature-attribution methods before applying them to a real-world model, especially in high-stakes situations.

“Researchers and practitioners may employ explanation techniques like feature-attribution methods to engender a person’s trust in a model, but that trust is not founded unless the explanation technique is first rigorously evaluated,” Shah says. “An explanation technique may be used to help calibrate a person’s trust in a model, but it is equally important to calibrate a person’s trust in the explanations of the model.”

Moving forward, the researchers want to use their evaluation procedure to study more subtle or realistic features that could lead to spurious correlations. Another area of work they want to explore is helping humans understand saliency maps so they can make better decisions based on a neural network’s predictions.

This research was supported, in part, by the National Science Foundation.

Read More

“Hey, Alexa! Are you trustworthy?”

A family gathers around their kitchen island to unbox the digital assistant they just purchased. They will be more likely to trust this new voice-user interface, which might be a smart speaker like Amazon’s Alexa or a social robot like Jibo, if it exhibits some humanlike social behaviors, according to a new study by researchers in MIT’s Media Lab.

The researchers found that family members tend to think a device is more competent and emotionally engaging if it can exhibit social cues, like moving to orient its gaze at a speaking person. In addition, their study revealed that branding — specifically, whether the manufacturer’s name is associated with the device — has a significant effect on how members of a family perceive and interact with different voice-user interfaces.

When a device has a higher level of social embodiment, such as the ability to give verbal and nonverbal social cues through motion or expression, family members also interacted with one another more frequently while engaging with the device as a group, the researchers found.

Their results could help designers create voice-user interfaces that are more engaging and more likely to be used by members of a family in the home, while also improving the transparency of these devices. The researchers also outline ethical concerns that could come from certain personality and embodiment designs.

“These devices are new technology coming into the home and they are still very under-explored,” says Anastasia Ostrowski, a research assistant in the Personal Robotics Group in the Media Lab, and lead author of the paper. “Families are in the home, so we were very interested in looking at this from a generational approach, including children and grandparents. It was super interesting for us to understand how people are perceiving these, and how families interact with these devices together.”

Coauthors include Vasiliki Zygouras, a recent Wellesley College graduate working in the Personal Robotics Group at the time of this research; Research Scientist Hae Won Park; Cornell University graduate student Jenny Fu; and senior author Cynthia Breazeal, professor of media arts and sciences, director of MIT RAISE, and director of the Personal Robotics Group, as well as a developer of the Jibo robot. The paper is published today in Frontiers in Robotics and AI.

“The human-centered insights of this work are relevant to the design of all kinds of personified AI devices, from smart speakers and intelligent agents to personal robots,” says Breazeal.

Investigating interactions

This work grew out of an earlier study where the researchers explored how people use voice-user interfaces at home. At the start of the study, users familiarized themselves with three devices before taking one home for a month. The researchers noticed that people spent more time interacting with a Jibo social robot than they did the smart speakers, Amazon Alexa and Google Home. They wondered why people engaged more with the social robot.

To get to the bottom of this, they designed three experiments that involved family members interacting as a group with different voice-user interfaces. Thirty-four families, comprising 92 people between age 4 and 69, participated in the studies.

The experiments were designed to mimic a family’s first encounter with a voice-user interface. Families were video recorded as they interacted with three devices, working through a list of 24 actions (like “ask about the weather” or “try to learn the agent’s opinions”). Then they answered questions about their perception of the devices and categorized the voice-user interfaces’ personalities.

In the first experiment, participants interacted with a Jibo robot, Amazon Echo, and Google Home, with no modifications. Most found the Jibo to be far more outgoing, dependable, and sympathetic. Because the users perceived that Jibo had a more humanlike personality, they were more likely to interact with it, Ostrowski explains.

An unexpected result

In the second experiment, researchers set out to understand how branding affected participants’ perspectives. They changed the “wake word” (the word the user says aloud to engage the device) of the Amazon Echo to “Hey, Amazon!” instead of “Hey, Alexa!,” but kept the “wake word” the same for the Google Home (“Hey, Google!”) and the Jibo robot (“Hey, Jibo!”). They also provided participants with information about each manufacturer. When branding was taken into account, users viewed Google as more trustworthy than Amazon, despite the fact that the devices were very similar in design and functionality.

“It also drastically changed how much people thought the Amazon device was competent or like a companion,” Ostrowski says. “I was not expecting it to have that big of a difference between the first and second study. We didn’t change any of the abilities, how they function, or how they respond. Just the fact that they were aware the device is made by Amazon made a huge difference in their perceptions.”

Changing the “wake word” of a device can have ethical implications. A personified name, which can make a device seem more social, could mislead users by masking the connection between the device and the company that made it, which is also the company that now has access to the user’s data, she says.

In the third experiment, the team wanted to see how interpersonal movement affected the interactions. For instance, the Jibo robot turns its gaze to the individual who is speaking. For this study, the researchers used the Jibo along with an Amazon Echo Show (a rectangular screen) with the modified wake word “Hey, Computer,” and an Amazon Echo Spot (a sphere with a circular screen) that had a rotating flag on top which sped up when someone called its wake word, “Hey, Alexa!”

Users found the modified Amazon Echo Spot to be no more engaging than the Amazon Echo Show, suggesting that repetitive movement without social embodiment may not be an effective way to increase user engagement, Ostrowski says.

Fostering deeper relationships

Deeper analysis of the third study also revealed that users interacted more among themselves, like glancing at each other, laughing together, or having side conversations, when the device they were engaging with had more social abilities.

“In the home, we have been wondering how these systems promote engagement between users. That is always a big concern for people: How are these devices going to shape people’s relationships? We want to design systems that can promote a more flourishing relationship between people,” Ostrowski says.

The researchers used their insights to lay out several voice-user interface design considerations, including the importance of developing warm, outgoing, and thoughtful personalities; understanding how the wake word influences user acceptance; and conveying nonverbal social cues through movement.

With these results in hand, the researchers want to continue exploring how families engage with voice-user interfaces that have varying levels of functionality. For instance, they might conduct a study with three different social robots. They would also like to replicate these studies in a real-world environment and explore which design features are best suited for specific interactions.

This research was funded by the Media Lab Consortia.

Read More

Q&A: Dolapo Adedokun on computer technology, Ireland, and all that jazz

Adedolapo Adedokun has a lot to look forward to in 2023. After completing his degree in electrical engineering and computer science next spring, he will travel to Ireland to undertake an MS in intelligent systems at Trinity College Dublin as MIT’s fourth student to receive the prestigious George J. Mitchell Scholarship. But there’s more to Adedokun, who goes by Dolapo, than just academic achievement. Besides being a talented computer scientist, the senior is an accomplished musician, an influential member of student government and an anime fan.

Q: What excites you the most about going to Ireland to study for a year?

A: One of the reasons I was interested in Ireland was when I learned about Music Generation, a national music education initiative in Ireland, with the goal of giving every child in Ireland access to the arts through access to music tuition, performance opportunities, and music education in and outside of the classroom. It made me think, “Wow, this is a country that recognizes the importance of arts and music education and has invested to make it accessible for people of all backgrounds.” I am inspired by this initiative and wish it was something I could have had growing up.

I am also really inspired by the work of Louis Stewart, an amazing jazz guitarist who was born and raised in Dublin. I am excited to explore his musical influences and to dive into the rich musical community of Dublin. I hope to join a jazz band, maybe a trio or a quartet, and perform all around the city, immersing myself in the rich Irish musical scene, but also sharing my own styles and musical influences with the community there.

Q: Of course, while you’re there, you’ll be working on your MS in intelligent systems. I’m intrigued by your invention of a smart-home system that lets users layer different melodies as they enter and leave a building. Can you tell us a little more about that system: how it works, how you envision users interacting with it and experiencing it, and what you learned from developing it?

A: Funny enough, it actually started as a system I worked on in my freshman year in 6.08 (Introduction to Embedded Systems) with a few classmates. We called it Smart HOMiE, an IoT [internet-of-things] Arduino smart-home device that gathered basic information like location, weather, and interfaced with Amazon Alexa. I had forgotten about having worked on it until I took 21M.080 (Introduction to Music Technology) and 6.033 (Computer System Engineering) in my junior year, and began to learn about the creative applications of machine learning and computer science in areas like audio synthesis and digital instrument design. I learned about amazing projects like Google Magenta’s Tone Transfer ML — models that use machine learning models to transform sounds into legitimate musical instruments. Learning about this unique intersection combining music and technology, I began to think about bigger questions, like, “What kind of creative future can technology create? How can technology enable anyone to be expressive?”

When I had some downtime while being at home for a year, I wanted to play around with some of the audio synthesis tools I had learned about. I took Smart HOMiE and upgraded it a bit — made it a bit more musical. It worked in three main steps. First, multiple people could sing and record melodies that the device would save and store. Then, using a few pitch correction and audio synthesis Python libraries, Smart HOMiE corrected the recorded melodies until they fit together, or generally fit inside the same key, in music terms. Lastly, it then would combine the melodies, add some harmony or layer the track over a backing track, and by the end, you’ve made something really unique and expressive. It was definitely a bit scrappy, but it was one of my first times messing around and exploring all the work that has already been done by amazing people in this space. Technology has this incredible potential to make anyone a creator — I’d like to build the tools to make it happen.

Q: You’re a jazz instrumentalist yourself. Tell us more!

A: I’ve always had an affinity for music, but haven’t always felt like I could become a musician. I had played saxophone in middle school but it never really stuck. When I got to MIT, I was fortunate enough to take 21M.051 (Fundamentals of Music) and dive into proper music theory for the first time. It was in that class that I was exposed to jazz and completely fell in love. I’ll never forget walking back to New House from Barker Library in my freshman year and stumbling upon “Undercurrent,” by Bill Evans and Jim Hall — I think that was when I decided I wanted to learn jazz guitar.

Jazz, and in particular improvisation, has taught me so much about what it means to be creative: to be willing to experiment, take risks, build upon the work of others, and accept failure — all skills that I wholeheartedly believe have made me a better technologist and leader. Most importantly, though, I think music and jazz have taught me patience and discipline, and that mastery of a skill takes a lifetime. I’d be lying if I said I was satisfied with where I am currently at, but each day, I’m eager to take one step forward towards my goals.

Q: You’ve focused in on music and arts education, and the potential of technology to bolster both. Is there a particularly influential class, technology, or teacher in your past that you can point to as a change-maker in your life?

A: Wow, tough question! I think there are a few inflection points that have really been change-makers for me. The first was in high school when I first learned about Guitar Hero, the music rhythm video game that started as a project in the MIT Media Lab attempting to bring the joy of music-making to people of all backgrounds. It was then that I was able to see the multidisciplinary outreach of technology in service of others.

The next I would say was taking 6.033 at MIT. From the first day of class, Professor [Katrina] LaCurts emphasized understanding the people we design for. That we ought to see system design as inherently people-oriented — before we think of designing a system, we must first consider the people that will be using them. We must consider their goals, their personas, their backgrounds, the barriers that they face, and most importantly, the consequences of our design and implementation choices. I envision a future where music, arts, and the creative process are accessible to everyone, and I believe 6.033 has given me the foundation to build the technology to reach that goal.

Q: You’ve also developed a passion for broadband infrastructure, which at first glance, people might not connect with music and education, your other two focuses. Why is broadband such an important factor?

A: Before we can think about the potential of technology to democratize accessibility to music and the arts, we first have to take a step back and think about accessibility. What communities have more and less access to the proper technology that we often take for granted? I think broadband is just one factor in the realm of the bigger problem, which is accessibility, particularly in minority and low-income communities. I see technology as being the key to democratizing access to music and the arts for people of all background — but that technology can only be the key if the foundational infrastructure is in place for all people to take advantage of it. Just like I learned in 6.033, that means understanding the barriers of the people and communities with the least access and investing in crucial, basic technological resources like equitable broadband internet access.

Q: Between your work on the Undergraduate Student Advisory Group in EECS, the Harvard/MIT Cooperative Society, the MIT Chapter of the National Society of Black Engineers, and of course all your research and many academic interests, many readers must wonder if you ever eat or sleep! How have you balanced your busy MIT life and maintained a sense of self while accomplishing so much as an undergraduate?

A: Great question! I’ll start by saying it took me a while to figure out. There were semesters where I had to drop classes and or drop extracurricular commitments to find some sense of balance. It’s always difficult, being surrounded by the world’s brightest students who are all doing incredible and amazing things, to not feel like you should add one more class or an extra UROP.

I think the most important thing, though, is to stay true to you — figuring out the things that bring you joy, that excite you, and how much of those commitments is reasonable to take on each semester. I’m not a student who can take a million-and-one classes, research, internships, and clubs all at the same time — but that’s totally OK. It took me a while to find the things I enjoyed, and understand the academic load that’s appropriate for me each semester, but once I did, I was happier than ever before. I realized things like playing tennis and basketball, jamming with friends, and even sneaking in a few episodes of anime here and there are really important to me. As long as I can look back each week, month, semester, and year and say I’ve taken a step forward towards my academic, social, and music goals, even just the tiniest amount, then I think I am taking steps in the right direction.

Read More

The promise and pitfalls of artificial intelligence explored at TEDxMIT event

Scientists, students, and community members came together last month to discuss the promise and pitfalls of artificial intelligence at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) for the fourth TEDxMIT event held at MIT. 

Attendees were entertained and challenged as they explored “the good and bad of computing,” explained CSAIL Director Professor Daniela Rus, who organized the event with John Werner, an MIT fellow and managing director of Link Ventures; MIT sophomore Lucy Zhao; and grad student Jessica Karaguesian. “As you listen to the talks today,” Rus told the audience, “consider how our world is made better by AI, and also our intrinsic responsibilities for ensuring that the technology is deployed for the greater good.”

Rus mentioned some new capabilities that could be enabled by AI: an automated personal assistant that could monitor your sleep phases and wake you at the optimal time, as well as on-body sensors that monitor everything from your posture to your digestive system. “Intelligent assistance can help empower and augment our lives. But these intriguing possibilities should only be pursued if we can simultaneously resolve the challenges that these technologies bring,” said Rus. 

The next speaker, CSAIL principal investigator and professor of electrical engineering and computer science Manolis Kellis, started off by suggesting what sounded like an unattainable goal — using AI to “put an end to evolution as we know it.” Looking at it from a computer science perspective, he said, what we call evolution is basically a brute force search. “You’re just exploring all of the search space, creating billions of copies of every one of your programs, and just letting them fight against each other. This is just brutal. And it’s also completely slow. It took us billions of years to get here.” Might it be possible, he asked, to speed up evolution and make it less messy?

The answer, Kellis said, is that we can do better, and that we’re already doing better: “We’re not killing people like Sparta used to, throwing the weaklings off the mountain. We are truly saving diversity.”

Knowledge, moreover, is now being widely shared, passed on “horizontally” through accessible information sources, he noted, rather than “vertically,” from parent to offspring. “I would like to argue that competition in the human species has been replaced by collaboration. Despite having a fixed cognitive hardware, we have software upgrades that are enabled by culture, by the 20 years that our children spend in school to fill their brains with everything that humanity has learned, regardless of which family came up with it. This is the secret of our great acceleration” — the fact that human advancement in recent centuries has vastly out-clipped evolution’s sluggish pace.

The next step, Kellis said, is to harness insights about evolution in order to combat an individual’s genetic susceptibility to disease. “Our current approach is simply insufficient,” he added. “We’re treating manifestations of disease, not the causes of disease.” A key element in his lab’s ambitious strategy to transform medicine is to identify “the causal pathways through which genetic predisposition manifests. It’s only by understanding these pathways that we can truly manipulate disease causation and reverse the disease circuitry.” 

Kellis was followed by Aleksander Madry, MIT professor of electrical engineering and computer science and CSAIL principal investigator, who told the crowd, “progress in AI is happening, and it’s happening fast.” Computer programs can routinely beat humans in games like chess, poker, and Go. So should we be worried about AI surpassing humans? 

Madry, for one, is not afraid — or at least not yet. And some of that reassurance stems from research that has led him to the following conclusion: Despite its considerable success, AI, especially in the form of machine learning, is lazy. “Think about being lazy as this kind of smart student who doesn’t really want to study for an exam. Instead, what he does is just study all the past years’ exams and just look for patterns. Instead of trying to actually learn, he just tries to pass the test. And this is exactly the same way in which current AI is lazy.”

A machine-learning model might recognize grazing sheep, for instance, simply by picking out pictures that have green grass in them. If a model is trained to identify fish from photos of anglers proudly displaying their catches, Madry explained, “the model figures out that if there’s a human holding something in the picture, I will just classify it as a fish.” The consequences can be more serious for an AI model intended to pick out malignant tumors. If the model is trained on images containing rulers that indicate the size of tumors, the model may end up selecting only those photos that have rulers in them.

This leads to Madry’s biggest concerns about AI in its present form. “AI is beating us now,” he noted. “But the way it does it [involves] a little bit of cheating.” He fears that we will apply AI “in some way in which this mismatch between what the model actually does versus what we think it does will have some catastrophic consequences.” People relying on AI, especially in potentially life-or-death situations, need to be much more mindful of its current limitations, Madry cautioned.

There were 10 speakers altogether, and the last to take the stage was MIT associate professor of electrical engineering and computer science and CSAIL principal investigator Marzyeh Ghassemi, who laid out her vision for how AI could best contribute to general health and well-being. But in order for that to happen, its models must be trained on accurate, diverse, and unbiased medical data.

It’s important to focus on the data, Ghassemi stressed, because these models are learning from us. “Since our data is human-generated … a neural network is learning how to practice from a doctor. But doctors are human, and humans make mistakes. And if a human makes a mistake, and we train an AI from that, the AI will, too. Garbage in, garbage out. But it’s not like the garbage is distributed equally.”

She pointed out that many subgroups receive worse care from medical practitioners, and members of these subgroups die from certain conditions at disproportionately high rates. This is an area, Ghassemi said, “where AI can actually help. This is something we can fix.” Her group is developing machine-learning models that are robust, private, and fair. What’s holding them back is neither algorithms nor GPUs. It’s data. Once we collect reliable data from diverse sources, Ghassemi added, we might start reaping the benefits that AI can bring to the realm of health care.

In addition to CSAIL speakers, there were talks from members across MIT’s Institute for Data, Systems, and Society; the MIT Mobility Initiative; the MIT Media Lab; and the SENSEable City Lab.

The proceedings concluded on that hopeful note. Rus and Werner then thanked everyone for coming. “Please continue to reflect about the good and bad of computing,” Rus urged. “And we look forward to seeing you back here in May for the next TEDxMIT event.”

The exact theme of the spring 2022 gathering will have something to do with “superpowers.” But — if December’s mind-bending presentations were any indication — the May offering is almost certain to give its attendees plenty to think about. And maybe provide the inspiration for a startup or two.

Read More

Physics and the machine-learning “black box”

Machine-learning algorithms are often referred to as a “black box.” Once data are put into an algorithm, it’s not always known exactly how the algorithm arrives at its prediction. This can be particularly frustrating when things go wrong. A new mechanical engineering (MechE) course at MIT teaches students how to tackle the “black box” problem, through a combination of data science and physics-based engineering.

In class 2.C01 (Physical Systems Modeling and Design Using Machine Learning), Professor George Barbastathis demonstrates how mechanical engineers can use their unique knowledge of physical systems to keep algorithms in check and develop more accurate predictions.

“I wanted to take 2.C01 because machine-learning models are usually a “black box,” but this class taught us how to construct a system model that is informed by physics so we can peek inside,” explains Crystal Owens, a mechanical engineering graduate student who took the course in spring 2021.

As chair of the Committee on the Strategic Integration of Data Science into Mechanical Engineering, Barbastathis has had many conversations with mechanical engineering students, researchers, and faculty to better understand the challenges and successes they’ve had using machine learning in their work.

“One comment we heard frequently was that these colleagues can see the value of data science methods for problems they are facing in their mechanical engineering-centric research; yet they are lacking the tools to make the most out of it,” says Barbastathis. “Mechanical, civil, electrical, and other types of engineers want a fundamental understanding of data principles without having to convert themselves to being full-time data scientists or AI researchers.”

Additionally, as mechanical engineering students move on from MIT to their careers, many will need to manage data scientists on their teams someday. Barbastathis hopes to set these students up for success with class 2.C01.

Bridging MechE and the MIT Schwarzman College of Computing

Class 2.C01 is part of the MIT Schwarzman College of Computing’s Common Ground for Computing Education. The goal of these classes is to connect computer science and artificial intelligence with other disciplines, for example, connecting data science with physics-based disciplines like mechanical engineering. Students take the course alongside 6.C01 (Modeling with Machine Learning: from Algorithms to Applications), taught by professors of electrical engineering and computer science Regina Barzilay and Tommi Jaakkola.

The two classes are taught concurrently during the semester, exposing students to both fundamentals in machine learning and domain-specific applications in mechanical engineering.

In 2.C01, Barbastathis highlights how complementary physics-based engineering and data science are. Physical laws present a number of ambiguities and unknowns, ranging from temperature and humidity to electromagnetic forces. Data science can be used to predict these physical phenomena. Meanwhile, having an understanding of physical systems helps ensure the resulting output of an algorithm is accurate and explainable.

“What’s needed is a deeper combined understanding of the associated physical phenomena and the principles of data science, machine learning in particular, to close the gap,” adds Barbastathis. “By combining data with physical principles, the new revolution in physics-based engineering is relatively immune to the “black box” problem facing other types of machine learning.”

Equipped with a working knowledge of machine-learning topics covered in class 6.C402 and a deeper understanding of how to pair data science with physics, students are charged with developing a final project that solves for an actual physical system.

Developing solutions for real-world physical systems

For their final project, students in 2.C01 are asked to identify a real-world problem that requires data science to address the ambiguity inherent in physical systems. After obtaining all relevant data, students are asked to select a machine-learning method, implement their chosen solution, and present and critique the results.

Topics this past semester ranged from weather forecasting to the flow of gas in combustion engines, with two student teams drawing inspiration from the ongoing Covid-19 pandemic.

Owens and her teammates, fellow graduate students Arun Krishnadas and Joshua David John Rathinaraj, set out to develop a model for the Covid-19 vaccine rollout.

“We developed a method of combining a neural network with a susceptible-infected-recovered (SIR) epidemiological model to create a physics-informed prediction system for the spread of Covid-19 after vaccinations started,” explains Owens.

The team accounted for various unknowns including population mobility, weather, and political climate. This combined approach resulted in a prediction of Covid-19’s spread during the vaccine rollout that was more reliable than using either the SIR model or a neural network alone.

Another team, including graduate student Yiwen Hu, developed a model to predict mutation rates in Covid-19, a topic that became all too pertinent as the delta variant began its global spread.

“We used machine learning to predict the time-series-based mutation rate of Covid-19, and then incorporated that as an independent parameter into the prediction of pandemic dynamics to see if it could help us better predict the trend of the Covid-19 pandemic,” says Hu.

Hu, who had previously conducted research into how vibrations on coronavirus protein spikes affect infection rates, hopes to apply the physics-based machine-learning approaches she learned in 2.C01 to her research on de novo protein design.

Whatever the physical system students addressed in their final projects, Barbastathis was careful to stress one unifying goal: the need to assess ethical implications in data science. While more traditional computing methods like face or voice recognition have proven to be rife with ethical issues, there is an opportunity to combine physical systems with machine learning in a fair, ethical way.

“We must ensure that collection and use of data are carried out equitably and inclusively, respecting the diversity in our society and avoiding well-known problems that computer scientists in the past have run into,” says Barbastathis.

Barbastathis hopes that by encouraging mechanical engineering students to be both ethics-literate and well-versed in data science, they can move on to develop reliable, ethically sound solutions and predictions for physical-based engineering challenges.

Read More

Meet the 2021-22 Accenture Fellows

Launched in October of 2020, the MIT and Accenture Convergence Initiative for Industry and Technology underscores the ways in which industry and technology come together to spur innovation. The five-year initiative aims to achieve its mission through research, education, and fellowships. To that end, Accenture has once again awarded five annual fellowships to MIT graduate students working on research in industry and technology convergence who are underrepresented, including by race, ethnicity, and gender.

This year’s Accenture Fellows work across disciplines including robotics, manufacturing, artificial intelligence, and biomedicine. Their research covers a wide array of subjects, including: advancing manufacturing through computational design, with the potential to benefit global vaccine production; designing low-energy robotics for both consumer electronics and the aerospace industry; developing robotics and machine learning systems that may aid the elderly in their homes; and creating ingestible biomedical devices that can help gather medical data from inside a patient’s body.

Student nominations from each unit within the School of Engineering, as well as from the four other MIT schools and the MIT Schwarzman College of Computing, were invited as part of the application process. Five exceptional students were selected as fellows in the initiative’s second year.

Xinming (Lily) Liu is a PhD student in operations research at MIT Sloan School of Management. Her work is focused on behavioral and data-driven operations for social good, incorporating human behaviors into traditional optimization models, designing incentives, and analyzing real-world data. Her current research looks at the convergence of social media, digital platforms, and agriculture, with particular attention to expanding technological equity and economic opportunity in developing countries. Liu earned her BS from Cornell University, with a double major in operations research and computer science.

Caris Moses is a PhD student in electrical engineering and computer science specializing in artificial intelligence. Moses’ research focuses on using machine learning, optimization, and electromechanical engineering to build robotics systems that are robust, flexible, intelligent, and can learn on the job. The technology she is developing holds promise for industries including flexible, small-batch manufacturing; robots to assist the elderly in their households; and warehouse management and fulfillment. Moses earned her BS in mechanical engineering from Cornell University and her MS in computer science from Northeastern University.

Sergio Rodriguez Aponte is a PhD student in biological engineering. He is working on the convergence of computational design and manufacturing practices, which have the potential to impact industries such as biopharmaceuticals, food, and wellness/nutrition. His current research aims to develop strategies for applying computational tools, such as multiscale modeling and machine learning, to the design and production of manufacturable and accessible vaccine candidates that could eventually be available globally. Rodriguez Aponte earned his BS in industrial biotechnology from the University of Puerto Rico at Mayaguez.

Soumya Sudhakar SM ’20 is a PhD student in aeronautics and astronautics. Her work is focused on the co-design of new algorithms and integrated circuits for autonomous low-energy robotics that could have novel applications in aerospace and consumer electronics. Her contributions bring together the emerging robotics industry, integrated circuits industry, aerospace industry, and consumer electronics industry. Sudhakar earned her BSE in mechanical and aerospace engineering from Princeton University and her MS in aeronautics and astronautics from MIT.

So-Yoon Yang is a PhD student in electrical engineering and computer science. Her work on the development of low-power, wireless, ingestible biomedical devices for health care is at the intersection of the medical device, integrated circuit, artificial intelligence, and pharmaceutical fields. Currently, the majority of wireless biomedical devices can only provide a limited range of medical data measured from outside the body. Ingestible devices hold promise for the next generation of personal health care because they do not require surgical implantation, can be useful for detecting physiological and pathophysiological signals, and can also function as therapeutic alternatives when treatment cannot be done externally. Yang earned her BS in electrical and computer engineering from Seoul National University in South Korea and her MS in electrical engineering from Caltech.

Read More

Perfecting pitch perception

New research from MIT neuroscientists suggests that natural soundscapes have shaped our sense of hearing, optimizing it for the kinds of sounds we most often encounter.

In a study reported Dec. 14 in the journal Nature Communications, researchers led by McGovern Institute for Brain Research associate investigator Josh McDermott used computational modeling to explore factors that influence how humans hear pitch. Their model’s pitch perception closely resembled that of humans — but only when it was trained using music, voices, or other naturalistic sounds.

Humans’ ability to recognize pitch — essentially, the rate at which a sound repeats — gives melody to music and nuance to spoken language. Although this is arguably the best-studied aspect of human hearing, researchers are still debating which factors determine the properties of pitch perception, and why it is more acute for some types of sounds than others. McDermott, who is also an associate professor in MIT’s Department of Brain and Cognitive Sciences, and an Investigator with the Center for Brains, Minds, and Machines (CBMM) at MIT, is particularly interested in understanding how our nervous system perceives pitch because cochlear implants, which send electrical signals about sound to the brain in people with profound deafness, don’t replicate this aspect of human hearing very well.

“Cochlear implants can do a pretty good job of helping people understand speech, especially if they’re in a quiet environment. But they really don’t reproduce the percept of pitch very well,” says Mark Saddler, a graduate student and CBMM researcher who co-led the project and an inaugural graduate fellow of the K. Lisa Yang Integrative Computational Neuroscience Center. “One of the reasons it’s important to understand the detailed basis of pitch perception in people with normal hearing is to try to get better insights into how we would reproduce that artificially in a prosthesis.”

Artificial hearing

Pitch perception begins in the cochlea, the snail-shaped structure in the inner ear where vibrations from sounds are transformed into electrical signals and relayed to the brain via the auditory nerve. The cochlea’s structure and function help determine how and what we hear. And although it hasn’t been possible to test this idea experimentally, McDermott’s team suspected our “auditory diet” might shape our hearing as well.

To explore how both our ears and our environment influence pitch perception, McDermott, Saddler, and Research Assistant Ray Gonzalez built a computer model called a deep neural network. Neural networks are a type of machine learning model widely used in automatic speech recognition and other artificial intelligence applications. Although the structure of an artificial neural network coarsely resembles the connectivity of neurons in the brain, the models used in engineering applications don’t actually hear the same way humans do — so the team developed a new model to reproduce human pitch perception. Their approach combined an artificial neural network with an existing model of the mammalian ear, uniting the power of machine learning with insights from biology. “These new machine-learning models are really the first that can be trained to do complex auditory tasks and actually do them well, at human levels of performance,” Saddler explains.

The researchers trained the neural network to estimate pitch by asking it to identify the repetition rate of sounds in a training set. This gave them the flexibility to change the parameters under which pitch perception developed. They could manipulate the types of sound they presented to the model, as well as the properties of the ear that processed those sounds before passing them on to the neural network.

When the model was trained using sounds that are important to humans, like speech and music, it learned to estimate pitch much as humans do. “We very nicely replicated many characteristics of human perception … suggesting that it’s using similar cues from the sounds and the cochlear representation to do the task,” Saddler says.

But when the model was trained using more artificial sounds or in the absence of any background noise, its behavior was very different. For example, Saddler says, “If you optimize for this idealized world where there’s never any competing sources of noise, you can learn a pitch strategy that seems to be very different from that of humans, which suggests that perhaps the human pitch system was really optimized to deal with cases where sometimes noise is obscuring parts of the sound.”

The team also found the timing of nerve signals initiated in the cochlea to be critical to pitch perception. In a healthy cochlea, McDermott explains, nerve cells fire precisely in time with the sound vibrations that reach the inner ear. When the researchers skewed this relationship in their model, so that the timing of nerve signals was less tightly correlated to vibrations produced by incoming sounds, pitch perception deviated from normal human hearing. 

McDermott says it will be important to take this into account as researchers work to develop better cochlear implants. “It does very much suggest that for cochlear implants to produce normal pitch perception, there needs to be a way to reproduce the fine-grained timing information in the auditory nerve,” he says. “Right now, they don’t do that, and there are technical challenges to making that happen — but the modeling results really pretty clearly suggest that’s what you’ve got to do.”

Read More

Characters for good, created by artificial intelligence

As it becomes easier to create hyper-realistic digital characters using artificial intelligence, much of the conversation around these tools has centered on misleading and potentially dangerous deepfake content. But the technology can also be used for positive purposes — to revive Albert Einstein to teach a physics class, talk through a career change with your older self, or anonymize people while preserving facial communication.

To encourage the technology’s positive possibilities, MIT Media Lab researchers and their collaborators at the University of California at Santa Barbara and Osaka University have compiled an open-source, easy-to-use character generation pipeline that combines AI models for facial gestures, voice, and motion and can be used to create a variety of audio and video outputs. 

The pipeline also marks the resulting output with a traceable, as well as human-readable, watermark to distinguish it from authentic video content and to show how it was generated — an addition to help prevent its malicious use.

By making this pipeline easily available, the researchers hope to inspire teachers, students, and health-care workers to explore how such tools can help them in their respective fields. If more students, educators, health-care workers, and therapists have a chance to build and use these characters, the results could improve health and well-being and contribute to personalized education, the researchers write in Nature Machine Intelligence.

“It will be a strange world indeed when AIs and humans begin to share identities. This paper does an incredible job of thought leadership, mapping out the space of what is possible with AI-generated characters in domains ranging from education to health to close relationships, while giving a tangible roadmap on how to avoid the ethical challenges around privacy and misrepresentation,” says Jeremy Bailenson, founding director of the Stanford Virtual Human Interaction Lab, who was not associated with the study.

Although the world mostly knows the technology from deepfakes, “we see its potential as a tool for creative expression,” says the paper’s first author Pat Pataranutaporn, a PhD student in professor of media technology Pattie Maes’ Fluid Interfaces research group.  

Other authors on the paper include Maes; Fluid Interfaces master’s student Valdemar Danry and PhD student Joanne Leong; Media Lab Research Scientist Dan Novy; Osaka University Assistant Professor Parinya Punpongsanon; and University of California at Santa Barbara Assistant Professor Misha Sra.

Deeper truths and deeper learning

Generative adversarial networks, or GANs, a combination of two neural networks that compete against each other, have made it easier to create photorealistic images, clone voices, and animate faces. Pataranutaporn, with Danry, first explored its possibilities in a project called Machinoia, where he generated multiple alternative representations of himself — as a child, as an old man, as female — to have a self-dialogue of life choices from different perspectives. The unusual deepfaking experience made him aware of his “journey as a person,” he says. “It was deep truth — to uncover something about yourself you’ve never thought of before, using your own data on your own self.”

Self-exploration is only one of the positive applications of AI-generated characters, the researchers say. Experiments show, for instance, that these characters can make students more enthusiastic about learning and improve cognitive task performance. The technology offers a way for instruction to be “personalized to your interest, your idols, your context, and can be changed over time,” Pataranutaporn explains, as a complement to traditional instruction.

For instance, the MIT researchers used their pipeline to create a synthetic version of Johann Sebastian Bach, which had a live conversation with renowned cellist Yo Yo Ma in Media Lab Professor Tod Machover’s musical interfaces class — to the delight of both the students and Ma.

Other applications might include characters who help deliver therapy, to alleviate a growing shortage of mental health professionals and reach the estimated 44 percent of Americans with mental health issues who never receive counseling, or AI-generated content that delivers exposure therapy to people with social anxiety. In a related use case, the technology can be used to anonymize faces in video while preserving facial expressions and emotions, which may be useful for sessions where people want to share personally sensitive information such as health and trauma experiences, or for whistleblowers and witness accounts.

But there are also more artistic and playful use cases. In this fall’s Experiments in Deepfakes class, led by Maes and research affiliate Roy Shilkrot, students used the technology to animate the figures in a historical Chinese painting and to create a dating “breakup simulator,” among other projects.

Legal and ethical challenges

Many of the applications of AI-generated characters raise legal and ethical issues that must be discussed as the technology evolves, the researchers note in their paper. For instance, how will we decide who has the right to digitally recreate a historical character? Who is legally liable if an AI clone of a famous person promotes harmful behavior online? And is there any danger that we will prefer interacting with synthetic characters over humans?

“One of our goals with this research is to raise awareness about what is possible, ask questions and start public conversations about how this technology can be used ethically for societal benefit. What technical, legal, policy and educational actions can we take to promote positive use cases while reducing the possibility for harm?” states Maes.

By sharing the technology widely, while clearly labeling it as synthesized, Pataranutaporn says, “we hope to stimulate more creative and positive use cases, while also educating people about the technology’s potential benefits and harms

Read More

Q&A: Cathy Wu on developing algorithms to safely integrate robots into our world

Cathy Wu is the Gilbert W. Winslow Assistant Professor of Civil and Environmental Engineering and a member of the MIT Institute for Data, Systems, and Society. As an undergraduate, Wu won MIT’s toughest robotics competition, and as a graduate student took the University of California at Berkeley’s first-ever course on deep reinforcement learning. Now back at MIT, she’s working to improve the flow of robots in Amazon warehouses under the Science Hub, a new collaboration between the tech giant and the MIT Schwarzman College of Computing. Outside of the lab and classroom, Wu can be found running, drawing, pouring lattes at home, and watching YouTube videos on math and infrastructure via 3Blue1Brown and Practical Engineering. She recently took a break from all of that to talk about her work.

Q: What put you on the path to robotics and self-driving cars?

A: My parents always wanted a doctor in the family. However, I’m bad at following instructions and became the wrong kind of doctor! Inspired by my physics and computer science classes in high school, I decided to study engineering. I wanted to help as many people as a medical doctor could.

At MIT, I looked for applications in energy, education, and agriculture, but the self-driving car was the first to grab me. It has yet to let go! Ninety-four percent of serious car crashes are caused by human error and could potentially be prevented by self-driving cars. Autonomous vehicles could also ease traffic congestion, save energy, and improve mobility.

I first learned about self-driving cars from Seth Teller during his guest lecture for the course Mobile Autonomous Systems Lab (MASLAB), in which MIT undergraduates compete to build the best full-functioning robot from scratch. Our ball-fetching bot, Putzputz, won first place. From there, I took more classes in machine learning, computer vision, and transportation, and joined Teller’s lab. I also competed in several mobility-related hackathons, including one sponsored by Hubway, now known as Blue Bike.

Q: You’ve explored ways to help humans and autonomous vehicles interact more smoothly. What makes this problem so hard?

A: Both systems are highly complex, and our classical modeling tools are woefully insufficient. Integrating autonomous vehicles into our existing mobility systems is a huge undertaking. For example, we don’t know whether autonomous vehicles will cut energy use by 40 percent, or double it. We need more powerful tools to cut through the uncertainty. My PhD thesis at Berkeley tried to do this. I developed scalable optimization methods in the areas of robot control, state estimation, and system design. These methods could help decision-makers anticipate future scenarios and design better systems to accommodate both humans and robots.

Q: How is deep reinforcement learning, combining deep and reinforcement learning algorithms, changing robotics?

A: I took John Schulman and Pieter Abbeel’s reinforcement learning class at Berkeley in 2015 shortly after Deepmind published their breakthrough paper in Nature. They had trained an agent via deep learning and reinforcement learning to play “Space Invaders” and a suite of Atari games at superhuman levels. That created quite some buzz. A year later, I started to incorporate reinforcement learning into problems involving mixed traffic systems, in which only some cars are automated. I realized that classical control techniques couldn’t handle the complex nonlinear control problems I was formulating.

Deep RL is now mainstream but it’s by no means pervasive in robotics, which still relies heavily on classical model-based control and planning methods. Deep learning continues to be important for processing raw sensor data like camera images and radio waves, and reinforcement learning is gradually being incorporated. I see traffic systems as gigantic multi-robot systems. I’m excited for an upcoming collaboration with Utah’s Department of Transportation to apply reinforcement learning to coordinate cars with traffic signals, reducing congestion and thus carbon emissions.

Q: You’ve talked about the MIT course, 6.003 (Signals and Systems), and its impact on you. What about it spoke to you?

A: The mindset. That problems that look messy can be analyzed with common, and sometimes simple, tools. Signals are transformed by systems in various ways, but what do these abstract terms mean, anyway? A mechanical system can take a signal like gears turning at some speed and transform it into a lever turning at another speed. A digital system can take binary digits and turn them into other binary digits or a string of letters or an image. Financial systems can take news and transform it via millions of trading decisions into stock prices. People take in signals every day through advertisements, job offers, gossip, and so on, and translate them into actions that in turn influence society and other people. This humble class on signals and systems linked mechanical, digital, and societal systems and showed me how foundational tools can cut through the noise.

Q: In your project with Amazon you’re training warehouse robots to pick up, sort, and deliver goods. What are the technical challenges?

A: This project involves assigning robots to a given task and routing them there. [Professor] Cynthia Barnhart’s team is focused on task assignment, and mine, on path planning. Both problems are considered combinatorial optimization problems because the solution involves a combination of choices. As the number of tasks and robots increases, the number of possible solutions grows exponentially. It’s called the curse of dimensionality. Both problems are what we call NP Hard; there may not be an efficient algorithm to solve them. Our goal is to devise a shortcut.

Routing a single robot for a single task isn’t difficult. It’s like using Google Maps to find the shortest path home. It can be solved efficiently with several algorithms, including Dijkstra’s. But warehouses resemble small cities with hundreds of robots. When traffic jams occur, customers can’t get their packages as quickly. Our goal is to develop algorithms that find the most efficient paths for all of the robots.

Q: Are there other applications?

A: Yes. The algorithms we test in Amazon warehouses might one day help to ease congestion in real cities. Other potential applications include controlling planes on runways, swarms of drones in the air, and even characters in video games. These algorithms could also be used for other robotic planning tasks like scheduling and routing.

Q: AI is evolving rapidly. Where do you hope to see the big breakthroughs coming?

A: I’d like to see deep learning and deep RL used to solve societal problems involving mobility, infrastructure, social media, health care, and education. Deep RL now has a toehold in robotics and industrial applications like chip design, but we still need to be careful in applying it to systems with humans in the loop. Ultimately, we want to design systems for people. Currently, we simply don’t have the right tools.

Q: What worries you most about AI taking on more and more specialized tasks?

A: AI has the potential for tremendous good, but it could also help to accelerate the widening gap between the haves and the have-nots. Our political and regulatory systems could help to integrate AI into society and minimize job losses and income inequality, but I worry that they’re not equipped yet to handle the firehose of AI.

Q: What’s the last great book you read?

A:How to Avoid a Climate Disaster,” by Bill Gates. I absolutely loved the way that Gates was able to take an overwhelmingly complex topic and distill it down into words that everyone can understand. His optimism inspires me to keep pushing on applications of AI and robotics to help avoid a climate disaster.

Read More

Nonsense can make sense to machine-learning models

For all that neural networks can accomplish, we still don’t really understand how they operate. Sure, we can program them to learn, but making sense of a machine’s decision-making process remains much like a fancy puzzle with a dizzying, complex pattern where plenty of integral pieces have yet to be fitted. 

If a model was trying to classify an image of said puzzle, for example, it could encounter well-known, but annoying adversarial attacks, or even more run-of-the-mill data or processing issues. But a new, more subtle type of failure recently identified by MIT scientists is another cause for concern: “overinterpretation,” where algorithms make confident predictions based on details that don’t make sense to humans, like random patterns or image borders. 

This could be particularly worrisome for high-stakes environments, like split-second decisions for self-driving cars, and medical diagnostics for diseases that need more immediate attention. Autonomous vehicles in particular rely heavily on systems that can accurately understand surroundings and then make quick, safe decisions. The network used specific backgrounds, edges, or particular patterns of the sky to classify traffic lights and street signs — irrespective of what else was in the image. 

The team found that neural networks trained on popular datasets like CIFAR-10 and ImageNet suffered from overinterpretation. Models trained on CIFAR-10, for example, made confident predictions even when 95 percent of input images were missing, and the remainder is senseless to humans. 

“Overinterpretation is a dataset problem that’s caused by these nonsensical signals in datasets. Not only are these high-confidence images unrecognizable, but they contain less than 10 percent of the original image in unimportant areas, such as borders. We found that these images were meaningless to humans, yet models can still classify them with high confidence,” says Brandon Carter, MIT Computer Science and Artificial Intelligence Laboratory PhD student and lead author on a paper about the research. 

Deep-image classifiers are widely used. In addition to medical diagnosis and boosting autonomous vehicle technology, there are use cases in security, gaming, and even an app that tells you if something is or isn’t a hot dog, because sometimes we need reassurance. The tech in discussion works by processing individual pixels from tons of pre-labeled images for the network to “learn.” 

Image classification is hard, because machine-learning models have the ability to latch onto these nonsensical subtle signals. Then, when image classifiers are trained on datasets such as ImageNet, they can make seemingly reliable predictions based on those signals. 

Although these nonsensical signals can lead to model fragility in the real world, the signals are actually valid in the datasets, meaning overinterpretation can’t be diagnosed using typical evaluation methods based on that accuracy. 

To find the rationale for the model’s prediction on a particular input, the methods in the present study start with the full image and repeatedly ask, what can I remove from this image? Essentially, it keeps covering up the image, until you’re left with the smallest piece that still makes a confident decision. 

To that end, it could also be possible to use these methods as a type of validation criteria. For example, if you have an autonomously driving car that uses a trained machine-learning method for recognizing stop signs, you could test that method by identifying the smallest input subset that constitutes a stop sign. If that consists of a tree branch, a particular time of day, or something that’s not a stop sign, you could be concerned that the car might come to a stop at a place it’s not supposed to.

While it may seem that the model is the likely culprit here, the datasets are more likely to blame. “There’s the question of how we can modify the datasets in a way that would enable models to be trained to more closely mimic how a human would think about classifying images and therefore, hopefully, generalize better in these real-world scenarios, like autonomous driving and medical diagnosis, so that the models don’t have this nonsensical behavior,” says Carter. 

This may mean creating datasets in more controlled environments. Currently, it’s just pictures that are extracted from public domains that are then classified. But if you want to do object identification, for example, it might be necessary to train models with objects with an uninformative background. 

This work was supported by Schmidt Futures and the National Institutes of Health. Carter wrote the paper alongside Siddhartha Jain and Jonas Mueller, scientists at Amazon, and MIT Professor David Gifford. They are presenting the work at the 2021 Conference on Neural Information Processing Systems.

Read More