MIT-Takeda program launches

In February, researchers from MIT and Takeda Pharmaceuticals joined together to celebrate the official launch of the MIT-Takeda Program. The MIT-Takeda Program aims to fuel the development and application of artificial intelligence (AI) capabilities to benefit human health and drug development. Centered within the Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic), the program brings together the MIT School of Engineering and Takeda Pharmaceuticals, to combine knowledge and address challenges of mutual interest.   

Following a competitive proposal process, nine inaugural research projects were selected. The program’s flagship research projects include principal investigators from departments and labs spanning the School of Engineering and the Institute. Research includes diagnosis of diseases, prediction of treatment response, development of novel biomarkers, process control and improvement, drug discovery, and clinical trial optimization.

“We were truly impressed by the creativity and breadth of the proposals we received,” says Anantha P. Chandrakasan, dean of the School of Engineering, Vannevar Bush Professor of Electrical Engineering and Computer Science, and co-chair of the MIT-Takeda Program Steering Committee.

Engaging with researchers and industry experts from Takeda, each project team will bring together different disciplines, merging theory and practical implementation, while combining algorithm and platform innovations.

“This is an incredible opportunity to merge the cross-disciplinary and cross-functional expertise of both MIT and Takeda researchers,” says Chandrakasan. “This particular collaboration between academia and industry is of great significance as our world faces enormous challenges pertaining to human health. I look forward to witnessing the evolution of the program and the impact its research aims to have on our society.” 

“The shared enthusiasm and combined efforts of researchers from across MIT and Takeda have the opportunity to shape the future of health care,” says Anne Heatherington, senior vice president and head of Data Sciences Institute (DSI) at Takeda, and co-chair of the MIT-Takeda Program Steering Committee. “Together we are building capabilities and addressing challenges through interrogation of multiple data types that we have not been able to solve with the power of humans alone that have the potential to benefit both patients and the greater community.”

The following are the inaugural projects of the MIT-Takeda Program. Included are the MIT teams collaborating with Takeda researchers, who are leveraging AI to positively impact human health.

“AI-enabled, automated inspection of lyophilized products in sterile pharmaceutical manufacturing”: Duane Boning, the Clarence J. LeBel Professor of Electrical Engineering and faculty co-director of the Leaders for Global Operations program; Luca Daniel, professor of electrical engineering and computer science; Sanjay Sarma, the Fred Fort Flowers and Daniel Fort Flowers Professor of Mechanical Engineering and vice president for open learning; and Brian Subirana, research scientist and director MIT Auto-ID Laboratory within the Department of Mechanical Engineering.

“Automating adverse effect assessments and scientific literature review”: Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science and Jameel Clinic faculty co-lead; Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society; and Jacob Andreas, assistant professor of electrical engineering and computer science.

“Automated analysis of speech and language deficits for frontotemporal dementia”: James Glass, senior research scientist in the MIT Computer Science and Artificial Intelligence Laboratory; Sanjay Sarma, the Fred Fort Flowers and Daniel Fort Flowers Professor of Mechanical Engineering and vice president for open learning; and Brian Subirana, research scientist and director of the MIT Auto-ID Laboratory within the Department of Mechanical Engineering.

“Discovering human-microbiome protein interactions with continuous distributed representation”: Jim Collins, the Termeer Professor of Medical Engineering and Science in MIT’s Institute for Medical Engineering and Science and Department of Biological Engineering, Jameel Clinic faculty co-lead, and MIT-Takeda Program faculty lead; and Timothy Lu, associate professor of electrical engineering and computer science and of biological engineering.

“Machine learning for early diagnosis, progression risk estimation, and identification of non-responders to conventional therapy for inflammatory bowel disease”: Peter Szolovits, professor of computer science and engineering, and David Sontag, associate professor of electrical engineering and computer science.

“Machine learning for image-based liver phenotyping and drug discovery”: Polina Golland, professor of electrical engineering and computer science; Brian W. Anthony, principal research scientist in the Department of Mechanical Engineering; and Peter Szolovits, professor of computer science and engineering.

“Predictive in silico models for cell culture process development for biologics manufacturing”: Connor W. Coley, assistant professor of chemical engineering, and J. Christopher Love, the Raymond A. (1921) and Helen E. St. Laurent Professor of Chemical Engineering.

“Automated data quality monitoring for clinical trial oversight via probabilistic programming”: Vikash Mansinghka, principal research scientist in the Department of Brain and Cognitive Sciences; Tamara Broderick, associate professor of electrical engineering and computer science; David Sontag, associate professor of electrical engineering and computer science; Ulrich Schaechtle, research scientist in the Department of Brain and Cognitive Sciences; and Veronica Weiner, director of special projects for the MIT Probabilistic Computing Project.

“Time series analysis from video data for optimizing and controlling unit operations in production and manufacturing”: Allan S. Myerson, professor of chemical engineering; George Barbastathis, professor of mechanical engineering; Richard Braatz, the Edwin R. Gilliland Professor of Chemical Engineering; and Bernhardt Trout, the Raymond F. Baddour, ScD, (1949) Professor of Chemical Engineering.

“The flagship research projects of the MIT-Takeda Program offer real promise to the ways we can impact human health,” says Jim Collins. “We are delighted to have the opportunity to collaborate with Takeda researchers on advances that leverage AI and aim to shape health care around the globe.”

Read More

What jumps out in a photo changes the longer we look

What seizes your attention at first glance might change with a closer look. That elephant dressed in red wallpaper might initially grab your eye until your gaze moves to the woman on the living room couch and the surprising realization that the pair appear to be sharing a quiet moment together.

In a study being presented at the virtual Computer Vision and Pattern Recognition conference this week, researchers show that our attention moves in distinctive ways the longer we stare at an image, and that these viewing patterns can be replicated by artificial intelligence models. The work suggests immediate ways of improving how visual content is teased and eventually displayed online. For example, an automated cropping tool might zoom in on the elephant for a thumbnail preview or zoom out to include the intriguing details that become visible once a reader clicks on the story.

“In the real world, we look at the scenes around us and our attention also moves,” says Anelise Newman, the study’s co-lead author and a master’s student at MIT. “What captures our interest over time varies.” The study’s senior authors are Zoya Bylinskii PhD ’18, a research scientist at Adobe Research, and Aude Oliva, co-director of the MIT Quest for Intelligence and a senior research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory.

What researchers know about saliency, and how humans perceive images, comes from experiments in which participants are shown pictures for a fixed period of time. But in the real world, human attention often shifts abruptly. To simulate this variability, the researchers used a crowdsourcing user interface called CodeCharts to show participants photos at three durations — half a second, 3 seconds, and 5 seconds — in a set of online experiments. 

When the image disappeared, participants were asked to report where they had last looked by typing in a three-digit code on a gridded map corresponding to the image. In the end, the researchers were able to gather heat maps of where in a given image participants had collectively focused their gaze at different moments in time. 

At the split-second interval, viewers focused on faces or a visually dominant animal or object. By 3 seconds, their gaze had shifted to action-oriented features, like a dog on a leash, an archery target, or an airborne frisbee. At 5 seconds, their gaze either shot back, boomerang-like, to the main subject, or it lingered on the suggestive details. 

“We were surprised at just how consistent these viewing patterns were at different durations,” says the study’s other lead author, Camilo Fosco, a PhD student at MIT.

With real-world data in hand, the researchers next trained a deep learning model to predict the focal points of images it had never seen before, at different viewing durations. To reduce the size of their model, they included a recurrent module that works on compressed representations of the input image, mimicking the human gaze as it explores an image at varying durations. When tested, their model outperformed the state of the art at predicting saliency across viewing durations.

The model has potential applications for editing and rendering compressed images and even improving the accuracy of automated image captioning. In addition to guiding an editing tool to crop an image for shorter or longer viewing durations, it could prioritize which elements in a compressed image to render first for viewers. By clearing away the visual clutter in a scene, it could improve the overall accuracy of current photo-captioning techniques. It could also generate captions for images meant for split-second viewing only. 

“The content that you consider most important depends on the time you have to look at it,” says Bylinskii. “If you see the full image at once, you may not have time to absorb it all.”

As more images and videos are shared online, the need for better tools to find and make sense of relevant content is growing. Research on human attention offers insights for technologists. Just as computers and camera-equipped mobile phones helped create the data overload, they are also giving researchers new platforms for studying human attention and designing better tools to help us cut through the noise.

In a related study accepted to the ACM Conference on Human Factors in Computing Systems, researchers outline the relative benefits of four web-based user interfaces, including CodeCharts, for gathering human attention data at scale. All four tools capture attention without relying on traditional eye-tracking hardware in a lab, either by collecting self-reported gaze data, as CodeCharts does, or by recording where subjects click their mouse or zoom in on an image.

“There’s no one-size-fits-all interface that works for all use cases, and our paper focuses on teasing apart these trade-offs,” says Newman, lead author of the study.

By making it faster and cheaper to gather human attention data, the platforms may help to generate new knowledge on human vision and cognition. “The more we learn about how humans see and understand the world, the more we can build these insights into our AI tools to make them more useful,” says Oliva.

Other authors of the CVPR paper are Pat Sukhum, Yun Bin Zhang, and Nanxuan Zhao. The research was supported by the Vannevar Bush Faculty Fellowship program, an Ignite grant from the SystemsThatLearn@CSAIL, and cloud computing services from MIT Quest.

Read More

Photorealistic simulator made MIT robot racing competition a live online experience

Every spring, the basement of the Ray and Maria Stata Center becomes a racetrack for tiny self-driving cars that tear through the halls one by one. Sprinting behind each car on foot is a team of three to six students, sometimes carrying wireless routers or open laptops extended out like Olympic torches. Lining the basement walls, their classmates cheer them on, knowing the effort it took to program the algorithms steering the cars around the course during this annual MIT autonomous racing competition.

The competition is the final project for Course 6.141/16.405 (Robotics: Science and Systems). It’s an end-of-semester event that gets pulses speeding, and prizes are awarded for finishing different race courses with the fastest times out of 20 teams.

With campus evacuated this spring due to the Covid-19 pandemic, however, not a single robotic car burned rubber in the Stata Center basement. Instead, a new race was on as Luca Carlone, the Charles Stark Draper Assistant Professor of Aeronautics and Astronautics and member of the Institute for Data, Systems, and Society; Nicholas Roy, professor of aeronautics and astronautics; and teaching assistants (TAs) including Marcus Abate, Lukas Lao Beyer, and Caris Mariah Moses had only four weeks to figure out how to bring the excitement of this highly-anticipated race online.

Because the lab sometimes uses a simple simulator for other research, Carlone says they considered taking the race in that direction. With this simple simulator, students could watch as their self-driving cars snaked around a flat map, like a car depicted by a dot moving along a GPS navigation system. Ultimately, they decided that wasn’t the right route. The racing competition needed to be noisy. Realistic. Exciting. The dynamics of the car needed to be nearly as complex as the robotic cars the students had planned to use. Building on his prior research in collaboration with MIT Lincoln Laboratory, Abate worked with Lao Beyer and engineering graduate student Sabina Chen to develop a new photorealistic simulator at the last minute.

The race was back on, and Carlone was impressed by how everything from the cityscape to the sleek car designs looked “as realistic as possible.”

“The modifications involved introducing an outdoor environment based on open-source assets, building in realistic car dynamics for the agent, and adding lidar sensors,” Abate says. “I also had to revamp the interfacing with Python and Robot Operating System (ROS) to make it all plug-and-play for the students.”

What that means is that the race ran a lot like a racing game, such as Gran Turismo or Forza. Only instead of sitting on your couch thumbing the joystick to direct the car, students developed algorithms to anticipate every roadblock and bend ahead. For students, programming for this new environment was perhaps the biggest adjustment. “The simulator used an outdoor scene and a full-sized car with a very different dynamics model than the real-life race car in the Stata basement,” Abate says.

The TAs also had to adjust to complications behind the scenes of the race’s new setting. “A huge amount of effort was put into the new simulator, as well as into the logistics of obtaining and evaluating students’ software,” Lao Beyer says. “Usually, teams are able to configure the software on their race car however they want, but it is very difficult to accommodate for such a diversity of software setups in the virtual race.”

Once the simulator was ready, there was no time to troubleshoot, so TAs made themselves available to debug on the fly any issues that arose. “I think that saved the day for the final project and the final race,” Carlone says.

Programming their autonomous racing code wasn’t the only way that students customized their race experience, though. Co-instructor Jane Abbott brought Writing, Rhetoric, and Professional Communication (WRAP) into the course. As coordinator of the communication-intensive team that focused on helping teams work effectively, she says she was worried the silence that often looms on Zoom would suck out all the energy of the race. She suggested the TAs add a soundtrack.

In the end, the remote race ran for nearly four hours, bringing together more than 100 people in one Zoom call with commentators and Mario Kart music playing. “We got to watch every student’s solution with some cool visualization code running that showed the trajectory and any obstacles hit,” says Samuel Ubellacker, an electrical engineering and computer science student who raced this year. “We got to see how each team’s solution ran much clearer in the simulator because the camera was always following the race car.”

For Yorai Shaoul, another electrical engineering and computer science student in the race, getting out of the basement helped him become more engaged with other teams’ projects. “Before leaving campus, we found ourselves working long hours in the Stata basement,” Shaoul says. “So focused on our robot, we failed to notice that other teams were right there next to us the whole time.”

During the race, other programming solutions his team had overlooked became clear. “The TAs showcased and narrated each team’s run, finally allowing us to see the diverse approaches other teams were developing,” Shaoul says.

“One thing that was nice: When we’ve done it live in the tunnels, you can only see a part of it,” Abbott says. “You sort of stand at a fixed point and you see the car go by. It’s like watching the marathon: you see the runners for 100 yards and then they’re gone.”

Over Zoom, participants could watch every impressive cruise and spectacular crash as it happened, plus replays. Many stayed to watch, and Lao Beyer says, “We managed to retain as much excitement and suspense about the final challenge as possible.” Ubellacker agrees: “It was certainly an unforgettable experience!”

For those students who don’t bro down with Mario, they could also choose the music they wanted to accompany their races. “Near, far, wherever you are,” these lyrics from one team’s choice to use the “Titanic” movie theme “My Heart Will Go On” are a wink to the extra challenge of collaborating as teams at a distance.

One of the masters of ceremonies for the 2020 race, Marwa Abdulhai ’20, was a TA last year and says one obvious benefit of the online race is that it’s a lot easier to figure out why your car crashed. “Pros of this virtual approach have been allowing students to race through the track multiple times and knowing that the car’s performance was primarily due to the algorithm and not any physical constraints,” Abdulhai says.

For Ubellacker that was actually a con, though: “The biggest element that I missed without having a physical car was not being able to experience the differences between simulation and real life.” He says, “Part of the fun to me is designing a system that works perfectly in the simulator, and then getting to figure out all the crazy ways it will fail in the real world!”

Shaoul says instead of working on one car, sometimes it felt like they were working on five individual cars that lived on each team member’s computer. “With one car, it was easy to see how well it did and what required fixing, whereas virtually it was more ambiguous,” Shaoul says. “We faced challenges with keeping track of up-to-date code versions and also simple communication.”

Carlone was concerned students wouldn’t be as invested in their algorithms without the experience of seeing the car’s performance play out in real life to motivate them to push harder. “Every year, the record time on that Stata Center track was getting better and better,” he says. “This year, we were a bit concerned about the performance.”

Fortunately, many students were very much still in the race, with some teams beating the most optimistic predictions, despite having to adjust to new racing conditions and greater challenges collaborating as a team fully online. The winning students completed the race courses sometimes three times faster than other teams, without any collisions. “It was just beyond expectation,” Carlone says.

Although this shift in the final project somewhat changed the takeaways from the course, Carlone says the experience will still advance algorithmic skills for students working on robotics, as well as introducing them to the intensity of communication required to work effectively as remote teams. “Many robotics groups are doing research using photorealistic simulation, because you can test conditions that you cannot test on the real robot,” he says. Co-instructor Roy says it worked so well, the new simulator might become a permanent feature of the course — not to replace the physical race, but as an extra element. “The robotics experience was good,” Carlone says of the 2020 race, but still: “The human experience is, of course, different.”

Read More

Learning the ropes and throwing lifelines

In March, as her friends and neighbors were scrambling to pack up and leave campus due to the Covid-19 pandemic, Geeticka Chauhan found her world upended in yet another way. Just weeks earlier, she had been elected council president of MIT’s largest graduate residence, Sidney-Pacific. Suddenly the fourth-year PhD student was plunged into rounds of emergency meetings with MIT administrators.

From her apartment in Sidney-Pacific, where she has stayed put due to travel restrictions in her home country of India, Chauhan is still learning the ropes of her new position. With others, she has been busy preparing to meet the future challenge of safely redensifying the living space of more than 1,000 people: how to regulate high-density common areas, handle noise complaints as people spend more time in their rooms, and care for the mental and physical well-being of a community that can only congregate virtually. “It’s just such a crazy time,” she says.

She’s prepared for the challenge. During her time at MIT, while pursuing her research using artificial intelligence to understand human language, Chauhan has worked to strengthen the bonds of her community in numerous ways, often drawing on her experience as an international student to do so.

Adventures in brunching

When Chauhan first came to MIT in 2017, she quickly fell in love with Sidney-Pacific’s thriving and freewheeling “helper culture.” “These are all researchers, but they’re maybe making brownies, doing crazy experiments that they would do in lab, except in the kitchen,” she says. “That was my first introduction to the MIT spirit.”

Next thing she knew, she was teaching Budokon yoga, mashing chickpeas into guacamole, and immersing herself in the complex operations of a monthly brunch attended by hundreds of graduate students, many of whom came to MIT from outside the U.S. In addition to the genuine thrill of cracking 300 eggs in 30 minutes, working on the brunches kept her grounded in a place thousands of miles from her home in New Delhi. “It gave me a sense of community and made me feel like I have a family here,” she says.

Chauhan has found additional ways to address the particular difficulties that international students face. As a member of the Presidential Advisory Council this year, she gathered international student testimonies on visa difficulties and presented them to MIT’s president and the director of the International Students Office. And when a friend from mainland China had to self-quarantine on Valentine’s Day, Chauhan knew she had to act. As brunch chair, she organized food delivery, complete with chocolates and notes, for Sidney-Pacific residents who couldn’t make it to the monthly event. “Initially when you come back to the U.S. from your home country, you really miss your family,” she says. “I thought self-quarantining students should feel their MIT community cares for them.”

Culture shock

Growing up in New Delhi, math was initially one of her weaknesses, Chauhan says, and she was scared and confused by her early introduction to coding. Her mother and grandmother, with stern kindness and chocolates, encouraged her to face these fears. “My mom used to teach me that with hard work, you can make your biggest weakness your biggest strength,” she explains. She soon set her sights on a future in computer science.

However, as Chauhan found her life increasingly dominated by the high-pressure culture of preparing for college, she began to long for a feeling of wholeness, and for the person she left behind on the way. “I used to have a lot of artistic interests but didn’t get to explore them,” she says. She quit her weekend engineering classes, enrolled in a black and white photography class, and after learning about the extracurricular options at American universities, landed a full scholarship to attend Florida International University.

It was a culture shock. She didn’t know many Indian students in Miami and felt herself struggling to reconcile the individualistic mindset around her with the community and family-centered life at home. She says the people she met got her through, including Mark Finlayson, a professor studying the science of narrative from the viewpoint of natural language processing. Under Finlayson’s guidance she developed a fascination with the way AI techniques could be used to better understand the patterns and structures in human narratives. She learned that studying AI wasn’t just a way of imitating human thinking, but rather an approach for deepening our understanding of ourselves as reflected by our language. “It was due to Mark’s mentorship that I got involved in research” and applied to MIT, she says.

The holistic researcher

Chauan now works in the Clinical Decision Making Group led by Peter Szolovits at the Computer Science and Artificial Intelligence Laboratory, where she is focusing on the ways natural language processing can address health care problems. For her master’s project, she worked on the problem of relation extraction and built a tool to digest clinical literature that would, for example, help pharamacologists easily assess negative drug interactions. Now, she’s finishing up a project integrating visual analysis of chest radiographs and textual analysis of radiology reports for quantifying pulmonary edema, to help clinicians manage the fluid status of their patients who have suffered acute heart failure.

“In routine clinical practice, patient care is interweaved with a lot of bureaucratic work,” she says. “The goal of my lab is to assist with clinical decision making and give clinicians the full freedom and time to devote to patient care.”

It’s an exciting moment for Chauhan, who recently submitted a paper she co-first authored with another grad student, and is starting to think about her next project: interpretability, or how to elucidate a decision-making model’s “thought process” by highlighting the data from which it draws its conclusions. She continues to find the intersection of computer vision and natural language processing an exciting area of research. But there have been challenges along the way.

After the initial flurry of excitement her first year, personal and faculty expectations of students’ independence and publishing success grew, and she began to experience uncertainty and imposter syndrome. “I didn’t know what I was capable of,” she says. “That initial period of convincing yourself that you belong is difficult. I am fortunate to have a supportive advisor that understands that.”

Finally, one of her first-year projects showed promise, and she came up with a master’s thesis plan in a month and submitted the project that semester. To get through, she says, she drew on her “survival skills”: allowing herself to be a full person beyond her work as a researcher so that one setback didn’t become a sense of complete failure. For Chauhan, that meant working as a teaching assistant, drawing henna designs, singing, enjoying yoga, and staying involved in student government. “I used to try to separate that part of myself with my work side,” she says. “I needed to give myself some space to learn and grow, rather than compare myself to others.”

Citing a study showing that women are more likely to drop out of STEM disciplines when they receive a B grade in a challenging course, Chauhan says she wishes she could tell her younger self not to compare herself with an ideal version of herself. Dismantling imposter syndrome requires an understanding that qualification and success can come from a broad range of experiences, she says: It’s about “seeing people for who they are holistically, rather than what is seen on the resume.”

Read More

Engineers put tens of thousands of artificial brain synapses on a single chip

MIT engineers have designed a “brain-on-a-chip,” smaller than a piece of confetti, that is made from tens of thousands of artificial brain synapses known as memristors — silicon-based components that mimic the information-transmitting synapses in the human brain.

The researchers borrowed from principles of metallurgy to fabricate each memristor from alloys of silver and copper, along with silicon. When they ran the chip through several visual tasks, the chip was able to “remember” stored images and reproduce them many times over, in versions that were crisper and cleaner compared with existing memristor designs made with unalloyed elements.

Their results, published today in the journal Nature Nanotechnology, demonstrate a promising new memristor design for neuromorphic devices — electronics that are based on a new type of circuit that processes information in a way that mimics the brain’s neural architecture. Such brain-inspired circuits could be built into small, portable devices, and would carry out complex computational tasks that only today’s supercomputers can handle.

“So far, artificial synapse networks exist as software. We’re trying to build real neural network hardware for portable artificial intelligence systems,” says Jeehwan Kim, associate professor of mechanical engineering at MIT. “Imagine connecting a neuromorphic device to a camera on your car, and having it recognize lights and objects and make a decision immediately, without having to connect to the internet. We hope to use energy-efficient memristors to do those tasks on-site, in real-time.”

Wandering ions

Memristors, or memory transistors, are an essential element in neuromorphic computing. In a neuromorphic device, a memristor would serve as the transistor in a circuit, though its workings would more closely resemble a brain synapse — the junction between two neurons. The synapse receives signals from one neuron, in the form of ions, and sends a corresponding signal to the next neuron.

A transistor in a conventional circuit transmits information by switching between one of only two values, 0 and 1, and doing so only when the signal it receives, in the form of an electric current, is of a particular strength. In contrast, a memristor would work along a gradient, much like a synapse in the brain. The signal it produces would vary depending on the strength of the signal that it receives. This would enable a single memristor to have many values, and therefore carry out a far wider range of operations than binary transistors.

Like a brain synapse, a memristor would also be able to “remember” the value associated with a given current strength, and produce the exact same signal the next time it receives a similar current. This could ensure that the answer to a complex equation, or the visual classification of an object, is reliable — a feat that normally involves multiple transistors and capacitors.

Ultimately, scientists envision that memristors would require far less chip real estate than conventional transistors, enabling powerful, portable computing devices that do not rely on supercomputers, or even connections to the Internet.

Existing memristor designs, however, are limited in their performance. A single memristor is made of a positive and negative electrode, separated by a “switching medium,” or space between the electrodes. When a voltage is applied to one electrode, ions from that electrode flow through the medium, forming a “conduction channel” to the other electrode. The received ions make up the electrical signal that the memristor transmits through the circuit. The size of the ion channel (and the signal that the memristor ultimately produces) should be proportional to the strength of the stimulating voltage.

Kim says that existing memristor designs work pretty well in cases where voltage stimulates a large conduction channel, or a heavy flow of ions from one electrode to the other. But these designs are less reliable when memristors need to generate subtler signals, via thinner conduction channels.

The thinner a conduction channel, and the lighter the flow of ions from one electrode to the other, the harder it is for individual ions to stay together. Instead, they tend to wander from the group, disbanding within the medium. As a result, it’s difficult for the receiving electrode to reliably capture the same number of ions, and therefore transmit the same signal, when stimulated with a certain low range of current.

Borrowing from metallurgy

Kim and his colleagues found a way around this limitation by borrowing a technique from metallurgy, the science of melding metals into alloys and studying their combined properties.

“Traditionally, metallurgists try to add different atoms into a bulk matrix to strengthen materials, and we thought, why not tweak the atomic interactions in our memristor, and add some alloying element to control the movement of ions in our medium,” Kim says.

Engineers typically use silver as the material for a memristor’s positive electrode. Kim’s team looked through the literature to find an element that they could combine with silver to effectively hold silver ions together, while allowing them to flow quickly through to the other electrode.

The team landed on copper as the ideal alloying element, as it is able to bind both with silver, and with silicon.

“It acts as a sort of bridge, and stabilizes the silver-silicon interface,” Kim says.

To make memristors using their new alloy, the group first fabricated a negative electrode out of silicon, then made a positive electrode by depositing a slight amount of copper, followed by a layer of silver. They sandwiched the two electrodes around an amorphous silicon medium. In this way, they patterned a millimeter-square silicon chip with tens of thousands of memristors.

As a first test of the chip, they recreated a gray-scale image of the Captain America shield. They equated each pixel in the image to a corresponding memristor in the chip. They then modulated the conductance of each memristor that was relative in strength to the color in the corresponding pixel.

The chip produced the same crisp image of the shield, and was able to “remember” the image and reproduce it many times, compared with chips made of other materials.

The team also ran the chip through an image processing task, programming the memristors to alter an image, in this case of MIT’s Killian Court, in several specific ways, including sharpening and blurring the original image. Again, their design produced the reprogrammed images more reliably than existing memristor designs.

“We’re using artificial synapses to do real inference tests,” Kim says. “We would like to develop this technology further to have larger-scale arrays to do image recognition tasks. And some day, you might be able to carry around artificial brains to do these kinds of tasks, without connecting to supercomputers, the internet, or the cloud.”

This research was funded, in part, by the MIT Research Support Committee funds, the MIT-IBM Watson AI Lab, Samsung Global Research Laboratory, and the National Science Foundation.

Read More

If transistors can’t get smaller, then coders have to get smarter

In 1965, Intel co-founder Gordon Moore predicted that the number of transistors that could fit on a computer chip would grow exponentially — and they did, doubling about every two years. For half a century, Moore’s Law has endured: Computers have gotten smaller, faster, cheaper, and more efficient, enabling the rapid worldwide adoption of PCs, smartphones, high-speed internet, and more.

This miniaturization trend has led to silicon chips today that have almost unimaginably small circuitry. Transistors, the tiny switches that implement computer microprocessors, are so small that 1,000 of them laid end-to-end are no wider than a human hair. And for a long time, the smaller the transistors were, the faster they could switch. But today, we’re approaching the limit of how small transistors can get. As a result, over the past decade researchers have been scratching their heads to find other ways to improve performance so that the computer industry can continue to innovate.

While we wait for the maturation of new computing technologies like quantum, carbon nanotubes, or photonics (which may take a while), other approaches will be needed to get performance as Moore’s Law comes to an end. In a recent journal article published in Science, a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) identifies three key areas to prioritize to continue to deliver computing speed-ups: better software, new algorithms, and more streamlined hardware.

Senior author Charles E. Leiserson says that the performance benefits from miniaturization have been so great that, for decades, programmers have been able to prioritize making code-writing easier rather than making the code itself run faster. The inefficiency that this tendency introduces has been acceptable, because faster computer chips have always been able to pick up the slack.

“But nowadays, being able to make further advances in fields like machine learning, robotics, and virtual reality will require huge amounts of computational power that miniaturization can no longer provide,” says Leiserson, the Edwin Sibley Webster Professor in MIT’s Department of Electrical Engineering and Computer Science. “If we want to harness the full potential of these technologies, we must change our approach to computing.”

Leiserson co-wrote the paper, published this week, with Research Scientist Neil Thompson, Professor Daniel Sanchez, Adjunct Professor Butler Lampson, and research scientists Joel Emer, Bradley Kuszmaul, and Tao Schardl.

No more Moore

The authors make recommendations about three areas of computing: software, algorithms, and hardware architecture.

With software, they say that programmers’ previous prioritization of productivity over performance has led to problematic strategies like “reduction”: taking code that worked on problem A and using it to solve problem B. For example, if someone has to create a system to recognize yes-or-no voice commands, but doesn’t want to code a whole new custom program, they could take an existing program that recognizes a wide range of words and tweak it to respond only to yes-or-no answers.

While this approach reduces coding time, the inefficiencies it creates quickly compound: if a single reduction is 80 percent as efficient as a custom solution, and you then add 20 layers of reduction, the code will ultimately be 100 times less efficient than it could be.

“These are the kinds of strategies that programmers have to rethink as hardware improvements slow down,” says Thompson. “We can’t keep doing ‘business as usual’ if we want to continue to get the speed-ups we’ve grown accustomed to.”

Instead, the researchers recommend techniques like parallelizing code. Much existing software has been designed using ancient assumptions that processors can only do only one operation at a time. But in recent years multicore technology has enabled complex tasks to be completed thousands of times faster and in a much more energy-efficient way. 

“Since Moore’s Law will not be handing us improved performance on a silver platter, we will have to deliver performance the hard way,” says Moshe Vardi, a professor in computational engineering at Rice University. “This is a great opportunity for computing research, and the [MIT CSAIL] report provides a road map for such research.” 

As for algorithms, the team suggests a three-pronged approach that includes exploring new problem areas, addressing concerns about how algorithms scale, and tailoring them to better take advantage of modern hardware.

Lastly, in terms of hardware architecture, the team advocates that hardware be streamlined so that problems can be solved with fewer transistors and less silicon. Streamlining includes using simpler processors and creating hardware tailored to specific applications, like the graphics-processing unit is tailored for computer graphics. 

“Hardware customized for particular domains can be much more efficient and use far fewer transistors, enabling applications to run tens to hundreds of times faster,” says Schardl. “More generally, hardware streamlining would further encourage parallel programming, creating additional chip area to be used for more circuitry that can operate in parallel.”

While these approaches may be the best path forward, the researchers say that it won’t always be an easy one. Organizations that use such techniques may not know the benefits of their efforts until after they’ve invested a lot of engineering time. Plus, the speed-ups won’t be as consistent as they were with Moore’s Law: they may be dramatic at first, and then require large amounts of effort for smaller improvements. 

Certain companies have already gotten the memo.

“For tech giants like Google and Amazon, the huge scale of their data centers means that even small improvements in software performance can result in large financial returns,” says Thompson.  “But while these firms may be leading the charge, many others will need to take these issues seriously if they want to stay competitive.”

Getting improvements in the areas identified by the team will also require building up the infrastructure and workforce that make them possible.  

“Performance growth will require new tools, programming languages, and hardware to facilitate more and better performance engineering,” says Leiserson. “It also means computer scientists being better educated about how we can make software, algorithms, and hardware work together, instead of putting them in different silos.”

This work was supported, in part, by the National Science Foundation.

Read More

Giving soft robots feeling

One of the hottest topics in robotics is the field of soft robots, which utilizes squishy and flexible materials rather than traditional rigid materials. But soft robots have been limited due to their lack of good sensing. A good robotic gripper needs to feel what it is touching (tactile sensing), and it needs to sense the positions of its fingers (proprioception). Such sensing has been missing from most soft robots.

In a new pair of papers, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) came up with new tools to let robots better perceive what they’re interacting with: the ability to see and classify items, and a softer, delicate touch. 

“We wish to enable seeing the world by feeling the world. Soft robot hands have sensorized skins that allow them to pick up a range of objects, from delicate, such as potato chips, to heavy, such as milk bottles,” says CSAIL Director Daniela Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science and the deputy dean of research for the MIT Stephen A. Schwarzman College of Computing. 

One paper builds off last year’s research from MIT and Harvard University, where a team developed a soft and strong robotic gripper in the form of a cone-shaped origami structure. It collapses in on objects much like a Venus’ flytrap, to pick up items that are as much as 100 times its weight. 

To get that newfound versatility and adaptability even closer to that of a human hand, a new team came up with a sensible addition: tactile sensors, made from latex “bladders” (balloons) connected to pressure transducers. The new sensors let the gripper not only pick up objects as delicate as potato chips, but it also classifies them — letting the robot better understand what it’s picking up, while also exhibiting that light touch. 

When classifying objects, the sensors correctly identified 10 objects with over 90 percent accuracy, even when an object slipped out of grip.

“Unlike many other soft tactile sensors, ours can be rapidly fabricated, retrofitted into grippers, and show sensitivity and reliability,” says MIT postdoc Josie Hughes, the lead author on a new paper about the sensors. “We hope they provide a new method of soft sensing that can be applied to a wide range of different applications in manufacturing settings, like packing and lifting.” 

In a second paper, a group of researchers created a soft robotic finger called “GelFlex” that uses embedded cameras and deep learning to enable high-resolution tactile sensing and “proprioception” (awareness of positions and movements of the body). 

The gripper, which looks much like a two-finger cup gripper you might see at a soda station, uses a tendon-driven mechanism to actuate the fingers. When tested on metal objects of various shapes, the system had over 96 percent recognition accuracy. 

“Our soft finger can provide high accuracy on proprioception and accurately predict grasped objects, and also withstand considerable impact without harming the interacted environment and itself,” says Yu She, lead author on a new paper on GelFlex. “By constraining soft fingers with a flexible exoskeleton, and performing high-resolution sensing with embedded cameras, we open up a large range of capabilities for soft manipulators.” 

Magic ball senses 

The magic ball gripper is made from a soft origami structure, encased by a soft balloon. When a vacuum is applied to the balloon, the origami structure closes around the object, and the gripper deforms to its structure. 

While this motion lets the gripper grasp a much wider range of objects than ever before, such as soup cans, hammers, wine glasses, drones, and even a single broccoli floret, the greater intricacies of delicacy and understanding were still out of reach — until they added the sensors.  

When the sensors experience force or strain, the internal pressure changes, and the team can measure this change in pressure to identify when it will feel that again. 

In addition to the latex sensor, the team also developed an algorithm which uses feedback to let the gripper possess a human-like duality of being both strong and precise — and 80 percent of the tested objects were successfully grasped without damage. 

The team tested the gripper-sensors on a variety of household items, ranging from heavy bottles to small, delicate objects, including cans, apples, a toothbrush, a water bottle, and a bag of cookies. 

Going forward, the team hopes to make the methodology scalable, using computational design and reconstruction methods to improve the resolution and coverage using this new sensor technology. Eventually, they imagine using the new sensors to create a fluidic sensing skin that shows scalability and sensitivity. 

Hughes co-wrote the new paper with Rus, which they will present virtually at the 2020 International Conference on Robotics and Automation. 

GelFlex

In the second paper, a CSAIL team looked at giving a soft robotic gripper more nuanced, human-like senses. Soft fingers allow a wide range of deformations, but to be used in a controlled way there must be rich tactile and proprioceptive sensing. The team used embedded cameras with wide-angle “fisheye” lenses that capture the finger’s deformations in great detail.

To create GelFlex, the team used silicone material to fabricate the soft and transparent finger, and put one camera near the fingertip and the other in the middle of the finger. Then, they painted reflective ink on the front and side surface of the finger, and added LED lights on the back. This allows the internal fish-eye camera to observe the status of the front and side surface of the finger. 

The team trained neural networks to extract key information from the internal cameras for feedback. One neural net was trained to predict the bending angle of GelFlex, and the other was trained to estimate the shape and size of the objects being grabbed. The gripper could then pick up a variety of items such as a Rubik’s cube, a DVD case, or a block of aluminum. 

During testing, the average positional error while gripping was less than 0.77 millimeter, which is better than that of a human finger. In a second set of tests, the gripper was challenged with grasping and recognizing cylinders and boxes of various sizes. Out of 80 trials, only three were classified incorrectly. 

In the future, the team hopes to improve the proprioception and tactile sensing algorithms, and utilize vision-based sensors to estimate more complex finger configurations, such as twisting or lateral bending, which are challenging for common sensors, but should be attainable with embedded cameras.

Yu She co-wrote the GelFlex paper with MIT graduate student Sandra Q. Liu, Peiyu Yu of Tsinghua University, and MIT Professor Edward Adelson. They will present the paper virtually at the 2020 International Conference on Robotics and Automation.

Read More

Undergraduates develop next-generation intelligence tools

The coronavirus pandemic has driven us apart physically while reminding us of the power of technology to connect. When MIT shut its doors in March, much of campus moved online, to virtual classes, labs, and chatrooms. Among those making the pivot were students engaged in independent research under MIT’s Undergraduate Research Opportunities Program (UROP). 

With regular check-ins with their advisors via Slack and Zoom, many students succeeded in pushing through to the end. One even carried on his experiments from his bedroom, after schlepping his Sphero Bolt robots home in a backpack. “I’ve been so impressed by their resilience and dedication,” says Katherine Gallagher, one of three artificial intelligence engineers at MIT Quest for Intelligence who works with students each semester on intelligence-related applications. “There was that initial week of craziness and then they were right back to work.” Four projects from this spring are highlighted below.

Learning to explore the world with open eyes and ears

Robots rely heavily on images beamed through their built-in cameras, or surrogate “eyes,” to get around. MIT senior Alon Kosowsky-Sachs thinks they could do a lot more if they also used their microphone “ears.” 

From his home in Sharon, Massachusetts, where he retreated after MIT closed in March, Kosowsky-Sachs is training four baseball-sized Sphero Bolt robots to roll around a homemade arena. His goal is to teach the robots to pair sights with sounds, and to exploit this information to build better representations of their environment. He’s working with Pulkit Agrawal, an assistant professor in MIT’s Department of Electrical Engineering and Computer Science, who is interested in designing algorithms with human-like curiosity.

While Kosowsky-Sachs sleeps, his robots putter away, gliding through an object-strewn rink he built for them from two-by-fours. Each burst of movement becomes a pair of one-second video and audio clips. By day, Kosowsky-Sachs trains a “curiosity” model aimed at pushing the robots to become bolder, and more skillful, at navigating their obstacle course.

“I want them to see something through their camera, and hear something from their microphone, and know that these two things happen together,” he says. “As humans, we combine a lot of sensory information to get added insight about the world. If we hear a thunder clap, we don’t need to see lightning to know that a storm has arrived. Our hypothesis is that robots with a better model of the world will be able to accomplish more difficult tasks.”

Training a robot agent to design a more efficient nuclear reactor 

One important factor driving the cost of nuclear power is the layout of its reactor core. If fuel rods are arranged in an optimal fashion, reactions last longer, burn less fuel, and need less maintenance. As engineers look for ways to bring down the cost of nuclear energy, they are eying the redesign of the reactor core.

“Nuclear power emits very little carbon and is surprisingly safe compared to other energy sources, even solar or wind,” says third-year student Isaac Wolverton. “We wanted to see if we could use AI to make it more efficient.” 

In a project with Josh Joseph, an AI engineer at the MIT Quest, and Koroush Shirvan, an assistant professor in MIT’s Department of Nuclear Science and Engineering, Wolverton spent the year training a reinforcement learning agent to find the best way to lay out fuel rods in a reactor core. To simulate the process, he turned the problem into a game, borrowing a machine learning technique for producing agents with superhuman abilities at chess and Go.

He started by training his agent on a simpler problem: arranging colored tiles on a grid so that as few tiles as possible of the same color would touch. As Wolverton increased the number of options, from two colors to five, and four tiles to 225, he grew excited as the agent continued to find the best strategy. “It gave us hope we could teach it to swap the cores into an optimal arrangement,” he says.

Eventually, Wolverton moved to an environment meant to simulate a 36-rod reactor core, with two enrichment levels and 2.1 million possible core configurations. With input from researchers in Shirvan’s lab, Wolverton trained an agent that arrived at the optimal solution.

The lab is now building on Wolverton’s code to try to train an agent in a life-sized 100-rod environment with 19 enrichment levels. “There’s no breakthrough at this point,” he says. “But we think it’s possible, if we can find enough compute resources.”

Making more livers available to patients who need them

About 8,000 patients in the United States receive liver transplants each year, but that’s only half the number who need one. Many more livers might be made available if hospitals had a faster way to screen them, researchers say. In a collaboration with Massachusetts General Hospital, MIT Quest is evaluating whether automation could help to boost the nation’s supply of viable livers.  

In approving a liver for transplant, pathologists estimate its fat content from a slice of tissue. If it’s low enough, the liver is deemed ready for transplant. But there are often not enough qualified doctors to review tissue samples on the tight timeline needed to match livers with recipients. A shortage of doctors, coupled with the subjective nature of analyzing tissue, means that viable livers are inevitably discarded.

This loss represents a huge opportunity for machine learning, says third-year student Kuan Wei Huang, who joined the project to explore AI applications in health care. The project involves training a deep neural network to pick out globules of fat on liver tissue slides to estimate the liver’s overall fat content.

One challenge, says Huang, has been figuring out how to handle variations in how various pathologists classify fat globules. “This makes it harder to tell whether I’ve created the appropriate masks to feed into the neural net,” he says. “However, after meeting with experts in the field, I received clarifications and was able to continue working.”

Trained on images labeled by pathologists, the model will eventually learn to isolate fat globules in unlabeled images on its own. The final output will be a fat content estimate with pictures of highlighted fat globules showing how the model arrived at its final count. “That’s the easy part — we just count up the pixels in the highlighted globules as a percentage of the overall biopsy and we have our fat content estimate,” says the Quest’s Gallagher, who is leading the project.

Huang says he’s excited by the project’s potential to help people. “Using machine learning to address medical problems is one of the best ways that a computer scientist can impact the world.”

Exposing the hidden constraints of what we mean in what we say

Language shapes our understanding of the world in subtle ways, with slight variations in the words we use conveying sharply different meanings. The sentence, “Elephants live in Africa and Asia,” looks a lot like the sentence “Elephants eat twigs and leaves.” But most readers will conclude that the elephants in the first sentence are split into distinct groups living on separate continents but not apply the same reasoning to the second sentence, because eating twigs and eating leaves can both be true of the same elephant in a way that living on different continents cannot.

Karen Gu is a senior majoring in computer science and molecular biology, but instead of putting cells under a microscope for her SuperUROP project, she chose to look at sentences like the ones above. “I’m fascinated by the complex and subtle things that we do to constrain language understanding, almost all of it subconsciously,” she says.

Working with Roger Levy, a professor in MIT’s Department of Brain and Cognitive Sciences, and postdoc MH Tessler, Gu explored how prior knowledge guides our interpretation of syntax and ultimately, meaning. In the sentences above, prior knowledge about geography and mutual exclusivity interact with syntax to produce different meanings.

After steeping herself in linguistics theory, Gu built a model to explain how, word by word, a given sentence produces meaning. She then ran a set of online experiments to see how human subjects would interpret analogous sentences in a story. Her experiments, she says, largely validated intuitions from linguistic theory.

One challenge, she says, was having to reconcile two approaches for studying language. “I had to figure out how to combine formal linguistics, which applies an almost mathematical approach to understanding how words combine, and probabilistic semantics-pragmatics, which has focused more on how people interpret whole utterances.’ “

After MIT closed in March, she was able to finish the project from her parents’ home in East Hanover, New Jersey. “Regular meetings with my advisor have been really helpful in keeping me motivated and on track,” she says. She says she also got to improve her web-development skills, which will come in handy when she starts work at Benchling, a San Francisco-based software company, this summer.

Spring semester Quest UROP projects were funded, in part, by the MIT-IBM Watson AI Lab and Eric Schmidt, technical advisor to Alphabet Inc., and his wife, Wendy.

Read More

Microsoft President Brad Smith talks data, Covid-19, and a potential “digital 9/11”

In a virtual discussion hosted by MIT last week, viewers learned that there are many problems that concern Microsoft President Brad Smith: things like climate change, Covid-19, and the work of the future.

Attendees also learned how seriously he takes the issue of computer security: 45 minutes into the event, his Windows system automatically rebooted for a lightning-quick software update.

“There are a lot of benefits to working from home,” he said with a laugh after rejoining, “but it certainly also adds a level of unpredictability.”

Smith’s conversation with MIT Professor Daniela Rus on May 14 spanned a wide range of topics, from the challenges of Covid-19 to the security of online voting. The fireside chat was held as part of MIT’s “Hot Topics in Computing” series, founded by the Computer Science and Artificial Intelligence Laboratory (CSAIL). The series is now an Institute-wide effort being co-presented with the MIT Stephen A. Schwarzman College of Computing (SCC). SCC Dean Dan Huttenlocher opened the event with a welcome to the audience and introduced Smith and Rus.

Having worked at Microsoft for 25 years and served as its president since 2015, Smith gave an inside look at what it’s been like to be there in recent months — and what the tech company has done to try to help curb the spread of Covid-19. 

In March, for example, Microsoft developed a chatbot to help people determine if they might need a Covid-19 test. Within weeks of deploying the app at a hospital in Seattle, Washington, the company started rolling it out more broadly. The chatbot was ultimately used 190 million times in April, and is now available at 1,500 institutions across 23 countries. 

Smith also discussed the promising work being done with Bluetooth-based contact tracing, but expressed skepticism that it could be adopted at a meaningful scale.

“Not everyone is going to walk around with an app on their phone,” he said. “I think we should recognize that it is a tool, and not a panacea.”

Smith and his colleague Carol Ann Browne recently co-wrote the book “Tools and Weapons,” which examines the promise and peril of technology for both good and bad. The authors draw on lessons in history, from Edward Snowden’s revelations about government surveillance to the Cambridge Analytica scandal, to explore the future of technology and how it needs to be managed. 

While he agreed with Rus’ assessment of the book as one “where the geeks are the heroes,” Smith also warned of the dangers of a future “digital 9/11” with respect to the electric grid and future presidential elections. 

“You don’t need to be a PhD in computer science to have an important role to play,” he said. “As consumers, citizens, and voters, we’re at a point of time where we’d all benefit from being better informed.” 

The long-time sustainability advocate also spoke about Microsoft’s ambitious goals to not just be carbon-negative by 2030, but to remove all of the carbon that the company has emitted since 1975 in the next 30 years. At a higher level, Smith advocated for making what he calls “the biggest R&D investment of our century” to develop new techniques to remove carbon from the environment. 

“We’re going to need huge breakthroughs in the next three decades if we are going to achieve this fundamental goal of protecting the planet the way we need to,” he said.

In the hourlong conversation, Smith often implored his audience of computer scientists and technologists to recognize the responsibility they have to solve tangible real-world problems. He pointed out that, for all the buzz about big data over the last 20 years, the Covid-19 pandemic has actually led to government officials making decisions directly informed by data.

“Data is running the economy and deciding who can leave their house and who needs to stay home,” Smith said. “The world will need the kind of technology that we can create … [and] you all have an opportunity to make a more positive impact in the world than perhaps any generation of MIT students has had before. How can that not get you excited about getting up in the morning?”

Read More

Fireflies helps companies get more out of meetings

Many decisions are made and details sorted out in a productive business meeting. But in order for that meeting to translate into results, participants have to remember all those details, understand their assignments, and follow through on commitments.

The startup Fireflies.ai is helping people get the most out of their meetings with a note-taking, information-organizing virtual assistant named Fred. Fred transcribes every word of meetings and then uses artificial intelligence to help people sort and share that information later on.

“There’s a tremendous amount of data generated in meetings that can help your team stay on the same page,” says Sam Udotong ’16, who founded the company with Krish Ramineni in 2016. “We let people capture that data, search through it, and then share it to the places that matter most.”

The tool integrates with popular meeting and scheduling software like Zoom and Google Calendar so users can quickly add Fred to calls. It also works with collaboration platforms like Slack and customer management software like Salesforce to help ensure plans turn into coordinated action.

Fireflies is used by people working in roles including sales, recruiting, and product management. They can use the service to automate project management tasks, screen candidates, and manage internal team communications.

In the last few months, driven in part by the Covid-19 pandemic, Fred has sat through millions of minutes of meetings involving more than half a million people. And the founders believe Fred can do more than simply help people adjust to remote work; it can also help them collaborate more effectively than ever before.

“[Fred] is giving you perfect memory,” says Udotong, who serves as Firelies’ chief technology officer. “The dream is for everyone to have perfect recall and make all their decisions based on the right information. So being able to search back to exact points in conversation and remember that is powerful. People have told us it makes them look smarter in front of clients.”

Taking the leap

Udotong was introduced to the power of machine learning in his first year at MIT while working on a project in which students built a drone that could lead people on campus tours. Later, during his first MIT hackathon, he sought to use machine learning in a cryptography solution. That’s when he met Ramineni, who was a student at the University of Pennsylvania. That’s also when Fireflies was born — although the founders would go on to change everything about the company besides its name as they sought to use artificial intelligence to improve efficiency in a range of fields.

“We ended up building six iterations of Fireflies before this current meeting assistant,” Udotong remembers. “And every time we would build a different iteration, we would tell our friends, ‘Download it, use it, and get back to us next week, we’ll grab coffee.’ We were making all these agreements and promises, and it became really challenging to keep track of all the conversations we were having to get our products out there. We thought, ‘What if we just had an AI that could keep track of conversations for us?’”

The founders’ initial note-taking solution, built in short bursts between classes and homework, tracked action items written in messages, sending reminders to users later on.

Following Udotong’s graduation with a degree in aeronautics and astronautics in 2016, the founders decided to use a $25,000 stipend they received from Rough Draft Ventures, along with $5,000 from the MIT Sandbox Innovation Fund, to work on Fireflies through the summer.

The plan was to work on Fireflies for another short burst: Ramineni was already making plans to attend Cambridge University for his master’s degree in the fall, and Udotong was weighing acceptance letters from graduate schools as well as job offers. By July, however, the founders had changed their plans.

“I think deciding [on a career path] is really hard these days, even if you identify your passion,” Udotong says. “The easy path for someone in tech is to follow the money and go work for Google or Facebook. We decided to go a different route and take the risk.”

They moved to Ramineni’s hometown of San Francisco to officially launch the company. Udotong remembers getting to San Francisco with $100 dollars in his bank account.

The founders had fully committed themselves to Fireflies, but it didn’t make starting the company any easier. They decided not to raise venture capital in the company’s early years, and Ramineni admits to questioning whether going all in on Fireflies was the right decision as recently at 2018.

The founders also weren’t sure a radically new software category would be embraced so readily by businesses. They continued to invest in the voice AI space, as they believed that the need for their technology was growing and the timing was right.

“We realized that there’s a ton of data generated every day through speech, either in meetings like Zoom or in person,” Ramineni says. “Today, two hours after your meeting, unless you’re taking good notes or recording, you’re not going to be able to recall everything. You might not even remember what action items you agreed to a few hours ago. It’s such a common problem that people don’t even know it’s an issue. You have meetings and you expect things to slip through the cracks.”

Illuminating conversations

Today the Fireflies solution shows little trace of the arduous journey the founders took to get to this point. In fact, building simplicity into the tool has been a major focus for the founders.

Fred can join calendar events automatically or be added to meetings using the fred@fireflies.ai address. Fred joins Zoom, Google Meet, Skype, or Microsoft calls as a participant, silently transcribing and generating notes from the meeting. After the meeting, the AI assistant sends a full transcript to whomever the organizer chooses, allowing users to click on sections of the transcript to hear that part of the meeting audio. Users can also search the transcript and go through an hourlong meeting in five minutes, according to the company. The transcript can also surface action items, tasks, metrics, pricing, and other topics of interest.

After each meeting, Fireflies can automatically sync all this meeting data into apps from companies like Slack, Salesforce, and Hubspot.

“Fireflies is like a personal assistant that helps connect your systems of communication with your systems of record,” Udotong says. “If you’re having these meetings over Zoom and Google Meet every day, and you’re interacting with Slack or Trello, Fireflies is that middle router that can bring synchronicity to your work life.”

In the midst of the Covid-19 pandemic, millions of companies have been forced to operate remotely, and the founders think the impact of that response will be felt for far longer than the virus.

“I think the world’s now realizing that people can be fully distributed,” says Ramineni, who notes Fireflies’ team has been remote since he and Udotong began working together in college hackathons from different campuses in 2014.

And as the company has grown, customers have begun using Fred for use cases the founders hadn’t even considered, like sending Fred to meetings that they can’t attend and reviewing the notes later on. Customers, the founders believe, are realizing that being able to quickly search, sort, and otherwise collaborate across audio data unlocks a world of new possibilities.

“It’s kind of like what Google did with search,” Udotong says. “There was five to 10 years of web data building up, and there was no way for people to find what they were looking for. The same thing is true today of audio and meeting data. It’s out there, but there’s no way to actually find what you’re looking for because it’s never even stored in the first place.”

Read More