NVIDIA Delivers Streaming AR and VR from the Cloud with AWS

NVIDIA Delivers Streaming AR and VR from the Cloud with AWS

NVIDIA and AWS are bringing the future of XR streaming to the cloud.

Announced today, the NVIDIA CloudXR platform will be available on Amazon EC2 P3 and G4 instances, which support NVIDIA V100 and T4 GPUs, allowing cloud users to stream high-quality immersive experiences to remote VR and AR devices.

The CloudXR platform includes the NVIDIA CloudXR software development kit, NVIDIA Virtual Workstation software and NVIDIA AI SDKs to deliver photorealistic graphics, with the mobile convenience of all-in-one XR headsets. XR is a collective term for VR, AR and mixed reality.

With the ability to stream from the cloud, professionals can now easily set up, scale and access immersive experiences from anywhere — they no longer need to be tethered to expensive workstations or external VR tracking systems.

The growing availability of advanced tools like CloudXR is paving the way for enhanced collaboration, streamlined workflows and high fidelity virtual environments. XR solutions are also introducing new possibilities for adding AI features and functionality.

With the CloudXR platform, many early access customers and partners across industries like manufacturing, media and entertainment, healthcare and others are enhancing immersive experiences by combining photorealistic graphics with the mobility of wireless head-mounted displays.

Lucid Motors recently announced the new Lucid Air, a powerful and efficient electric vehicle that users can experience through a custom implementation of the ZeroLight platform. Lucid Motors is developing a virtual design showroom using the CloudXR platform. By streaming the experience from AWS, shoppers can enter the virtual environment and see the advanced features of Lucid Air.

“NVIDIA CloudXR allows people all over the world to experience an incredibly immersive, personalized design with the new Lucid Air,“ said Thomas Orenz, director of digital interactive marketing at Lucid Motors. “By using the AWS cloud, we can save on infrastructure costs by removing the need for onsite servers, while also dynamically scaling the VR configuration experiences for our customers.”

Another early adopter of CloudXR on AWS is The Gettys Group, a hospitality design, branding and development company based in Chicago. Gettys frequently partners with visualization company Theia Interactive to turn the design process into interactive Unreal Engine VR experiences.

When the coronavirus pandemic hit, Gettys and Theia used NVIDIA CloudXR to deliver customer projects to a local Oculus Quest HMD, streaming from the AWS EC2 P3 instance with NVIDIA Virtual Workstations.

“This is a game changer — by streaming collaborative experiences from AWS, we can digitally bring project stakeholders together on short notice for quick VR design alignment meetings,” said Ron Swidler, chief innovation officer at The Gettys Group. “This is going to save a ton of time and money, but more importantly it’s going to increase client engagement, understanding and satisfaction.”


Next-Level Streaming from the Cloud

CloudXR is built on NVIDIA RTX GPUs to allow streaming of immersive AR, VR or mixed reality experiences from anywhere.

The platform includes:

  • NVIDIA CloudXR SDK, which provides support for all OpenVR apps and includes broad client support for phones, tablets and HMDs. Its adaptive streaming protocol delivers the richest experiences with the lowest perceived latency by constantly adapting to network conditions.
  • NVIDIA Virtual Workstations to deliver the most immersive, highest quality graphics at the fastest frame rates. It’s available from cloud providers such as AWS, or can be deployed from an enterprise data center.
  • NVIDIA AI SDKs to accelerate performance and enhance immersive presence.

With the NVIDIA CloudXR platform on Amazon EC2 G4 and P3 instances supporting NVIDIA T4 and V100 GPUs, companies can deliver high-quality virtual experiences to any user, anywhere in the world.

Availability Coming Soon

NVIDIA CloudXR on AWS will be generally available early next year, with a private beta available in the coming months. Sign up now to get the latest news and updates on upcoming CloudXR releases, including the private beta.

The post NVIDIA Delivers Streaming AR and VR from the Cloud with AWS appeared first on The Official NVIDIA Blog.

Read More

Triaging COVID-19 Patients: 20 Hospitals in 20 Days Build AI Model that Predicts Oxygen Needs

Triaging COVID-19 Patients: 20 Hospitals in 20 Days Build AI Model that Predicts Oxygen Needs

Researchers at NVIDIA and Massachusetts General Brigham Hospital have developed an AI model that determines whether a person showing up in the emergency room with COVID-19 symptoms will need supplemental oxygen hours or even days after an initial exam.

The original model, named CORISK, was developed by scientist Dr. Quanzheng Li at Mass General Brigham. It combines medical imaging and health records to help clinicians more effectively manage hospitalizations at a time when many countries may start seeing a second wave of COVID-19 patients.

Oxygen prediction AI workflow

To develop an AI model that doctors trust and that generalizes to as many hospitals as possible, NVIDIA and Mass General Brigham embarked on an initiative called EXAM (EMR CXR AI Model) the largest, most diverse federated learning initiative with 20 hospitals from around the world.

In just two weeks, the global collaboration achieved a model with .94 area under the curve (with an AUC goal of 1.0), resulting in excellent prediction for the level of oxygen required by incoming patients. The federated learning model will be released as part of NVIDIA Clara on NGC in the coming weeks.

Looking Inside the ‘EXAM’ Initiative

Using NVIDIA Clara Federated Learning Framework, researchers at individual hospitals were able to use a chest X-ray, patient vitals and lab values to train a local model and share only a subset of model weights back with the global model in a privacy-preserving technique called federated learning.

The ultimate goal of this model is to predict the likelihood that a person showing up in the emergency room will need supplemental oxygen, which can aid physicians in determining the appropriate level of care for patients, including ICU placement.

Dr. Ittai Dayan, who leads development and deployment of AI at Mass General Brigham, co-led the EXAM initiative with NVIDIA, and facilitated the use of CORISK as the starting point for the federated learning training. The improvements were achieved by training the model on distributed data from a multinational, diverse dataset of patients across North and South America, Canada, Europe and Asia.

In addition to Mass Gen Brigham and its affiliated hospitals, other participants included: Children’s National Hospital in Washington, D.C.; NIHR Cambridge Biomedical Research Centre; The Self-Defense Forces Central Hospital in Tokyo; National Taiwan University MeDA Lab and MAHC and Taiwan National Health Insurance Administration; Kyungpook National University Hospital in South Korea; Faculty of Medicine, Chulalongkorn University in Thailand; Diagnosticos da America SA in Brazil; University of California, San Francisco; VA San Diego; University of Toronto; National Institutes of Health in Bethesda, Maryland; University of Wisconsin-Madison School of Medicine and Public Health; Memorial Sloan Kettering Cancer Center in New York; and Mount Sinai Health System in New York.

Each of these hospitals used NVIDIA Clara to train its local models and participate in EXAM.

Rather than needing to pool patient chest X-rays and other confidential information into a single location, each institution uses a secure, in-house server for its data. A separate server, hosted on AWS, holds the global deep neural network, and each participating hospital gets a copy of the model to train on its own dataset.

Collaboration on a Global Scale

Large-scale federated learning projects also are underway, aimed at improving drug discovery and bringing AI benefits to the point of care.

Owkin is teaming up with NVIDIA, King’s College London and more than a dozen other organizations on MELLODDY, a drug-discovery consortium based in the U.K., to demonstrate how federated learning techniques could give pharmaceutical partners the best of both worlds: the ability to leverage the world’s largest collaborative drug compound dataset for AI training without sacrificing data privacy.

King’s College London is hoping that its work with federated learning, as part of its London Medical Imaging and Artificial Intelligence Centre for Value-Based Healthcare project, could lead to breakthroughs in classifying stroke and neurological impairments, determining the underlying causes of cancers, and recommending the best treatment for patients.

Learn more about another AI model for COVID-19 utilizing a multinational dataset in this paper, and about the science behind federated learning in this paper.

The post Triaging COVID-19 Patients: 20 Hospitals in 20 Days Build AI Model that Predicts Oxygen Needs appeared first on The Official NVIDIA Blog.

Read More

American Express Adopts NVIDIA AI to Help Prevent Fraud and Foil Cybercrime

American Express Adopts NVIDIA AI to Help Prevent Fraud and Foil Cybercrime

Financial fraud is surging along with waves of cybersecurity breaches.

Cybercrime cost the global economy $600 billion annually, or 0.8 percent of worldwide GDP, according to an estimate in 2018 from McAfee. And consulting firm Accenture forecasts cyberattacks could cost companies $5.2 trillion worldwide by 2024.

Credit and bank cards are a major target. American Express, which handles more than eight  billion transactions a year, is using deep learning on the NVIDIA GPU computing platform to combat fraud detection.

American Express has now deployed deep-learning-based models optimized with NVIDIA TensorRT and running on NVIDIA Triton Inference Server to detect fraud, NVIDIA CEO Jensen Huang announced at the GPU Technology Conference on Monday.

NVIDIA TensorRT is a high performance deep learning inference optimizer and runtime that minimizes latency and maximizes throughput.

NVIDIA Triton Inference Server software simplifies model deployment at scale and can be used as a  microservice that enables applications to use AI models in datacenter production.

“Our fraud algorithms monitor in real time every American Express transaction around the world for more than $1.2 trillion spent annually, and we generate fraud decisions in mere milliseconds,” said Manish Gupta, vice president of Machine Learning and Data Science Research at American Express.

Online Shopping Spree

Online shopping has spiked since the pandemic. In the U.S. alone, online commerce rose 49 percent in April compared with early March, according to Adobe’s Digital Economy Index.

That means less cash, more digital dollars. And more digital dollars demand bank and credit card usage, which has already seen increased fraud.

“Card fraud netted criminals $3.88 billion more in 2018 than in 2017,” said David Robertson, publisher of The Nilson Report, which tracks information about the global payments industry.

American Express, with more than 115 million active credit cards, has maintained the lowest fraud rate in the industry for 13 years in a row, according to The Nilson Report

“Having our card members and merchants’ back is our top priority, so keeping our fraud rates low is key to achieving that goal,” said Gupta.

Anomaly Detection with GPU Computing

With online transactions rising, fraudsters are waging more complex attacks as financial firms step up security measures.

One area that is easier to monitor is anomalous spending patterns. These types of transactions on one card — known as “out of pattern” — could show a coffee was purchased in San Francisco and then five minutes later a tank of gas was purchased in Los Angeles.

Such anomalies are red-flagged using recurrent neural networks, or RNNs, which are particularly good at guessing what comes next in a sequence of data.

American Express has deployed long short-term memory networks, or LSTMs, which can provide improved performance in RNNs.

And that can mean the closing gaps on latency and accuracy, two areas where American Express has made leaps. The teams there used NVIDIA DGX systems to accelerate the building and training of these LSTM models on mountains of structured and unstructured data using TensorFlow.

50x Gains Over CPUs

The recently released TensorRT-optimized LSTM network aids the system that analyzes transaction data on tens of millions of daily transactions in real time. This LSTM is now deployed using the NVIDIA Triton Inference Server on NVIDIA T4 GPUs for split-second inference.

Results are in: American Express was able to implement this enhanced, real-time fraud detection system for improved accuracy. It operates within a tight two-millisecond latency requirement, and this new system delivers a 50x improvement over a CPU-based configuration, which couldn’t meet the goal.

The financial services giant’s GPU-accelerated LSTM deep neural network combined with its long-standing gradient boosting machine (GBM) model — used for regression and classification — has improved fraud detection accuracy by up to six percent in specific segments.

Accuracy matters. A false positive that denies a customer’s legitimate transaction is an unpleasant situation to be in for card members and merchants, says American Express.

“Especially in this environment, our customers need us now more than ever, so we’re supporting them with best-in-class fraud protection and servicing,” Gupta said.

It’s not too late to get access to hundreds of live and on-demand talks at GTC. Register now through Oct. 9 using promo code CMB4KN to get 20 percent off.

The post American Express Adopts NVIDIA AI to Help Prevent Fraud and Foil Cybercrime appeared first on The Official NVIDIA Blog.

Read More

Plan2Explore: Active Model-Building for Self-Supervised Visual Reinforcement Learning

Plan2Explore: Active Model-Building for Self-Supervised Visual Reinforcement Learning

To operate successfully in unstructured open-world environments, autonomous
intelligent agents need to solve many different tasks and learn new tasks
quickly. Reinforcement learning has enabled artificial agents to solve complex
tasks both in simulation
and real-world.
However, it requires collecting large amounts of experience in the environment for each individual task. Self-supervised reinforcement learning has emerged
as an alternative,
where the agent only follows an intrinsic objective that is independent of any individual task,
analogously to unsupervised representation learning.
After acquiring general and reusable knowledge about the environment through
self-supervision, the agent can adapt to specific downstream tasks more
efficiently.


In this post, we explain our recent publication that develops Plan2Explore.
While many recent papers on self-supervised reinforcement learning have focused on
model-free agents, our agent learns an internal
world model that predicts the future outcomes of potential actions.
The world model captures general knowledge, allowing Plan2Explore to quickly solve new tasks through planning in its own imagination.
The world model further enables the agent to explore what it expects to be novel, rather than repeating what it found novel in the past.
Plan2Explore obtains state-of-the-art zero-shot and few-shot performance on continuous control benchmarks with high-dimensional input images.
To make it easy to experiment with our agent, we are open-sourcing the complete source code .

Simplify data management with new APIs in Amazon Personalize

Simplify data management with new APIs in Amazon Personalize

Amazon Personalize now makes it easier to manage your growing item and user catalogs with new APIs to incrementally add items and users in your datasets to create personalized recommendations. With the new putItems and putUsers APIs, you can simplify the process of managing your datasets. You no longer need to upload an entire dataset containing historical records and new records just to include new records in your recommendations. Providing new records to Amazon Personalize when they become available reduces your latency for incorporating new information, ensuring your recommendations remain relevant to your users and item catalog.

Based on over 20 years of personalization experience at Amazon.com, Amazon Personalize enables you to improve customer engagement by powering personalized product and content recommendations and targeted marketing promotions. Amazon Personalize uses machine learning (ML) to create higher-quality recommendations for your websites and applications. You can get started without any prior ML experience and use simple APIs to easily build sophisticated personalization capabilities in just a few clicks. Amazon Personalize processes and examines your data, identifies what is meaningful, and trains and optimizes a personalization model that is customized for your data. All your data is encrypted to be private and secure, and is only used to create recommendations for your users.

This post walks you through the process of incrementally modifying your items and users datasets in Amazon Personalize.

Adding new items and users to your datasets

For this use case, we create a dataset group with an interaction dataset, an item dataset (item metadata) and a user dataset using the Amazon Personalize CLI. For instructions on creating a dataset group, see Getting Started (CLI).

  1. Create an Interactions dataset using the following schema and import data using the interactions-100k.csv data file:
{
	"type": "record",
	"name": "Interactions",
	"namespace": "com.amazonaws.personalize.schema",
	"fields": [
		{
			"name": "USER_ID",
			"type": "string"
		},
		{
			"name": "ITEM_ID",
			"type": "string"
		},
		{
			"name": "EVENT_TYPE",
			"type": [
				"string"
			]
		},
		{
			"name": "EVENT_VALUE",
			"type": [
				"null",
				"float"
			]
		},
		{
			"name": "TIMESTAMP",
			"type": "long"
		}
	]
}
  1. Create an Items dataset using the following schema and import data using the csv data file:
{
	"type": "record",
	"name": "Items",
	"namespace": "com.amazonaws.personalize.schema",
	"fields": [
		{
			"name": "ITEM_ID",
			"type": "string"
		},
		{
			"name": "GENRE",
			"type”: "null”
			 "categorical": true
		}
	],
	"version": "1.0"
}
  1. Create a Users dataset using the following schema and import data using the csv data file:
{
	"type": "record",
	"name": "Users",
	"namespace": "com.amazonaws.personalize.schema",
	"fields": [
		{
			"name": "USER_ID",
			"type": "string"
		},
		{
			"name": "AGE",
			"type": "int"
		},
		{
			"name": "GENDER",
			"type": "string"
		}
	],
	"version": "1.0"
}

Now that you have created your datasets, you can add data to them in two different ways:

  • Using bulk import for item and user datasets from Amazon Simple Storage Service (Amazon S3). (for more information, see Preparing and Importing Data)
  • Using the new putUsers and putItems You can incrementally add up to 10 records per call to the user dataset using the putUsers API and the items dataset using putItems API.

For the putUsers call, the Users dataset required schema field (USER_ID) is mapped to the camel case userId. For the putItems call, the Items dataset required schema field (ITEM_ID) is mapped to the camel case itemId.

The following code adds two new users to the Users dataset via the putUsers API:

personalize_events.put_users(
datasetArn="arn:aws:personalize:region:acctID:dataset/crud-test/USERS",                          
    	users=[
 {
                 'userId' :"489",
                 'properties': "{"AGE":"29", "GENDER":F}"
             },
             {
                 'userId' : "650",
                 'properties':"{"AGE":"65", "GENDER"":F}"
             }]
)

The following code adds a new item to the Items dataset via the putItems API:

personalize_events.put_items(
datasetArn="arn:aws:personalize:region:acctID:dataset/crud-test/ITEMS",
items=[
{
            'itemId' :"432",
             'properties': "{"GENRE":"Action"}"
         }]
)

An HTTP/1.1 200 response is returned for successful record creation. In cases where your new item or user doesn’t match your dataset’s defined schema, you receive an InvalidInputException detailing the total number of records in your request that don’t match the schema.

For new records created (incrementally or via bulk upload) with the same userId or itemId as a record that already exists in the Users or Items dataset, the most recently created record (ingested by Amazon Personalize) is used in new solutions or solution versions.

Additionally, records added using putUsers or putItems are persisted until your dataset is deleted, so be sure to delete your dataset in the dataset group before importing a refreshed dataset. Amazon Personalize doesn’t replace your catalog or user data management systems.

Incorporating the newly added users and items in recommendations and filters

Now that you’ve added new items and new users to your datasets, incorporating this information into your Amazon Personalize solutions makes sure that recommendations remain timely and relevant for your users. When not using the aws-user-personalization recipe, solution re-training is needed to include these new items in your personalized recommendations.

If you have exploration enabled in an Amazon Personalize recipe, your new items are included in recommendations as soon as your next campaign update is complete. New events generated by your users’ interactions with these items are incorporated when your train a new solution or solution version in this dataset group.

Any filters you created in the dataset group are updated with your new item and user data within 15 minutes from the last dataset import job completion or the last incremental record. This update allows your campaigns to use your most recent data when filtering recommendations for your users.

Summary

Amazon Personalize allows you to easily manage your growing item and user catalogs so your personalized product and content recommendations keep pace with your business and your customers. For more information about optimizing your user experience with Amazon Personalize, see What Is Amazon Personalize?


About the Authors

Matt Chwastek is a Senior Product Manager for Amazon Personalize. He focuses on delivering products that make it easier to build and use machine learning solutions. In his spare time, he enjoys reading and photography.

 

 

 

 

Gaurav Singh Chauhan is a Software Engineer for Amazon Personalize and works on architecting software systems and big data pipelines that serve customers at scale. Gaurav has a B.Tech in Computer Science from IIT Bombay, India. Outside of work, he likes all things outdoors and is an avid runner. In his spare time, he likes reading about and exploring new technologies. He tweets on startups, technology, and India at @bazingaurav.

 

 

Read More

How TensorFlow docs uses Jupyter notebooks

How TensorFlow docs uses Jupyter notebooks

Posted by Billy Lamberta, TensorFlow Team

Jupyter notebooks are an important part of our TensorFlow documentation infrastructure. With the JupyterCon 2020 conference underway, the TensorFlow docs team would like to share some tools we use to manage a large collection of Jupyter notebooks as a first-class documentation format published on tensorflow.org.

As the TensorFlow ecosystem has grown, the TensorFlow documentation has grown into a substantial software project in its own right. We publish ~270 notebook guides and tutorials on tensorflow.org—all tested and available in GitHub. We also publish an additional ~400 translated notebooks for many languages—all tested like their English counterpart. The tooling we’ve developed to work with Jupyter notebooks helps us manage all this content.

Graph showing Notebooks published

When we published our first notebook on tensorflow.org over two years ago for the 2018 TensorFlow Developer Summit, the community response was fantastic. Users love that they can immediately jump from webpage documentation to an interactive computing experience in Google Colab. This setup allows you to run—and experiment with—our guides and tutorials right in the browser, without installing any software on your machine. This tensorflow.org integration with Colab made it much easier to get started and changed how we could teach TensorFlow using Jupyter notebooks. Other machine learning projects soon followed. Notebooks can be loaded directly from GitHub into Google Colab with just the URL:

https://colab.research.google.com/github/<repo>/blob/<branch>/<path>/notebook.ipynb

For compute-intensive tasks, Colab provides TPUs and GPUs at no cost. The TensorFlow documentation, such as this quickstart tutorial, has buttons that link to both its notebook source in GitHub and to load in Colab.

Better collaboration

Software documentation is a team effort, and notebooks are an expressive, education-focused format that allows engineers and writers to build up an interactive demonstration. Jupyter notebooks are JSON-formatted files that contain text cells and code cells, typically executed in sequential order from top-to-bottom. They are an excellent way to communicate programming ideas, and, with some discipline, a way to share reproducible results.

On the TensorFlow team, notebooks allow engineers, technical writers, and open source contributors to collaborate on the same document without the tension that exists between a separate code example and its published explanation. We write TensorFlow notebooks so that the documentation is the code—self-contained, easily shared, and tested.

Notebook translations with GitLocalize

Documentation needs to reach everyone around the world—something the TensorFlow team values. The TensorFlow community translation project has grown to 10 languages over the past two years. Translation sprints are a great way to engage with the community on open source documentation projects.

To make TensorFlow documentation accessible to even more developers, we worked with Alconost to add Jupyter notebook support to their GitLocalize translation tool. GitLocalize makes it easy to create translated notebooks and sync documentation updates from the source files. Open source contributors can submit pull requests and provide reviews using the TensorFlow GitLocalize project: gitlocalize.com/tensorflow/docs-l10n.

Jupyter notebook support in GitLocalize not only benefits TensorFlow, but is now available for all open source translation projects that use notebooks with GitHub.

TensorFlow docs notebook tools

Incorporating Jupyter notebooks into our docs infrastructure allows us to run and test all the published guides and tutorials to ensure everything on the site works for a new TensorFlow release—using stable or nightly packages.

Benefits aside, there are challenges with managing Jupyter notebooks as source code. To make pull requests and reviews easier for contributors and project maintainers, we created the TensorFlow docs notebook tools to automate common fixes and communicate issues to contributors with continuous integration (CI) tests. You can install the tensorflow-docs pip package directly from the tensorflow/docs GitHub repository:

$ python3 -m pip install -U git+https://github.com/tensorflow/docs

nbfmt

While the Jupyter notebook format is straightforward, notebook authoring environments are often inconsistent with JSON formatting or embed their own metadata in the file. These unnecessary changes can cause diff churn in pull requests that make content reviews difficult. The solution is to use an auto-formatter that outputs consistent notebook JSON.

nbfmt is a notebook formatter with a preference for the TensorFlow docs notebook style. It formats the JSON and strips unneeded metadata except for some Colab-specific fields used for our integration. To run:

$ python3 -m tensorflow_docs.tools.nbfmt [options] notebook.ipynb

For TensorFlow docs projects, notebooks saved without output cells are executed and tested; notebooks saved with output cells are published as-is. We prefer to remove outputs to test our notebooks, but nbfmt can be used with either format.

The --test flag is available for continuous integration tests. Instead of updating the notebook, it returns an error if the notebook is not formatted. We use this in a CI test for one of our GitHub Actions workflows. And with some further bot integration, formatting patches can be automatically applied to the contributor’s pull request.

nblint

The easiest way to scale reviews is to let the machine do it. Every project has recurring issues that pop up in reviews, and style questions are often best settled with a style guide (TensorFlow likes the Google developer docs style guide). For a large project, the more patterns you can catch and fix automatically, the more time you’ll have available for other goals.

nblint is a notebook linting tool that checks documentation style rules. We use it to catch common style and structural issues in TensorFlow notebooks:

>$ python3 -m tensorflow_docs.tools.nblint [options] notebook.ipynb

Lints are assertions that test specific sections of the notebook. These lints are collected into style modules. nblint tests the google and tensorflow styles by default, and other style modules can be loaded at the command-line. Some styles require arguments that are also passed at the command-line, for example, setting a different repo when linting the TensorFlow translation notebooks:

$ python3 -m tensorflow_docs.tools.nblint 
--styles=tensorflow,tensorflow_docs_l10n
--arg=repo:tensorflow/docs-1l0n
notebook.ipynb

Lint tests can have an associated fix that makes it easy to update notebooks to pass style checks automatically. Use the --fix argument to apply lint fixes that overwrite the notebook, for example:

$ python3 -m tensorflow_docs.tools.nblint --fix 
--arg=repo:tensorflow/docs notebook.ipynb

Learn more

TensorFlow is a big fan of Project Jupyter and Jupyter notebooks. Along with Google Colab, notebooks changed how we teach TensorFlow and scale a large open source documentation project with tested guides, tutorials, and translations. We hope that sharing some of the tools will help other open source projects that want to use notebooks as documentation.

Read a TensorFlow tutorial and then run the notebook in Google Colab. To contribute to the TensorFlow documentation project, submit a pull request or a translation review to our GitLocalize project.

Special thanks to Mark Daoust, Wolff Dobson, Yash Katariya, the TensorFlow docs team, and all TensorFlow docs authors, reviewers, contributors, and supporters.

Read More

How we make moral decisions

How we make moral decisions

Imagine that one day you’re riding the train and decide to hop the turnstile to avoid paying the fare. It probably won’t have a big impact on the financial well-being of your local transportation system. But now ask yourself, “What if everyone did that?” The outcome is much different — the system would likely go bankrupt and no one would be able to ride the train anymore.

Moral philosophers have long believed this type of reasoning, known as universalization, is the best way to make moral decisions. But do ordinary people spontaneously use this kind of moral judgment in their everyday lives?

In a study of several hundred people, MIT and Harvard University researchers have confirmed that people do use this strategy in particular situations called “threshold problems.” These are social dilemmas in which harm can occur if everyone, or a large number of people, performs a certain action. The authors devised a mathematical model that quantitatively predicts the judgments they are likely to make. They also showed, for the first time, that children as young as 4 years old can use this type of reasoning to judge right and wrong.

“This mechanism seems to be a way that we spontaneously can figure out what are the kinds of actions that I can do that are sustainable in my community,” says Sydney Levine, a postdoc at MIT and Harvard and the lead author of the study.

Other authors of the study are Max Kleiman-Weiner, a postdoc at MIT and Harvard; Laura Schulz, an MIT professor of cognitive science; Joshua Tenenbaum, a professor of computational cognitive science at MIT and a member of MIT’s Center for Brains, Minds, and Machines and Computer Science and Artificial Intelligence Laboratory (CSAIL); and Fiery Cushman, an assistant professor of psychology at Harvard. The paper is appearing this week in the Proceedings of the National Academy of Sciences.

Judging morality

The concept of universalization has been included in philosophical theories since at least the 1700s. Universalization is one of several strategies that philosophers believe people use to make moral judgments, along with outcome-based reasoning and rule-based reasoning. However, there have been few psychological studies of universalization, and many questions remain regarding how often this strategy is used, and under what circumstances.

To explore those questions, the MIT/Harvard team asked participants in their study to evaluate the morality of actions taken in situations where harm could occur if too many people perform the action. In one hypothetical scenario, John, a fisherman, is trying to decide whether to start using a new, more efficient fishing hook that will allow him to catch more fish. However, if every fisherman in his village decided to use the new hook, there would soon be no fish left in the lake.

The researchers found that many subjects did use universalization to evaluate John’s actions, and that their judgments depended on a variety of factors, include the number of people who were interested in using the new hook and the number of people using it that would trigger a harmful outcome.

To tease out the impact of those factors, the researchers created several versions of the scenario. In one, no one else in the village was interested in using the new hook, and in that scenario, most participants deemed it acceptable for John to use it. However, if others in the village were interested but chose not to use it, then John’s decision to use it was judged to be morally wrong.

The researchers also found that they could use their data to create a mathematical model that explains how people take different factors into account, such as the number of people who want to do the action and the number of people doing it that would cause harm. The model accurately predicts how people’s judgments change when these factors change.

In their last set of studies, the researchers created scenarios that they used to test judgments made by children between the ages of 4 and 11. One story featured a child who wanted to take a rock from a path in a park for his rock collection. Children were asked to judge if that was OK, under two different circumstances: In one, only one child wanted a rock, and in the other, many other children also wanted to take rocks for their collections.

The researchers found that most of the children deemed it wrong to take a rock if everyone wanted to, but permissible if there was only one child who wanted to do it. However, the children were not able to specifically explain why they had made those judgments.

“What’s interesting about this is we discovered that if you set up this carefully controlled contrast, the kids seem to be using this computation, even though they can’t articulate it,” Levine says. “They can’t introspect on their cognition and know what they’re doing and why, but they seem to be deploying the mechanism anyway.”

In future studies, the researchers hope to explore how and when the ability to use this type of reasoning develops in children.

Collective action

In the real world, there are many instances where universalization could be a good strategy for making decisions, but it’s not necessary because rules are already in place governing those situations.

“There are a lot of collective action problems in our world that can be solved with universalization, but they’re already solved with governmental regulation,” Levine says. “We don’t rely on people to have to do that kind of reasoning, we just make it illegal to ride the bus without paying.”

However, universalization can still be useful in situations that arise suddenly, before any government regulations or guidelines have been put in place. For example, at the beginning of the Covid-19 pandemic, before many local governments began requiring masks in public places, people contemplating wearing masks might have asked themselves what would happen if everyone decided not to wear one.

The researchers now hope to explore the reasons why people sometimes don’t seem to use universalization in cases where it could be applicable, such as combating climate change. One possible explanation is that people don’t have enough information about the potential harm that can result from certain actions, Levine says.

The research was funded by the John Templeton Foundation, the Templeton World Charity Foundation, and the Center for Brains, Minds, and Machines.

Read More

Announcing the winners of request for proposals on agent-based user interaction simulation

In May, we launched a request for research proposals in agent-based user interaction simulation to find and fix integrity and privacy issues. Today, we’re announcing the recipients of these research awards.
View RFPSoftware systems increasingly support communities of users who interact through the platform, elevating the importance and impact of research on integrity and privacy. How do we ensure that such communities remain safe and their data remains private? To tackle these challenges, Facebook is undertaking research and development on a web-enabled simulation (WES) system called WW.

“The WES research agenda offers so many fascinating new scientific challenges. We cannot hope to tackle them all ourselves,” says Facebook Research Scientist Mark Harman. “We are really excited that the strong response to this call will help to build collaboration and partnerships aimed at tackling these challenges.”

We received 86 proposals from 18 countries and 63 universities. Thank you to everyone who took the time to submit a proposal, and congratulations to the winners.

Research award recipients

A game-theoretic approach to evolving and analysing mechanism design in WES
Aldeida Aleti, Chong Chun Yong, Julian Garcia Gallego (Monash University)

Agent-based simulation for public procurement efficiency
Marcelin Joanis, Andrea Lodi, Igor Sadoune (Polytechnique Montréal)

Empirical game-theoretic analysis for web-enabled simulation
Michael P. Wellman, Mithun Chakraborty (University of Michigan)

Identify metamorphic relations for testing web-enabled simulation systems
Pak Lok Poon, Tsong Yueh Chen (Central Queensland University)

MoCA: Multi-objective co-evolutionary learning agents
Kalyanmoy Deb, Vishnu Boddeti (Michigan State University)

Odbody: An ethics and privacy guardian angel for social media users
Munindar P. Singh, Nirav Ajmeri (North Carolina State University)

Planning to induce emotion labels in a social media network
R. Michael Young (University of Utah)

Simulating a bad actor with knowledge graph-assisted action set generation
Ling Chen, Ivor Tsang (University of Technology Sydney)

Finalists

A bot scheduler for web-enabled simulations
Giovanni Denaro, Martin Tappler, Mauro Pezzè, Valerio Terragni (University of Milano-Bicocca)

AgenTest: A collaborative platform for human testers and test agents
Filippo Ricca, Lorenzo Rosasco, Viviana Mascardi (University of Genova)

Co-evolutionary iterated games to dynamically model bad-actor behaviour
Martin Shepperd (Brunel University)

Co-opetitive game theory for web-enabled simulation
Dr. Shaurya Agarwal (University of Florida)

Combinatorial reap-reward approach to expose privacy and trust attacks
Hyunsook Do (University of North Texas)

Detecting privacy leaks in WES via differential testing and diversification
Kangjie Lu (University of Minnesota Twin Cities)

DOTCOM: Deriving automated tests from conversation mutations
Rumyana Neykova, Giuseppe Destefanis, Stephen Swift, Steve Counsell (Brunel University)

Looking for interactions in the crowd: Using search and self-adaptation
Myra Cohen (Iowa State University Foundation)

MINDSET: Multi-agent-based socio-emotional testing
Rui Filipe Fernandes Prada, Manuel Lopes, Pedro Fernandes, Saba Ansari, Tanja E. J. Vos, Wishnu Prasety (INESC-ID)

Multi-agent-based automated data privacy testing for mobile apps
Yuan Tian, Christian Muise, Xuan-Bach D. Le (Queen’s University)

NLP-driven search-based fuzzing of systems with natural language interfaces
Phil McMinn, Gregory M. Kapfhammer, Mark Stevenson, Owain Parry (University of Sheffield)

SANS-T: Strategic agents network for social testing
Rocco Oliveto, Simone Scalabrino (University of Molise)

Synthesize realistic agents based on behavior examples
Harald C. Gall, Pasquale Salza (University of Zurich)

Taming deep learning (making it faster, more explainable)
Tim Menzies (North Carolina State University)

Towards multi-agent imitation learning in real world
Changyou Chen (University at Buffalo, SUNY)

Using SBSE and web-enabled simulation to detect adversaries
Kevin Leach, Westley Weimer (University of Michigan)

The post Announcing the winners of request for proposals on agent-based user interaction simulation appeared first on Facebook Research.

Read More

Massively Large-Scale Distributed Reinforcement Learning with Menger

Massively Large-Scale Distributed Reinforcement Learning with Menger

Posted by Amir Yazdanbakhsh, Research Scientist, and Junchaeo Chen, Software Engineer, Google Research

In the last decade, reinforcement learning (RL) has become one of the most promising research areas in machine learning and has demonstrated great potential for solving sophisticated real-world problems, such as chip placement and resource management, and solving challenging games (e.g., Go, Dota 2, and hide-and-seek). In simplest terms, an RL infrastructure is a loop of data collection and training, where actors explore the environment and collect samples, which are then sent to the learners to train and update the model. Most current RL techniques require many iterations over batches of millions of samples from the environment to learn a target task (e.g., Dota 2 learns from batches of 2 million frames every 2 seconds). As such, an RL infrastructure should not only scale efficiently (e.g., increase the number of actors) and collect an immense number of samples, but also be able to swiftly iterate over these extensive amounts of samples during training.

Overview of an RL system in which an actor sends trajectories (e.g., multiple samples) to a learner. The learner trains a model using the sampled data and pushes the updated model back to the actor (e.g. TF-Agents, IMPALA).

Today we introduce Menger1, a massive large-scale distributed RL infrastructure with localized inference that scales up to several thousand actors across multiple processing clusters (e.g., Borg cells), reducing the overall training time in the task of chip placement. In this post we describe how we implement Menger using Google TPU accelerators for fast training iterations, and present its performance and scalability on the challenging task of chip placement. Menger reduces the training time by up to 8.6x compared to a baseline implementation.

Menger System Design
There are various distributed RL systems, such as Acme and SEED RL, each of which focus on optimizing a single particular design point in the space of distributed reinforcement learning systems. For example, while Acme uses local inference on each actor with frequent model retrieval from the learner, SEED RL benefits from a centralized inference design by allocating a portion of TPU cores for performing batched calls. The tradeoffs between these design points are (1) paying the communication cost of sending/receiving observations and actions to/from a centralized inference server or paying the communication cost of model retrieval from a learner and (2) the cost of inference on actors (e.g., CPUs) compared to accelerators (e.g., TPUs/GPUs). Because of the requirements of our target application (e.g., size of observations, actions, and model size), Menger uses local inference in a manner similar to Acme, but pushes the scalability of actors to virtually an unbounded limit. The main challenges to achieving massive scalability and fast training on accelerators include:

  1. Servicing a large number of read requests from actors to a learner for model retrieval can easily throttle the learner and quickly become a major bottleneck (e.g., significantly increasing the convergence time) as the number of actors increases.
  2. The TPU performance is often limited by the efficiency of the input pipeline in feeding the training data to the TPU compute cores. As the number of TPU compute cores increases (e.g., TPU Pod), the performance of the input pipeline becomes even more critical for the overall training runtime.

Efficient Model Retrieval
To address the first challenge, we introduce transparent and distributed caching components between the learner and the actors optimized in TensorFlow and backed by Reverb (similar approach used in Dota). The main responsibility of the caching components is to strike a balance between the large number of requests from actors and the learner job. Adding these caching components not only significantly reduces the pressure on the learner to service the read requests, but also further distributes the actors across multiple Borg cells with a marginal communication overhead. In our study, we show that for a 16 MB model with 512 actors, the introduced caching components reduce the average read latency by a factor of ~4.0x leading to faster training iterations, especially for on-policy algorithms such as PPO.

Overview of a distributed RL system with multiple actors placed in different Borg cells. Servicing the frequent model update requests from a massive number of actors across different Borg cells throttles the learner and the communication network between learner and actors, which leads to a significant increase in the overall convergence time. The dashed lines represent gRPC communication between different machines.
Overview of a distributed RL system with multiple actors placed in different Borg cells with the introduced transparent and distributed caching service. The learner only sends the updated model to the distributed caching services. Each caching service handles the model request updates from the nearby actors (i.e., actors placed on the same Borg cells) and the caching service. The caching service not only reduces the load on the learner for servicing the model update requests, but also reduces the average read latency by the actors.

High Throughput Input Pipeline
To deliver a high throughput input data pipeline, Menger uses Reverb, a recently open-sourced data storage system designed for machine learning applications that provides an efficient and flexible platform to implement experience replay in a variety of on-policy/off-policy algorithms. However, using a single Reverb replay buffer service does not currently scale well in a distributed RL setting with thousands of actors, and simply becomes inefficient in terms of write throughput from actors.

A distributed RL system with a single replay buffer. Servicing a massive number of write requests from actors throttles the replay buffer and reduces its overall throughput. In addition, as we scale the learner to a setting with multiple compute engines (e.g., TPU Pod), feeding the data to these engines from a single replay buffer service becomes inefficient, which negatively impacts the overall convergence time.

To better understand the efficiency of the replay buffer in a distributed setting, we evaluate the average write latency for various payload sizes from 16 MB to 512 MB and a number of actors ranging from 16 to 2048. We repeat the experiment when the replay buffer and actors are placed on the same Borg cell. As the number of actors grows the average write latency also increases significantly. Expanding the number of actors from 16 to 2048, the average write latency increases by a factor of ~6.2x and ~18.9x for payload size 16 MB and 512 MB, respectively. This increase in the write latency negatively impacts the data collection time and leads to inefficiency in the overall training time.

The average write latency to a single Reverb replay buffer for various payload sizes (16 MB – 512 MB) and various number of actors (16 to 2048) when the actors and replay buffer are placed on the same Borg cells.

To mitigate this, we use the sharding capability provided by Reverb to increase the throughput between actors, learner, and replay buffer services. Sharding balances the write load from the large number of actors across multiple replay buffer servers, instead of throttling a single replay buffer server, and also minimizes the average write latency for each replay buffer server (as fewer actors share the same server). This enables Menger to scale efficiently to thousands of actors across multiple Borg cells.

A distributed RL system with sharded replay buffers. Each replay buffer service is a dedicated data storage for a collection of actors, generally located on the same Borg cells. In addition, the sharded replay buffer configuration provides a higher throughput input pipeline to the accelerator cores.

Case Study: Chip Placement
We studied the benefits of Menger in the complex task of chip placement for a large netlist. Using 512 TPU cores, Menger achieves significant improvements in the training time (up to ~8.6x, reducing the training time from ~8.6 hours down to merely one hour in the fastest configuration) compared to a strong baseline. While Menger was optimized for TPUs, that the key factor for this performance gain is the architecture, and we would expect to see similar gains when tailored to use on GPUs.

The improvement in training time using Menger with variable number of TPU cores compared to a baseline in the task of chip placement.

We believe that Menger infrastructure and its promising results in the intricate task of chip placement demonstrate an innovative path forward to further shorten the chip design cycle and has the potential to not only enable further innovations in the chip design process, but other challenging real-world tasks as well.

Acknowledgments
Most of the work was done by Amir Yazdanbakhsh, Junchaeo Chen, and Yu Zheng. We would like to also thank Robert Ormandi, Ebrahim Songhori, Shen Wang, TF-Agents team, Albin Cassirer, Aviral Kumar, James Laudon, John Wilkes, Joe Jiang, Milad Hashemi, Sat Chatterjee, Piotr Stanczyk, Sabela Ramos, Lasse Espeholt, Marcin Michalski, Sam Fishman, Ruoxin Sang, Azalia Mirhosseini, Anna Goldie, and Eric Johnson for their help and support.


1 A Menger cube is a three-dimensional fractal curve, and the inspiration for the name of this system, given that the proposed infrastructure can virtually scale ad infinitum.

Read More

Announcing the winner of the AWS DeepComposer Chartbusters The Sounds of Science challenge

Announcing the winner of the AWS DeepComposer Chartbusters The Sounds of Science challenge

We’re excited to announce the top 10 compositions and the winner of the AWS DeepComposer Chartbusters The Sounds of Science challenge. AWS DeepComposer provides a creative and hands-on experience for learning generative AI and machine learning (ML). Chartbusters is a global monthly challenge where you can use AWS DeepComposer to create original compositions and compete to top the charts and win prizes. To participate in The Sounds of Science, developers composed background music for a video clip using the Autoregressive CNN (AR-CNN) algorithm and edited notes with the newly launched Edit melody feature to better match the provided video.

Top 10 compositions

The high-quality submissions made it challenging for our judges to select the chart-toppers. Our panel of experts—Kesha Williams, Sally Revell, and Prachi Kumar—selected the top 10 ranked compositions by evaluating the quality of the music, creativity, and how well the music matched the video clip.

The winner of The Sounds of Science is… (cue drum roll) Sungin Lee! You can listen to his winning composition and the top 10 compositions on SoundCloud or on the AWS DeepComposer console. The top 10 compositions for the Sounds of Science challenge are:

Sungin will receive an AWS DeepComposer Chartbusters gold record and will tell his story in an upcoming post, right here on the AWS ML blog.

Congratulations, Sungin Lee!

It’s time to move on to the next Chartbusters challengeTrack or Treat, which is Halloween-themed. The challenge launches today and is open until October 23rd, 2020.


About the Author

Maryam Rezapoor is a Senior Product Manager with AWS AI Ecosystem team. As a former biomedical researcher and entrepreneur, she finds her passion in working backward from customers’ needs to create new impactful solutions. Outside of work, she enjoys hiking, photography, and gardening.

Read More