Making a Splash: AI Can Help Protect Ocean Goers From Deadly Rips

Making a Splash: AI Can Help Protect Ocean Goers From Deadly Rips

Surfers, swimmers and beachgoers face a hidden danger in the ocean: rip currents. These narrow channels of water can flow away from the shore at speeds up to 2.5 meters per second, making them one of the biggest safety risks for those enjoying the ocean.

To help keep beachgoers safe, Christo Rautenbach, a coastal and estuarine physical processes scientist, has teamed up with the National Institute of Water and Atmospheric Research in New Zealand to develop a real-time rip current identification tool using deep learning.

On this episode of the NVIDIA AI Podcast, host Noah Kravitz interviews Rautenbach about how AI can be used to identify rip currents and the potential for the tool to be used globally to help reduce the number of fatalities caused by rip currents.

Developed in collaboration with Surf Lifesaving New Zealand, the rip current identification tool has achieved a detection rate of roughly 90% in trials. Rautenbach also shares the research behind the technology, which was published in the November 22 edition of the journal Remote Sensing.

You Might Also Like

Art(ificial) Intelligence: Pindar Van Arman Builds Robots That Paint
Pindar Van Arman, an American artist and roboticist, designs painting robots that explore the differences between human and computational creativity. Since his first system in 2005, he has built multiple artificially creative robots. The most famous, Cloud Painter, was awarded first place at Robotart 2018.

Real or Not Real? Attorney Steven Frank Uses Deep Learning to Authenticate Art
Steven Frank is a partner at the law firm Morgan Lewis, specializing in intellectual property and commercial technology law. He’s also half of the husband-wife team that used convolutional neural networks to authenticate artistic masterpieces, including da Vinci’s Salvador Mundi, with AI’s help.

GANTheftAuto: Harrison Kinsley on AI-Generated Gaming Environments
Humans playing games against machines is nothing new, but now computers can develop games for people to play. Programming enthusiast and social media influencer Harrison Kinsley created GANTheftAuto, an AI-based neural network that generates a playable chunk of the classic video game Grand Theft Auto V.

Subscribe to the AI Podcast on Your Favorite Platform

You can now listen to the AI Podcast through Amazon Music, Apple Music, Google Podcasts, Google Play, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Featured image credit: T. Caulfield

Read More

Google Research, 2022 & beyond: Robotics

Google Research, 2022 & beyond: Robotics

(This is Part 6 in our series of posts covering different topical areas of research at Google. You can find other posts in the series here.)

Within our lifetimes, we will see robotic technologies that can help with everyday activities, enhancing human productivity and quality of life. Before robotics can be broadly useful in helping with practical day-to-day tasks in people-centered spaces — spaces designed for people, not machines — they need to be able to safely & competently provide assistance to people.

In 2022, we focused on challenges that come with enabling robots to be more helpful to people: 1) allowing robots and humans to communicate more efficiently and naturally; 2) enabling robots to understand and apply common sense knowledge in real-world situations; and 3) scaling the number of low-level skills robots need to effectively perform tasks in unstructured environments.

An undercurrent this past year has been the exploration of how large, generalist models, like PaLM, can work alongside other approaches to surface capabilities allowing robots to learn from a breadth of human knowledge and allowing people to engage with robots more naturally. As we do this, we’re transforming robot learning into a scalable data problem so that we can scale learning of generalized low-level skills, like manipulation. In this blog post, we’ll review key learnings and themes from our explorations in 2022.

Bringing the capabilities of LLMs to robotics

An incredible feature of large language models (LLMs) is their ability to encode descriptions and context into a format that’s understandable by both people and machines. When applied to robotics, LLMs let people task robots more easily — just by asking — with natural language. When combined with vision models and robotics learning approaches, LLMs give robots a way to understand the context of a person’s request and make decisions about what actions should be taken to complete it.

One of the underlying concepts is using LLMs to prompt other pretrained models for information that can build context about what is happening in a scene and make predictions about multimodal tasks. This is similar to the socratic method in teaching, where a teacher asks students questions to lead them through a rational thought process. In “Socratic Models”, we showed that this approach can achieve state-of-the-art performance in zero-shot image captioning and video-to-text retrieval tasks. It also enables new capabilities, like answering free-form questions about and predicting future activity from video, multimodal assistive dialogue, and as we’ll discuss next, robot perception and planning.

In “Towards Helpful Robots: Grounding Language in Robotic Affordances”, we partnered with Everyday Robots to ground the PaLM language model in a robotics affordance model to plan long horizon tasks. In previous machine-learned approaches, robots were limited to short, hard-coded commands, like “Pick up the sponge,” because they struggled with reasoning about the steps needed to complete a task — which is even harder when the task is given as an abstract goal like, “Can you help clean up this spill?”

With PaLM-SayCan, the robot acts as the language model’s “hands and eyes,” while the language model supplies high-level semantic knowledge about the task.

For this approach to work, one needs to have both an LLM that can predict the sequence of steps to complete long horizon tasks and an affordance model representing the skills a robot can actually do in a given situation. In “Extracting Skill-Centric State Abstractions from Value Functions”, we showed that the value function in reinforcement learning (RL) models can be used to build the affordance model — an abstract representation of the actions a robot can perform under different states. This lets us connect long-horizons of real-world tasks, like “tidy the living room”, to the short-horizon skills needed to complete the task, like correctly picking, placing, and arranging items.

Having both an LLM and an affordance model doesn’t mean that the robot will actually be able to complete the task successfully. However, with Inner Monologue, we closed the loop on LLM-based task planning with other sources of information, like human feedback or scene understanding, to detect when the robot fails to complete the task correctly. Using a robot from Everyday Robots, we show that LLMs can effectively replan if the current or previous plan steps failed, allowing the robot to recover from failures and complete complex tasks like “Put a coke in the top drawer,” as shown in the video below.

With PaLM-SayCan, the robot acts as the language model’s “hands and eyes,” while the language model supplies high-level semantic knowledge about the task.

An emergent capability from closing the loop on LLM-based task planning that we saw with Inner Monologue is that the robot can react to changes in the high-level goal mid-task. For example, a person might tell the robot to change its behavior as it is happening, by offering quick corrections or redirecting the robot to another task. This behavior is especially useful to let people interactively control and customize robot tasks when robots are working near people.

While natural language makes it easier for people to specify and modify robot tasks, one of the challenges is being able to react in real time to the full vocabulary people can use to describe tasks that a robot is capable of doing. In “Talking to Robots in Real Time”, we demonstrated a large-scale imitation learning framework for producing real-time, open-vocabulary, language-conditionable robots. With one policy we were able to address over 87,000 unique instructions, with an estimated average success rate of 93.5%. As part of this project, we released Language-Table, the largest available language-annotated robot dataset, which we hope will drive further research focused on real-time language-controllable robots.

Examples of long horizon goals reached under real time human language guidance.

We’re also excited about the potential for LLMs to write code that can control robot actions. Code-writing approaches, like in “Robots That Write Their Own Code”, show promise in increasing the complexity of tasks robots can complete by autonomously generating new code that re-composes API calls, synthesizes new functions, and expresses feedback loops to assemble new behaviors at runtime.

Code as Policies uses code-writing language models to map natural language instructions to robot code to complete tasks. Generated code can call existing perception action APIs, third party libraries, or write new functions at runtime.

Turning robot learning into a scalable data problem

Large language and multimodal models help robots understand the context in which they’re operating, like what’s happening in a scene and what the robot is expected to do. But robots also need low-level physical skills to complete tasks in the physical world, like picking up and precisely placing objects.

While we often take these physical skills for granted, executing them hundreds of times every day without even thinking, they present significant challenges to robots. For example, to pick up an object, the robot needs to perceive and understand the environment, reason about the spatial relation and contact dynamics between its gripper and the object, actuate the high degrees-of-freedom arm precisely, and exert the right amount of force to stably grasp the object without breaking it. The difficulty of learning these low-level skills is known as Moravec’s paradox: reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources.

Inspired by the recent success of LLMs, which shows that the generalization and performance of large Transformer-based models scale with the amount of data, we are taking a data-driven approach, turning the problem of learning low-level physical skills into a scalable data problem. With Robotics Transformer-1 (RT-1), we trained a robot manipulation policy on a large-scale, real-world robotics dataset of 130k episodes that cover 700+ tasks using a fleet of 13 robots from Everyday Robots and showed the same trend for robotics — increasing the scale and diversity of data improves the model ability to generalize to new tasks, environments, and objects.

Example PaLM-SayCan-RT1 executions of long-horizon tasks in real kitchens.

Behind both language models and many of our robotics learning approaches, like RT-1, are Transformers, which allow models to make sense of Internet-scale data. Unlike LLMs, robotics is challenged by multimodal representations of constantly changing environments and limited compute. In 2020, we introduced Performers as an approach to make Transformers more computationally efficient, which has implications for many applications beyond robotics. In Performer-MPC, we applied this to introduce a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints from Model Predictive Control (MPC). We show a >40% improvement on the robot reaching its goal and a >65% improvement on social metrics when navigating around humans in comparison to a standard MPC policy. Performer-MPC provides 8 ms latency for the 8.3M parameter model, making on-robot deployment of Transformers practical.

Navigation robot maneuvering through highly constrained spaces using: Regular MPC, Explicit Policy, and Performer-MPC.

In the last year, our team has shown that data-driven approaches are generally applicable on different robotic platforms in diverse environments to learn a wide range of tasks, including mobile manipulation, navigation, locomotion and table tennis. This shows us a clear path forward for learning low-level robot skills: scalable data collection. Unlike video and text data that is abundant on the Internet, robotic data is extremely scarce and hard to acquire. Finding approaches to collect and efficiently use rich datasets representative of real-world interactions is the key for our data-driven approaches.

Simulation is a fast, safe, and easily parallelizable option, but it is difficult to replicate the full environment, especially physics and human-robot interactions, in simulation. In i-Sim2Real, we showed an approach to address the sim-to-real gap and learn to play table tennis with a human opponent by bootstrapping from a simple model of human behavior and alternating between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are refined.

Learning to play table tennis with a human opponent.

While simulation helps, collecting data in the real world is essential for fine-tuning simulation policies or adapting existing policies in new environments. While learning, robots are prone to failure, which can cause damage to itself and surroundings — especially in the early stages of learning where they are exploring how to interact with the world. We need to collect training data safely, even while the robot is learning, and enable the robot to autonomously recover from failure. In “Learning Locomotion Skills Safely in the Real World”, we introduced a safe RL framework that switches between a “learner policy” optimized to perform the desired task and a “safe recovery policy” that prevents the robot from unsafe states. In “Legged Robots that Keep on Learning”, we trained a reset policy so the robot can recover from failures, like learning to stand up by itself after falling.

Automatic reset policies enable the robot to continue learning in a lifelong fashion without human supervision.

While robot data is scarce, videos of people performing different tasks are abundant. Of course, robots aren’t built like people — so the idea of robotic learning from people raises the problem of transferring learning across different embodiments. In “Robot See, Robot Do”, we developed Cross-Embodiment Inverse Reinforcement Learning to learn new tasks by watching people. Instead of trying to replicate the task exactly as a person would, we learn the high-level task objective, and summarize that knowledge in the form of a reward function. This type of demonstration learning could allow robots to learn skills by watching videos readily available on the internet.

We’re also progressing towards making our learning algorithms more data efficient so that we’re not relying only on scaling data collection. We improved the efficiency of RL approaches by incorporating prior information, including predictive information, adversarial motion priors, and guide policies. Further improvements are gained by utilizing a novel structured dynamical systems architecture and combining RL with trajectory optimization, supported by novel solvers. These types of prior information helped alleviate the exploration challenges, served as good regularizers, and significantly reduced the amount of data required. Furthermore, our team has invested heavily in more data-efficient imitation learning. We showed that a simple imitation learning approach, BC-Z, can enable zero-shot generalization to new tasks that were not seen during training. We also introduced an iterative imitation learning algorithm, GoalsEye, which combined Learning from Play and Goal-Conditioned Behavior Cloning for high-speed and high-precision table tennis games. On the theoretical front, we investigated dynamical-systems stability for characterizing the sample complexity of imitation learning, and the role of capturing failure-and-recovery within demonstration data to better condition offline learning from smaller datasets.

Closing

Advances in large models across the field of AI have spurred a leap in capabilities for robot learning. This past year, we’ve seen the sense of context and sequencing of events captured in LLMs help solve long-horizon planning for robotics and make robots easier for people to interact with and task. We’ve also seen a scalable path to learning robust and generalizable robot behaviors by applying a transformer model architecture to robot learning. We continue to open source data sets, like “Scanned Objects: A Dataset of 3D-Scanned Common Household Items”, and models, like RT-1, in the spirit of participating in the broader research community. We’re excited about building on these research themes in the coming year to enable helpful robots.

Acknowledgements

We would like to thank everyone who supported our research. This includes the entire Robotics at Google team, and collaborators from Everyday Robots and Google Research. We also want to thank our external collaborators, including UC Berkeley, Stanford, Gatech, University of Washington, MIT, CMU and U Penn.

Top

Google Research, 2022 & beyond

This was the sixth blog post in the “Google Research, 2022 & Beyond” series. Other posts in this series are listed in the table below:

Language Models Computer Vision Multimodal Models
Generative Models Responsible AI ML & Computer Systems
Efficient Deep Learning Algorithmic Advances Robotics
Health* General Science & Quantum Community Engagement
* Articles will be linked as they are released.

Read More

Building AI chatbots using Amazon Lex and Amazon Kendra for filtering query results based on user context

Building AI chatbots using Amazon Lex and Amazon Kendra for filtering query results based on user context

Amazon Kendra is an intelligent search service powered by machine learning (ML). It indexes the documents stored in a wide range of repositories and finds the most relevant document based on the keywords or natural language questions the user has searched for. In some scenarios, you need the search results to be filtered based on the context of the user making the search. Additional refinement is needed to find the documents specific to that user or user group as the top search result.

In this blog post, we focus on retrieving custom search results that apply to a specific user or user group. For instance, faculty in an educational institution belongs to different departments, and if a professor belonging to the computer science department signs in to the application and searches with the keywords “faculty courses,” then documents relevant to the same department come up as the top results, based on data source availability.

Solution overview

To solve this problem, you can identify one or more unique metadata information that is associated with the documents being indexed and searched. When the user signs in to an Amazon Lex chatbot, user context information can be derived from Amazon Cognito. The Amazon Lex chatbot can be integrated into Amazon Kendra using a direct integration or via an AWS Lambda function. The use of the AWS Lambda function will provide you with fine-grained control of the Amazon Kendra API calls. This will allow you to pass contextual information from the Amazon Lex chatbot to Amazon Kendra to fine-tune the search queries.

In Amazon Kendra, you provide document metadata attributes using custom attributes. To customize the document metadata during the ingestion process, refer to the Amazon Kendra Developer Guide. After completing the document metadata generation and indexing steps, you need to focus on refining the search results using the metadata attributes. Based on this, for example, you can ensure that users from the computer science department will get search results ranked according to their relevance to the department. That is, if there’s a document relevant to that department, it should be on the top of the search-result list preceding any other document without department information or nonmatching department.

Let’s now explore how to build this solution in more detail.

Solution walkthrough

Architecture diagram for proposed solution

Figure 1: Architecture diagram of proposed solution

The sample architecture used in this blog to demonstrate the use case is shown in Figure 1. You will set up an Amazon Kendra document index that consumes data from an Amazon Simple Storage Service (Amazon S3) bucket. You will set up a simple chatbot using Amazon Lex that will connect to the Amazon Kendra index via an AWS Lambda function. Users will rely on Amazon Cognito to authenticate and gain access to the Amazon Lex chatbot user interface. For the purposes of the demo, you will have two different users in Amazon Cognito belonging to two different departments. Using this setup, when you sign in using User 1 in Department A, search results will be filtered documents belonging to Department A and vice versa for Department B users.

Prerequisites

Before you can try to integrate the Amazon Lex chatbot with an Amazon Kendra index, you need to set up the basic building blocks for the solution. At a high level, you need to perform the following steps to enable this demo:

  1. Set up an S3 bucket data source with the appropriate documents and folder structure. For instructions on creating S3 buckets, please refer to AWS Documentation – Creating a bucket. Store the required document metadata along with the documents statically in the S3 bucket. To understand how to store document metadata for your documents in the S3 bucket, please refer to AWS Documentation – Amazon S3 document metadata. A sample metadata file could look like the one below:
{
    "DocumentId": "Faculty Certification Course-Computer Science",
    "Attributes": {
        "_category": "dosanddonts",
        "department": "Computer Science",
        "document_type": "job aid"
    },
    "Title": "Faculty Certification Course-Computer Science",
    "ContentType": "PDF"
}
  1. Set up an Amazon Kendra index by following the AWS documentation – Creating an index.
  2. Add the S3 bucket as a data source to your index by following the AWS Documentation – Using an Amazon S3 data source. Ensure that Amazon Kendra is aware of the metadata information and allows the department information to be faceted.
  3. You need to ensure the custom attributes in the Amazon Kendra index are set to be facetable, searchable, and displayable. You can do this in the Amazon Kendra console, by going to Data management and choosing Facet definition. To do this using the AWS command line interface (AWS CLI), you can leverage the kendra update-index command.
  4. Set up an Amazon Cognito user pool with two users. Associate a custom attribute with the user to capture their department values.
  5. Build a simple Amazon Lex v2 chatbot with required intents, slots, and utterances to drive the use case. In this blog, we will not provide detailed guidance on setting up the basic bot as the focus of the blog is to understand how to send user context information from the front end to the Amazon Kendra index. For details on creating a simple Amazon Lex bot, refer to the Building bots documentation. For the rest of the blog, it is assumed that the Amazon Lex chatbot has the following:
    • Intent – SearchCourses
    • Utterance – “What courses are available in {subject_types}?”
    • Slot – elective_year (can have values – elective, nonelective)
  6. You need to create a chatbot interface that the user will use to authenticate and interact with the chatbot. You can use the Sample Amazon Lex Web Interface (lex-web-ui) provided by AWS to get started. This will simplify the process of testing the integrations as it already integrates Amazon Cognito for user authentication and passes the required contextual information and Amazon Cognito JWT identity token to the backend Amazon Lex chatbot.

Once the basic building blocks are in place, your next step will be to create the AWS Lambda function that will tie together the Amazon Lex chatbot intent fulfillment with the Amazon Kendra index. The rest of this blog will specifically focus on this step and provide details on how to achieve this integration.

Integrating Amazon Lex with Amazon Kendra to pass user context

Now that the prerequisites are in place, you can start working on integrating your Amazon Lex chatbot with the Amazon Kendra index. As part of the integration, you will need to perform the following tasks:

  • Write an AWS Lambda function that will be attached to your Amazon Lex chatbot. In this Lambda function, you will parse the incoming input event to extract the user information, such as the user ID and additional attributes for the user from the Amazon Cognito identity token that is passed in as part of the session attributes in the event object.
  • Once all the information to form the Amazon Kendra query is in place, you submit a query to the Amazon Kendra index, including all the custom attributes that you want to use to scope down the search results view.
  • Finally, once the Amazon Kendra query returns the results, you generate a proper Amazon Lex response object to send the search results response back to the user.
  • Associate the AWS Lambda function with the Amazon Lex chatbot so that whenever the chatbot receives a query from the user, it triggers the AWS Lambda function.

Let’s look at these steps in more detail below.

Extracting user context in AWS Lambda function

The first thing you need to do is code and set up the Lambda function that can act as a bridge between the Amazon Lex chatbot intent and the Amazon Kendra index. The input event format documentation provides the full input Javascript Object Notation (JSON) input event structure. If the authentication system provides the user ID as an HTTP POST request to Amazon Lex, then the value will be available in the “userId” key of the JSON object. When the authentication is performed using Amazon Cognito, the “sessionState”.”sessionAttributes”.”idtokenjwt” key will contain a JSON Web Token (JWT) token object. If you are programming the AWS Lambda function in Python, the two lines of code to read the attributes from the event object will be as follows:

userid = event[‘userId’]
token = event[‘sessionState’][‘sessionAttributes’][‘idtokenjwt’]

The JWT token is encoded. Once you’ve decoded the JWT token, you will be able to read the value of the custom attribute associated with the Amazon Cognito user. Refer to How can I decode and verify the signature of an Amazon Cognito JSON Web Token to understand how to decode the JWT token, verify it, and retrieve the custom values. Once you have the claims from the token, you can extract the custom attribute, like “department” in Python, as follows:

userDept = claims[‘custom:department’]

When using a third-party identity provider (IDP) to authenticate against the chatbot, you need to ensure that the IDP sends an token with required attributes. The token should include required data for the custom attributes, such as department, group memberships, etc. This will be passed to the Amazon Lex chatbot in the session context variables. If you are using the lex-web-ui as the chatbot interface, then refer to the credential management section of the lex-web-ui readme documentation to understand how Amazon Cognito is integrated with lex-web-ui. To understand how you can integrate third-party identity providers with an Amazon Cognito identity pool, refer to the documentation on Identity pools (federated identities) external identity providers.

For the query topic from the user, you can extract from the event object by reading the value of the slots identified by Amazon Lex. The actual value of the slot can be read from the attribute with the key “sessionState”.”intent”.”slots”.”slot name”.”value”.”interpretedValue” based on the identified data type. In the example in this blog, using Python, you could use the following lines of code to read the query values:

slotValue = event['sessionState']['intent']['slots']['elective_year']['value']['interpretedValue']

As described in the documentation for input event format, the slots value is an object that can have multiple entries of different data types. The data type for any given value will be indicated by “'sessionState”.”intent”.”slots”.”slot name”.”shape”. If the attribute is empty or missing, then the datatype is a string. In the example in this blog, using Python, you could use the following lines of code to read the query values:

slotType = event['sessionState']['intent']['slots']['elective_year']['shape']

Once you know the data format for the slot, you can interpret the value of ‘slotValue’ based on the data type identified in ‘slotType’.

Query Amazon Kendra index from AWS Lambda

Now that you’ve managed to extract all the relevant information from the input event object, you need to construct an Amazon Kendra query within the Lambda. Amazon Kendra lets you filter queries via specific attributes. When you submit a query to Amazon Kendra using the Query API, you can provide a document attribute as an attribute filter so that your users’ search results will be based on values matching that filter. Filters can be logically combined when you need to query on a hierarchy of attributes. A sample-filtered query will look as follows:

response=kendra.query(
    QueryText = query,
    IndexId = index,
    AttributeFilter = {
        Filter Conditions Object
    }
)

To understand filtering queries in Amazon Kendra in more detail, you can refer to AWS documentation – Filtering queries. Based on the above query, search results from Amazon Kendra will be scoped to include documents where the metadata attribute for “document” matches the value for the filter provided. In Python, this will look as follows:

response = kendra.query(
    QueryText = slotValue, 
    IndexId = index_id, 
    QueryResultTypeFilter = ‘ANSWER’, 
    AttributeFilter = {‘AndAllFilters’:
        [
            {‘EqualsTo’: {‘Key’: ‘department’, ‘Value’: {‘StringValue’: userDept}}}
        ]
    }
)

As highlighted earlier, please refer to Amazon Kendra Query API documentation to understand all the various attributes that can be provided into the query, including complex filter conditions for filtering the user search.

Handle Amazon Kendra response in AWS Lambda function

Upon a successful query within the Amazon Kendra index, you will receive a JSON object back as a response from the Query API. The full structure of the response object, including all its attributes details, are listed in the Amazon Kendra Query API documentation. You can read the “TotalNumberOfResults” to check the total number of results returned for the query you submitted. Do note that the SDK will only let you retrieve up to a maximum of 100 items. The query results are returned in the “ResultItems” attribute as an array of “QueryResultItem” objects. From the “QueryResultItem”, the attributes of immediate interest are “DocumentTitle”, “DocumentExcerpt”, and “DocumentURI”. In Python, you can use the below code to extract these values from the first “ResultItems” in the Amazon Kendra response:

docTitle = response[‘ResultItems’][0][‘DocumentTitle’]['Text’]
docURI = response[‘ResultItems’][0][‘DocumentURI’]
docTitle = response[‘ResultItems’][0][‘DocumentExcerpt’]['Text’]

Ideally, you should check the value of “TotalNumberOfResults” and iterate through the “ResultItems” array to retrieve all the results of interest. You need to then pack it properly into a valid AWS Lambda response object to be sent to the Amazon Lex chatbot. The structure of the expected Amazon Lex v2 chatbot response is documented in the Response format section. At a minimum, you need to populate the following attributes in the response object before returning it to the chatbot:

  • sessionState object – The mandatory attribute in this object is “dialogAction”. This will be used to define what state/action the chatbot should transition to next. If this is the end of the conversation because you’ve retrieved all the required results and are ready to present, then you will set it to close. You need to indicate which intent in the chatbot your response is related to and what the fulfillment state is that the chatbot needs to transition into. This can be done as follows:
response = {
                'sessionState': {
                    'activeContexts': [],
                    'dialogAction': {
                        'type': 'Close'
                    },
                    'intent': {
                        'name': ‘SearchCourses,
                        'slots': ‘elective_year’,
                        'state': ‘Fulfilled’
                    }
                }
            }
  • messages object – You need to submit your search results back into the chatbot by populating the messages object in the response based on the values you’ve extracted from the Amazon Kendra query. You can use the following code as an example to accomplish this:
response.update({
                    'messages': {
                        'contentType': 'PlainText',
                        'content': docTitle
                    }
                })

Hooking up the AWS Lambda function with Amazon Lex chatbot

At this point, you have a complete AWS Lambda function in place that can extract the user context from the incoming event, perform a filtered query against Amazon Kendra based on user context, and respond back to the Amazon Lex chatbot. The next step is to configure the Amazon Lex chatbot to use this AWS Lambda function as part of the intent fulfillment process. You can accomplish this by following the documented steps at Attaching a Lambda function to a bot alias. At this point, you now have a fully functioning Amazon Lex chatbot integrated with the Amazon Kendra index that can perform contextual queries based on the user interacting with the chatbot.

In our example, we have 2 users, User1 and User 2. User 1 is from the computer science department and User 2 is from the civil engineering department. Based on their contextual information related to department, Figure 2 will depict how the same conversation can result in different results in a side-by-side screenshot of two chatbot interactions:

User 1 chat messages example User 2 chat messages example

Figure 2: Side-by-side comparison of multiple user chat sessions

Cleanup

If you followed along the example setup, then you should clean up any resources you created to avoid additional charges in the long run. To perform a cleanup of the resources, you need to:

  1. Delete the Amazon Kendra index and associated Amazon S3 data source
  2. Delete the Amazon Lex chatbot
  3. Empty the S3 bucket
  4. Delete the S3 bucket
  5. Delete the Lambda function by following the Clean up section.
  6. Delete the lex-web-ui resources by deleting the associated AWS CloudFormation stack
  7. Delete the Amazon Cognito resources

Conclusion

Amazon Kendra is a highly accurate enterprise search service. Combining its natural language processing feature with an intelligent chatbot creates a solution that is robust for any use case needing custom outputs based on user context. Here we considered a sample use case of an organization with multiple departments, but this mechanism can be applied to any other relevant use cases with minimal changes.

Ready to get started? The Accenture AWS Business Group (AABG) helps customers accelerate their pace of digital innovation and realize incremental business value from cloud adoption and transformation. Connect with our team at accentureaws@amazon.com to learn how to build intelligent chatbot solutions for your customers.


About the Author

Rohit Satyanarayana is a Partner Solutions Architect at AWS in Singapore and is part of the AWS GSI team working with Accenture globally. His hobbies are reading fantasy and science fiction, watching movies and listening to music.

Leo An is a Senior Solutions Architect who has demonstrated the ability to design and deliver cost-effective, high-performance infrastructure solutions in a private and public cloud. He enjoys helping customers in using cloud technologies to address their business challenges and is specialized in machine learning and is focused on helping customers leverage AI/ML for their business outcomes.

Hemalatha Katari is a Solution Architect at Accenture. She is part of rapid prototyping team within the Accenture AWS Business Group (AABG). She helps organizations migrate and run their businesses in AWS cloud. She enjoys growing ornamental indoor plants and loves going for long nature trail walks.

Sruthi Mamidipalli is an AWS solutions architect at Accenture, where she is helping clients with successful adoption of cloud native architecture. Outside of work, she loves gardening, cooking, and spending time with her toddler.

Read More

Measure the Business Impact of Amazon Personalize Recommendations

Measure the Business Impact of Amazon Personalize Recommendations

We’re excited to announce that Amazon Personalize now lets you measure how your personalized recommendations can help you achieve your business goals. After specifying the metrics that you want to track, you can identify which campaigns and recommenders are most impactful and understand the impact of recommendations on your business metrics.

All customers want to track the metric that is most important for their business. For example, an online shopping application may want to track two metrics: the click-through rate (CTR) for recommendations and the total number of purchases. A video-on-demand platform that has carousels with different recommenders providing recommendations may wish to compare the CTR or watch duration. You can also monitor the total revenue or margin of a specified event type, for example when a user purchases an item. This new capability lets you measure the impact of Amazon Personalize campaigns and recommenders, as well as interactions generated by third-party solutions.

In this post, we demonstrate how to track your metrics and evaluate the impact of your Personalize recommendations in an e-commerce use case.

Solution overview

Previously, to understand the effect of personalized recommendations, you had to manually orchestrate workflows to capture business metrics data, and then present them in meaningful representations to draw comparisons. Now, Amazon Personalize has eliminated this operational overhead by allowing you to define and monitor the metrics that you wish to track. Amazon Personalize can send performance data to Amazon CloudWatch for visualization and monitoring, or alternatively into an Amazon Simple Storage Service (Amazon S3) bucket where you can access metrics and integrate them into other business intelligence tools. This lets you effectively measure how events and recommendations impact business objectives, and observe the outcome of any event that you wish you monitor.

To measure the impact of recommendations, you define a “metric attribution,” which is a list of event types that you want to report on using either the Amazon Personalize console or APIs. For each event type, you simply define the metric and function that you want to calculate (sum or sample count), and Amazon Personalize performs the calculation, sending the generated reports to CloudWatch or Amazon S3.

The following diagram shows how you can track metrics from a single recommender or campaign:

Figure 1. Feature Overview: The interactions dataset is used to train a recommender or campaign. Then, when users interact with recommended items, these interactions are sent to Amazon Personalize and attributed to the corresponding recommender or campaign. Next, these metrics are exported to Amazon S3 and CloudWatch so that you can monitor them and compare the metrics of each recommender or campaign.

Metric attributions also let you provide an eventAttributionSource, for each interaction, which specifies the scenario that the user was experiencing when they interacted with an item. The following diagram shows how you can track metrics from two different recommenders using the Amazon Personalize metric attribution.

Figure 2. Measuring the business impact of recommendations in two scenarios: The interactions dataset is used to train two recommenders or campaigns, in this case designated “Blue” and “Orange”. Then, when users interact with the recommended items, these interactions are sent to Amazon Personalize and attributed to the corresponding recommender, campaign, or scenario to which the user was exposed when they interacted with the item. Next, these metrics are exported to Amazon S3 and CloudWatch so that you can monitor them and compare the metrics of each recommender or campaign.

In this example, we walk through the process of defining metrics attributions for your interaction data in Amazon Personalize. First, you import your data, and create two attribution metrics to measure the business impact of the recommendations. Then, you create two retail recommenders – it’s the same process if you’re using custom recommendation solution – and send events to track using the metrics. To get started, you only need the interactions dataset. However, since one of the metrics we track in this example is margin, we also show you how to import the items dataset. A code sample for this use case is available on GitHub.

Prerequisites

You can use the AWS Console or supported APIs to create recommendations using Amazon Personalize, for example using the AWS Command Line Interface or AWS SDK for Python.

To calculate and report the impact of recommendations, you first need to set up some AWS resources.

You must create an AWS Identity and Access Management (IAM) role that Amazon Personalize will assume with a relevant assume role policy document. You must also attach policies to let Amazon Personalize access data from an S3 bucket and to send data to CloudWatch. For more information, see Giving Amazon Personalize access to your Amazon S3 bucket and Giving Amazon Personalize access to CloudWatch.

Then, you must create some Amazon Personalize resources. Create your dataset group, load your data, and train recommenders. For full instructions, see Getting started.

  1. Create a dataset group. You can use metric attributions in domain dataset groups and custom dataset groups.
  2. Create an Interactions dataset using the following schema:
    { "type": "record", 
    "name": "Interactions",
     "namespace": "com.amazonaws.personalize.schema", 
    "fields": [ 
        {
            "name": "USER_ID",
            "type": "string"
        },
        {
            "name": "ITEM_ID",
            "type": "string"
        },
        {
            "name": "TIMESTAMP",
            "type": "long"
        },
        {
            "name": "EVENT_TYPE",
            "type": "string"
        }
    ],
     "version": "1.0" 
    }

  3. Create an Items dataset using the following schema:
    {
        "type": "record",
        "name": "Items",
        "namespace": "com.amazonaws.personalize.schema",
        "fields": [
            {
                "name": "ITEM_ID",
                "type": "string"
            },
            {
                "name": "PRICE",
                "type": "float"
            },
            {
                "name": "CATEGORY_L1",
                "type": ["string"],
                "categorical": True
            },
            {
                "name": "CATEGORY_L2",
                "type": ["string"],
                "categorical": True
            },
            {
                "name": "MARGIN",
                "type": "double"
            }
        ],
    "version": "1.0"
    }

Before importing our data to Amazon Personalize, we will define the metrics attribution.

Creating Metric Attributions

To begin generating metrics, you specify the list of events for which you’d like to gather metrics. For each of the event types chosen, you define the function that Amazon Personalize will apply as it collects data – the two functions available are  SUM(DatasetType.COLUMN_NAME) and SAMPLECOUNT(), where DatasetType can be the INTERACTIONS or ITEMS dataset. Amazon Personalize can send metrics data to CloudWatch for visualization and monitoring, or alternatively export it to an S3 bucket.

After you create a metric attribution and record events or import incremental bulk data, you’ll incur some monthly CloudWatch cost per metric. For information about CloudWatch pricing, see the CloudWatch pricing page. To stop sending metrics to CloudWatch, delete the metric attribution.

In this example, we’ll create two metric attributions:

  1. Count the total number of “View” events using the SAMPLECOUNT(). This function only requires the INTERACTIONS dataset.
  2. Calculate the total margin when purchase events occur using the SUM(DatasetType.COLUMN_NAME) In this case, the DatasetType is ITEMS and the column is MARGIN because we’re tracking the margin for the item when it was purchased. The Purchase event is recorded in the INTERACTIONS dataset. Note that, in order for the margin to be triggered by the purchase event, you would be sending a purchase event for each individual unit of each item purchased, even if they’re repeats – for example, two shirts of the same type. If your users can purchase multiples of each item when they checkout, and you’re only sending one purchase event for all of them, then a different metric will be more appropriate.

The function to calculate sample count is available only for the INTERACTIONS dataset. However, total margin requires you to have the ITEMS dataset and to configure the calculation. For each of them we specify the eventType that we’ll track, the function used, and give it a metricName that will identify the metrics once we export them. For this example, we’ve given them the names “countViews” and “sumMargin”.

The code sample is in Python.

import boto3 
personalize = boto3.client('personalize')

metrics_list = [{
        "eventType": "View",
        "expression": "SAMPLECOUNT()",
        "metricName": "countViews"
    },
    {
        "eventType": "Purchase",
        "expression": "SUM(ITEMS.MARGIN)",
        "metricName": "sumMargin"
}]

We also define where the data will be exported. In this case to an S3 bucket.

output_config = {
    "roleArn": role_arn,
    "s3DataDestination": {
    "path": path_to_bucket    
    }
}

Then we generate the metric attribution.

response = personalize.create_metric_attribution(
name = metric_attribution_name,
datasetGroupArn = dataset_group_arn,
metricsOutputConfig = output_config,
metrics = metrics_list
)
metric_attribution_arn = response['metricAttributionArn']

You must give a name to the metric attribution, as well as indicate the dataset group from which the metrics will be attributed using the datasetGroupArn, and the metricsOutputConfig and metrics objects we created previously.

Now with the metric attribution created, you can proceed with the dataset import job which will load our items and interactions datasets from our S3 bucket into the dataset groups that we previously configured.

For information on how to modify or delete an existing metric attribution, see Managing a metric attribution.

Importing Data and creating Recommenders

First, import the interaction data to Amazon Personalize from Amazon S3. For this example, we use the following data file. We generated the synthetic data based on the code in the Retail Demo Store project. Refer to the GitHub repository to learn more about the synthetic data and potential uses.

Then, create a recommender. In this example, we create two recommenders:

  1. “Recommended for you” recommender. This type of recommender creates personalized recommendations for items based on a user that you specify.
  2. Customers who viewed X also viewed. This type of recommender creates recommendations for items that customers also viewed based on an item that you specify.

Send events to Amazon Personalize and attribute them to the recommenders

To send interactions to Amazon Personalize, you must create an Event Tracker.

For each event, Amazon Personalize can record the eventAttributionSource. It can be inferred from the recommendationId or you can specify it explicitly and identify it in reports in the EVENT_ATTRIBUTION_SOURCE column. An eventAttributionSource can be a recommender, scenario, or third-party-managed part of the page where interactions occurred.

  • If you provide a recommendationId, then Amazon Personalize automatically infers the source campaign or recommender.
  • If you provide both attributes, then Amazon Personalize uses only the source.
  • If you don’t provide a source or a recommendationId, then Amazon Personalize labels the source SOURCE_NAME_UNDEFINED in reports.

The following code shows how to provide an eventAttributionSource for an event in a PutEvents operation.

response = personalize_events.put_events(
trackingId = 'eventTrackerId',
userId= 'userId',
sessionId = 'sessionId123',
eventList = [{
'eventId': event_id,
'eventType': event_type,
'itemId': item_id,
'metricAttribution': {"eventAttributionSource": attribution_source},
'sentAt': timestamp_in_unix_format
}
}]
)
print (response)

Viewing your Metrics

Amazon Personalize sends the metrics to Amazon CloudWatch or Amazon S3:

For all bulk data, if you provide an Amazon S3 bucket when you create your metric attribution, you can choose to publish metric reports to your Amazon S3 bucket. You need to do this each time you create a dataset import job for interactions data.

import boto3

personalize = boto3.client('personalize')

response = personalize.create_dataset_import_job(
    jobName = 'YourImportJob',
    datasetArn = 'dataset_arn',
    dataSource = {'dataLocation':'s3://bucket/file.csv'},
    roleArn = 'role_arn',
    importMode = 'INCREMENTAL',
    publishAttributionMetricsToS3 = True
)

print (response)

When importing your data, select the correct import mode INCREMENTAL or FULL and instruct Amazon Personalize to publish the metrics by setting publishAttributionMetricsToS3 to True. For more information on publishing metric reports to Amazon S3, see Publishing metrics to Amazon S3.

For PutEvents data sent via the Event Tracker and for incremental bulk data imports, Amazon Personalize automatically sends metrics to CloudWatch. You can view data from the previous 2 weeks in Amazon CloudWatch – older data is ignored.

You can graph a metric directly in the CloudWatch console by specifying the name that you gave the metric when you created the metric attribution as the search term. For more information on how you can view these metrics in CloudWatch, see Viewing metrics in CloudWatch.

Figure 3: An example of comparing two CTRs from two recommenders viewed in the CloudWatch Console.

Importing and publishing metrics to Amazon S3

When you upload your data to Amazon Personalize via a dataset import job, and you have provided a path to your Amazon S3 bucket in your metric attribution, you can view your metrics in Amazon S3 when the job completes.

Each time that you publish metrics, Amazon Personalize creates a new file in your Amazon S3 bucket. The file name specifies the import method and date. The field EVENT_ATTRIBUTION_SOURCE specifies the event source, i.e., under which scenario the interaction took place. Amazon Personalize lets you specify the EVENT_ATTRIBUTION_SOURCE explicitly using this field, this can be a third-party recommender. For more information, see Publishing metrics to Amazon S3.

Summary

Adding metrics attribution let you track the effect that recommendations have on business metrics. You create these metrics by adding a metric attribution to your dataset group and selecting the events that you want to track, as well as the function to count the events or aggregate a dataset field. Afterward, you can see the metrics in which you’re interested in CloudWatch or in the exported file in Amazon S3.

For more information about Amazon Personalize, see What Is Amazon Personalize?


About the authors

Anna Grüebler is a Specialist Solutions Architect at AWS focusing on in Artificial Intelligence. She has more than 10 years of experience helping customers develop and deploy machine learning applications. Her passion is taking new technologies and putting them in the hands of everyone, and solving difficult problems leveraging the advantages of using AI in the cloud.


Gabrielle Dompreh is Specialist Solutions Architect at AWS in Artificial Intelligence and Machine Learning. She enjoys learning about the new innovations of machine learning and helping customers leverage their full capability with well-architected solutions.

Read More

Updates: TensorFlow Decision Forests is production ready

Updates: TensorFlow Decision Forests is production ready

Posted by Mathieu Guillame-Bert, Richard Stotz, Luiz GUStavo Martins

Two years ago, we open sourced the experimental version of TensorFlow Decision Forests and Yggdrasil Decision Forests, a pair of libraries to train and use decision forest models such as Random Forests and Gradient Boosted Trees in TensorFlow. Since then, we’ve added a lot of new features and improvements.

TensorFlow Decision Forests

Today, we are happy to announce that TensorFlow Decision Forests is production ready. In this post, we are going to show you all the new features that come with it 🙂. Buckle up!

First, what are decision forests?

Decision forests are a type of machine learning model that train fast and work extremely well on tabular datasets. Informally, a decision forest is composed of many small decision trees. Together, they make better predictions thanks to the wisdom of the crowd principle. If you want to learn more, check out our class.

Illustration of a simple decision tree to select an animal based on number of legs (more than or equal to 4; if no = penguin, and/or number of eyes (more than or equal to three; if yes = spider, if no = dog)

If you’re new to TensorFlow Decision Forests, we recommend that you try the beginner tutorial. Here is how easy it is to use TF-DF:

train_df = pd.read_csv("train.csv") train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_df, label="species") model = tfdf.keras.GradientBoostedTreesModel() model.fit(train_ds) model.save("my_model")

Following are the main new features introduced to TensorFlow Decision Forests (TF-DF) in the 1.x release.

Easier hyper-parameter tuning

ML Illustration

Like all machine learning algorithms, Decision Forests have hyper-parameters. The default values of those parameters give good results, but, if you really want the best possible results for your model, you need to “tune” those parameters.

TF-DF makes it easy to tune parameters. For example, the objective function and the configuration for distribution are selected automatically, and you specify the hyper-parameters you wish to tune as follows:

tuner = tfdf.tuner.RandomSearch(num_trials=50) tuner.choice("min_examples", [2, 5, 7, 10]) tuner.choice("categorical_algorithm", ["CART", "RANDOM"]) tuner.choice("max_depth", [3, 4, 5, 6, 8]) tuner.choice("use_hessian_gain", [True, False]) tuner.choice("shrinkage", [0.02, 0.05, 0.10, 0.15]) tuner.choice("growing_strategy", ["LOCAL"]).choice("max_depth", [3, 4, 5, 6, 8]) tuner.choice("growing_strategy", ["BEST_FIRST_GLOBAL"], merge=True).choice("max_num_nodes", [16, 32, 64, 128, 256]) # ... Add all the parameters to tune model = tfdf.keras.GradientBoostedTreesModel(verbose=2, tuner=tuner) model.fit(training_dataset

Starting with TF-DF 1.0, you can use the pre-configured hyper-parameter tuning search space. Simply add use_predefined_hps=True to your model constructor and the tuning will be done automatically:

tuner = tfdf.tuner.RandomSearch(num_trials=50, use_predefined_hps=True) # No need to configure each hyper-parameters tuned_model = tfdf.keras.GradientBoostedTreesModel(verbose=2, tuner=tuner) tuned_model.fit(train_ds, verbose=2)

Check the hyper-parameter tuning tutorial for more details. And, if your dataset is large, or if you have a lot of parameters to optimize, you can even use distributed training to tune your hyper-parameters.

Hyper-parameters templates

As mentioned above, to maximize the quality of your model you need to tune the hyper-parameters. However, this operation takes time. If you don’t have the time to tune your hyper-parameters, we have a new solution for you: Hyper-parameter templates.

Hyper-parameter templates are a set of hyper-parameters that have been discovered by testing hundreds of datasets. To use them, you simply need to set the hyperparameter_template argument.

model = tfdf.keras.GradientBoostedTreesModel(hyperparameter_template="benchmark_rank1") model.fit(training_dataset)

In our paper called “Yggdrasil Decision Forests: A Fast and Extensible Decision Forests Library”, we show experimentally that the results are almost as good as with manual hyper-parameter tuning.

See the “hyper-parameter templates” sections in the hyper-parameter index for more details.

Serving models on Google Cloud

Cloud Logo

TensorFlow Decision Forests is now included in the official release of TensorFlow Serving and in Google Cloud’s Vertex AI. Without any special configuration or custom images, you can now run TensorFlow Decision Forests in Google Cloud.

See our examples for TensorFlow Serving.

Distributed training on billions of examples

illustration of ten desktop PCs in two rows of five

Training TF-DF on datasets with less than a million examples is almost instantaneous. On larger datasets however, training takes longer. TF-DF now supports distributed training. If your dataset contains multiple millions or even billions of examples, you can use distributed training on tens or even hundreds of machines.

Here is an example:

cluster_resolver = tf.distribute.cluster_resolver.TFConfigClusterResolver() strategy = tf.distribute.experimental.ParameterServerStrategy(cluster_resolver) with strategy.scope(): model = tfdf.keras.DistributedGradientBoostedTreesModel( temp_directory=..., num_threads=30, ) model.fit_on_dataset_path( train_path=os.path.join(dataset_path, "train@60"), valid_path=os.path.join(dataset_path, "valid@20"), label_key="my_label", dataset_format="csv")

See our end-to-end example and documentation for more details and examples.

Training models in Google Sheets

To make it even easier to train decision forests, we created Simple ML for Sheets. Simple ML for Sheets makes it possible to train, evaluate, and interpret TensorFlow Decision Forests models in Google Sheets without any coding!

Cloud Logo

And once you have trained your model in Google Sheets, you can export it back to TensorFlow Decision Forests and use it like any other models.

Check the Simple ML for Sheets tutorial for more details.

Next steps

We hope you enjoyed reading this news, and that the new version of TensorFlow Decision Forests will be useful for your work.

To learn more about the TensorFlow Decision Forests library, see the following resources:

  • See tutorials on this page.
  • Learn more about advanced usages of TensorFlow Decision Forests and Yggdrasil Decision Forests on this page.

And if you have questions, please ask them on the discuss.tensorflow.org using the tag “TFDF” and we’ll do our best to help. Thanks again.

— The TensorFlow Decision Forests team

Read More

3D Creators Share Art From the Heart This Week ‘In the NVIDIA Studio’

3D Creators Share Art From the Heart This Week ‘In the NVIDIA Studio’

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology improves creative workflows. We’re also deep diving on new GeForce RTX 40 Series GPU features, technologies and resources, and how they dramatically accelerate content creation.

Love and creativity are in the air this Valentine’s Day In the NVIDIA Studio, as 3D artist Molly Brady presents a parody scene inspired by the iconic The Birth of Venus (Redux) painting by Sando Botticelli.

Plus, join the #ShareYourHeART challenge by sharing what Valentine’s Day means to you in a scene built with NVIDIA Omniverse, a platform for creating and operating metaverse applications. Use the hashtag to post artwork — whether heartened by love, chocolate, teddy bears or anything else Valentine’s-themed — for a chance to be featured across NVIDIA social media channels.

3D artist Tanja Langgner’s delectable scene with chocolate hearts, featured below, is just one example.

Also, get a chance to win a GeForce RTX 3090 Ti GPU in the NVIDIA Instant NeRF VR sweepstakes. Named by TIME Magazine as one of the best inventions of 2022, NVIDIA Instant NeRF enables creators to rapidly create 3D models from 2D images and use them in virtual scenes. The tool provides a glimpse into the future of photography, 3D graphics and virtual worlds. Enter the sweepstakes by creating your own NeRF scene, and look to influencer Paul Trillo’s Instagram for inspiration.

New NVIDIA Studio laptops powered by GeForce RTX 40 Series Laptop GPUs are now available, including MSI’s Stealth 17 Studio and Razer’s 16 and 18 models — with more on the way. Learn why PC Gamer said the “RTX 4090 pushes laptops to blistering new frontiers: Yes, it’s fast, but also much more.”

Download the latest NVIDIA Studio Driver to enhance existing app features and reduce repetitive tasks. ON1 NoNoise AI, an app that quickly removes image noise while preserving and enhancing photo details, released an update speeding this process by an average of 50% on GeForce RTX 40 Series GPUs.

And NVIDIA GTC, a global conference for the era of AI and the metaverse, is running online March 20-23, with a slew of creator sessions, Omniverse tutorials and more — all free with registration. Learn more below.

A Satirical Valentine

Molly Brady is a big fan of caricatures.

“I love parody,” she gleefully admitted. “Nothing pleases me more than taking the air out of something serious and stoic.”

Botticelli’s The Birth of Venus painting, often referenced and revered, presented Brady with an opportunity for humor through her signature visual style.

According to Brady, “3D allows you to mix stylistic artwork with real-world limitations,” which is why the touchable, cinematic look of stop-motion animation heavily inspires her work.

“Stop-motion reforms found items into set pieces for fantastical worlds, giving them a new life and that brings me immense joy,” she said.

Brady’s portfolio features colorful, vibrant visuals with a touch of whimsical humor.

Brady composited Birth of Venus (Redux) with placeholder meshes, focusing on the central creature figure, before confirming the composition and scale were to her liking. She then sculpted finery details in the flexible 3D modeling app Foundry Modo, assisted by RTX acceleration in OTOY OctaneRender, which was made possible by her GeForce RTX 4090 GPUs.

Advanced sculpting was completed in Modo.

She then applied materials and staged lighting with precision, and added speed with the RTX-accelerated ray tracing renderer. Brady has the option to deploy Octane Render, her preferred 3D renderer, in over 20 3D applications, including Autodesk 3ds Max, Blender and Maxon’s Cinema 4D.

After rendering the image, Brady deployed several post-processing features in Adobe Photoshop to help ensure the colors popped, as well as to add grain to compensate for any compression when posted on social media. Her RTX GPU affords over 30 GPU-accelerated features, such as blur gallery, object selection, liquify and smart sharpen.

“Art has been highly therapeutic for me, not just as an outlet to express emotion but to reflect how I see myself and what I value,” Brady said. “Whenever I feel overwhelmed by the pressure of expectation, whether internal or external, I redirect my efforts and instead create something that brings me joy.”

3D artist Molly Brady.

View more of Brady’s artwork on Instagram.

Valen(time) to Join the #ShareYourHeART Challenge

The photorealistic, chocolate heart plate beside a rose-themed mug and napkins, featured below, is 3D artist and illustrator Tanja Langgner’s stunning #ShareYourHeART challenge entry.

Hungry?

Langgner gathered assets and sculpted the heart shape using McNeel Rhino and Maxon ZBrush. Next, she assembled the pieces in Blender and added textures using Adobe Substance 3D Painter. The scene was then exported from Blender as a USD file and brought into Omniverse Create, where the artist added lighting and virtual cameras to capture the sweets with the perfect illuminations and angles.

“The main reason I started using Omniverse was its capability to link all my favorite apps,” Langgner said. “Saving time on exporting, importing and recreating materials in each app is a dream come true.”

Learn more about Langgner’s creative journey at the upcoming Community Spotlight livestream on the Omniverse Twitch channel and YouTube on Wednesday, Feb. 22, from 11 a.m. to 12 p.m. PT.

Join the #ShareYourHeART challenge by posting your own Valentine’s-themed Omniverse scene on social media using the hashtag. Entries could be featured on the NVIDIA Omniverse Twitter, LinkedIn and Instagram accounts.

Creative Boosts at GTC 

Experience this spring’s GTC for more inspiring content, expert-led sessions and a must-see keynote to accelerate your life’s creative work.

Catch these sessions live or watch on demand:

  • 3D Art Goes Multiplayer: Behind the Scenes of Adobe Substance’s “End of Summer” Project With Omniverse [S51239]
  • 3D and Beyond: How 3D Artists Can Build a Side Hustle in the Metaverse [SE52117]
  • NVIDIA Omniverse User Group [SE52047]
  • Accelerate the Virtual Production Pipeline to Produce an Award-Winning Sci-Fi Short Film [S51496]
  • 3D by AI: How Generative AI Will Make Building Virtual Worlds Easier [S52163]
  • Custom World Building With AI Avatars: The Little Martians Sci-Fi Project [S51360]
  • AI-Powered, Real-Time, Markerless: The New Era of Motion Capture [S51845]

Search the GTC session catalog or check out the Media and Entertainment and Omniverse topics for additional creator-focused talks.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter. Learn more about Omniverse on Instagram, Medium, Twitter and YouTube for additional resources and inspiration. Check out the Omniverse forums, and join our Discord server and Twitch channel to chat with the community.

Read More

Democratizing AI with PyTorch Foundation and ROCm™ support for PyTorch

Democratizing AI with PyTorch Foundation and ROCm™ support for PyTorch

AMD Founding Member

Last year, Meta announced that PyTorch joined the Linux Foundation as a neutral home for growing the machine learning project and community with AMD representation as a part of the founding membership and governing board.

PyTorch Foundation’s mission is to drive AI adoption by democratizing its software ecosystem through open source principles aligning with the AMD core principle of an Open software ecosystem. AMD strives to foster innovation through the support for latest generations of hardware, tools, libraries, and other components to simplify and accelerate adoption of AI across a broad range of scientific discoveries.

AMD, along with key PyTorch codebase developers (including those at Meta AI), delivered a set of updates to the ROCm™ open software ecosystem that brings stable support for AMD Instinct™ accelerators as well as many Radeon™ GPUs. This now gives PyTorch developers the ability to build their next great AI solutions leveraging AMD GPU accelerators & ROCm. The support from PyTorch community in identifying gaps, prioritizing key updates, providing feedback for performance optimizing and supporting our journey from “Beta” to “Stable” was immensely helpful and we deeply appreciate the strong collaboration between the two teams at AMD and PyTorch. The move for ROCm support from “Beta” to “Stable” came in the PyTorch 1.12 release (June 2022) brings the added support to easily run PyTorch on native environment without having to configure custom dockers. This is a sign of confidence about the quality of support and performance of PyTorch using AMD Instinct and ROCm. The results of these collaborative efforts are evident in the performance measured on key industry benchmarks like Microsoft’s SuperBench shown below in Graph 1.

“We are excited to see the significant impact of developers at AMD to contribute to and extend features within PyTorch to make AI models run in a more performant, efficient, and scalable way. A great example of this is the thought-leadership around unified memory approaches between the framework and future hardware systems, and we look forward to seeing that feature progress.”

– Soumith Chintala, PyTorch lead-maintainer and Director of Engineering, Meta AI

The progressive improvements on both the AMD CDNA™ architecture as well as ROCm and PyTorch shows single GPU model throughput increase from AMD Instinct MI100 to the latest generation AMD Instinct MI200 family GPUs going from ROCm 4.2 to ROCm 5.3 and from PyTorch 1.7 to PyTorch 1.12.

Graph 1: ML model performance over generation using Microsoft Superbench Suite

Graph 1: ML model performance over generation using Microsoft Superbench Suite 1, 2, 3

Below are a few of the key updates for ROCm support since the PyTorch 1.12 release

Full Continuous Integration (CI) for ROCm on PyTorch

With the ROCm support for PyTorch move from “Beta” to “Stable,” all the functions and features commits are now verified through a full Continuous Integration (CI) process. The CI process helps ensure the proper build and test process ahead of an expected Docker and PIP wheel release with stable commits forthcoming.

Support for Kineto Profiler

The addition of Kineto profiler support to ROCm now helps developers and users understand performance bottlenecks through effective diagnosis and profiling tools. The tool also provides recommendations to improve known issues and visualization through TensorBoard UI.

Key PyTorch Libraries support added

PyTorch ecosystem libraries like TorchText (Text classification), TorchRec (libraries for recommender systems – RecSys), TorchVision (Computer Vision), TorchAudio (audio and signal processing) are fully supported since ROCm 5.1 and upstreamed with PyTorch 1.12.

Key libraries provided with the ROCm software stack including MIOpen (Convolution models), RCCL (ROCm Collective Communications) and rocBLAS (BLAS for transformers) were further optimized to offer new potential efficiencies and higher performance.

MIOpen innovates on several fronts, such as implementing fusion to optimize for memory bandwidth and GPU launch overheads, providing an auto-tuning infrastructure to overcome the large design space of problem configurations, and implementing different algorithms to optimize convolutions for different filter and input sizes. MIOpen is one of the first libraries to publicly support the bfloat16 data-type for convolutions, allowing efficient training at lower precision maintaining expected accuracy.

RCCL (pronounced “Rickle”) is a stand-alone library of standard collective communication routines for GPUs, implementing all-reduce, all-gather, reduce, broadcast, reduce-scatter, gather, scatter, and all-to-all. There is support for direct GPU-to-GPU send and receive operations. It has been optimized to achieve high bandwidth on platforms using PCIe®, Infinity Fabric™ (GPU to GPU) as well as networking using InfiniBand Verbs or TCP/IP sockets. RCCL supports an arbitrary number of GPUs installed in single or multiple nodes and can be used in either single- or multi-process (e.g., MPI) applications.

Along with the above key highlights, over 50 features and functionality improvements were completed jointly between AMD and PyTorch to add stable support for ROCm. These include improvements to tools, compilers, runtime, graph optimizations through TorchScript, INT8 quant path usage, and ONNX runtime integration including support for Navi 21 based Radeon™ PRO datacenter graphics card to name a few.

AITemplate Inference Engine

MetaAI recently published a blog announcing the release of its open source AITemplate (link) for a unified inference system supporting AMD Instinct GPU accelerators using the AMD ROCm stack. This Python based framework can help significantly improve performance through increased utilization of AMD matrix cores for transformer blocks. This is achieved through the AMD Composable Kernel (CK) library which provides performance critical Kernels for ML AI workloads across multiple architectures including GPUs and CPUs through HIP & C++.

Moreover, the AITemplate also provides out-of-the-box support for widely used AI models like BERT, ResNET, Vision Transformer, Stable Diffusion etc. simplifying deployment process through these pretrained models.

What’s coming with future ROCm releases?

Unified memory models for CPU + GPU

As system architecture evolves to address the complexity of large problem sizes and data sets, memory management becomes a key performance bottle neck that needs a cohesive strategy to be addressed through innovations at both hardware and software levels. AMD is uniquely positioned to address this problem with its effective data center solutions integrating AMD EPYC™ CPU cores with its AMD Instinct GPU compute units in a truly unified datacenter APU (Accelerated Processing Unit) form factor set to be launched in 2H 2023.

The software work to leverage the unified CPU + GPU memory has already started in collaboration with the PyTorch team, to enable the usage of a fast, low latency, synchronized memory model that enables not only AMD but also other AI accelerators to address the complex memory management problem of today. We are looking forward to this joint effort and announcement soon.

Acknowledgement

The content in this blog highlights the joint work between AMD and key PyTorch contributors including Meta, working on many of the core features, as well as Microsoft enabling ONNX Runtime support. We are looking forward to working with the other founding members at the PyTorch Foundation on the next steps and improvements to democratize and grow adoption of PyTorch across the industry.

CAUTIONARY STATEMENT


This blog contains forward-looking statements concerning Advanced Micro Devices, Inc. (AMD) such as the availability, timing and expected benefits of an AMD datacenter APU form factor, which are made pursuant to the Safe Harbor provisions of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are commonly identified by words such as “would,” “may,” “expects,” “believes,” “plans,” “intends,” “projects” and other terms with similar meaning. Investors are cautioned that the forward-looking statements in this blog are based on current beliefs, assumptions and expectations, speak only as of the date of this blog and involve risks and uncertainties that could cause actual results to differ materially from current expectations. Such statements are subject to certain known and unknown risks and uncertainties, many of which are difficult to predict and generally beyond AMD’s control, that could cause actual results and other future events to differ materially from those expressed in, or implied or projected by, the forward-looking information and statements. Investors are urged to review in detail the risks and uncertainties in AMD’s Securities and Exchange Commission filings, including but not limited to AMD’s most recent reports on Forms 10-K and 10-Q. AMD does not assume, and hereby disclaims, any obligation to update forward-looking statements made in this blog, except as may be required by law.

Endnotes

  1. MI100D-01 SuperBench v0.5 model training results based on AMD internal testing as of 11/09/2022 measuring the total training throughput, at half precision, using a 2P AMD EPYC™ 7763 CPU server tested with 1x AMD Instinct™ MI100 (32GB HBM2e) 300W GPU, SBIOS 2.2, Ubuntu® 20.04.5 LTS, host ROCm™ 5.2.0, guest ROCm 4.2, PyTorch 1.7.0. Server manufacturers may vary configurations, yielding different results. Performance may vary based factors including use of latest drivers and optimizations.
  2. MI200D-01 SuperBench v0.6 model training results based on AMD internal testing as of 11/09/2022 measuring the total training throughput, at half precision, using a 2P AMD EPYC™ 7763 CPU server tested with 1x AMD Instinct™ MI210 (64GB HBM2e) 300W GPU, SBIOS 2.2, Ubuntu 20.04.5 LTS, host ROCm 5.3.0, guest ROCm 5.3, PyTorch 1.12. Server manufacturers may vary configurations, yielding different results. Performance may vary based factors including use of latest drivers and optimizations.
  3. MI200D-02: SuperBench v0.6 model training results based on AMD internal testing as of 11/09/2022 measuring the total training throughput, at half precision, using a 2P AMD EPYC™️ 7763 CPU server tested with 1x AMD Instinct™️ MI250 (128GB HBM2e) 560W GPU, SBIOS M12, Ubuntu 20.04 LTS, host ROCm 5.3.0, guest ROCm 5.3, PyTorch 1.12. Server manufacturers may vary configurations, yielding different results. Performance may vary based factors including use of latest drivers and optimizations.

Read More

Configure an AWS DeepRacer environment for training and log analysis using the AWS CDK

Configure an AWS DeepRacer environment for training and log analysis using the AWS CDK

This post is co-written by Zdenko Estok, Cloud Architect at Accenture and Sakar Selimcan, DeepRacer SME at Accenture.

With the increasing use of artificial intelligence (AI) and machine learning (ML) for a vast majority of industries (ranging from healthcare to insurance, from manufacturing to marketing), the primary focus shifts to efficiency when building and training models at scale. The creation of a scalable and hassle-free data science environment is key. It can take a considerable amount of time to launch and configure an environment tailored for a specific use case and even harder to onboard colleagues to collaborate.

According to Accenture, companies that manage to efficiently scale AI and ML can achieve nearly triple the return on their investments. Still, not all companies meet their expected returns on their AI/ML journey. Toolkits to automate the infrastructure become essential for horizontal scaling of AI/ML efforts within a corporation.

AWS DeepRacer is a simple and fun way to get started with reinforcement learning (RL), an ML technique where an agent discovers the optimal actions to take in a given environment. In our case, that would be an AWS DeepRacer vehicle, trying to race fast around a track. You can get started with RL quickly with hands-on tutorials that guide you through the basics of training RL models and test them in an exciting, autonomous car racing experience.

This post shows how companies can use infrastructure as code (IaC) with the AWS Cloud Development Kit (AWS CDK) to accelerate the creation and replication of highly transferable infrastructure and easily compete for AWS DeepRacer events at scale.

“IaC combined with a managed Jupyter environment gave us best of both worlds: repeatable, highly transferable data science environments for us to onboard our AWS DeepRacer competitors to focus on what they do the best: train fast models fast.”

– Selimcan Sakar, AWS DeepRacer SME at Accenture.

Solution overview

Orchestrating all the necessary services takes a considerable amount of time when it comes to creating a scalable template that can be applied for multiple use cases. In the past, AWS CloudFormation templates have been created to automate the creation of these services. With the advancements in automation and configuring with increasing levels of abstraction to set up different environments with IaC tools, the AWS CDK is being widely adopted across various enterprises. The AWS CDK is an open-source software development framework to define your cloud application resources. It uses the familiarity and expressive power of programming languages for modeling your applications, while provisioning resources in a safe and repeatable manner.

In this post, we enable the provisioning of different components required for performing log analysis using Amazon SageMaker on AWS DeepRacer via AWS CDK constructs.

Although the analysis graph provided within in the DeepRacer console if effective and straightforward regarding the rewards granted and progress achieved, it doesn’t give insight into how fast the car moves through the waypoints, or what kind of a line the car prefers around the track. This is where advanced log analysis comes into play. Our advanced log analysis aims to bring efficiency in training retrospectively to understand which reward functions and action spaces work better than the others when training multiple models, and whether a model is overfitting, so that racers can train smarter and achieve better results with less training.

Our solution describes an AWS DeepRacer environment configuration using the AWS CDK to accelerate the journey of users experimenting with SageMaker log analysis and reinforcement learning on AWS for an AWS DeepRacer event.

An administrator can run the AWS CDK script provided in the GitHub repo via the AWS Management Console or in the terminal after loading the code in their environment. The steps are as follows:

  1. Open AWS Cloud9 on the console.
  2. Load the AWS CDK module from GitHub into the AWS Cloud9 environment.
  3. Configure the AWS CDK module as described in this post.
  4. Open the cdk.context.json file and inspect all the parameters.
  5. Modify the parameters as needed and run the AWS CDK command with the intended persona to launch the configured environment suited for that persona.

The following diagram illustrates the solution architecture.

cdk-arch

With the help of the AWS CDK, we can version control our provisioned resources and have a highly transportable environment that complies with enterprise-level best practices.

Prerequisites

In order to provision ML environments with the AWS CDK, complete the following prerequisites:

  1. Have access to an AWS account and permissions within the Region to deploy the necessary resources for different personas. Make sure you have the credentials and permissions to deploy the AWS CDK stack into your account.
  2. We recommend following certain best practices that are highlighted through the concepts detailed in the following resources:
  3. Clone the GitHub repo into your environment.

Deploy the portfolio into your account

In this deployment, we use AWS Cloud9 to create a data science environment using the AWS CDK.

  1. Navigate to the AWS Cloud9 console.
  2. Specify your environment type, instance type, and platform.

  1. Specify your AWS Identity and Access Management (IAM) role, VPC, and subnet.

  1. In your AWS Cloud9 environment, create a new folder called DeepRacer.
  2. Run the following command to install the AWS CDK, and make sure you have the right dependencies to deploy the portfolio:
npm install -g aws-cdk
  1. To verify that the AWS CDK has been installed and to access the docs, run the following command in your terminal (it should redirect you to the AWS CDK documentation):
cdk docs
  1. Now we can clone the AWS DeepRacer repository from GitHub.
  2. Open the cloned repo in AWS Cloud9:
cd DeepRacer_cdk

After you review the content in the DeepRacer_cdk directory, there will be a file called package.json with all the required modules and dependencies defined. This is where you can define your resources in a module.

  1. Next, install all required modules and dependencies for the AWS CDK app:
npm install

cdk synth

This will synthesize the corresponding CloudFormation template.

  1. To run the deployment, either change the context.json file with parameter names or explicitly define them during runtime:
cdk deploy

The following components are created for AWS DeepRacer log analysis based on running the script:

  • An IAM role for the SageMaker notebook with a managed policy
  • A SageMaker notebook instance with the instance type either explicitly added as a cdk context parameter or default value stored in the context.json file
  • A VPC with CIDR as specified in the context.json file along with four public subnets configured
  • A new security group for the Sagemaker notebook instance allowing communication within the VPC
  • A SageMaker lifecycle policy with a bash script that is preloading the content of another GitHub repository, which contains the files we use for running the log analysis on the AWS DeepRacer models

  1. You can run the AWS CDK stack as follows:
$ cdk deploy
  1. Go to the AWS CloudFormation console in the Region where the stack is deployed to verify the resources.

Now users can start using those services to work with log analysis and deep RL model training on SageMaker for AWS DeepRacer.

Module testing

You can run also some unit tests before deploying the stack to verify that you accidently didn’t remove any required resources. The unit tests are located in DeepRacer/test/deep_racer.test.ts and can be run with the following code:

npm run test

Generate diagrams using cdk-dia

To generate diagrams, complete the following steps:

  1. Install graphviz using your operating system tools:
npm -g cdk-dia

This installs the cdk-dia application.

  1. Now run the following code:
cdk-dia

A graphical representation of your AWS CDK stack will be stored in .png format.

After you run the preceding steps, you should see be able see the creation process of the notebook instance with status Pending. When the status of the notebook instance is InService (as shown in the following screenshot), you can proceed with the next steps.

  1. Choose Open Jupyter to start running the Python script for performing the log analysis.

For additional details on log analysis using AWS DeepRacer and associated visualizations, refer to Using log analysis to drive experiments and win the AWS DeepRacer F1 ProAm Race.

Clean up

To avoid ongoing charges, complete the following steps:

  1. Use cdk destroy to delete the resources created via the AWS CDK.
  2. On the AWS CloudFormation console, delete the CloudFormation stack.

Conclusion

AWS DeepRacer events are a great way to raise interest and increase ML knowledge across all pillars and levels of an organization. In this post, we shared how you can configure a dynamic AWS DeepRacer environment and set up selective services to accelerate the journey of users on the AWS platform. We discussed how to create services Amazon SageMaker Notebook Instance, IAM roles, SageMaker notebook lifecycle configuration with best practices, a VPC, and Amazon Elastic Compute Cloud (Amazon EC2) instances based on identifying the context using the AWS CDK and scaling for different users using AWS DeepRacer.

Configure the CDK environment and run the advanced log analysis notebook to bring efficiency in running the module. Assist racers to achieve better results in less time and gain granular insights into reward functions and action.

References

More information is available at the following resources:

  1. Automate Amazon SageMaker Studio setup using AWS CDK
  2. AWS SageMaker CDK API reference

About the Authors

 Zdenko Estok works as a cloud architect and DevOps engineer at Accenture. He works with AABG to develop and implement innovative cloud solutions, and specializes in infrastructure as code and cloud security. Zdenko likes to bike to the office and enjoys pleasant walks in nature.

Selimcan “Can” Sakar is a cloud first developer and solution architect at Accenture with a focus on artificial intelligence and a passion for watching models converge.

Shikhar Kwatra is an AI/ML specialist solutions architect at Amazon Web Services, working with a leading Global System Integrator. Shikhar aids in architecting, building, and maintaining cost-efficient, scalable cloud environments for the organization, and supports the GSI partner in building strategic industry solutions on AWS. Shikhar enjoys playing guitar, composing music, and practicing mindfulness in his spare time.

Read More

Improving Human Annotation Effectiveness for Fact Collection by Identifying the Most Relevant Answers

This paper was accepted at the Workshops on Data Science with Human in the Loop at EMNLP 2022
Identifying and integrating missing facts is a crucial task for knowledge graph completion to ensure robustness towards downstream applications such as question answering. Adding new facts to a knowledge graph in real world system often involves human verification effort, where candidate facts are verified for accuracy by human annotators. This process is labor-intensive, time-consuming, and inefficient since only a small number of missing facts can be identified. This paper proposes a simple but…Apple Machine Learning Research