How InfoJobs (Adevinta) improves NLP model prediction performance with AWS Inferentia and Amazon SageMaker

This is a guest post co-written by Juan Francisco Fernandez, ML Engineer in Adevinta Spain, and AWS AI/ML Specialist Solutions Architects Antonio Rodriguez and João Moura.

InfoJobs, a subsidiary company of the Adevinta group, provides the perfect match between candidates looking for their next job position and employers looking for the best hire for the openings they need to fill. For this goal, we use natural language processing (NLP) models such as BERT through PyTorch to automatically extract relevant information from users’ CVs at the moment they upload these to our portal.

Performing inference with NLP models can take several seconds when hosted on typical CPU-based instances given the complexity and variety of the fields. This affects the user experience in the job listing web portal. Alternatively, hosting these models on GPU-based instances can prove costly, which makes the solution not feasible for our business. For this solution, we were looking for a way to optimize the latency of predictions, while keeping the costs at a minimum.

To solve this challenge, we initially considered some possible solutions along two axes:

  • Vertical scaling by using bigger general-purpose instances as well as GPU-powered instances.
  • Optimizing our models using openly available techniques such as quantization or open tools such as ONNX.

Neither option, whether individually or combined, was able to provide the needed performance at an affordable cost. After benchmarking our full range of options with the help of AWS AI/ML Specialists, we found that compiling our PyTorch models with AWS Neuron and using AWS Inferentia to host them on Amazon SageMaker endpoints offered a reduction of up to 92% in prediction latency, at 75% lower cost when compared to our best initial alternatives. It was, in other words, like having the best of GPU power at CPU cost.

Amazon Comprehend is a plug-and-play managed NLP service that uses machine learning to automatically uncover valuable insights and connections in text. However, in this particular case we wanted to use fine-tuned models for the task.

In this post, we share a summary of the benchmarks performed and an example of how to use AWS Inferentia with SageMaker to compile and host NLP models. We also describe how InfoJobs is using this solution to optimize the inference performance of NLP models, extracting key information from users’ CVs in a cost-efficient way.

Overview of solution

First, we had to evaluate the different options available on AWS to find the best balance between performance and cost to host our NLP models. The following diagram summarizes the most common alternatives for real-time inference, most of which were explored during our collaboration with AWS.

Inference options diagram

Hosting options benchmark on SageMaker

We started our tests with a publicly available pre-trained model from the Hugging Face model hub bert-base-multilingual-uncased. This is the same base model used by InfoJobs’s CV key value extraction model. For this purpose, we deployed this model to a SageMaker endpoint using different combinations of instance types: CPU-based, GPU-based, or AWS Inferentia-based. We also explored optimization with Amazon SageMaker Neo and compilation with AWS Neuron where appropriate.

In this scenario, deploying our model to a SageMaker endpoint with an AWS Inferentia instance yielded 96% faster inference times compared to CPU instances and 44% faster inference times compared to GPU instances in the same range of cost and specs. This allows us to respond to 15 times more inferences than using CPU instances, or 4 times more inferences than using GPU instances at the same cost.

Based on the encouraging first results, our next step was to validate our tests on the actual model used by InfoJobs. This is a more complex model that requires PyTorch quantization for performance improvement, so we expected worse results compared to the previous standard case with bert-base-multilingual-uncased. The results of our tests for this model are summarized in the following table (based on public pricing in Region us-east-1 as of February 20, 2022).

Category Mode Instance type example p50 Inference latency (ms)  TPS Cost per hour (USD) Inferences per hour Cost per million inferences (USD)
CPU Normal m5.xlarge 1400 2 0.23 5606 41.03
CPU Optimized m5.xlarge 1105 2 0.23 7105 32.37
GPU Normal g4dn.xlarge 800 18 0.736 64800 11.36
GPU Optimized g4dn.xlarge 700 21 0.736 75600 9.74
AWS Inferentia Compiled inf1.xlarge 57 33 0.297 120000 2.48

The following graph shows real-time inference response times for the InfoJobs model (less is better). In this case, inference latency is 75-92% faster when compared to both CPU or GPU options.

Inference latency graph

This also means between 4-13 times less cost for running inferences compared to both CPU or GPU options, as shown in the following graph of cost per million inferences.

Inference cost graph

We must highlight that no further optimizations were made to the inference code during these non-extensive tests. However, the performance and cost benefits we saw from using AWS Inferentia exceeded our initial expectations, and enabled us to proceed to production. In the future, we will continue to optimize with other features of Neuron, such as NeuronCore Pipeline or the PyTorch-specific DataParallel API. We encourage you to explore and compare the results for your specific use case and model.

Compiling for AWS Inferentia with SageMaker Neo

You don’t need to use the Neuron SDK directly to compile your model and be able to host it on AWS Inferentia instances.

SageMaker Neo automatically optimizes machine learning (ML) models for inference on cloud instances and edge devices to run faster with no loss in accuracy. In particular, Neo is capable of compiling a wide variety of transformer-based models, making use of the Neuron SDK in the background. This allows you to get the benefit of AWS Inferentia by using APIs that are integrated with the familiar SageMaker SDK, with no required context switch.

In this section, we go through an example in which we show you how to compile a BERT model with Neo for AWS Inferentia. We then deploy that model to a SageMaker endpoint. You can find a sample notebook describing the whole process in detail on GitHub.

First, we need to create a sample input to trace our model with PyTorch and create a tar.gz file, with the model being its only content. This is a required step to have Neo compile our model artifact (for more information, see Prepare Model for Compilation). For demonstration purposes, the model is initialized as a mock model for sequence classification that hasn’t been fine-tuned on the task at all. In reality, you would replace the model identifier with your selected model from the Hugging Face model hub or a locally saved model artifact. See the following code:

import transformers
import torch
import tarfile

tokenizer = transformers.AutoTokenizer.from_pretrained("distilbert-base-multilingual-uncased")
model = transformers.AutoModelForSequenceClassification.from_pretrained(
"distilbert-base- multilingual-uncased", return_dict=False
)

seq_0 = "This is just sample text for model tracing, the length of the sequence does not matter because we will pad to the max length that Bert accepts."
seq_1 = seq_0
max_length = 512

tokenized_sequence_pair = tokenizer.encode_plus(
    seq_0, seq_1, max_length=max_length, padding="max_length", truncation=True, return_tensors="pt"
)

example = tokenized_sequence_pair["input_ids"], tokenized_sequence_pair["attention_mask"]

traced_model = torch.jit.trace(model.eval(), example)
traced_model.save("model.pth")

with tarfile.open('model.tar.gz', 'w:gz') as f:
    f.add('model.pth')
f.close()

It’s important to set the return_dict parameter to False when loading a pre-trained model, because Neuron compilation does not support dictionary-based model outputs. We upload our model.tar.gz file to Amazon Simple Storage Service (Amazon S3), saving its location in a variable named traced_model_url.

We then use the PyTorchModel SageMaker API to instantiate and compile our model:

from sagemaker.pytorch.model import PyTorchModel
from sagemaker.predictor import Predictor
import json

traced_sm_model = PyTorchModel(
    model_data=traced_model_url,
    predictor_cls=Predictor,
    framework_version="1.5.1",
    role=role,
    sagemaker_session=sagemaker_session,
    entry_point="inference_inf1.py",
    source_dir="code",
    py_version="py3",
    name="inf1-bert-base-multilingual-uncased ",
)

compiled_inf1_model = traced_sm_model.compile(
    target_instance_family="ml_inf1",
    input_shape={"input_ids": [1, 512], "attention_mask": [1, 512]},
    job_name=’testing_inf1_neo,
    role=role,
    framework="pytorch",
    framework_version="1.5.1",
    output_path=f"s3://{sm_bucket}/{your_model_destination}”
    compiler_options=json.dumps("--dtype int64")
)

Compilation may take a few minutes. As you can see, our entry_point to model inference is our inference_inf1.py script. It determines how our model is loaded, how input and output are preprocessed, and how the model is used for prediction. Check out the full script on GitHub.

Finally, we can deploy our model to a SageMaker endpoint on an AWS Inferentia instance, and get predictions from it in real time:

from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

compiled_inf1_predictor = compiled_inf1_model.deploy(
    instance_type="ml.inf1.xlarge",
    initial_instance_count=1,
    endpoint_name=f"test-neo-inf1-bert",
    serializer=JSONSerializer(),
    deserializer=JSONDeserializer(),
)

payload = seq_0, seq_1
print(compiled_inf1_predictor.predict(payload))

As you can see, we were able to get all the benefits of using AWS Inferentia instances on SageMaker by using simple APIs that complement the standard flow of the SageMaker SDK.

Final solution

The following architecture illustrates the solution deployed in AWS.

Architecture diagram

All the testing and evaluation analysis described in this post were done with the help of AWS AI/ML Specialist Solutions Architects in under 3 weeks, thanks for the ease of use of SageMaker and AWS Inferentia.

Conclusion

In this post, we shared how InfoJobs (Adevinta) uses AWS Inferentia with SageMaker endpoints to optimize the performance of NLP model inference in a cost-effective way, reducing inference times up to 92% with a 75% lower cost than the initial best alternative. You can follow the process and code shared for compiling and deploying your own models easily using SageMaker, the Neuron SDK for PyTorch, and AWS Inferentia.

The results of the benchmarking tests performed between AWS AI/ML Specialist Solutions Architects and InfoJobs engineers were also validated in InfoJobs’s environment. This solution is now being deployed in production, handling the processing of all the CVs uploaded by users to the InfoJobs portal in real time.

As a next step, we will be exploring ways to optimize model training and our ML pipeline with SageMaker by relying on the Hugging Face integration with SageMaker and SageMaker Training Compiler, among other features.

We encourage you to try out AWS Inferentia with SageMaker, and connect with AWS to discuss your specific ML needs. For more examples on SageMaker and AWS Inferentia, you can also check out SageMaker examples on GitHub and AWS Neuron tutorials.


About the Authors

Juan Francisco Fernandez is an ML Engineer with Adevinta Spain. He joined InfoJobs to tackle the challenge of automating model development, thereby providing more time for data scientists to think about new experiments and models and freeing them of the burden of engineering tasks. In his spare time, he enjoys spending time with his son, playing basketball and video games, and learning languages.

Antonio Rodriguez is an AI & ML Specialist Solutions Architect at Amazon Web Services. He helps companies solve their challenges through innovation with the AWS Cloud and AI/ML services. Apart from work, he loves to spend time with his family and play sports with his friends.

João Moura is an AI & ML Specialist Solutions Architect at Amazon Web Services. He focuses mostly on NLP use cases and helping customers optimize deep learning model deployments.

Read More

Festo Develops With Isaac Sim to Drive Its Industrial Automation

Dionysios Satikidis was playing FIFA 19 when he realized the simulated soccer game’s realism offered a glimpse into the future for training robots.

An expert in AI and autonomous systems at Festo, a German industrial control and automation company, he believed the worlds of gaming and robotics would intersect.

“I’ve always been passionate about technology and gaming, and for me and my close colleagues it was clear that someday we will need the gaming tools to create autonomous robots,” said Satikidis, based in Esslingen Germany.

It was a view shared by teammate Jan Seyler, head of advanced control and analytics at Festo; and Dimitrios Lagamtzis, who worked with Festo at that time in 2019.

Satikidis and his colleagues had begun keeping close tabs on NVIDIA and grew increasingly curious about Isaac Sim, a robotics simulation application and synthetic data generation tool built on NVIDIA Omniverse, the 3D design and simulation platform.

Finally, watching from the sidelines of the field wasn’t enough.

“I set up a call with NVIDIA, and when Dieter Fox, senior director of robotics research at NVIDIA, came on the call, I just asked if they were willing to work with us,” he said.

And that’s when it really started.

Tackling Sim-to-Real Challenge

Today Satikidis and a small team at Festo are developing AI for robotics automation. As a player in hardware and pneumatics used in robotics, Festo is making a move into AI-driven simulation, aiming at future Festo products.

Festo uses Isaac Sim to develop skills for its collaborative robots, or cobots. That requires building an awareness of their environments, human partners and tasks.

The lab is focused on narrowing the sim-to-real gap for a robotic arm, developing simulation that improves perception for real robots.

For building perception, its AI models are trained on synthetic data generated by Omniverse Replicator.

“Festo is working on its own cobots, which they plan to ship in 2023 in Europe,” said Satikidis.

Applying Cortex for Automation 

Festo uses Isaac Cortex, a tool in Isaac Sim, to simplify programming for cobot skills. Cortex is a framework for coordinating the Isaac tools into a cohesive robotic system to control virtual robots in Omniverse and physical robots in the real world.

“Our goal is to make programming task-aware robots as easy as programming gaming AIs,” said Nathan Ratliff, director of systems software at NVIDIA, in a recent GTC presentation.

Isaac Sim is a simulation suite that provides a diverse set of tools for robotics simulation. It enables sensor simulation, synthetic data generation, world representation, robot modeling and other capabilities.

The Omniverse platform and its Isaac Sim tools have been a game changer for Festo.

“This is incredible because you can manifest a video game to a real robot,” said Satikidis.

To learn more, check out the GTC session Isaac Cortex: A Decision Framework for Virtual and Physical Robots

The post Festo Develops With Isaac Sim to Drive Its Industrial Automation appeared first on NVIDIA Blog.

Read More

What Is Zero Trust?

For all its sophistication, the Internet age has brought on a digital plague of security breaches. The steady drumbeat of data and identity thefts spawned a new movement and a modern mantra that’s even been the subject of a U.S. presidential mandate — zero trust.

So, What Is Zero Trust?

Zero trust is a cybersecurity strategy for verifying every user, device, application and transaction in the belief that no user or process should be trusted.

That definition comes from the NSTAC report, a 56-page document on zero trust compiled in 2021 by the U.S. National Security Telecommunications Advisory Committee, a group that included dozens of security experts led by a former AT&T CEO.

In an interview, John Kindervag, the former Forrester Research analyst who created the term, noted that he defines it this way in his Zero Trust Dictionary: Zero trust is a strategic initiative that helps prevent data breaches by eliminating digital trust in a way that can be deployed using off-the-shelf technologies that will improve over time.

What Are the Basic Tenets of Zero Trust?

In his 2010 report that coined the term, Kindervag laid out three basic tenets of zero trust. Because all network traffic should be untrusted, he said users must:

  • verify and secure all resources,
  • limit and strictly enforce access control, and
  • inspect and log all network traffic.

That’s why zero trust is sometimes known by the motto, “Never Trust, Always Verify.”

How Do You Implement Zero Trust?

As the definitions suggest, zero trust is not a single technique or product, but a set of principles for a modern security policy.

In its seminal 2020 report, the U.S. National Institute for Standards and Technology (NIST) detailed guidelines for implementing zero trust.

Zero Trust architecture from NIST

Its general approach is described in the chart above. It uses a security information and event management (SIEM) system to collect data and continuous diagnostics and mitigation (CDM) to analyze it and respond to insights and events it uncovers.

It’s an example of a security plan also called a zero trust architecture (ZTA) that creates a more secure network called a zero trust environment.

But one size doesn’t fit all in zero trust. There’s no “single deployment plan for ZTA [because each] enterprise will have unique use cases and data assets,” the NIST report said.

Five Steps to Zero Trust

The job of deploying zero trust can be boiled down to five main steps.

It starts by defining a so-called protect surface, what users want to secure. A protect surface can span systems inside a company’s offices, the cloud and the edge.

From there, users create a map of the transactions that typically flow across their networks and a zero trust architecture to protect them. Then they establish security policies for the network.

Finally, they monitor network traffic to make sure transactions stay within the policies.

Five step process for zero trust

Both the NSTAC report (above) and Kindervag suggest these same steps to create a zero trust environment.

It’s important to note that zero trust is a journey not a destination. Consultants and government agencies recommend users adopt a zero trust maturity model to document an organization’s security improvements over time.

The Cybersecurity Infrastructure Security Agency, part of the U.S. Department of Homeland Security, described one such model (see chart below) in a 2021 document.

Zero Trust maturity model from CISA

In practice, users in zero trust environments request access to each protected resource separately. They typically use multi-factor authentication (MFA) such as providing a password on a computer, then a code sent to a smartphone.

The NIST report lists ingredients for an algorithm (below) that determines whether or not a user gets access to a resource.

NIST algorithm for zero trust access

“Ideally, a trust algorithm should be contextual, but this may not always be possible,” given a company’s resources, it said.

Some argue the quest for an algorithm to measure trustworthiness is counter to the philosophy of zero trust. Others note that machine learning has much to offer here, capturing context across many events on a network to help make sound decisions on access.

The Big Bang of Zero Trust

In May 2021, President Joe Biden released an executive order mandating zero trust for the government’s computing systems.

The order gave federal agencies 60 days to adopt zero trust architectures based on the NIST recommendations. It also called for a playbook on dealing with security breaches, a safety board to review major incidents — even a program to establish cybersecurity warning labels for some consumer products.

It was a big bang moment for zero trust that’s still echoing around the globe.

“The likely effect this had on advancing zero trust conversations within boardrooms and among information security teams cannot be overstated,” the NSTAC report said.

What’s the History of Zero Trust?

Around 2003, ideas that led to zero trust started bubbling up inside the U.S. Department of Defense, leading to a 2007 report. About the same time, an informal group of industry security experts called the Jericho Forum coined the term “de-perimeterisation.”

Kindervag crystalized the concept and gave it a name in his bombshell September 2010 report.

The industry’s focus on building a moat around organizations with firewalls and intrusion detection systems was wrongheaded, he argued. Bad actors and inscrutable data packets were already inside organizations, threats that demanded a radically new approach.

Security Goes Beyond Firewalls

From his early days installing firewalls, “I realized our trust model was a problem,” he said in an interview. “We took a human concept into the digital world, and it was just silly.”

At Forrester, he was tasked with finding out why cybersecurity wasn’t working. In 2008, he started using the term zero trust in talks describing his research.

After some early resistance, users started embracing the concept.

“Someone once told me zero trust would become my entire job. I didn’t believe him, but he was right,” said Kindervag, who, in various industry roles, has helped hundreds of organizations build zero trust environments.

An Expanding Zero Trust Ecosystem

Indeed, Gartner projects that by 2025 at least 70% of new remote access deployments will use what it calls zero trust network access (ZTNA), up from less than 10% at the end of 2021. (Gartner, Emerging Technologies: Adoption Growth Insights for Zero Trust Network Access, G00764424, April 2022)

That’s in part because the COVID lockdown accelerated corporate plans to boost security for remote workers. And many firewall vendors now include ZTNA capabilities in their products.

Market watchers estimate at least 50 vendors from Appgate to Zscaler now offer security products aligned with the zero trust concepts.

AI Automates Zero Trust

Users in some zero trust environments express frustration with repeated requests for multi-factor authentication. It’s a challenge that some experts see as an opportunity for automation with machine learning.

For example, Gartner suggests applying analytics in an approach it calls continuous adaptive trust. CAT (see chart below) can use contextual data — such as device identity, network identity and geolocation — as a kind of digital reality check to help authenticate users.

Gartner on MFA to CAT for zero trust journey
Gartner lays out zero trust security steps. Source: Gartner, Shift Focus From MFA to Continuous Adaptive Trust, G00745072, December 2021.

In fact, networks are full of data that AI can sift in real time to automatically enhance security.

“We do not collect, maintain and observe even half the network data we could, but there’s intelligence in that data that will form a holistic picture of a network’s security,” said Bartley Richardson, senior manager of AI infrastructure and cybersecurity engineering at NVIDIA.

Human operators can’t track all the data a network spawns or set policies for all possible events. But they can apply AI to scour data for suspicious activity, then respond fast.

“We want to give companies the tools to build and automate robust zero trust environments with defenses that live throughout the fabric of their data centers,” said Richardson, who leads development on NVIDIA Morpheus, an open AI cybersecurity framework.

NVIDIA Morpheus for zero trust

NVIDIA provides pretrained AI models for Morpheus, or users can choose a model from a third party or build one themselves.

“The backend engineering and pipeline work is hard, but we have expertise in that, and we can architect it for you,” he said.

It’s the kind of capability experts like Kindervag see as part of the future for zero trust.

“Manual response by security analysts is too difficult and ineffective,” he wrote in a 2014 report. “The maturity of systems is such that a valuable and reliable level of automation is now achievable.”

To learn more about AI and zero trust, read this blog or watch the video below.

The post What Is Zero Trust? appeared first on NVIDIA Blog.

Read More

Feel the Need … for Speed as ‘Top Goose’ Debuts In the NVIDIA Studio

Editor’s note: This post is part of our weekly In the NVIDIA Studio series, which celebrates featured artists, offers creative tips and tricks, and demonstrates how NVIDIA Studio technology accelerates creative workflows. 

You can be my wing-wing anytime.

This week In the NVIDIA Studio takes off with the debut of Top Goose, a short animation created with Omniverse Machinima and inspired by one of the greatest fictional pilots to ever grace the big screen.

The project was powered by PCs using the same breed of GPU that has produced every Best Visual Effects nominee at the Academy Awards for 14 years: multiple systems with NVIDIA RTX A6000 GPUs and an NVIDIA Studio laptop — the Razer Blade 15 with a GeForce RTX 3070 Laptop GPU.

The team took Top Goose from concept to completion in just two weeks. It likely would’ve taken at least twice as long without the remote collaboration NVIDIA Omniverse offers NVIDIA RTX and GeForce RTX users.

 

Built to showcase the #MadeinMachinima contest, the inspiration was simple. One of the NVIDIANs involved in the project, Dane Johnston, succinctly noted, “How do you get a midcentury legionnaire on an aircraft carrier and what would he be doing? He’d be getting chased by a goose, of course.”

Ready to Take-Off

Johnston and fellow NVIDIANs Dave Tyner, Matthew Harwood and Terry Naas began the project by prepping models for the static assets in Autodesk 3ds Max. Several of the key models came from TurboSquid by Shutterstock, including the F14 fighter jet, aircraft carrier, goose and several props.

High-quality models such as the F14 fighter jet, courtesy of TurboSquid by Shutterstock, are available to all Omniverse users.

TurboSquid has a huge library of 3D models to begin creating within Omniverse. Simply drag and drop models into Omniverse and start collaborating with team members — regardless of the 3D application they’re using or where they’re physically located.

Tyner could easily integrate 3D models he already owned by simply dropping them into the scene from the new Asset Store browser in Omniverse.

Texture details were added within Omniverse in real time using Adobe Photoshop.

The team worked seamlessly between apps within Omniverse, in real time, including Adobe Photoshop.

From there, Adobe Photoshop was used to edit character uniforms and various props within the scene, including the Top Goose badge at the end of the cinematic.

Animators, Mount Up!

Once models were ready, animation could begin. The team used Reallusion’s iClone Character Creator Omniverse Connector to import characters to Machinima.

Omniverse-ready USD animations from Reallusion ActorCore were dragged and dropped into the Omniverse Machinima content browser for easy access.

 

The models and animations were brought into Machinima by Tyner, where he used the retargeting function to instantly apply the animations to different characters, including the top knight from Mount & Blade II: Bannerlord — one of the hundreds of assets included with Omniverse.

Tyner, a generalist 3D artist, supplemented the project by creating custom animations from motion capture using an Xsens suit that was exported to FBX. Using a series of Omniverse Connectors, he brought the FBX files into Autodesk 3ds Max and ran a quick script to create a rudimentary skin.

Then, Tyner sent the skinned character and animation into Autodesk Maya for USD skeleton export to Machinima, using the Autodesk Maya Connector. The animation was automatically retargeted onto the main character inside Machinima. Once the data was captured, the entire mocap workflow took only a few minutes using NVIDIA Studio tools.

If Tyner didn’t have a motion-capture suit, he could have used Machinima’s AI Pose Estimation — a tool within Omniverse that lets anyone with a camera capture movement and create a 3D animation.

Static objects were all animated in Machinima with the Curve Editor and Sequencer. These tools allowed the team to animate anything they wanted, exactly how they wanted. For instance, the team animated the fighter jet barrel rolls with gravity keyed on a y-axis — allowing gravity to be turned on and off.

This technique, coupled with NVIDIA PhysX, also allowed the team to animate the cockpit scene with the flying bread and apples simply by turning off the gravity. The objects in the scene all obeyed the laws of physics and flew naturally without any manual animation.

The team collaborates virtually to achieve realistic animations using the Omniverse platform.

Animating the mighty wings of the goose was no cheap trick. While some of the animations were integrated as part of the asset from TurboSquid, the team collaborated within Omniverse to animate the inverted scenes.

Tyner used Omniverse Cloud Simple Share Early Access to package and send the entire USD project to Johnston and Harwood, NVIDIA’s resident audiophile. Harwood added sounds like the fly-bys and goose honks. Johnston brought the Mount & Blade II: Bannerlord character to life by recording custom audio and animating the character’s face with Omniverse Audio2Face.

Traditional audio workflows usually involve multiple pieces of audio recordings sent piecemeal to the animators. With Simple Share, Tyner packaged and sent the entire USD project to Harwood, who was able to add audio directly to the file and return it with a single click.

Revvin’ Up the Engine

Working in Omniverse meant the team could make adjustments and see the changes, with full-quality resolution, in real time. This saved the team a massive amount of time by not having to wait for single shots to render out.

The 3D artist team works together to finish the scene in Omniverse Machinima and Audio2Face.

With individuals working hundreds of miles apart, the team leveraged Omniverse’s collaboration capabilities with Omniverse Nucleus. They were able to complete set dressing, layout and lighting adjustments in a single real-time jam session.

 

The new constraints system in Machinima was integral to the camera work. Tyner created the shaky camera that helps bring the feeling of being on an aircraft carrier by animating a shaking ball in Autodesk 3ds Max, bringing it in via its Omniverse Connector, and constraining a camera to it using OmniGraph.

Equally important are the new Curve Editor and Sequencer. They gave the team complete intuitive control of the creative process. They used Sequencer to quickly and easily choreograph animated characters, lights, constraints and cameras — including field of view and depth of field.

With all elements in place, all that was left was the final render — conveniently and quickly handled using the Omniverse RTX renderer and without any file transfers in Omniverse Nucleus.

Tyner noted, “This is the first major project that I’ve done where I was never blocked. With Omniverse, everything just worked and was really easy to use.”

Not only was it easy to use individually, but Omniverse, part of the NVIDIA Studio suite of software, let this team of artists easily collaborate while working in and out of various apps from multiple locations.

Top Prizes in the #MadeinMachinima Contest

Top Goose is a showcase for #MadeinMachinima. The contest, which is currently running and closes June 27, asks artists to build and animate a cinematic short story with the Omniverse Machinima app for a chance to win RTX-accelerated NVIDIA Studio laptops.

RTX creators everywhere can remix and animate characters from Squad, Mount & Blade II: Bannerlord, Shadow Warrior 3, Post Scriptum, Beyond the Wire and Mechwarrior Mercenaries 5 using the Omniverse Machinima app.

Experiment with the AI-enabled tools like Audio2Face for instant facial animation from just an audio track; create intuitively with PhysX-powered tools to help you build as if building in reality; or add special effects with Blast for destruction and Flow for smoke and fire. You can use any third-party tools to help with your workflow, just assemble and render your final submission using Omniverse Machinima.

Learn more about NVIDIA Omniverse, including tips, tricks and more on the Omniverse YouTube channel. For additional support, explore the Omniverse forums or join the Discord server to chat with the community. Check out the Omniverse Twitter, Instagram and Medium page to stay up to date.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access a wide range of tutorials on the Studio YouTube channel and get updates in your inbox by subscribing to the Studio newsletter.

The post Feel the Need … for Speed as ‘Top Goose’ Debuts In the NVIDIA Studio appeared first on NVIDIA Blog.

Read More

Amazon SageMaker Studio and SageMaker Notebook Instance now come with JupyterLab 3 notebooks to boost developer productivity

Amazon SageMaker comes with two options to spin up fully managed notebooks for exploring data and building machine learning (ML) models. The first option is fast start, collaborative notebooks accessible within Amazon SageMaker Studio – a fully integrated development environment (IDE) for machine learning. You can quickly launch notebooks in Studio, easily dial up or down the underlying compute resources without interrupting your work, and even share your notebook as a link in few simple clicks. In addition to creating notebooks, you can perform all the ML development steps to build, train, debug, track, deploy, and monitor your models in a single pane of glass in Studio. The second option is Amazon SageMaker Notebook Instance – a single, fully managed ML compute instance running notebooks in cloud, offering customers more control on their notebook configurations.

Today, we’re excited to announce that SageMaker Studio and SageMaker Notebook Instance now come with JupyterLab 3 notebooks. The new notebooks provide data scientists and developers a modern IDE complete with developer productivity tools for code authoring, refactoring and debugging, and support for the latest open-source Jupyter extensions. AWS is a major contributor to the Jupyter open-source community and we’re happy to bring the latest Jupyter capabilities to our customers.

In this post, we showcase some of the exciting new features built into SageMaker notebooks and call attention to some of our favorite open-source extensions that improve the developer experience when using SageMaker to build, train, and deploy your ML models.

What’s new with notebooks on SageMaker

The new notebooks come with several features out of the box that improve the SageMaker developer experience, including the following:

  • An integrated debugger with support for breakpoints and variable inspection
  • A table of contents panel to more easily navigate notebooks
  • A filter bar for the file browser
  • Support for multiple display languages
  • The ability to install extensions through pip, Conda, and Mamba

With the integrated debugger, you can inspect variables and step through breakpoints while you interactively build your data science and ML code. You can access the debugger by simply choosing the debugger icon on the notebook toolbar.

As of this writing, the debugger is available for our newly launched Base Python 2.0 and Data Science 2.0 images in SageMaker Studio and amazonei_pytorch_latest_p37, pytorch_p38, and tensorflow2_p38 kernels in SageMaker Notebook Instance, with plans to support more in the near future.

The table of contents panel provides an excellent utility to navigate notebooks and more easily share your findings with colleagues.

JupyterLab extensions

With the upgraded notebooks in SageMaker, you can take advantage of the ever-growing community of open-source JupyterLab extensions. In this section, we highlight a few that fit naturally into the SageMaker developer workflow, but we encourage you to browse the available extensions or even create your own.

The first extension we highlight is the Language Server Protocol extension. This open-source extension enables modern IDE functionality such as tab completion, syntax highlighting, jump to reference, variable renaming across notebooks and modules, diagnostics, and much more. This extension is very useful for those developers who want to author Python modules as well as notebooks.

Another useful extension for the SageMaker developer workflow is the jupyterlab-s3-browser. This extension picks up your SageMaker execution role’s credentials and allows you to browse, load, and write files directly to Amazon Simple Storage Service (Amazon S3).

Install extensions

JupyterLab 3 now makes the process of packaging and installing extensions significantly easier. You can install the aforementioned extensions through bash scripts. For example, in SageMaker Studio, open the system terminal from the Studio launcher and run the following commands. Note that the upgraded Studio has a separate, isolated Conda environment for managing the Jupyter Server runtime, so you need to install extensions into the studio Conda environment. To install extensions in SageMaker Notebook Instance, there is no need to switch Conda environments.

In addition, you can automate the installation of these extensions using lifecycle configurations so they’re persisted between Studio restarts. You can configure this for all the users in the domain or at an individual user level.

For Python Language Server, use the following code to install the extensions:

conda init
conda activate studio
pip install jupyterlab-lsp
pip install 'python-lsp-server[all]'
conda deactivate
nohup supervisorctl -c /etc/supervisor/conf.d/supervisord.conf restart jupyterlabserver

For Amazon S3 filebrowser, use the following:

conda init
conda activate studio
pip install jupyterlab_s3_browser
jupyter serverextension enable --py jupyterlab_s3_browser
conda deactivate
nohup supervisorctl -c /etc/supervisor/conf.d/supervisord.conf restart jupyterlabserver

Be sure to refresh your browser after installation.

For more information about writing similar lifecycle scripts for SageMaker Notebook Instance, refer to Customize a Notebook Instance Using a Lifecycle Configuration Script and Customize your Amazon SageMaker notebook instances with lifecycle configurations and the option to disable internet access. Additionally, for more information on extension management, including how to write lifecycle configurations that work for both versions 1 and 3 of JupyterLab notebooks for backward compatibility, see Installing JupyterLab and Jupyter Server extensions.

Get started with JupyterLab 3 notebooks in Studio

If you’re creating a new Studio domain, you can specify the default notebook version directly from the AWS Management Console or using the API.

On the SageMaker Control Panel, change your notebook version when editing your domain settings, in the Jupyter Lab version section.

To use the API, configure the JupyterServerAppSettings parameter as follows:

aws --region <REGION> 
sagemaker create-domain 
--domain-name <NEW_DOMAIN_NAME> 
--auth-mode <AUTHENTICATION_MODE> 
--subnet-ids <SUBNET-IDS> 
--vpc-id <VPC-ID> 
--default-user-settings ‘{
  “JupyterServerAppSettings”: {
    “DefaultResourceSpec”: {
      “SageMakerImageArn”: “arn:aws:sagemaker:<REGION>:<ACCOUNT_ID>:image/jupyter-server-3",
      “InstanceType”: “system”
    }
  }
}

If you’re an existing Studio user, you can modify your notebook version by choosing your user profile on the SageMaker Control Panel and choosing Edit.

Then choose your preferred version in the Jupyter Lab version section.

For more information, see JupyterLab Versioning.

Get started with JupyterLab 3 on SageMaker Notebook Instance

SageMaker Notebook Instance users can also specify the default notebook version both from the console and using our API. If using the console, note that the option to choose the Jupyter Lab 3 notebooks is only available for latest generation of SageMaker Notebook Instance that comes with Amazon Linux 2.

On the SageMaker console, choose your version while creating your notebook instance, under Platform identifier.

If using the API, use the following code:

create-notebook-instance --notebook-instance-name <NEW_NOTEBOOK_NAME> 
--instance-type <INSTANCE_TYPE> 
--role-arn <YOUR_ROLE_ARN> 
--platform-identifier <notebook-al2-v2>

For more information, see Creating a notebook with your JupyterLab version.

Conclusion

SageMaker Studio and SageMaker Notebook Instance now offer an upgraded notebook experience to users. We encourage you to try out the new capabilities and further boost developer productivity with these enhancements!


About the Authors

Sean MorganSean Morgan is an AI/ML Solutions Architect at AWS. He has experience in the semiconductor and academic research fields, and uses his experience to help customers reach their goals on AWS. In his free time, Sean is an active open-source contributor/maintainer and is the special interest group lead for TensorFlow Add-ons.

Arkaprava De is a Senior Software Engineer at AWS. He has been at Amazon for over 7 years and is currently working on improving the Amazon SageMaker Studio IDE experience.

Kunal Jha is a Senior Product Manager at AWS. He is focused on building Amazon SageMaker Studio as the IDE of choice for all ML development steps. In his spare time, Kunal enjoys skiing and exploring the Pacific Northwest. You can find him on LinkedIn.

Read More

Hallucinating to better text translation

As babies, we babble and imitate our way to learning languages. We don’t start off reading raw text, which requires fundamental knowledge and understanding about the world, as well as the advanced ability to interpret and infer descriptions and relationships. Rather, humans begin our language journey slowly, by pointing and interacting with our environment, basing our words and perceiving their meaning through the context of the physical and social world. Eventually, we can craft full sentences to communicate complex ideas.

Similarly, when humans begin learning and translating into another language, the incorporation of other sensory information, like multimedia, paired with the new and unfamiliar words, like flashcards with images, improves language acquisition and retention. Then, with enough practice, humans can accurately translate new, unseen sentences in context without the accompanying media; however, imagining a picture based on the original text helps.

This is the basis of a new machine learning model, called VALHALLA, by researchers from MIT, IBM, and the University of California at San Diego, in which a trained neural network sees a source sentence in one language, hallucinates an image of what it looks like, and then uses both to translate into a target language. The team found that their method demonstrates improved accuracy of machine translation over text-only translation. Further, it provided an additional boost for cases with long sentences, under-resourced languages, and instances where part of the source sentence is inaccessible to the machine translator.

As a core task within the AI field of natural language processing (NLP), machine translation is an “eminently practical technology that’s being used by millions of people every day,” says study co-author Yoon Kim, assistant professor in MIT’s Department of Electrical Engineering and Computer Science with affiliations in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT-IBM Watson AI Lab. With recent, significant advances in deep learning, “there’s been an interesting development in how one might use non-text information — for example, images, audio, or other grounding information — to tackle practical tasks involving language” says Kim, because “when humans are performing language processing tasks, we’re doing so within a grounded, situated world.” The pairing of hallucinated images and text during inference, the team postulated, imitates that process, providing context for improved performance over current state-of-the-art techniques, which utilize text-only data.

This research will be presented at the IEEE / CVF Computer Vision and Pattern Recognition Conference this month. Kim’s co-authors are UC San Diego graduate student Yi Li and Professor Nuno Vasconcelos, along with research staff members Rameswar Panda, Chun-fu “Richard” Chen, Rogerio Feris, and IBM Director David Cox of IBM Research and the MIT-IBM Watson AI Lab.

Learning to hallucinate from images

When we learn new languages and to translate, we’re often provided with examples and practice before venturing out on our own. The same is true for machine-translation systems; however, if images are used during training, these AI methods also require visual aids for testing, limiting their applicability, says Panda.

“In real-world scenarios, you might not have an image with respect to the source sentence. So, our motivation was basically: Instead of using an external image during inference as input, can we use visual hallucination — the ability to imagine visual scenes — to improve machine translation systems?” says Panda.

To do this, the team used an encoder-decoder architecture with two transformers, a type of neural network model that’s suited for sequence-dependent data, like language, that can pay attention key words and semantics of a sentence. One transformer generates a visual hallucination, and the other performs multimodal translation using outputs from the first transformer.

During training, there are two streams of translation: a source sentence and a ground-truth image that is paired with it, and the same source sentence that is visually hallucinated to make a text-image pair. First the ground-truth image and sentence are tokenized into representations that can be handled by transformers; for the case of the sentence, each word is a token. The source sentence is tokenized again, but this time passed through the visual hallucination transformer, outputting a hallucination, a discrete image representation of the sentence. The researchers incorporated an autoregression that compares the ground-truth and hallucinated representations for congruency — e.g., homonyms: a reference to an animal “bat” isn’t hallucinated as a baseball bat. The hallucination transformer then uses the difference between them to optimize its predictions and visual output, making sure the context is consistent.

The two sets of tokens are then simultaneously passed through the multimodal translation transformer, each containing the sentence representation and either the hallucinated or ground-truth image. The tokenized text translation outputs are compared with the goal of being similar to each other and to the target sentence in another language. Any differences are then relayed back to the translation transformer for further optimization.

For testing, the ground-truth image stream drops off, since images likely wouldn’t be available in everyday scenarios.

“To the best of our knowledge, we haven’t seen any work which actually uses a hallucination transformer jointly with a multimodal translation system to improve machine translation performance,” says Panda.

Visualizing the target text

To test their method, the team put VALHALLA up against other state-of-the-art multimodal and text-only translation methods. They used public benchmark datasets containing ground-truth images with source sentences, and a dataset for translating text-only news articles. The researchers measured its performance over 13 tasks, ranging from translation on well-resourced languages (like English, German, and French), under-resourced languages (like English to Romanian) and non-English (like Spanish to French). The group also tested varying transformer model sizes, how accuracy changes with the sentence length, and translation under limited textual context, where portions of the text were hidden from the machine translators.

The team observed significant improvements over text-only translation methods, improving data efficiency, and that smaller models performed better than the larger base model. As sentences became longer, VALHALLA’s performance over other methods grew, which the researchers attributed to the addition of more ambiguous words. In cases where part of the sentence was masked, VALHALLA could recover and translate the original text, which the team found surprising.

Further unexpected findings arose: “Where there weren’t as many training [image and] text pairs, [like for under-resourced languages], improvements were more significant, which indicates that grounding in images helps in low-data regimes,” says Kim. “Another thing that was quite surprising to me was this improved performance, even on types of text that aren’t necessarily easily connectable to images. For example, maybe it’s not so surprising if this helps in translating visually salient sentences, like the ‘there is a red car in front of the house.’ [However], even in text-only [news article] domains, the approach was able to improve upon text-only systems.”

While VALHALLA performs well, the researchers note that it does have limitations, requiring pairs of sentences to be annotated with an image, which could make it more expensive to obtain. It also performs better in its ground domain and not the text-only news articles. Moreover, Kim and Panda note, a technique like VALHALLA is still a black box, with the assumption that hallucinated images are providing helpful information, and the team plans to investigate what and how the model is learning in order to validate their methods.

In the future, the team plans to explore other means of improving translation. “Here, we only focus on images, but there are other types of a multimodal information — for example, speech, video or touch, or other sensory modalities,” says Panda. “We believe such multimodal grounding can lead to even more efficient machine translation models, potentially benefiting translation across many low-resource languages spoken in the world.”

This research was supported, in part, by the MIT-IBM Watson AI Lab and the National Science Foundation.

Read More

Reinventing retail with no-code machine learning: Sales forecasting using Amazon SageMaker Canvas

Retail businesses are data-driven—they analyze data to get insights about consumer behavior, understand shopping trends, make product recommendations, optimize websites, plan for inventory, and forecast sales.

A common approach for sales forecasting is to use historical sales data to predict future demand. Forecasting future demand is critical for planning and impacts inventory, logistics, and even marketing campaigns. Sales forecasting is generated at many levels such as product, sales channel (store, website, partner), warehouse, city, or country.

Sales managers and planners have domain expertise and knowledge of sales history, but lack data science and programming skills to create machine learning (ML) models to generate accurate sales forecasts. They need an intuitive, easy-to-use tool to create ML models without writing code.

To help achieve the agility and effectiveness that business analysts seek, we’ve introduced Amazon SageMaker Canvas, a no-code ML solution that helps companies accelerate delivery of ML solutions down to hours or days. Canvas enables analysts to easily use available data in data lakes, data warehouses, and operational data stores; build ML models; and use them to make predictions interactively and for batch scoring on bulk datasets—all without writing a single line of code.

In this post, we show how to use Canvas to generate sales forecasts at the retail store level.

Solution overview

Canvas can import data from the local disk file, Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Snowflake (as of this writing).

In this post, we use Amazon Redshift cluster-based data with Canvas to build ML models to generate sales forecasts. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. Retail industry customers use Amazon Redshift to store and analyze large-scale, enterprise-level structured and semi-structured business data. It helps them accelerate data-driven business decisions in a performant and scalable way.

Generally, data engineers are responsible for ingesting and curating sales data in Amazon Redshift. Many retailers have a data lake where this has been done, but we show the steps here for clarity, and to illustrate how the data engineer can help the business analyst (such as the sales manager) by curating data for their use. This allows the data engineers to enable self-service data for use by business analysts.

In this post, we use a sample dataset that consists of two tables: storesales and storepromotions. You can prepare this sample dataset using your own sales data.

The storesales table keeps historical time series sales data for the stores. The table details are as follows:

Column Name Data Type
store INT
saledate TIMESTAMP
totalsales DECIMAL

The storepromotions table contains historical data from the stores regarding promotions and school holidays, on a daily time frame. The table details are as follows:

Column Name Data Type
store INT
saledate TIMESTAMP
promo INT (0 /1)
schoolholiday INT (0/1)

We combine data from these two tables to train an ML model that can generate forecasts for the store sales.

Canvas is a visual, point-and-click service that makes it easy to build ML models and generate accurate predictions. There are four steps involved in building the forecasting model:

  1. Select data from the data source (Amazon Redshift in this case).
  2. Configure and build (train) your model.
  3. View model insights such as accuracy and column impact on the prediction.
  4. Generate predictions (sales forecasts in this case).

Before we can start using Canvas, we need to prepare our data and configure an AWS Identity and Access Management (IAM) role for Canvas.

Create tables and load sample data

To use the sample dataset, complete the following steps:

  1. Upload storesales and storepromotions sample data files store_sales.csv and store_promotions.csv to an Amazon S3 bucket. Make sure the bucket is in the same region where you run Amazon Redshift cluster.
  2. Create an Amazon Redshift cluster (if not running).
  3. Access the Amazon Redshift query editor.
  4. Create the tables and run the COPY command to load data. Use the appropriate IAM role for the Amazon Redshift cluster in the following code:
create table storesales
(
store INT,
saledate VARCHAR,
totalsales DECIMAL
);

create table storepromotions
(
store INT,
saledate VARCHAR,
promo INT,
schoolholiday INT
);

copy storesales (store,saledate,totalsales)
from ‘s3://<YOUR_BUCKET_NAME>/store_sales.csv’
iam_role ‘<REDSHIFT_IAM_ROLE_ARN>’
Csv
IGNOREHEADER 1;

copy storepromotions (store,saledate,promo,schoolholiday)
from ‘s3://<YOUR_BUCKET_NAME>/store_promotions.csv’
iam_role ‘<REDSHIFT_IAM_ROLE_ARN>’
Csv
IGNOREHEADER 1;

By default, the sample data is loaded in the storesales and storepromotions tables in the public schema of the dev database. But you can choose to use a different database and schema.

Create an IAM role for Canvas

Canvas uses an IAM role to access other AWS services. To configure your role, complete the following steps:

  1. Create your role. For instructions, refer to Give your users permissions to perform time series forecasting.
  2. Replace the code in the Trusted entities field on the Trust relationships tab.

The following code is the new trust policy for the IAM role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [ "sagemaker.amazonaws.com", 
            "forecast.amazonaws.com"]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
  1. Provide the IAM role permission to Amazon Redshift. For instructions, refer to Give users permissions to import Amazon Redshift data.

The following screenshot shows your permission policies.

The IAM role should be assigned as the execution role for Canvas in the Amazon SageMaker domain configuration.

  1. On the SageMaker console, assign the IAM role created as the execution role when configuring your SageMaker domain.

The data in the Amazon Redshift cluster database and Canvas configuration both are ready. You can now use Canvas to build the forecasting model.

Launch Canvas

After the data engineers prepare the data in Amazon Redshift data warehouse, the sales managers can use Canvas to generate forecasts.

To launch Canvas, the AWS account administrator first performs the following steps:

  • Create a SageMaker domain.
  • Create user profiles for the SageMaker domain.

For instructions, refer to Getting started with using Amazon SageMaker Canvas or contact your AWS account administrator for the guidance.

Launch the Canvas app from the SageMaker console. Make sure to launch Canvas in the same AWS Region where the Amazon Redshift cluster is.

When Canvas is launched, you can start with the first step of selecting data from the data source.

Import data in Canvas

To import your data, complete the following steps:

  1. In the Canvas application, on the Datasets menu, choose Import.
  2. On the Import page, choose the Add connection menu and choose Redshift.

    The data engineer or cloud administrator can provide Amazon Redshift connection information to the sales manager. We show an example of the connection information in this post.
  3. For Type, choose IAM.
  4. For Cluster identifier, enter your Amazon Redshift cluster ID.
  5. For Database name, enter dev.
  6. For Database user, enter awsuser.
  7. For Unload IAM role, enter the IAM role you created earlier for the Amazon Redshift cluster.
  8. For Connection name, enter redshiftconnection.
  9. Choose Add connection.

    The connection between Canvas and the Amazon Redshift cluster is established. You can see the redshiftconnection icon on the top of the page.
  10. Drag and drop storesales and storepromotions tables under the public schema to the right panel.
    It automatically creates an inner join between the tables on their matching column names store and saledate.

    You can update joins and decide which fields to select from each table to create your desired dataset. You can configure the joins and field selection in two ways: using the Canvas user interface to drag and drop joining of tables, or update the SQL script in Canvas if the sales manager knows SQL. We include an example of editing SQL for completeness, and for the many business analysts who have been trained in SQL. The end goal is to prepare a SQL statement that provides the desired dataset that can be imported to Canvas.
  11. Choose Edit in SQL to see SQL script used for the join.
  12. Modify the SQL statement with the following code:
    WITH DvtV AS (SELECT store, saledate, promo, schoolholiday FROM dev.public."storepromotions"), 
          L394 AS (SELECT store, saledate, totalsales FROM dev.public."storesales")
          SELECT 
                      DvtV.promo,
                      DvtV.schoolholiday,
                      L394.totalsales,
                      DvtV.saledate AS saledate,
                      DvtV.store AS store
             FROM DvtV INNER JOIN L394 ON DvtV.saledate = L394.saledate AND DvtV.store = L394.store;

  13. Choose Run SQL to run the query.

    When the query is complete, you can see a preview of the output. This is the final data that you want to import in Canvas for the ML model and forecasting purposes.
  14. Choose Import data to import the data into Canvas.

When importing the data, provide a suitable name for the dataset, such as store_daily_sales_dataset.

The dataset is ready in Canvas. Now you can start training a model to forecast total sales across stores.

Configure and train the model

To configure model training in Canvas, complete the following steps:

  1. Choose the Models menu option and choose New Model.
  2. For the new model, give a suitable name such as store_sales_forecast_model.
  3. Select the dataset store_daily_sales_dataset.
  4. Choose Select dataset.

    On the Build tab, you can see data and column-level statistics as well as the configuration area for the model training.
  5. Select totalsales for the target column.
    Canvas automatically selects Time series forecasting as the model type.
  6. Choose Configure to start configuration of the model training.
  7. In the Time series forecasting configuration section, choose store as the unique identity column because we want to generate forecasts for the store.
  8. Choose saledate for the time stamps column because it represents historical time series.
  9. Enter 120 as the number of days because we want to forecast sales for a 3-month horizon.
  10. Choose Save.
  11. When the model training configuration is complete, choose Standard build to start the model training.

The Quick build and Preview model options aren’t available for the time series forecasting model type at the time of this writing. After you choose the standard build, the Analyze tab shows the estimated time for the model training.

Model training can take 1–4 hours to complete depending on the data size. For the sample data used in this post, the model training was around 3 hours. When the model is ready, you can use it for generating forecasts.

Analyze results and generate forecasts

When the model training is complete, Canvas shows the prediction accuracy of the model on the Analyze tab. For this example, it shows prediction accuracy as 79.13%. We can also see the impact of the columns on the prediction; in this example, promo and schoolholiday don’t influence the prediction. Column impact information is useful in fine-tuning the dataset and optimizing the model training.

The forecasts are generated on the Predict tab. You can generate forecasts for all the items (all stores) or for the selected single item (single store). It also shows the date range for which the forecasts can be generated.

As an example, we choose to view a single item and enter 2 as the store to generate sales forecasts for store 2 for the date range 2015-07-31 00:00:00 through 2015-11-28 00:00:00.

The generated forecasts show the average forecast as well as the upper and lower bound of the forecasts. The forecasts boundary helps make aggressive or balanced approaches for the forecast handling.

You can also download the generated forecasts as a CSV file or image. The generated forecasts CSV file is generally used to work offline with the forecast data.

The forecasts are generated based on time series data for a period of time. When the new baseline of data becomes available for the forecasts, you can upload a new baseline dataset and change the dataset in Canvas to retrain the forecast model using new data.

You can retrain the model multiple times as new source data is available.

Conclusion

Generating sales forecasts using Canvas is configuration driven and an easy-to-use process. We showed you how data engineers can help curate data for business analysts to use, and how business analysts can gain insights from their data. The business analyst can now connect to data sources such local disk, Amazon S3, Amazon Redshift, or Snowflake to import data and join data across multiple tables to train a ML forecasting model, which is then used to generate sales forecasts. As the historical sales data updates, you can retrain the forecast model to maintain forecast accuracy.

Sales managers and operations planners can use Canvas without expertise in data science and programming. This expedites decision-making time, enhances productivity, and helps build operational plans.

To get started and learn more about Canvas, refer to the following resources:


About the Authors

Brajendra Singh is solution architect in Amazon Web Services working with enterprise customers. He has strong developer background and is a keen enthusiast for data and machine learning solutions.

Davide Gallitelli is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customers throughout Benelux. He has been a developer since he was very young, starting to code at the age of 7. He started learning AI/ML at university, and has fallen in love with it since then.

Read More

Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

Deep learning has made significant impacts on multi-view stereo systems. State-of-the-art approaches typically involve building a cost volume, followed by multiple 3D convolution operations to recover the input image’s pixel-wise depth. While such end-to-end learning of plane-sweeping stereo advances public benchmarks’ accuracy, they are typically very slow to compute. We present MVS2D, a highly efficient multi-view stereo algorithm that seamlessly integrates multi-view constraints into single-view networks via an attention mechanism. Since MVS2D only builds on 2D convolutions, it is at least…Apple Machine Learning Research

Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction

Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision. State-of-the-art approaches typically assume accurate camera poses as input, which could be difficult to obtain in realistic settings. In this paper, we present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses. The core of our approach is a fast and robust multi-view reconstruction algorithm to jointly refine 3D geometry and camera pose estimation using learnable neural network modules. We…Apple Machine Learning Research