Bringing Novel Idea to Life, NVIDIA Artists Create Retro Writer’s Room in Omniverse With ‘The Storyteller’

Real-time rendering and photorealistic graphics used to be tall tales, but NVIDIA Omniverse has made them fact from fiction.

NVIDIA’s own artists are writing new chapters in Omniverse, an accelerated 3D design platform that connects and enhances 3D apps and creative workflows, to showcase these stories.

Combined with the NVIDIA Studio platform, Omniverse and Studio-validated hardware enable creatives to push the limits of their imagination and design rich, captivating virtual worlds like never before.

One of the latest projects in Omniverse, The Storyteller showcases a stunning retro-style writer’s room filled with leather-bound books, metallic typewriters and wooden furniture. Artists from NVIDIA used Omniverse, Autodesk 3ds Max and Substance 3D Painter to capture the essence of the room, creating detailed 3D models with realistic lighting.

Just for Reference — It Begins With Images

To kick off the project, lead environment artist Andrew Averkin looked at various interior images to use as references for the scene. From retro furniture and toys to vintage record players and sturdy bookshelves, these images were used as guidance and inspiration throughout the creative process.

The team of artists also collected various 3D models to create the assets that would populate and bring mood and atmosphere to the scene.

For one key element, the writer’s table, the team added extra details, such as texturing done in Substance 3D Painter, to create more layers of realism.

3D Assets, Assemble!

Once the 3D assets were completed, Averkin used Autodesk 3ds Max to assemble the scene connected to Omniverse Create, a scene composition app that can handle complex 3D scenes and objects.

With Autodesk 3ds Max connected to Create, Averkin had a much more iterative workflow — he was able to place 3D models in the scene, make changes to them on the spot, and continue assembling the scene until he achieved the look and feel he wanted.

“The best part was that I used all the tools in Autodesk 3ds Max to quickly assemble the scene. And with Omniverse Create, I used path-traced render mode to get high-quality, photorealistic renders of the scene in real time,” said Averkin. “I also used Assembly Tool, which is a set of tools that allowed me to work with the 3D models in a more efficient way — from scattering objects to painting surfaces.”

Averkin used the Autodesk 3ds Max Omniverse Connector — a plug-in that enables users to quickly and easily convert 3D content to Universal Scene Description, or USD — to export the scene from Autodesk 3ds Max to Omniverse Create. This made it easier to sync his work from one app to another, and continue working on the project inside Omniverse.

A Story Rendered Complete

To put the final touches on the Storyteller project, the artists worked with the simple-to-use tools in Omniverse Create to add realistic, ambient lighting and shadows.

“I wanted the lighting to look like the kind you see after the rain, or on a cloudy day,” said Averkin. “I also used rectangular light behind the window, so it could brighten the indoor part of the room and provide some nice shadows.”

To stage the composition, the team placed 30 or so cameras around the room to capture its different angles and perspectives, so viewers could be immersed in the scene.

For the final render of The Storyteller, the artists used Omniverse RTX Renderer in path-traced mode to get the most realistic result.

Some shots were rendered on an NVIDIA Studio system powered by two NVIDIA RTX A6000 GPUs. The team also used Omniverse Farm — a system layer that lets users create their own render farm — to accelerate the rendering process and achieve the final design significantly faster.

Watch the final cut of The Storyteller, and learn more about Omniverse at GTC, taking place on March 21-24.

GTC, which is free to attend, will feature sessions that dive into virtual worlds and digital twins, including Averkin’s talk, “An Artist’s Omniverse: How to Build Virtual Worlds.

Creators can download NVIDIA Omniverse for free and get started with step-by-step tutorials on our Omniverse YouTube channel. For additional resources and inspiration, follow Omniverse on  Instagram, Twitter and Medium. To chat with the community, check out the Omniverse forums and join our Discord Server.

The post Bringing Novel Idea to Life, NVIDIA Artists Create Retro Writer’s Room in Omniverse With ‘The Storyteller’ appeared first on The Official NVIDIA Blog.

Read More

This Googler wants to ‘add every voice’ to AI

Early in his career, Laurence Moroney was working on an equation — not something related to his job in tech, but to his bank account. “At one point, I calculated I was about three weeks away from being homeless,” Laurence says. “My motivation was to put a meal on the table and keep a roof over my head.”

Today Laurence is a developer advocate at Google focusing on artificial intelligence (AI) and machine learning (ML). “It’s my goal to inform and inspire the world about what we can do with AI and ML, and help developers realize these possibilities.” Laurence applied at Google in 2013 after hearing then-CEO Larry Page talk about Google’s vision to make the world a better place. “I was hired on my third attempt — so yes, I failed twice!”

Now he focuses on inviting and introducing more people to roles in the AI and ML fields through coursework, workshops and bootcamps that help developers gain job skills through professional certificates. “I try to meet developers where they are, whether that’s on YouTube, social media or in-person events,” he says. He’s particularly motivated to reach out to groups who have been historically underrepresented in tech. “Often they look and see everyone is one ethnicity and one gender and they think they don’t belong, but that’s not the case: Everyone, all ages, disabilities, whatever your background is, you should be here,” he says. “It’s so important for AI and ML work to include the entire scope of people which is why I’m so motivated to try and make everyone feel like they belong in this work.”

But it wasn’t an easy or straightforward path: his early years were tumultuous. Originally from Cyprus, Laurence and his family were forced to leave their home when a civil war resulted in an invasion. Exposure to chemicals used in the war zone permanently stained Laurence’s teeth, and he was also left with shaky hands. After moving to four different countries before the age of 8 (and learning four different languages), they settled in Ireland. “When you’re young, you don’t notice how difficult these things are, you just think…this is your life and this is normal,” he says.

He didn’t have the luxury to find his “passion” at work. “I needed a job and I needed a career. And around that time, the internet was starting to open up all of these new possibilities and opportunities.” In 1992, while bouncing around between odd jobs after receiving his degree in physics, Laurence heard about a government AI training program in the U.K. — one that worked as a sort of fellowship helping participants earn their master’s degree while also working on ways that AI systems could benefit the country.

“Hundreds of people descended on the testing center, where they looked at things like IQ, reasoning skills and so on,” Laurence says. Laurence also went — and ended up with the highest score. “They signed me up without realizing my background or ethnicity, and I was glad for that because I had experinced a lot of discrimination for being Irish,” he notes. “By that time I had gotten in the habit of disguising my accent. I tried not to talk much when I spoke to the government official who was running the program.” Despite his nerves, Laurence was asked to be the first person to sign on…though, the program itself was short-lived.

Every voice we add enriches what we’re doing — and every voice we lose diminishes it.

Read More

What Is Edge AI and How Does It Work?

Recent strides in the efficacy of AI, the adoption of IoT devices and the power of edge computing have come together to unlock the power of edge AI.

This has opened new opportunities for edge AI that were previously unimaginable — from helping radiologists identify pathologies in the hospital, to driving cars down the freeway, to helping us pollinate plants.

Countless analysts and businesses are talking about and implementing edge computing, which traces its origins to the 1990s, when content delivery networks were created to serve web and video content from edge servers deployed close to users.

Today, almost every business has job functions that can benefit from the adoption of edge AI. In fact, edge applications are driving the next wave of AI in ways that improve our lives at home, at work, in school and in transit.

Learn more about what edge AI is, its benefits and how it works, examples of edge AI use cases, and the relationship between edge computing and cloud computing.

What Is Edge AI? 

Edge AI is the deployment of AI applications in devices throughout the physical world. It’s called “edge AI” because the AI computation is done near the user at the edge of the network, close to where the data is located, rather than centrally in a cloud computing facility or private data center.

Since the internet has global reach, the edge of the network can connote any location. It can be a retail store, factory, hospital or devices all around us, like traffic lights, autonomous machines and phones.

Edge AI: Why Now? 

Organizations from every industry are looking to increase automation to improve processes, efficiency and safety.

To help them, computer programs need to recognize patterns and execute tasks repeatedly and safely. But the world is unstructured and the range of tasks that humans perform covers infinite circumstances that are impossible to fully describe in programs and rules.

Advances in edge AI have opened opportunities for machines and devices, wherever they may be, to operate with the “intelligence” of human cognition. AI-enabled smart applications learn to perform similar tasks under different circumstances, much like real life.

The efficacy of deploying AI models at the edge arises from three recent innovations.

  1. Maturation of neural networks: Neural networks and related AI infrastructure have finally developed to the point of allowing for generalized machine learning. Organizations are learning how to successfully train AI models and deploy them in production at the edge.
  2. Advances in compute infrastructure: Powerful distributed computational power is required to run AI at the edge. Recent advances in highly parallel GPUs have been adapted to execute neural networks.
  3. Adoption of IoT devices: The widespread adoption of the Internet of Things has fueled the explosion of big data. With the sudden ability to collect data in every aspect of a business — from industrial sensors, smart cameras, robots and more — we now have the data and devices necessary to deploy AI models at the edge. Moreover, 5G is providing IoT a boost with faster, more stable and secure connectivity.

Why Deploy AI at the Edge? What Are the Benefits of Edge AI? 

Since AI algorithms are capable of understanding language, sights, sounds, smells, temperature, faces and other analog forms of unstructured information, they’re particularly useful in places occupied by end users with real-world problems. These AI applications would be impractical or even impossible to deploy in a centralized cloud or enterprise data center due to issues related to latency, bandwidth and privacy.

The benefits of edge AI include:

  • Intelligence: AI applications are more powerful and flexible than conventional applications that can respond only to inputs that the programmer had anticipated. In contrast, an AI neural network is not trained how to answer a specific question, but rather how to answer a particular type of question, even if the question itself is new. Without AI, applications couldn’t possibly process infinitely diverse inputs like texts, spoken words or video.
  • Real-time insights: Since edge technology analyzes data locally rather than in a faraway cloud delayed by long-distance communications, it responds to users’ needs in real time.
  • Reduced cost: By bringing processing power closer to the edge, applications need less internet bandwidth, greatly reducing networking costs.
  • Increased privacy: AI can analyze real-world information without ever exposing it to a human being, greatly increasing privacy for anyone whose appearance, voice, medical image or any other personal information needs to be analyzed. Edge AI further enhances privacy by containing that data locally, uploading only the analysis and insights to the cloud. Even if some of the data is uploaded for training purposes, it can be anonymized to protect user identities. By preserving privacy, edge AI simplifies the challenges associated with data regulatory compliance.
  • High availability: Decentralization and offline capabilities make edge AI more robust since internet access is not required for processing data. This results in higher availability and reliability for mission-critical, production-grade AI applications.
  • Persistent improvement: AI models grow increasingly accurate as they train on more data. When an edge AI application confronts data that it cannot accurately or confidently process, it typically uploads it so that the AI can retrain and learn from it. So the longer a model is in production at the edge, the more accurate the model will be.

How Does Edge AI Technology Work?

Lifecycle of an edge AI application.

For machines to see, perform object detection, drive cars, understand speech, speak, walk or otherwise emulate human skills, they need to functionally replicate human intelligence.

AI employs a data structure called a deep neural network to replicate human cognition. These DNNs are trained to answer specific types of questions by being shown many examples of that type of question along with correct answers.

This training process, known as “deep learning,” often runs in a data center or the cloud due to the vast amount of data required to train an accurate model, and the need for data scientists to collaborate on configuring the model. After training, the model graduates to become an “inference engine” that can answer real-world questions.

In edge AI deployments, the inference engine runs on some kind of computer or device in far-flung locations such as factories, hospitals, cars, satellites and homes. When the AI stumbles on a problem, the troublesome data is commonly uploaded to the cloud for further training of the original AI model, which at some point replaces the inference engine at the edge. This feedback loop plays a significant role in boosting model performance; once edge AI models are deployed, they only get smarter and smarter.

What Are Examples of Edge AI Use Cases? 

AI is the most powerful technology force of our time. We’re now at a time where AI is revolutionizing the world’s largest industries.

Across manufacturing, healthcare, financial services, transportation, energy and more, edge AI is driving new business outcomes in every sector, including:

  • Intelligent forecasting in energy: For critical industries such as energy, in which discontinuous supply can threaten the health and welfare of the general population, intelligent forecasting is key. Edge AI models help to combine historical data, weather patterns, grid health and other information to create complex simulations that inform more efficient generation, distribution and management of energy resources to customers.
  • Predictive maintenance in manufacturing: Sensor data can be used to detect anomalies early and predict when a machine will fail. Sensors on equipment scan for flaws and alert management if a machine needs a repair so the issue can be addressed early, avoiding costly downtime.
  • AI-powered instruments in healthcare: Modern medical instruments at the edge are becoming AI-enabled with devices that use ultra-low-latency streaming of surgical video to allow for minimally invasive surgeries and insights on demand.
  • Smart virtual assistants in retail: Retailers are looking to improve the digital customer experience by introducing voice ordering to replace text-based searches with voice commands. With voice ordering, shoppers can easily search for items, ask for product information and place online orders using smart speakers or other intelligent mobile devices.

What Role Does Cloud Computing Play in Edge Computing? 

AI applications can run in a data center like those in public clouds, or out in the field at the network’s edge, near the user. Cloud computing and edge computing each offer benefits that can be combined when deploying edge AI.

The cloud offers benefits related to infrastructure cost, scalability, high utilization, resilience from server failure, and collaboration. Edge computing offers faster response times, lower bandwidth costs and resilience from network failure.

There are several ways in which cloud computing can support an edge AI deployment:

  • The cloud can run the model during its training period.
  • The cloud continues to run the model as it is retrained with data that comes from the edge.
  • The cloud can run AI inference engines that supplement the models in the field when high compute power is more important than response time. For example, a voice assistant might respond to its name, but send complex requests back to the cloud for parsing.
  • The cloud serves up the latest versions of the AI model and application.
  • The same edge AI often runs across a fleet of devices in the field with software in the cloud

Learn more about the best practices for hybrid edge architectures.

The Future of Edge AI 

Thanks to the commercial maturation of neural networks, proliferation of IoT devices, advances in parallel computation and 5G, there is now robust infrastructure for generalized machine learning. This is allowing enterprises to capitalize on the colossal opportunity to bring AI into their places of business and act upon real-time insights, all while decreasing costs and increasing privacy.

We are only in the early innings of edge AI, and still the possible applications seem endless.

Learn how your organization can deploy edge AI by checking out the top considerations for deploying AI at the edge.

The post What Is Edge AI and How Does It Work? appeared first on The Official NVIDIA Blog.

Read More

Performance You Can Feel: Putting GeForce NOW RTX 3080 Membership’s Ultra-Low Latency to the Test This GFN Thursday

GeForce NOW’s RTX 3080 membership is the next generation of cloud gaming. This GFN Thursday looks at one of the tier’s major benefits: ultra-low-latency streaming from the cloud.

This week also brings a new app update that lets members log in via Discord, a members-only World of Warships reward and eight titles joining the GeForce NOW library.

Full Speed Ahead

The GeForce NOW RTX 3080 membership tier is kind of like magic. When members play on underpowered PCs, Macs, Chromebooks, SHIELD TVs, Android devices, iPhones and iPads, they’re streaming the full PC games that they own with all of the benefits of GeForce RTX 3080 GPUs — like ultra-low latency.

A few milliseconds of latency — the time it takes from a keystroke or mouse click to seeing the result on screen — can be the difference between a round-winning moment and a disappointing delay in the game.

The GeForce NOW RTX 3080 membership, powered by GeForce NOW SuperPODs, reduces latency through faster game rendering, more efficient encoding and higher streaming frame rates. Each step helps deliver cloud gaming that rivals many local gaming experiences.

With game rendering on the new SuperPODs, all GeForce NOW RTX 3080 members will feel a discernible reduction in latency. However, GeForce NOW RTX 3080 members playing on the PC, Mac and Android apps will observe the greatest benefits by streaming at up to 120 frames per second.

The result? Digital Foundry proclaimed it’s “the best streaming system we’ve played.”

That means whether you’re hitting your shots in Rainbow Six Siege, using a game-changing ability in Apex Legends or even sweating out a fast-paced shooter like Counter-Strike: Global Offensive, it’s all streaming in real time on GeForce NOW.

But don’t just take our word for it. We asked a few cloud gaming experts to put the GeForce NOW RTX 3080 membership to the test. Cloud Gaming Xtreme looked at how close RTX 3080 click-to-pixel latency is to their local PC, while GameTechPlanet called it “the one to beat when it comes to cloud gaming, for picture quality, graphics, and input latency.”

And Virtual Cloud noted the latency results “really shocked me just how good it felt,” and that “when swapping from my local PC to play on GeForce NOW, I really couldn’t tell a difference at all.”

Live in the Future

On top of this, the membership comes with the longest gaming session length for GeForce NOW — clocking in at eight glorious hours. It also enables full control to customize in-game graphics settings and RTX ON, rendering environments in cinematic quality for supported games.

Cyberpunk 2077 on GeForce NOW
Explore Night City with RTX ON, all streaming from the cloud with RTX 3080-class performance.

That means RTX 3080 members can experience titles like Cyberpunk 2077 with maximized, uninterrupted playtime at beautiful, immersive cinematic quality. Gamers can also enjoy the title’s new update – Patch 1.5 – released earlier this week, adding ray-traced shadows from local lights and introducing a variety of improvements and quality of life chances across the board, along with fresh pieces of free additional content.

With improved Fixer and Gig gameplay, enhanced UI and crowd reaction AI, map and open world tweaks, new narrative interactions, weapons, gear, customization options and more, now is the best time to stream Cyberpunk 2077 with RTX ON.

Ready to play? Start gaming today with a six-month GeForce NOW RTX 3080 membership for $99.99, with 10 percent off for Founders. Check out our membership FAQ for more information.

Dive Into the Cloud with Discord in the 2.0.38 Update

The new GeForce NOW update improves login options for gamers by supporting Discord as a convenient new account creation and login option for their NVIDIA accounts. Members can now use their Discord logins to access their GeForce NOW accounts. That’s one less password to remember.

The 2.0.38 update on GeForce NOW PC and Mac apps also supports Discord’s Rich Presence feature, which lets members easily display the game they’re currently playing in their Discord user status. The feature can be enabled or disabled through the GeForce NOW settings menu.

‘World of Warships’ Rewards

GeForce NOW delivers top-level performance and gaming goodies to members playing on the cloud.

World of Warship Rewards on GeForce NOW
Say ‘ahoy’ to these two awesome ships arriving today.

In celebration of the second anniversary of GeForce NOW, GFN Thursdays in February are full of rewards.

Today, members can get rewards for the naval warfare battle game World of Warships. Add two new ships to your fleet this week with the Charleston or the Dreadnought, redeemable for players via the Epic Games Store based on playtime in the game.

Getting membership rewards for streaming games on the cloud is easy. Log in to your NVIDIA account and select “GEFORCE NOW” from the header, scroll down to “REWARDS” and click the “UPDATE REWARDS SETTINGS” button. Check the box in the dialogue window that shows up to start receiving special offers and in-game goodies.

Sign up for the GeForce NOW newsletter, including notifications for when rewards are available, by logging into your NVIDIA account and selecting “PREFERENCES” from the header. Check the “Gaming & Entertainment” box, and “GeForce NOW” under topic preferences, to receive the latest updates.

Ready, Set … Game!

SpellMaster the Saga on GeForce NOW
Master your magic skills to become a sorcerer and save an uncharted world from impending disaster in SpellMaster: The Saga.

GFN Thursday always brings in a new batch of games for members to play. Catch the following eight new titles ready to stream this week:

  • SpellMaster: The Saga (New release on Steam)
  • Ashes of the Singularity: Escalation (Steam)
  • Citadel: Forged With Fire (Steam)
  • Galactic Civilizations III (Steam)
  • Haven (Steam)
  • People Playground (Steam)
  • Train Valley 2 (Steam)
  • Valley (Steam)

We make every effort to launch games on GeForce NOW as close to their release as possible, but, in some instances, games may not be available immediately.

Great gaming is calling. Let us know who you’re playing with on Twitter:

The post Performance You Can Feel: Putting GeForce NOW RTX 3080 Membership’s Ultra-Low Latency to the Test This GFN Thursday appeared first on The Official NVIDIA Blog.

Read More

How SIGNAL IDUNA operationalizes machine learning projects on AWS

This post is co-authored with Jan Paul Assendorp, Thomas Lietzow, Christopher Masch, Alexander Meinert, Dr. Lars Palzer, Jan Schillemans of SIGNAL IDUNA.

At SIGNAL IDUNA, a large German insurer, we are currently reinventing ourselves with our transformation program VISION2023 to become even more customer oriented. Two aspects are central to this transformation: the reorganization of large parts of the workforce into cross functional and agile teams, and becoming a truly data-driven company. Here, the motto “You build it, you run it” is an important requirement for a cross-functional team that builds a data or machine learning (ML) product. This places tight constraints on how much work team can spend to productionize and run a product.

This post shows how SIGNAL IDUNA tackles this challenge and utilizes the AWS Cloud to enable cross-functional teams to build and operationalize their own ML products. To this end, we first introduce the organizational structure of agile teams, which sets the central requirements for the cloud infrastructure used to develop and run a product. Next, we show how three central teams at SIGNAL IDUNA enable cross-functional teams to build data products in the AWS Cloud with minimal assistance, by providing a suitable workflow and infrastructure solutions that can easily be used and adapted. Finally, we review our approach and compare it with a more classical approach where development and operation are separated more strictly.

Agile@SI – the Foundation of Organizational Change

Since the start of 2021, SIGNAL IDUNA has begun placing its strategy Agile@SI into action and establishing agile methods for developing customer-oriented solutions across the entire company [1]. Previous tasks and goals are now undertaken by cross-functional teams, called squads. These squads employ agile methods (such as the Scrum framework), make their own decisions, and build customer-oriented products. Typically, the squads are located in business divisions, such as marketing, and many have a strong emphasis on building data-driven and ML powered products. As an example, typical use cases in insurance are customer churn prediction and product recommendation.

Due to the complexity of ML, creating an ML solution by a single squad is challenging, and thus requires the collaboration of different squads.

SIGNAL IDUNA has three essential teams that support creating ML solutions. Surrounded by these three squads is the team that is responsible for the development and the long-term operation and of the ML solution. This approach follows the AWS shared responsibility model [2].

In the image above, all of the squads are represented in an overview.

Cloud Enablement

The underlying cloud infrastructure for the entire organization is provided by the squad Cloud Enablement. It is their task to enable the teams to build products upon cloud technologies on their own. This improves the time to market building new products like ML, and it follows the principle of “You build it, you run it”.

Data Office/Data Lake

Moving data into the cloud, as well as finding the right dataset, is supported by the squad Data Office/Data Lake. They set up a data catalogue that can be used to search and select required datasets. Their aim is to establish data transparency and governance. Additionally, they are responsible for establishing and operating a Data Lake that helps teams to access and process relevant data.

Data Analytics Platform

Our squad Data Analytics Platform (DAP) is a cloud and ML focused team at SIGNAL IDUNA that is proficient in ML engineering, data engineering, as well as data science. We enable internal teams using public cloud for ML by providing infrastructure components and knowledge. Our products and services are presented in detail in the following section.

Enabling Cross-Functional Teams to Build ML Solutions

To enable cross-functional teams at SIGNAL IDUNA to build ML solutions, we need a fast and versatile way to provision reusable cloud infrastructure as well as an efficient workflow for onboarding teams to utilize the cloud capabilities.

To this end, we created a standardized onboarding and support process, and provided modular infrastructure templates as Infrastructure as Code (IaC). These templates contain infrastructure components designed for common ML use cases that can be easily tailored to the requirements of a specific use case.

The Workflow of Building ML Solutions

There are three main technical roles involved in building and operating ML solutions: The data scientist, ML engineer, and a data engineer. Each role is part of the cross-functional squad and has different responsibilities. The data scientist has the required domain knowledge of functional as well as technical requirements of the use case. The ML engineer specializes in building automated ML solutions and model deployment. And the data engineer makes sure that the data flows from on-premises and within the cloud.

The process of providing the platform is as follows:

The infrastructure of the specific use case is defined in IaC and versioned in a central project repository. This also includes pipelines for model training and deployment, as well as other data science related code artifacts. Data scientists, ML engineers, and data engineers have access to the project repository and can configure and update all of the infrastructure code autonomously. This enables the team to rapidly alter the infrastructure if needed. However, the ML engineer can always support in developing and updating infrastructure or ML models.

Reusable and Modular Infrastructure Components

The hierarchical and modular IaC resources are implemented in Terraform and include infrastructure for common data science and ETL use cases. This lets us reuse infrastructure code and enforce required security and compliance policies, such as using AWS Key Management Service (KMS) encryption for data, as well as encapsulating infrastructure in Amazon Virtual Private Cloud (VPC) environments without direct internet access.

The hierarchical IaC structure is as follows:

  • Modules encapsulate basic AWS services with the required configuration for security and access management. This includes best practice configurations such as the prevention of public access to Amazon Simple Storage Service (S3) buckets, or enforcing encryption for all files stored.
  • In some cases, you need a variety of services to automate processes, such as to deploy ML models in different stages. Therefore, we defined Solutions as a bundle of different modules in a joint configuration for different types of tasks.
  • In addition, we offer complete Blueprints that combine solutions in different environments to meet the many potential needs of a project. In our MLOps blueprint, we define a deployable infrastructure for training, provisioning, and monitoring ML models that are integrated and distributed in AWS accounts. We discuss further details in the next section.

These products are versioned in a central repository by the DAP squad. This lets us continuously improve our IaC and consider new features from AWS, such as Amazon SageMaker Model Registry. Each squad can reference these resources, parameterize them as needed, and finally deploy them in their own AWS accounts.

MLOps Architecture

We provide a ready-to-use blueprint with specific solutions to cover the entire MLOps process. The blueprint contains infrastructure distributed over four AWS accounts for building and deploying ML models. This lets us isolate resources and workflows for the different steps in the MLOps process. The following figure shows the multi-account architecture, and we describe how the responsibility over specific steps of the process is divided between the different technical roles.

The modeling account includes services for the development of ML models. First, the data engineer employs an ETL process to provide relevant data from the SIGNAL IDUNA data lake, the centralized gateway for data-driven workflows in the AWS Cloud. Subsequently, the dataset can be utilized by the data scientist to train and evaluate model candidates. Once ready for extensive experiments, a model candidate is integrated into an automated training pipeline by the ML engineer. We use Amazon SageMaker Pipelines to automate training, hyperparameter tuning, and model evaluation at scale. This also includes model lineage and a standardized approval mechanism for models to be staged for deployment into production. Automated unit tests and code analysis ensure quality and reliability of the code for each step of the pipeline, such as data preprocessing, model training, and evaluation. Once a model is evaluated and approved, we use Amazon SageMaker ModelPackages as an interface to the trained model and relevant meta data.

The tooling account contains automated CI/CD pipelines with different stages for testing and deployment of trained models. In the test stage, models are deployed into the serving-nonprod account. Although model quality is evaluated in the training pipeline prior to the model being staged for production, here we run performance and integration tests in an isolated testing environment. After passing the testing stage, models are deployed into the serving-prod account to be integrated into production workflows.

Separating the stages of the MLOps workflow into different AWS accounts lets us isolate development and testing from production. Therefore, we can enforce a strict access and security policy. Furthermore, tailored IAM roles ensure that specific services can only access data and other services required for its scope, following the principle of least privilege. Services within the serving environments can additionally be made accessible to external business processes. For example, a business process can query an endpoint within the serving-prod environment for model predictions.

Benefits of our Approach

This process has many advantages as compared to a strict separation of development and operation for both the ML models, as well as the required infrastructure:

  • Isolation: Every team receives their own set of AWS accounts that are completely isolated from other teams’ environments. This makes it easy to manage access rights and keep the data private to those who are entitled to work with it.
  • Cloud enablement: Team members with little prior experience in cloud DevOps (such as many data scientists) can easily watch the whole process of designing and managing infrastructure since (almost) nothing is hidden from them behind a central service. This creates a better understanding of the infrastructure, which can in turn help them create data science products more efficiently.
  • Product ownership: The use of preconfigured infrastructure solutions and managed services keeps the barrier to managing an ML product in production very low. Therefore, a data scientist can easily take ownership of a model that is put into production. This minimizes the well-known risk of failing to put a model into production after development.
  • Innovation: Since ML engineers are involved long before a model is ready to put into production, they can create infrastructure solutions suitable for new use cases while the data scientists develop an ML model.
  • Adaptability: Since the IaC solution developed by DAP are freely available, any team can easily adapt these to match a specific need for their use case.
  • Open source: All new infrastructure solutions can easily be made available via the central DAP code repo to be used by other teams. Over time, this will create a rich code base with infrastructure components tailored to different use cases.

Summary

In this post, we illustrated how cross-functional teams at SIGNAL IDUNA are being enabled to build and run ML products on AWS. Central to our approach is the usage of a dedicated set of AWS accounts for each team in combination with bespoke IaC blueprints and solutions. These two components enable a cross-functional team to create and operate production quality infrastructure. In turn, they can take full end-to-end ownership of their ML products.

Refer to Amazon SageMaker Model Building Pipelines – Amazon SageMaker to learn more.

Find more information on ML on AWS on our official page.

References

[1] https://www.handelsblatt.com/finanzen/versicherungsbranche-vorbild-spotify-signal-iduna-wird-von-einer-handwerker-versicherung-zum-agilen-konzern/27381902.html

[2] https://blog.crisp.se/wp-content/uploads/2012/11/SpotifyScaling.pdf

[3] https://aws.amazon.com/compliance/shared-responsibility-model/


About the Authors

Jan Paul Assendorp is an ML engineer with a strong data science focus. He builds ML models and automates model training and the deployment into production environments.

Thomas Lietzow is the Scrum Master of the squad Data Analytics Platform.

Christopher Masch is the Product Owner of the squad Data Analytics Platform with knowledge in data engineering, data science, and ML engineering.

Alexander Meinert is part of the Data Analytics Platform team and works as an ML engineer. Started with statistics, grew on data science projects, found passion for ML methods and architecture.

Dr. Lars Palzer is a data scientist and part of the Data Analytics Platform team. After helping to build the MLOps architecture components, he is now using them to build ML products.

Jan Schillemans is a ML engineer with a software engineering background. He focusses on applying software engineering best practices onto ML environments (MLOps).

Read More

Bongo Learn provides real-time feedback to improve learning outcomes with Amazon Transcribe

Real-time feedback helps drive learning. This is especially important for designing presentations, learning new languages, and strengthening other essential skills that are critical to succeed in today’s workplace. However, many students and lifelong learners lack access to effective face-to-face instruction to hone these skills. In addition, with the rapid adoption of remote learning, educators are seeking more effective ways to engage their students and provide feedback and guidance in online learning environments. Bongo is filling that gap using video-based engagement and personalized feedback.

Bongo is a video assessment solution that enables experiential learning and soft skill development at scale. Their Auto Analysis™ is an automated reporting feature that provides deeper insight into an individual’s performance and progress. Organizations around the world—both corporate and higher education institutions—use Bongo’s Auto Analysis™ to facilitate automated feedback for a variety of use cases, including individual presentations, objection handling, and customer interaction training. The Auto Analysis™ platform, which runs on AWS and uses Amazon Transcribe, allows learners to demonstrate what they can do on video and helps evaluators get an authentic representation of a learner’s competency across a range of skills.

When users complete a video assignment, Bongo uses Amazon Transcribe, a deep learning-powered automatic speech recognition (ASR), to convert speech into text. Bongo analyzes the transcripts to identify the use of keywords and filler words, and assess clarity and effectiveness of the individual’s delivery. Bongo then auto-generates personalized feedback reports based on these performance insights, which learners can utilize as they practice iteratively. Learners can then submit their recording for feedback from evaluators and peers. Learners have reported a strong preference for receiving private and detailed feedback prior to submitting their work for evaluation or peer review.

Why Bongo chose Amazon Transcribe

During the technical evaluation process, Bongo looked at several speech-to-text vendors and machine learning services. Bruce Fischer, CTO at Bongo, says, “When choosing a vendor, AWS’ breadth and depth of services enabled us to build a complete solution through a single vendor. That saved us valuable development and deployment time. In addition, Amazon Transcribe produces high-quality transcripts with timestamps that allow Bongo Auto Analysis™ to provide accurate feedback to learners and improve learning outcomes. We are excited with how the service has evolved and how its new capabilities enable us to innovate faster.”

Since launch, Bongo has added the custom vocabulary feature of Amazon Transcribe. For example, it can recognize business jargon that is common in sales presentations. Foreign language learning is another important use case for Bongo customers. The automatic language detection feature in Amazon Transcribe and overall language support (37 different languages for batch processing) allows Bongo to deliver Auto Analysis™ in several languages, such as French, Spanish, German, and Portuguese.

Recently, Bongo launched auto-captioning for their on-demand videos. Powered by Amazon Transcribe, captions help address the accessibility needs of Bongo users with learning disabilities and impairments.

Amazon Transcribe enables Bongo’s Auto Analysis™ to quickly and accurately transcribe learner videos and provide feedback on the video that helps a learner employ a ‘practice, reflect, improve’ loop. This enables learners to increase content comprehension, retention, and learning outcomes, and reduces instructor assessment time since they are viewing a better work product. Teachers can focus on providing insightful feedback without spending time on the metrics the Auto Analysis™ produces automatically.

– Josh Kamrath, Bongo’s CEO.

Recently, Dr. Lynda Randall and Dr. Jessica Jaynes from California State University, Fullerton, conducted a research study to analyze the effectiveness of Bongo in an actual classroom setting on student engagement and learning outcomes.[1] The study results showed how the use of Bongo helped increase student comprehension and retention of concepts.

Conclusion

The Bongo team is now looking at how to incorporate other AWS AI services, such as Amazon Comprehend to do further language processing and Amazon Rekognition for visual analysis of videos. Bongo and their AWS team will continue working together to create the best experience for learners and instructors alike. To learn more about Amazon Transcribe and test it yourself, visit the Amazon Transcribe console.

[1] Randall, L.E., & Jaynes, J. A comparison of three assessment types of student engagement and content knowledge in online instruction. Online Learning Journal. (Status: Accepted. Publication date TBD)


About Bongo

Bongo is an embedded solution that drives meaningful assessment, experiential learning, and skill development at scale through video-based engagement and personalized feedback. Organizations use our video workflows to create opportunities for practice, demonstration, analysis, and collaboration. When individuals show what they can do within a real-world learning environment, evaluators get an authentic representation of their competency.


About the Author

Roshni Madaiah is an Account Manager on the AWS EdTech team, where she helps Education Technology customers build cutting edge solutions to transform learning and enrich student experience. Prior to AWS, she worked with enterprises and commercial customers to drive business outcomes via technical solutions. Outside of work, she enjoys traveling, reading and cooking without recipes.

Read More

Prepare time series data with Amazon SageMaker Data Wrangler

Time series data is widely present in our lives. Stock prices, house prices, weather information, and sales data captured over time are just a few examples. As businesses increasingly look for new ways to gain meaningful insights from time-series data, the ability to visualize data and apply desired transformations are fundamental steps. However, time-series data possesses unique characteristics and nuances compared to other kinds of tabular data, and require special considerations. For example, standard tabular or cross-sectional data is collected at a specific point in time. In contrast, time series data is captured repeatedly over time, with each successive data point dependent on its past values.

Because most time series analyses rely on the information gathered across a contiguous set of observations, missing data and inherent sparseness can reduce the accuracy of forecasts and introduce bias. Additionally, most time series analysis approaches rely on equal spacing between data points, in other words, periodicity. Therefore, the ability to fix data spacing irregularities is a critical prerequisite. Finally, time series analysis often requires the creation of additional features that can help explain the inherent relationship between input data and future predictions. All these factors differentiate time series projects from traditional machine learning (ML) scenarios and demand a distinct approach to its analysis.

This post walks through how to use Amazon SageMaker Data Wrangler to apply time series transformations and prepare your dataset for time series use cases.

Use cases for Data Wrangler

Data Wrangler provides a no-code/low-code solution to time series analysis with features to clean, transform, and prepare data faster. It also enables data scientists to prepare time series data in adherence to their forecasting model’s input format requirements. The following are a few ways you can use these capabilities:

  • Descriptive analysis– Usually, step one of any data science project is understanding the data. When we plot time series data, we get a high-level overview of its patterns, such as trend, seasonality, cycles, and random variations. It helps us decide the correct forecasting methodology for accurately representing these patterns. Plotting can also help identify outliers, preventing unrealistic and inaccurate forecasts. Data Wrangler comes with a seasonality-trend decomposition visualization for representing components of a time series, and an outlier detection visualization to identify outliers.
  • Explanatory analysis– For multi-variate time series, the ability to explore, identify, and model the relationship between two or more time series is essential for obtaining meaningful forecasts. The Group by transform in Data Wrangler creates multiple time series by grouping data for specified cells. Additionally, Data Wrangler time series transforms, where applicable, allow specification of additional ID columns to group on, enabling complex time series analysis.
  • Data preparation and feature engineering– Time series data is rarely in the format expected by time series models. It often requires data preparation to convert raw data into time series-specific features. You may want to validate that time series data is regularly or equally spaced prior to analysis. For forecasting use cases, you may also want to incorporate additional time series characteristics, such as autocorrelation and statistical properties. With Data Wrangler, you can quickly create time series features such as lag columns for multiple lag periods, resample data to multiple time granularities, and automatically extract statistical properties of a time series, to name a few capabilities.

Solution overview

This post elaborates on how data scientists and analysts can use Data Wrangler to visualize and prepare time series data. We use the bitcoin cryptocurrency dataset from cryptodatadownload with bitcoin trading details to showcase these capabilities. We clean, validate, and transform the raw dataset with time series features and also generate bitcoin volume price forecasts using the transformed dataset as input.

The sample of bitcoin trading data is from January 1 – November 19, 2021, with 464,116 data points. The dataset attributes include a timestamp of the price record, the opening or first price at which the coin was exchanged for a particular day, the highest price at which the coin was exchanged on the day, the last price at which the coin was exchanged on the day, the volume exchanged in the cryptocurrency value on the day in BTC, and corresponding USD currency.

Prerequisites

Download the Bitstamp_BTCUSD_2021_minute.csv file from cryptodatadownload and upload it to Amazon Simple Storage Service (Amazon S3).

Import bitcoin dataset in Data Wrangler

To start the ingestion process to Data Wrangler, complete the following steps:

  1. On the SageMaker Studio console, on the File menu, choose New, then choose Data Wrangler Flow.
  2. Rename the flow as desired.
  3. For Import data, choose Amazon S3.
  4. Upload the Bitstamp_BTCUSD_2021_minute.csv file from your S3 bucket.

You can now preview your data set.

  1. In the Details pane, choose Advanced configuration and deselect Enable sampling.

This is a relatively small data set, so we don’t need sampling.

  1. Choose Import.

You have successfully created the flow diagram and are ready to add transformation steps.

Add transformations

To add data transformations, choose the plus sign next to Data types and choose Edit data types.

Ensure that Data Wrangler automatically inferred the correct data types for the data columns.

In our case, the inferred data types are correct. However, suppose one data type was incorrect. You can easily modify them through the UI, as shown in the following screenshot.

edit and review data types

Let’s kick off the analysis and start adding transformations.

Data cleaning

We first perform several data cleaning transformations.

Drop column

Let’s start by dropping the unix column, because we use the date column as the index.

  1. Choose Back to data flow.
  2. Choose the plus sign next to Data types and choose Add transform.
  3. Choose + Add step in the TRANSFORMS pane.
  4. Choose Manage columns.
  5. For Transform, choose Drop column.
  6. For Column to drop, choose unix.
  7. Choose Preview.
  8. Choose Add to save the step.

Handle missing

Missing data is a well-known problem in real-world datasets. Therefore, it’s a best practice to verify the presence of any missing or null values and handle them appropriately. Our dataset doesn’t contain missing values. But if there were, we would use the Handle missing time series transform to fix them. Commonly used strategies for handling missing data include dropping rows with missing values or filling the missing values with reasonable estimates. Because time series data relies on a sequence of data points across time, filling missing values is the preferred approach. The process of filling missing values is referred to as imputation. The Handle missing time series transform allows you to choose from multiple imputation strategies.

  1. Choose + Add step in the TRANSFORMS pane.
  2. Choose the Time Series transform.
  3. For Transform, Choose Handle missing.
  4. For Time series input type, choose Along column.
  5. For Method for imputing values, choose Forward fill.

The Forward fill method replaces the missing values with the non-missing values preceding the missing values.

handle missing time series transform

Backward fill, Constant Value, Most common value and Interpolate are other imputation strategies available in Data Wrangler. Interpolation techniques rely on neighboring values for filling missing values. Time series data often exhibits correlation between neighboring values, making interpolation an effective filling strategy. For additional details on the functions you can use for applying interpolation, refer to pandas.DataFrame.interpolate.

Validate timestamp

In time series analysis, the timestamp column acts as the index column, around which the analysis revolves. Therefore, it’s essential to make sure the timestamp column doesn’t contain invalid or incorrectly formatted time stamp values. Because we’re using the date column as the timestamp column and index, let’s confirm its values are correctly formatted.

  1. Choose + Add step in the TRANSFORMS pane.
  2. Choose the Time Series transform.
  3. For Transform, choose Validate timestamps.

The Validate timestamps transform allows you to check that the timestamp column in your dataset doesn’t have values with an incorrect timestamp or missing values.

  1. For Timestamp Column, choose date.
  2. For Policy dropdown, choose Indicate.

The Indicate policy option creates a Boolean column indicating if the value in the timestamp column is a valid date/time format. Other options for Policy include:

  • Error – Throws an error if the timestamp column is missing or invalid
  • Drop – Drops the row if the timestamp column is missing or invalid
  1. Choose Preview.

A new Boolean column named date_is_valid was created, with true values indicating correct format and non-null entries. Our dataset doesn’t contain invalid timestamp values in the date column. But if it did, you could use the new Boolean column to identify and fix those values.

Validate Timestamp time series transform

  1. Choose Add to save this step.

Time series visualization

After we clean and validate the dataset, we can better visualize the data to understand its different component.

Resample

Because we’re interested in daily predictions, let’s transform the frequency of data to daily.

The Resample transformation changes the frequency of the time series observations to a specified granularity, and comes with both upsampling and downsampling options. Applying upsampling increases the frequency of the observations (for example from daily to hourly), whereas downsampling decreases the frequency of the observations (for example from hourly to daily).

Because our dataset is at minute granularity, let’s use the downsampling option.

  1. Choose + Add step.
  2. Choose the Time Series transform.
  3. For Transform, choose Resample.
  4. For Timestamp, choose date.
  5. For Frequency unit, choose Calendar day.
  6. For Frequency quantity, enter 1.
  7. For Method to aggregate numeric values, choose mean.
  8. Choose Preview.

The frequency of our dataset has changed from per minute to daily.

  1. Choose Add to save this step.

Seasonal-Trend decomposition

After resampling, we can visualize the transformed series and its associated STL (Seasonal and Trend decomposition using LOESS) components using the Seasonal-Trend-decomposition visualization. This breaks down original time series into distinct trend, seasonality and residual components, giving us a good understanding of how each pattern behaves. We can also use the information when modelling forecasting problems.

Data Wrangler uses LOESS, a robust and versatile statistical method for modelling trend and seasonal components. It’s underlying implementation uses polynomial regression for estimating nonlinear relationships present in the time series components (seasonality, trend, and residual).

  1. Choose Back to data flow.
  2. Choose the plus sign next to the Steps on Data Flow.
  3. Choose Add analysis.
  4. In the Create analysis pane, for Analysis type, choose Time Series.
  5. For Visualization, choose Seasonal-Trend decomposition.
  6. For Analysis Name, enter a name.
  7. For Timestamp column, choose date.
  8. For Value column, choose Volume USD.
  9. Choose Preview.

The analysis allows us to visualize the input time series and decomposed seasonality, trend, and residual.

  1. Choose Save to save the analysis.

With the seasonal-trend decomposition visualization, we can generate four patterns, as shown in the preceding screenshot:

  • Original – The original time series re-sampled to daily granularity.
  • Trend – The polynomial trend with an overall negative trend pattern for the year 2021, indicating a decrease in Volume USD value.
  • Season – The multiplicative seasonality represented by the varying oscillation patterns. We see a decrease in seasonal variation, characterized by decreasing amplitude of oscillations.
  • Residual – The remaining residual or random noise. The residual series is the resulting series after trend and seasonal components have been removed. Looking closely, we observe spikes between January and March, and between April and June, suggesting room for modelling such particular events using historical data.

These visualizations provide valuable leads to data scientists and analysts into existing patterns and can help you choose a modelling strategy. However, it’s always a good practice to validate the output of STL decomposition with the information gathered through descriptive analysis and domain expertise.

To summarize, we observe a downward trend consistent with original series visualization, which increases our confidence in incorporating the information conveyed by trend visualization into downstream decision-making. In contrast, the seasonality visualization helps inform the presence of seasonality and the need for its removal by applying techniques such as differencing, it doesn’t provide the desired level of detailed insight into various seasonal patterns present, thereby requiring deeper analysis.

Feature engineering

After we understand the patterns present in our dataset, we can start to engineer new features aimed to increase the accuracy of the forecasting models.

Featurize datetime

Let’s start the feature engineering process with more straightforward date/time features. Date/time features are created from the timestamp column and provide an optimal avenue for data scientists to start the feature engineering process. We begin with the Featurize datetime time series transformation to add the month, day of the month, day of the year, week of the year, and quarter features to our dataset. Because we’re providing the date/time components as separate features, we enable ML algorithms to detect signals and patterns for improving prediction accuracy.

  1. Choose + Add step.
  2. Choose the Time Series transform.
  3. For Transform, choose Featurize datetime.
  4. For Input Column, choose date.
  5. For Output Column, enter date (this step is optional).
  6. For Output mode, choose Ordinal.
  7. For Output format, choose Columns.
  8. For date/time features to extract, select Month, Day, Week of year, Day of year, and Quarter.
  9. Choose Preview.

The dataset now contains new columns named date_month, date_day, date_week_of_year, date_day_of_year, and date_quarter. The information retrieved from these new features could help data scientists derive additional insights from the data and into the relationship between input features and output features.

featurize datetime time series transform

  1. Choose Add to save this step.

Encode categorical

Date/time features aren’t limited to integer values. You may also choose to consider certain extracted date/time features as categorical variables and represent them as one-hot encoded features, with each column containing binary values. The newly created date_quarter column contains values between 0-3, and can be one-hot encoded using four binary columns. Let’s create four new binary features, each representing the corresponding quarter of the year.

  1. Choose + Add step.
  2. Choose the Encode categorical transform.
  3. For Transform, choose One-hot encode.
  4. For Input column, choose date_quarter.
  5. For Output style, choose Columns.
  6. Choose Preview.
  7. Choose Add to add the step.

Lag feature

Next, let’s create lag features for the target column Volume USD. Lag features in time series analysis are values at prior timestamps that are considered helpful in inferring future values. They also help identify autocorrelation (also known as serial correlation) patterns in the residual series by quantifying the relationship of the observation with observations at previous time steps. Autocorrelation is similar to regular correlation but between the values in a series and its past values. It forms the basis for the autoregressive forecasting models in the ARIMA series.

With the Data Wrangler Lag feature transform, you can easily create lag features n periods apart. Additionally, we often want to create multiple lag features at different lags and let the model decide the most meaningful features. For such a scenario, the Lag features transform helps create multiple lag columns over a specified window size.

  1. Choose Back to data flow.
  2. Choose the plus sign next to the Steps on Data Flow.
  3. Choose + Add step.
  4. Choose Time Series transform.
  5. For Transform, choose Lag features.
  6. For Generate lag features for this column, choose Volume USD.
  7. For Timestamp Column, choose date.
  8. For Lag, enter 7.
  9. Because we’re interested in observing up to the previous seven lag values, let’s select Include the entire lag window.
  10. To create a new column for each lag value, select Flatten the output.
  11. Choose Preview.

Seven new columns are added, suffixed with the lag_number keyword for the target column Volume USD.

Lag feature time series transform

  1. Choose Add to save the step.

Rolling window features

We can also calculate meaningful statistical summaries across a range of values and include them as input features. Let’s extract common statistical time series features.

Data Wrangler implements automatic time series feature extraction capabilities using the open source tsfresh package. With the time series feature extraction transforms, you can automate the feature extraction process. This eliminates the time and effort otherwise spent manually implementing signal processing libraries. For this post, we extract features using the Rolling window features transform. This method computes statistical properties across a set of observations defined by the window size.

  1. Choose + Add step.
  2. Choose the Time Series transform.
  3. For Transform, choose Rolling window features.
  4. For Generate rolling window features for this column, choose Volume USD.
  5. For Timestamp Column, choose date.
  6. For Window size, enter 7.

Specifying a window size of 7 computes features by combining the value at the current timestamp and values for the previous seven timestamps.

  1. Select Flatten to create a new column for each computed feature.
  2. Choose your strategy as Minimal subset.

This strategy extracts eight features that are useful in downstream analyses. Other strategies include Efficient Subset, Custom subset, and All features. For full list of features available for extraction, refer to Overview on extracted features.

  1. Choose Preview.

We can see eight new columns with specified window size of 7 in their name, appended to our dataset.

  1. Choose Add to save the step.

Export the dataset

We have transformed the time series dataset and are ready to use the transformed dataset as input for a forecasting algorithm. The last step is to export the transformed dataset to Amazon S3. In Data Wrangler, you can choose Export step to automatically generate a Jupyter notebook with Amazon SageMaker Processing code for processing and exporting the transformed dataset to a S3 bucket. However, because our dataset contains just over 300 records, let’s take advantage of the Export data option in the Add Transform view to export the transformed dataset directly to Amazon S3 from Data Wrangler.

  1. Choose Export data.

  1. For S3 location, choose Browser and choose your S3 bucket.
  2. Choose Export data.

Now that we have successfully transformed the bitcoin dataset, we can use Amazon Forecast to generate bitcoin predictions.

Clean up

If you’re done with this use case, clean up the resources you created to avoid incurring additional charges. For Data Wrangler you can shutdown the underlying instance when finished. Refer to Shut Down Data Wrangler documentation for details. Alternatively, you can continue to Part 2 of this series to use this dataset for forecasting.

Summary

This post demonstrated how to utilize Data Wrangler to simplify and accelerate time series analysis using its built-in time series capabilities. We explored how data scientists can easily and interactively clean, format, validate, and transform time series data into the desired format, for meaningful analysis. We also explored how you can enrich your time series analysis by adding a comprehensive set of statistical features using Data Wrangler. To learn more about time series transformations in Data Wrangler, see Transform Data.


About the Author

Roop Bains is a Solutions Architect at AWS focusing on AI/ML. He is passionate about helping customers innovate and achieve their business objectives using Artificial Intelligence and Machine Learning. In his spare time, Roop enjoys reading and hiking.

Nikita Ivkin is an Applied Scientist, Amazon SageMaker Data Wrangler.

Read More

Reimagining Modern Luxury: NVIDIA Announces Partnership with Jaguar Land Rover

Jaguar Land Rover and NVIDIA are redefining modern luxury, infusing intelligence into the customer experience.

As part of its Reimagine strategy, Jaguar Land Rover announced today that it will develop its upcoming vehicles on the full-stack NVIDIA DRIVE Hyperion 8 platform, with DRIVE Orin delivering a wide spectrum of active safety, automated driving and parking systems, as well as driver assistance systems built on DRIVE AV software. The system will also deliver AI features inside the vehicle, including driver and occupant monitoring and advanced visualization, leveraging the DRIVE IX software stack.

The iconic maker of modern luxury vehicles and the leader in AI computing will work together to build software-defined features for future Jaguar and Land Rover vehicles with continuously improving automated driving and intelligent features, from 2025.

The result will be some of the world’s most desirable vehicles, preserving the design purity of the distinct Jaguar and Land Rover personalities while transforming the experiences for customers at every step of the journey.

These vehicles will be built on a unified computer architecture that delivers software-defined services for ongoing customer value and innovative new business models. The combination of centralized compute and intelligent features upgraded over the air also enhances supply chain management.

The next step in this reimagination of responsible, modern luxury is implementing safe and convenient AI-powered features.

“Our long-term strategic partnership with NVIDIA will unlock a world of potential for our future vehicles as the business continues its transformation into a truly global, digital powerhouse,” said Thierry Bolloré, CEO of Jaguar Land Rover.

End-to-End Intelligence

Future Jaguar and Land Rover vehicles will be developed with NVIDIA AI from end to end.

This development begins in the data center. Engineers from both companies will work together to train, test and validate new automated driving features using NVIDIA data center solutions.

This includes data center hardware, software and workflows needed to develop and validate autonomous driving technology, from raw data collection through validation. NVIDIA DGX supercomputers provide the building blocks required for DNN development and training, while DRIVE Sim enables the necessary validation, replay and testing in simulation to enable a safe autonomous driving experience.

With NVIDIA Omniverse, engineers can collaborate virtually as well as exhaustively test and validate these DNNs with high-fidelity synthetic data generation.

Jaguar Land Rover will deploy this full-stack solution on NVIDIA DRIVE Hyperion — the central nervous system of the vehicle — which features the DRIVE Orin centralized AI compute platform — the car’s brain. DRIVE Hyperion includes the safety, security systems, networking and surrounding sensors used for autonomous driving, parking and intelligent cockpit applications.

NVIDIA DRIVE Orin

The future vehicles will be continuously improved and supported throughout their lifetimes by some of the world’s foremost software and AI engineers at NVIDIA and Jaguar Land Rover.

​​“Next-generation cars will transform automotive into one of the largest and most advanced technology industries,” said NVIDIA founder and CEO Jensen Huang. “Fleets of software-defined, programmable cars will offer new functionalities and services for the life of the vehicles.”

A Responsible Future

This new intelligent architecture, in addition to the transition to zero emissions powertrains, ensures Jaguar and Land Rover vehicles will not only transform the experience of customers, but also benefit the surrounding environment.

In addition to an all-electric future, the automaker is aiming to achieve net-zero carbon emissions across its supply chain, products and operations by 2039, incorporating sustainability into its long-heralded heritage.

By developing vehicles that are intelligent and backed by the high-performance compute of NVIDIA, Jaguar Land Rover is investing in technology that is safe for all road users, as well as convenient and comfortable.

This attention to responsibility in the new era of modern luxury extends the unique, emotional attachment that Jaguar and Land Rover vehicles inspire for even more decades to come.

The post Reimagining Modern Luxury: NVIDIA Announces Partnership with Jaguar Land Rover appeared first on The Official NVIDIA Blog.

Read More