Deploy Amazon SageMaker Autopilot models to serverless inference endpoints

Deploy Amazon SageMaker Autopilot models to serverless inference endpoints

Amazon SageMaker Autopilot automatically builds, trains, and tunes the best machine learning (ML) models based on your data, while allowing you to maintain full control and visibility. Autopilot can also deploy trained models to real-time inference endpoints automatically.

If you have workloads with spiky or unpredictable traffic patterns that can tolerate cold starts, then deploying the model to a serverless inference endpoint would be more cost efficient.

Amazon SageMaker Serverless Inference is a purpose-built inference option ideal for workloads with unpredictable traffic patterns and that can tolerate cold starts. Unlike a real-time inference endpoint, which is backed by a long-running compute instance, serverless endpoints provision resources on demand with built-in auto scaling. Serverless endpoints scale automatically based on the number of incoming requests and scale down resources to zero when there are no incoming requests, helping you minimize your costs.

In this post, we show how to deploy Autopilot trained models to serverless inference endpoints using the Boto3 libraries for Amazon SageMaker.

Autopilot training modes

Before creating an Autopilot experiment, you can either let Autopilot select the training mode automatically, or you can select the training mode manually.

Autopilot currently supports three training modes:

  • Auto – Based on dataset size, Autopilot automatically chooses either ensembling or HPO mode. For datasets larger than 100 MB, Autopilot chooses HPO; otherwise, it chooses ensembling.
  • Ensembling – Autopilot uses the AutoGluon ensembling technique using model stacking and produces an optimal predictive model.
  • Hyperparameter optimization (HPO) – Autopilot finds the best version of a model by tuning hyperparameters using Bayesian optimization or multi-fidelity optimization while running training jobs on your dataset. HPO mode selects the algorithms most relevant to your dataset and selects the best range of hyperparameters to tune your models.

To learn more about Autopilot training modes, refer to Training modes.

Solution overview

In this post, we use the UCI Bank Marketing dataset to predict if a client will subscribe to a term deposit offered by the bank. This is a binary classification problem type.

We launch two Autopilot jobs using the Boto3 libraries for SageMaker. The first job uses ensembling as the chosen training mode. We then deploy the single ensemble model generated to a serverless endpoint and send inference requests to this hosted endpoint.

The second job uses the HPO training mode. For classification problem types, Autopilot generates three inference containers. We extract these three inference containers and deploy them to separate serverless endpoints. Then we send inference requests to these hosted endpoints.

For more information about regression and classification problem types, refer to Inference container definitions for regression and classification problem types.

We can also launch Autopilot jobs from the Amazon SageMaker Studio UI. If you launch jobs from the UI, ensure to turn off the Auto deploy option in the Deployment and Advanced settings section. Otherwise, Autopilot will deploy the best candidate to a real-time endpoint.

Prerequisites

Ensure you have the latest version of Boto3 and the SageMaker Python packages installed:

pip install -U boto3 sagemaker

We need the SageMaker package version >= 2.110.0 and Boto3 version >= boto3-1.24.84.

Launch an Autopilot job with ensembling mode

To launch an Autopilot job using the SageMaker Boto3 libraries, we use the create_auto_ml_job API. We then pass in AutoMLJobConfig, InputDataConfig, and AutoMLJobObjective as inputs to the create_auto_ml_job. See the following code:

bucket = session.default_bucket()
role = sagemaker.get_execution_role()
prefix = "autopilot/bankadditional"
sm_client = boto3.Session().client(service_name='sagemaker',region_name=region)

timestamp_suffix = strftime('%d%b%Y-%H%M%S', gmtime())
automl_job_name = f"uci-bank-marketing-{timestamp_suffix}"
max_job_runtime_seconds = 3600
max_runtime_per_job_seconds = 1200
target_column = "y"
problem_type="BinaryClassification"
objective_metric = "F1"
training_mode = "ENSEMBLING"

automl_job_config = {
    'CompletionCriteria': {
      'MaxRuntimePerTrainingJobInSeconds': max_runtime_per_job_seconds,
      'MaxAutoMLJobRuntimeInSeconds': max_job_runtime_seconds
    },    
    "Mode" : training_mode
}

automl_job_objective= { "MetricName": objective_metric }

input_data_config = [
    {
      'DataSource': {
        'S3DataSource': {
          'S3DataType': 'S3Prefix',
          'S3Uri': f's3://{bucket}/{prefix}/raw/bank-additional-full.csv'
        }
      },
      'TargetAttributeName': target_column
    }
  ]

output_data_config = {
	    'S3OutputPath': f's3://{bucket}/{prefix}/output'
	}


sm_client.create_auto_ml_job(
				AutoMLJobName=auto_ml_job_name,
				InputDataConfig=input_data_config,
				OutputDataConfig=output_data_config,
				AutoMLJobConfig=automl_job_config,
				ProblemType=problem_type,
				AutoMLJobObjective=automl_job_objective,
				RoleArn=role)

Autopilot returns the BestCandidate model object that has the InferenceContainers required to deploy the models to inference endpoints. To get the BestCandidate for the preceding job, we use the describe_automl_job function:

job_response = sm_client.describe_auto_ml_job(AutoMLJobName=automl_job_name)
best_candidate = job_response['BestCandidate']
inference_container = job_response['BestCandidate']['InferenceContainers'][0]
print(inference_container)

Deploy the trained model

We now deploy the preceding inference container to a serverless endpoint. The first step is to create a model from the inference container, then create an endpoint configuration in which we specify the MemorySizeInMB and MaxConcurrency values for the serverless endpoint along with the model name. Finally, we create an endpoint with the endpoint configuration created above.

We recommend choosing your endpoint’s memory size according to your model size. The memory size should be at least as large as your model size. Your serverless endpoint has a minimum RAM size of 1024 MB (1 GB), and the maximum RAM size you can choose is 6144 MB (6 GB).

The memory sizes you can choose are 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.

To help determine whether a serverless endpoint is the right deployment option from a cost and performance perspective, we encourage you to refer to the SageMaker Serverless Inference Benchmarking Toolkit, which tests different endpoint configurations and compares the most optimal one against a comparable real-time hosting instance.

Note that serverless endpoints only accept SingleModel for inference containers. Autopilot in ensembling mode generates a single model, so we can deploy this model container as is to the endpoint. See the following code:

# Create Model
	model_response = sm_client.create_model(
				ModelName=model_name,
				ExecutionRoleArn=role,
				Containers=[inference_container]
	)

# Create Endpoint Config
	epc_response = sm_client.create_endpoint_config(
		EndpointConfigName = endpoint_config_name,
		ProductionVariants=[
			{
				"ModelName": model_name,
				"VariantName": "AllTraffic",
				"ServerlessConfig": {
					"MemorySizeInMB": memory,
					"MaxConcurrency": max_concurrency
				}
			}
		]
	)

# Create Endpoint
	ep_response = sm_client.create_endpoint(
		EndpointName=endpoint_name,
		EndpointConfigName=endpoint_config_name
	)

When the serverless inference endpoint is InService, we can test the endpoint by sending an inference request and observe the predictions. The following diagram illustrates the architecture of this setup.

Note that we can send raw data as a payload to the endpoint. The ensemble model generated by Autopilot automatically incorporates all required feature-transform and inverse-label transform steps, along with the algorithm model and packages, into a single model.

Send inference request to the trained model

Use the following code to send inference on your model trained using ensembling mode:

from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer


payload = "34,blue-collar,married,basic.4y,no,no,no,telephone,may,tue,800,4,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0"

predictor = Predictor(
        endpoint_name=endpoint,
        sagmaker_session=session,
        serializer=CSVSerializer(),
    )

prediction = predictor.predict(payload).decode(‘utf-8’)
print(prediction)

Launch an Autopilot Job with HPO mode

In HPO mode, for CompletionCriteria, besides MaxRuntimePerTrainingJobInSeconds and MaxAutoMLJobRuntimeInSeconds, we could also specify the MaxCandidates to limit the number of candidates an Autopilot job will generate. Note that these are optional parameters and are only set to limit the job runtime for demonstration. See the following code:

training_mode = "HYPERPARAMETER_TUNING"

automl_job_config["Mode"] = training_mode
automl_job_config["CompletionCriteria"]["MaxCandidates"] = 15
hpo_automl_job_name =  f"{model_prefix}-HPO-{timestamp_suffix}"

response = sm_client.create_auto_ml_job(
					  AutoMLJobName=hpo_automl_job_name,
					  InputDataConfig=input_data_config,
					  OutputDataConfig=output_data_config,
					  AutoMLJobConfig=automl_job_config,
					  ProblemType=problem_type,
					  AutoMLJobObjective=automl_job_objective,
					  RoleArn=role,
					  Tags=tags_config
				)

To get the BestCandidate for the preceding job, we can again use the describe_automl_job function:

job_response = sm_client.describe_auto_ml_job(AutoMLJobName=automl_job_name)
best_candidate = job_response['BestCandidate']
inference_containers = job_response['BestCandidate']['InferenceContainers']
print(inference_containers)

Deploy the trained model

Autopilot in HPO mode for the classification problem type generates three inference containers.

The first container handles the feature-transform steps. Next, the algorithm container generates the predicted_label with the highest probability. Finally, the post-processing inference container performs an inverse transform on the predicted label and maps it to the original label. For more information, refer to Inference container definitions for regression and classification problem types.

We extract these three inference containers and deploy them to a separate serverless endpoints. For inference, we invoke the endpoints in sequence by sending the payload first to the feature-transform container, then passing the output from this container to the algorithm container, and finally passing the output from the previous inference container to the post-processing container, which outputs the predicted label.

The following diagram illustrates the architecture of this setup. diagram that illustrates Autopilot model in HPO mode deployed to three serverless endpoints

We extract the three inference containers from the BestCandidate with the following code:

job_response = sm_client.describe_auto_ml_job(AutoMLJobName=automl_job_name)
inference_containers = job_response['BestCandidate']['InferenceContainers']

models = list()
endpoint_configs = list()
endpoints = list()

# For brevity, we've encapsulated create_model, create endpoint_config and create_endpoint as helper functions
for idx, container in enumerate(inference_containers):
    (status, model_arn) = create_autopilot_model(
								    sm_client,
								    automl_job_name,
            						role,
								    container,
								    idx)
    model_name = model_arn.split('/')[1]
    models.append(model_name)

    endpoint_config_name = f"epc-{model_name}"
    endpoint_name = f"ep-{model_name}"
    (status, epc_arn) = create_serverless_endpoint_config(
								    sm_client,
								    endpoint_config_name,
								    model_name,
            						memory=2048,
								    max_concurrency=10)
	endpoint_configs.append(endpoint_config_name)

	response = create_serverless_endpoint(
								    sm_client,
								    endpoint_name,
								    endpoint_config_name)
	endpoints.append(endpoint_name)

Send inference request to the trained model

For inference, we send the payload in sequence: first to the feature-transform container, then to the model container, and finally to the inverse-label transform container.

visual of inference request flow of three inference containers from HPO mode

See the following code:

from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer

payload = "51,technician,married,professional.course,no,yes,no,cellular,apr,thu,687,1,0,1,success,-1.8,93.075,-47.1,1.365,5099.1"


for _, endpoint in enumerate(endpoints):
    try:
        print(f"payload: {payload}")
        predictor = Predictor(
            endpoint_name=endpoint,
            sagemaker_session=session,
            serializer=CSVSerializer(),
        )
        prediction = predictor.predict(payload)
        payload=prediction
    except Exception as e:
        print(f"Error invoking Endpoint; {endpoint} n {e}")
        break

The full implementation of this example is available in the following jupyter notebook.

Clean up

To clean up resources, you can delete the created serverless endpoints, endpoint configs, and models:

sm_client = boto3.Session().client(service_name='sagemaker',region_name=region)

for _, endpoint in enumerate(endpoints):
    try:
        sm_client.delete_endpoint(EndpointName=endpoint)
    except Exception as e:
        print(f"Exception:n{e}")
        continue
        
for _, endpoint_config in enumerate(endpoint_configs):
    try:
        sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config)
    except Exception as e:
        print(f"Exception:n{e}")
        continue

for _, autopilot_model in enumerate(models):
    try:
        sm_client.delete_model(ModelName=autopilot_model)
    except Exception as e:
        print(f"Exception:n{e}")
        continue

Conclusion

In this post, we showed how we can deploy Autopilot generated models both in ensemble and HPO modes to serverless inference endpoints. This solution can speed up your ability to use and take advantage of cost-efficient and fully managed ML services like Autopilot to generate models quickly from raw data, and then deploy them to fully managed serverless inference endpoints with built-in auto scaling to reduce costs.

We encourage you to try this solution with a dataset relevant to your business KPIs. You can refer to the solution implemented in a Jupyter notebook in the GitHub repo.

Additional references


About the Author

Praveen Chamarthi is a Senior AI/ML Specialist with Amazon Web Services. He is passionate about AI/ML and all things AWS. He helps customers across the Americas to scale, innovate, and operate ML workloads efficiently on AWS. In his spare time, Praveen loves to read and enjoys sci-fi movies.

Read More

What Is a Pretrained AI Model?

What Is a Pretrained AI Model?

Imagine trying to teach a toddler what a unicorn is. A good place to start might be by showing the child images of the creature and describing its unique features.

Now imagine trying to teach an artificially intelligent machine what a unicorn is. Where would one even begin?

Pretrained AI models offer a solution.

A pretrained AI model is a deep learning model — an expression of a brain-like neural algorithm that finds patterns or makes predictions based on data — that’s trained on large datasets to accomplish a specific task. It can be used as is or further fine-tuned to fit an application’s specific needs.

Why Are Pretrained AI Models Used?

Instead of building an AI model from scratch, developers can use pretrained models and customize them to meet their requirements.

To build an AI application, developers first need an AI model that can accomplish a particular task, whether that’s identifying a mythical horse, detecting a safety hazard for an autonomous vehicle or diagnosing a cancer based on medical imaging. That model needs a lot of representative data to learn from.

This learning process entails going through several layers of incoming data and emphasizing goals-relevant characteristics at each layer.

To create a model that can recognize a unicorn, for example, one might first feed it images of unicorns, horses, cats, tigers and other animals. This is the incoming data.

Then, layers of representative data traits are constructed, beginning with the simple — like lines and colors — and advancing to complex structural features. These characteristics are assigned varying degrees of relevance by calculating probabilities.

As opposed to a cat or tiger, for example, the more like a horse a creature appears, the greater the likelihood that it is a unicorn. Such probabilistic values are stored at each neural network layer in the AI model, and as layers are added, its understanding of the representation improves.

To create such a model from scratch, developers require enormous datasets, often with billions of rows of data. These can be pricey and challenging to obtain, but compromising on data can lead to poor performance of the model.

Precomputed probabilistic representations — known as weights — save time, money and effort. A pretrained model is already built and trained with these weights.

Using a high-quality pretrained model with a large number of accurate representative weights leads to higher chances of success for AI deployment. Weights can be modified, and more data can be added to the model to further customize or fine-tune it.

Developers building on pretrained models can create AI applications faster, without having to worry about handling mountains of input data or computing probabilities for dense layers.

In other words, using a pretrained AI model is like getting a dress or a shirt and then tailoring it to fit your needs, rather than starting with fabric, thread and needle.

Pretrained AI models are often used for transfer learning and can be based on several model architecture types. One popular architecture type is the transformer model, a neural network that learns context and meaning by tracking relationships in sequential data.

According to Alfredo Ramos, senior vice president of platform at AI company Clarifai — a Premier partner in the NVIDIA Inception program for startups — pretrained models can cut AI application development time by up to a year and lead to cost savings of hundreds of thousands of dollars.

How Are Pretrained Models Advancing AI?

Since pretrained models simplify and quicken AI development, many developers and companies use them to accelerate various AI use cases.

Top areas in which pretrained models are advancing AI include:

  • Natural language processing. Pretrained models are used for translation, chatbots and other natural language processing applications. Large language models, often based on the transformer model architecture, are an extension of pretrained models. One example of a pretrained LLM is NVIDIA NeMo Megatron, one of the world’s largest AI models.
  • Speech AI. Pretrained models can help speech AI applications plug and play across different languages. Use cases include call center automation, AI assistants and voice-recognition technologies.
  • Computer vision. Like in the unicorn example above, pretrained models can help AI quickly recognize creatures — or objects, places and people. In this way, pretrained models accelerate computer vision, giving applications human-like vision capabilities across sports, smart cities and more.
  • Healthcare. For healthcare applications, pretrained AI models like MegaMolBART — part of the NVIDIA BioNeMo service and framework — can understand the language of chemistry and learn the relationships between atoms in real-world molecules, giving the scientific community a powerful tool for faster drug discovery. 
  • Cybersecurity. Pretrained models provide a starting point to implement AI-based cybersecurity solutions and extend the capabilities of human security analysts to detect threats faster. Examples include digital fingerprinting of humans and machines, and detection of anomalies, sensitive information and phishing.
  • Art and creative workflows. Bolstering the recent wave of AI art, pretrained models can help accelerate creative workflows through tools like GauGAN and NVIDIA Canvas.

Pretrained AI models can be applied across industries beyond these, as their customization and fine-tuning can lead to infinite possibilities for use cases.

Where to Find Pretrained AI Models

Companies like Google, Meta, Microsoft and NVIDIA are inventing cutting-edge model architectures and frameworks to build AI models.

These are sometimes released on model hubs or as open source, enabling developers to fine-tune pretrained AI models, improve their accuracy and expand model repositories.

NVIDIA NGC — a hub for GPU-optimized AI software, models and Jupyter Notebook examples — includes pretrained models as well as AI benchmarks and training recipes optimized for use with the NVIDIA AI platform.

NVIDIA AI Enterprise, a fully managed, secure, cloud-native suite of AI and data analytics software, includes pretrained models without encryption. This allows developers and enterprises looking to integrate NVIDIA pretrained models into their custom AI applications to view model weights and biases, improve explainability and debug easily.

Thousands of open-source models are also available on hubs like GitHub, Hugging Face and others.

It’s important that pretrained models are trained using ethical data that’s transparent and explainable, privacy compliant, and obtained with consent and without bias.

NVIDIA Pretrained AI Models

To help more developers move AI from prototype to production, NVIDIA offers several pretrained models that can be deployed out of the box, including:

  • NVIDIA SegFormer, a transformer model for simple, efficient, powerful semantic segmentation — available on GitHub.
  • NVIDIA’s purpose-built computer vision models, trained on millions of images for smart cities, parking management and other applications.
  • NVIDIA NeMo Megatron, the world’s largest customizable language model, as part of NVIDIA NeMo, an open-source framework for building high-performance and flexible applications for conversational AI, speech AI and biology.
  • NVIDIA StyleGAN, a style-based generator architecture for generative adversarial networks, or GANs. It uses transfer learning to generate infinite paintings in a variety of styles.

In addition, NVIDIA Riva, a GPU-accelerated software development kit for building and deploying speech AI applications, includes pretrained models in ten languages.

And MONAI, an open-source AI framework for healthcare research developed by NVIDIA and King’s College London, includes pretrained models for medical imaging.

Learn more about NVIDIA pretrained AI models.

The post What Is a Pretrained AI Model? appeared first on NVIDIA Blog.

Read More

The Hunt Is On: ‘The Witcher 3: Wild Hunt’ Next-Gen Update Coming to GeForce NOW

The Hunt Is On: ‘The Witcher 3: Wild Hunt’ Next-Gen Update Coming to GeForce NOW

It’s a wild GFN Thursday — The Witcher 3: Wild Hunt next-gen update will stream on GeForce NOW day and date, starting next week. Today, members can stream new seasons of Fortnite and Genshin Impact, alongside eight new games joining the library.

In addition, the newest GeForce NOW app is rolling out this week with support for syncing members’ Ubisoft Connect library of games, which helps them get into their favorite Ubisoft games even quicker.

Plus, gamers across the U.K., Netherlands and Poland have the first chance to pick up the new HP Chromebook x360, 13.3 inches, built for extreme multitasking with an adaptive 360-degree design and great for cloud gaming. Each Chromebook purchase comes with a one-month GeForce NOW Priority membership for free.

Triss the Season

The Witcher 3 on GeForce NOW
“Hmm”

CD PROJEKT RED releases the next-gen update for The Witcher 3: Wild Hunt — Complete Edition on Wednesday, Dec. 14. The update is free for anyone who owns the game on Steam, Epic Games, or GOG.com, and GeForce NOW members can take advantage of upgraded visuals across nearly all of their devices.

The next-gen update brings vastly improved visuals, a new photo mode, and content inspired by Netflix’s The Witcher series. It also adds RTX Global Illumination, as well as ray-traced ambient occlusion, shadows and reflections that add cinematic detail to the game.

Play as Geralt of Rivia on a quest to track down his adopted daughter Ciri, the Child of Prophecy — and the carrier of the powerful Elder Blood — across all your devices without needing to wait for the update to download and install. GeForce NOW RTX 3080 and Priority members can play with RTX ON and NVIDIA DLSS to explore the beautiful open world of The Witcher at high frame rates on nearly any device — from Macs to mobile devices and more.

Get in Sync

The GeForce NOW 2.0.47 app update begins rolling out this week with support for syncing Ubisoft Connect accounts with your GeForce NOW library.

Ubisoft Connect Library GeForce NOW
The 2.0.47 app update brings Ubisoft Connect library syncing.

Members will be able to get to their Ubisoft games faster and easier with this new game-library sync for Ubisoft Connect. Once synced, members will be automatically logged into their Ubisoft account across all devices when streaming supported GeForce NOW games purchased directly from the Ubisoft or Epic Games Store. These include titles like Rainbow Six Siege and Far Cry 6.

The update also adds improvements to voice chat with Chromebook built-in mics, as well as bug fixes. Look for the update to hit PC, Mac and browser clients in the coming days.

‘Tis the Seasons

Fortnite Chapter 4 on GeForce NOW
Fortnite Chapter 4 is available to play in the cloud.

The action never stops on GeForce NOW. This week brings updates to some of the hottest titles streaming from the cloud, and eight new games to play.

Members can jump into Fortnite Chapter 4, now available on GeForce NOW. The chapter features a new island, newly forged weapons, a new realm and new ways to get around, whether riding a dirt bike or rolling around in a snowball. A new cast of combatants is also available, including Geralt of Rivia himself.

Genshin Impact’s Version 3.3 “All Senses Clear, All Existence Void” is also available to stream on GeForce NOW, bringing a new season of events, a new card game called the Genius Invokation TCG, and two powerful allies — the Wanderer and Faruzan — for more stories, fun and challenges.

Here’s the full list of games coming to the cloud this week:

A GeForce NOW paid membership makes a great present for the gamer in your life, so give the gift of gaming with a GeForce NOW gift card. It’s the perfect stocking stuffer or last-minute treat for yourself or a buddy.

Finally, with The Witcher 3: Wild Hunt — Complete Edition on the way, we need to know – Which Geralt are you today? Tell us on Twitter or in the comments below.

The post The Hunt Is On: ‘The Witcher 3: Wild Hunt’ Next-Gen Update Coming to GeForce NOW appeared first on NVIDIA Blog.

Read More

‘23 and AV: Transportation Industry to Drive Into Metaverse, Cloud Technologies

‘23 and AV: Transportation Industry to Drive Into Metaverse, Cloud Technologies

As the autonomous vehicle industry enters the next year, it will start navigating into even greater technology frontiers.

Next-generation vehicles won’t just be defined by autonomous driving capabilities. Everything from the design and production process to the in-vehicle experience is entering a new era of digitization, efficiency, safety and intelligence.

These trends arrive after a wave of breakthroughs in 2022. More automakers announced plans to build software-defined vehicles on the NVIDIA DRIVE Orin system-on-a-chip — including Jaguar Land Rover, NIO, Polestar, Volvo Cars and Xpeng. And in-vehicle compute pushed the envelope with the next-generation NVIDIA DRIVE Thor platform.

Delivering up to 2,000 trillion floating operations per second, DRIVE Thor unifies autonomous driving and cockpit functions on a single computer for unprecedented speed and efficiency.

In the coming year, the industry will see even more wide-ranging innovations begin to take hold, as industrial metaverse and cloud technologies become more prevalent.

Simulation technology for AV development has also flourished in the past year. New tools and techniques on NVIDIA DRIVE Sim, including using AI tools for training and validation, have narrowed the gap between the virtual and real worlds.

Here’s what to expect for intelligent transportation in 2023.

Enter the Metaverse

The same NVIDIA Omniverse platformthat serves as the foundation of DRIVE Sim for AV development is also revolutionizing the automotive product cycle. Automakers can leverage Omniverse to unify the 3D design and simulation pipelines for vehicles, and build persistent digital twins of their production facilities.

Designers can collaborate between 3D software ecosystems from anywhere in the world, in real time, with Omniverse. With full fidelity RTX ray tracing showing physically accurate lighting and reflections and physical behavior, vehicle designs can be more acutely evaluated and tested before physical prototyping ever begins.

Production is the next step in this process, and it requires thousands of parts and workers moving in harmony. With Omniverse, automakers can develop a unified view of their manufacturing processes across plants to streamline operations.

Planners can access the full-fidelity digital twin of the factory, reviewing and optimizing as needed. Every change can be quickly evaluated and validated in virtual, then implemented in the real world to ensure maximum efficiency and ergonomics for factory workers.

Customers can also benefit from enhanced product experiences. Full-fidelity, real-time car configurators, 3D simulations of vehicles, demonstrations in augmented reality and virtual test drives all help bring the vehicle to the customer.

These technologies bridge the gap between the digital and the physical, as the buying experience evolves to include both physical retail spaces and online engagement.

Cloud Migration

As remote work becomes a permanent fixture, cloud capabilities are proving vital to growing industries, including transportation.

Looking ahead, AV developers will be able to access a comprehensive suite of services using NVIDIA Omniverse Cloud to design, deploy and experience metaverse applications anywhere. These applications include simulation, in-vehicle experiences and car configurators.

With cloud-based simulation, AV engineers can generate physically based sensor data and traffic scenarios to test and validate self-driving technology. Developers can also use simulation to design intelligent vehicle interiors.

An autonomous test vehicle running in simulation.

These next-generation cabins will feature personalized entertainment, including streaming content. With the NVIDIA GeForce NOW cloud gaming service, passengers will be able to stream over 1,000 titles from the cloud into the vehicle while charging or waiting to pick up passengers.

Additionally, Omniverse Cloud enables automakers to offer a virtual showroom for an immersive experience to customize a vehicle before purchasing it from anywhere in the world.

Individualized Interiors

Autonomous driving capabilities will deliver a smoother, safer driving experience for all road users. As driving functions become more automated across the industry, vehicle interiors are taking on a bigger role for automakers to create branded experiences.

In addition to gaming, advances in AI and in-vehicle compute are enabling a range of new infotainment technologies, including digital assistants, occupant monitoring, AV visualization, video conferencing and more.

AI and cloud technologies provide personalized infotainment experiences for every passenger.

With NVIDIA DRIVE Concierge, automakers can provide these features across multiple displays in the vehicle. And with software-defined, centralized compute, they can continuously add new capabilities over the air.

This emerging cloud-first approach is transforming every segment of the AV industry, from developing vehicles and self-driving systems to operating global fleets.

The post ‘23 and AV: Transportation Industry to Drive Into Metaverse, Cloud Technologies appeared first on NVIDIA Blog.

Read More

IOM and Microsoft release first-ever differentially private synthetic dataset to counter human trafficking

IOM and Microsoft release first-ever differentially private synthetic dataset to counter human trafficking

Migrants rescued last March in the Channel of Sicily by Italian Coast Guard (File photo). © Francesco Malavolta/IOM 2015

Microsoft is home to a diverse team of researchers focused on supporting a healthy global society, including finding ways technology can address human rights problems affecting the most vulnerable populations around the world. With a multi-disciplinary background in human-computer interaction, data science, and the social sciences, the research team partners with community, governmental, and nongovernmental organizations to create open technologies that enable scalable responses to such challenges.  

The United Nations’ International Organization for Migration (IOM) provides direct assistance and support to migrants around the world, as well as victims and survivors of human trafficking. IOM is dedicated to promoting humane and orderly migration by providing services to governments and migrants in its 175 member countries. It recently reported 50 million victims of forced labor globally, including 3.3 million children, 6.3 million in commercial sexual exploitation, and 22 million trapped in forced marriages. Understanding and addressing problems at this scale requires technology to help anti-trafficking actors and domain experts gather and translate real-world data into evidence that can inform policies and build support systems. 

According to IOM, migrants and displaced people represent some of the most vulnerable populations in society. The organization explains that, “while human mobility can be a source of prosperity, innovation, and sustainable developmentevery migration journey can include risks to safety, which are exacerbated during times of crisis, or when people face extreme vulnerability as they are forced to migrate amid a lack of safe and regular migration pathways.

Spotlight: On-Demand EVENT

Microsoft Research Summit 2022

On-Demand
Watch now to learn about some of the most pressing questions facing our research community and listen in on conversations with 120+ researchers around how to ensure new technologies have the broadest possible benefit for humanity.

Today, using software developed by Microsoft researchers, IOM released its second synthetic dataset from trafficking victim case records, the first ever public dataset to describe victim-perpetrator relations. The synthetic dataset is also the first of its kind to be generated with differential privacy, providing an additional security guarantee for multiple data releases, which enables the sharing of more data and allows more rigorous research to be conducted while protecting privacy and civil liberties. 

The new data release builds on several years of collaboration between Microsoft and IOM to support safe data sharing of victim case records in ways that can inform collective action across the anti-trafficking community. This collaboration began in July 2019 when IOM joined the accelerator program of the Tech Against Trafficking (TAT) coalition, with the goal of advancing the privacy and utility of data made available through the Counter Trafficking Data Collaborative (CTDC) data hub – the first global portal on human trafficking case data. Since then, IOM and Microsoft have collaborated to improve the ways data on identified victims and survivors—as well as their accounts of perpetrators—can be used to combat the proliferation of human trafficking.  

“We are grateful to Microsoft Research for our partnership over almost four years to share data while protecting the safety and privacy of victims and survivors of trafficking.”

– Monica Goracci, IOM’s Director of Programme Support and Migration Management

The critical importance of data privacy when working with vulnerable populations 

When publishing data on victims of trafficking, all efforts must be taken to ensure that traffickers are wholly prevented from identifying known victims in published datasets. It is also important to protect individuals’ privacy to avoid stigma or other potential forms of harm or (re)traumatization. Data statistics accuracy is another concern: the statistics must simultaneously enable researchers and analysts to guarantee victims’ privacy and extract useful insights from the dataset containing personal information. This is critically important: if a privacy method were to over- or under-report a given pattern in victim cases, it could mislead decision makers to misdirect scarce resources and therefore fail to tackle the originating problem.

The collaboration between IOM and Microsoft was founded on the idea that rather than redacting sensitive data to create privacy, synthetic datasets can be generated in ways that accurately capture the structure and statistics of underlying sensitive datasets, while remaining private by design. But not all synthetic data comes with formal guarantees of data privacy or accuracy. Therefore, building trust in synthetic data requires communicating how well the synthetic data represents the actual sensitive data, while ensuring that these comparisons do not create privacy risks themselves.

From this founding principle, along with the need to accurately report case counts broken down by different combinations of attributes (e.g., age range, gender, nationality), a solution emerged: to release synthetic data alongside privacy-preserving counts of cases, matching all short combinations of case attributes. The aggregate data thereby supports both evaluation of synthetic data quality and retrieval of accurate counts for official reporting. Through this collaboration and the complementary nature of synthetic data and aggregate data—together with interactive interfaces with which to view and explore both datasets—the open-source Synthetic Data Showcase software was developed.

In September 2021, IOM used Synthetic Data Showcase to release its first downloadable Global Synthetic Dataset, representing data from over 156,000 victims and survivors of trafficking across 189 countries and territories (where victims were first identified and supported by CTDC partners). The new Global Victim-Perpetrator Synthetic Dataset, released today, is CTDC’s second synthetic dataset produced using an updated version of Synthetic Data Showcase with added support for differential privacy. This new dataset includes IOM data from over 17,000 trafficking victim case records and their accounts of over 37,000 perpetrators who facilitated the trafficking process from 2005 to 2022.  Together, these datasets provide vital first-hand information on the socio-demographic profiles of victims, their accounts of perpetrators, types of exploitation, and the overall trafficking process—all of which are critical to better assist survivors and prosecute perpetrators. 

“Data privacy is crucial to the pursuit of efficient, targeted counter-trafficking policies and good migration governance.”

– Irina Todorova, Head of the Assistance to Vulnerable Migrants Unit at IOM’s Protection Division

A differentially private dataset 

In 2006, Microsoft researchers led the initial development of differential privacy, and today it represents the gold standard in privacy protection. It helps ensure that answers to data queries are similar, whether or not any individual data subject is in the dataset, and therefore cannot be used to infer the presence of specific individuals, either directly or indirectly.  

Existing algorithms for differentially private data synthesis typically create privacy by “hiding” actual combinations of attributes in a sea of fabricated or spurious attribute combinations that don’t specifically reflect what was in the original sensitive dataset.

This can be problematic if the presence of these fabricated attribute combinations misrepresents the real-world situation and misleads downstream decision making, policy making, or resource allocation to the detriment of the underlying population (e.g., encouraging policing of trafficking routes that have not actually been observed). 

When the research team encountered these challenges with existing differentially private synthesizers, they engaged fellow researchers at Microsoft to explore possible solutions. They explained the critical importance of reporting accurate counts of actual attribute combinations in support of statistical reporting and evidence-based intervention, and how the “feature” of fabricating unobserved combinations as a way of preserving privacy could be harmful when attempting to understand real-world patterns of exploitation.

Those colleagues had recently solved a similar problem in a different context: how to extract accurate counts of n-gram word combinations from a corpus of private text data. Their solution, recently published at the 2021 Conference on Neural Information Processing Systems, significantly outperformed the state of the art. In collaboration with the research team working with IOM, they adapted this solution into a new approach to generating differentially private marginals—counts of all short combinations of attributes that represented a differentially-private aggregate dataset.

Because differentially private data has the property that subsequent processing cannot increase privacy loss, any datasets generated from such aggregates retain the same level of privacy. This enabled the team to modify their existing approach to data synthesis—creating synthetic records by sampling attribute combinations until all attributes are accounted for—to extrapolate these noisily reported attribute combinations into full, differentially-private synthetic records. The result is precisely what IOM and similar organizations need to create a thriving data ecosystem in the fight against human trafficking and other human rights violations: accurate aggregate data for official reporting, synthetic data for interactive exploration and machine learning, and differential privacy guarantees that provide protection even over multiple overlapping data releases. 

This new synthesizer is now available to the community via Microsoft’s SmartNoise library within the OpenDP initiative. Unlike existing synthesizers, it provides strong control over the extent to which fabrication of spurious attribute combinations is allowed and augments synthetic datasets with “actual” aggregate data protected by differential privacy.

Access to private-yet-accurate patterns of attributes characterizing victim-perpetrator relationships allows stakeholders to advance the understanding of risk factors for vulnerability and carry out effective counter-trafficking interventions, all while keeping the victims’ identities private.

“The new dataset represents the first global collection of case data linking the profiles of trafficking victims and perpetrators ever made available to the public, while enabling strong privacy guarantees. It provides critical information to better assist survivors and prosecute offenders.” – Claire Galez-Davis, Data Scientist at IOM’s Protection Division. 

An intuitive new interface and public utility web application 

Solving problems at a global scale requires tools that make safe data sharing accessible wherever there is a need and in a way that is understandable by all stakeholders. The team wanted to construct an intuitive interface to help develop a shared evidence base and motivate collective action by the anti-trafficking community. They also wanted to ensure that the solution was available to anyone with a need to share sensitive data safely and responsibly. The new user interface developed through this work is now available as a public utility web application in which private data aggregation and synthesis are performed locally in the web browser, with no data ever leaving the user’s machine.

“I find the locally run web application incredibly interactive and intuitive. It is a lot easier for me to explain the data generation process and teach others to use the new web interface. As the data is processed locally in our computers, I don’t need to worry about data leaks.” – Lorraine Wong, Research Officer at IOM’s Protection Division.  

What’s next for the IOM and Microsoft collaboration 

Microsoft and IOM have made the solution publicly accessible for other organizations, including central government agencies. It can be used by any stakeholder who wants to collect and publish sensitive data while protecting individual privacy.

Through workshops and guidance on how to produce high-quality administrative data, the organizations plan to share evidence on exploitation and abuse to support Member States, other UN agencies, and counter-trafficking organizations around the world. This kind of administrative data is a key source of information providing baseline statistics that can be used to understand patterns, risk factors, trends, and modus operandi that are critical for policy response formulation.

For example, IOM has been collaborating with the UN Office on Drugs and Crime (UNODC) to establish international standards and guidance to support governments in producing high-quality administrative data. It has also been collaborating with the UN International Labour Organization (ILO) to index policy-oriented research on trafficking in a bibliography. Finally, IOM is producing an online course, including a module that includes guidance on synthetic data, to encourage safe data sharing from governments and frontline counter-trafficking agencies.

“Being able to publish more data than we have done in the past, and in an even safer way, is a great achievement,” explained Phineas Jasi, Data Management and Research Specialist at IOM’s Protection Division. He added that “The aim is for these data to inform the evidence base on human trafficking, which in turns helps devise efficient and targeted counter-trafficking policies and achieve good migration governance.” 

Translating data into evidence is the goal of the related ShowWhy application from the same Microsoft research team, which guides domain experts through the end-to-end process of developing causal evidence from observational data. Just like Synthetic Data Showcase, it makes advanced data science capabilities accessible to domain experts through a suite of interactive, no-code user interfaces. 

“Driving a coordinated global response against human trafficking requires removing traditional barriers to both data access and data analysis,” said Darren Edge, Director at Microsoft Research. “With our Synthetic Data Showcase and ShowWhy applications, we are aiming to empower domain experts to develop causal evidence for themselves, from sensitive data that couldn’t otherwise be shared, and use this to inform collective action with a precision and scale that couldn’t otherwise be imagined.” 

The post IOM and Microsoft release first-ever differentially private synthetic dataset to counter human trafficking appeared first on Microsoft Research.

Read More