Implement serverless semantic search of image and live video with Amazon Titan Multimodal Embeddings

Implement serverless semantic search of image and live video with Amazon Titan Multimodal Embeddings

In today’s data-driven world, industries across various sectors are accumulating massive amounts of video data through cameras installed in their warehouses, clinics, roads, metro stations, stores, factories, or even private facilities. This video data holds immense potential for analysis and monitoring of incidents that may occur in these locations. From fire hazards to broken equipment, theft, or accidents, the ability to analyze and understand this video data can lead to significant improvements in safety, efficiency, and profitability for businesses and individuals.

This data allows for the derivation of valuable insights when combined with a searchable index. However,traditional video analysis methods often rely on manual, labor-intensive processes, making it challenging to scale and efficient. In this post, we introduce semantic search, a technique to find incidents in videos based on natural language descriptions of events that occurred in the video. For example, you could search for “fire in the warehouse” or “broken glass on the floor.” This is where multi-modal embeddings come into play. We introduce the use of the Amazon Titan Multimodal Embeddings model, which can map visual as well as textual data into the same semantic space, allowing you to use textual description and find images containing that semantic meaning. This semantic search technique allows you to analyze and understand frames from video data more effectively.

We walk you through constructing a scalable, serverless, end-to-end semantic search pipeline for surveillance footage with Amazon Kinesis Video Streams, Amazon Titan Multimodal Embeddings on Amazon Bedrock, and Amazon OpenSearch Service. Kinesis Video Streams makes it straightforward to securely stream video from connected devices to AWS for analytics, machine learning (ML), playback, and other processing. It enables real-time video ingestion, storage, encoding, and streaming across devices. Amazon Bedrock is a fully managed service that provides access to a range of high-performing foundation models from leading AI companies through a single API. It offers the capabilities needed to build generative AI applications with security, privacy, and responsible AI. Amazon Titan Multimodal Embeddings, available through Amazon Bedrock, enables more accurate and contextually relevant multimodal search. It processes and generates information from distinct data types like text and images. You can submit text, images, or a combination of both as input to use the model’s understanding of multimodal content. OpenSearch Service is a fully managed service that makes it straightforward to deploy, scale, and operate OpenSearch. OpenSearch Service allows you to store vectors and other data types in an index, and offers sub second query latency even when searching billions of vectors and measuring the semantical relatedness, which we use in this post.

We discuss how to balance functionality, accuracy, and budget. We include sample code snippets and a GitHub repo so you can start experimenting with building your own prototype semantic search solution.

Overview of solution

The solution consists of three components:

  • First, you extract frames of a live stream with the help of Kinesis Video Streams (you can optionally extract frames of an uploaded video file as well using an AWS Lambda function). These frames can be stored in an Amazon Simple Storage Service (Amazon S3) bucket as files for later processing, retrieval, and analysis.
  • In the second component, you generate an embedding of the frame using Amazon Titan Multimodal Embeddings. You store the reference (an S3 URI) to the actual frame and video file, and the vector embedding of the frame in OpenSearch Service.
  • Third, you accept a textual input from the user to create an embedding using the same model and use the API provided to query your OpenSearch Service index for images using OpenSearch’s intelligent vector search capabilities to find images that are semantically similar to your text based on the embeddings generated by the Amazon Titan Multimodal Embeddings model.

This solution uses Kinesis Video Streams to handle any volume of streaming video data without consumers provisioning or managing any servers. Kinesis Video Streams automatically extracts images from video data in real time and delivers the images to a specified S3 bucket. Alternatively, you can use a serverless Lambda function to extract frames of a stored video file with the Python OpenCV library.

The second component converts these extracted frames into vector embeddings directly by calling the Amazon Bedrock API with Amazon Titan Multimodal Embeddings.

Embeddings are a vector representation of your data that capture semantic meaning. Generating embeddings of text and images using the same model helps you measure the distance between vectors to find semantic similarities. For example, you can embed all image metadata and additional text descriptions into the same vector space. Close vectors indicate that the images and text are semantically related. This allows for semantic image search—given a text description, you can find relevant images by retrieving those with the most similar embeddings, as represented in the following visualization.

Visualisation of text and image embeddings

Starting December 2023, you can use the Amazon Titan Multimodal Embeddings model for use cases like searching images by text, image, or a combination of text and image. It produces 1,024-dimension vectors (by default), enabling highly accurate and fast search capabilities. You can also configure smaller vector sizes to optimize for cost vs. accuracy. For more information, refer to Amazon Titan Multimodal Embeddings G1 model.

The following diagram visualizes the conversion of a picture to a vector representation. You split the video files into frames and save them in a S3 bucket (Step 1). The Amazon Titan Multimodal Embeddings model converts these frames into vector embeddings (Step 2). You store the embeddings of the video frame as a k-nearest neighbors (k-NN) vector in your OpenSearch Service index with the reference to the video clip and the frame in the S3 bucket itself (Step 3). You can add additional descriptions in an additional field.

Conversion of a picture to a vector representation

The following diagram visualizes the semantic search with natural language processing (NLP). The third component allows you to submit a query in natural language (Step 1) for specific moments or actions in a video, returning a list of references to frames that are semantically similar to the query. The Amazon Titan MultimodalEmbeddings model (Step 2) converts the submitted text query into a vector embedding (Step 3). You use this embedding to look up the most similar embeddings (Step 4). The stored references in the returned results are used to retrieve the frames and video clip to the UI for replay (Step 5).

semantic search with natural language processing

The following diagram shows our solution architecture.

Solution Architecture

The workflow consists of the following steps:

  1. You stream live video to Kinesis Video Streams. Alternatively, upload existing video clips to an S3 bucket.
  2. Kinesis Video Streams extracts frames from the live video to an S3 bucket. Alternatively, a Lambda function extracts frames of the uploaded video clips.
  3. Another Lambda function collects the frames and generates an embedding with Amazon Bedrock.
  4. The Lambda function inserts the reference to the image and video clip together with the embedding as a k-NN vector into an OpenSearch Service index.
  5. You submit a query prompt to the UI.
  6. A new Lambda function converts the query to a vector embedding with Amazon Bedrock.
  7. The Lambda function searches the OpenSearch Service image index for any frames matching the query and the k-NN for the vector using cosine similarity and returns a list of frames.
  8. The UI displays the frames and video clips by retrieving the assets from Kinesis Video Streams using the saved references of the returned results. Alternatively, the video clips are retrieved from the S3 bucket.

This solution was created with AWS Amplify. Amplify is a development framework and hosting service that assists frontend web and mobile developers in building secure and scalable applications with AWS tools quickly and efficiently.

Optimize for functionality, accuracy, and cost

Let’s conduct an analysis of this proposed solution architecture to determine opportunities for enhancing functionality, improving accuracy, and reducing costs.

Starting with the ingestion layer, refer to Design considerations for cost-effective video surveillance platforms with AWS IoT for Smart Homes to learn more about cost-effective ingestion into Kinesis Video Streams.

The extraction of video frames in this solution is configured using Amazon S3 delivery with Kinesis Video Streams. A key trade-off to evaluate is determining the optimal frame rate and resolution to meet the use case requirements balanced with overall system resource utilization. The frame extraction rate can range from as high as five frames per second to as low as one frame every 20 seconds. The choice of frame rate can be driven by the business use case, which directly impacts embedding generation and storage in downstream services like Amazon Bedrock, Lambda, Amazon S3, and the Amazon S3 delivery feature, as well as searching within the vector database. Even when uploading pre-recorded videos to Amazon S3, thoughtful consideration should still be given to selecting an appropriate frame extraction rate and resolution. Tuning these parameters allows you to balance your use case accuracy needs with consumption of the mentioned AWS services.

The Amazon Titan Multimodal Embeddings model outputs a vector representation with an default embedding length of 1,024 from the input data. This representation carries the semantic meaning of the input and is best to compare with other vectors for optimal similarity. For best performance, it’s recommended to use the default embedding length, but it can have direct impact on performance and storage costs. To increase performance and reduce costs in your production environment, alternate embedding lengths can be explored, such as 256 and 384. Reducing the embedding length also means losing some of the semantic context, which has a direct impact on accuracy, but improves the overall speed and optimizes the storage costs.

OpenSearch Service offers on-demand, reserved, and serverless pricing options with general purpose or storage optimized machine types to fit different workloads. To optimize costs, you should select reserved instances to cover your production workload base, and use on-demand, serverless, and convertible reservations to handle spikes and non-production loads. For lower-demand production workloads, a cost-friendly alternate option is using pgvector with Amazon Aurora PostgreSQL Serverless, which offers lower base consumption units as compared to Amazon OpenSearch Serverless, thereby lowering the cost.

Determining the optimal value of K in the k-NN algorithm for vector similarity search is significant for balancing accuracy, performance, and cost. A larger K value generally increases accuracy by considering more neighboring vectors, but comes at the expense of higher computational complexity and cost. Conversely, a smaller K leads to faster search times and lower costs, but may lower result quality. When using the k-NN algorithm with OpenSearch Service, it’s essential to carefully evaluate the K parameter based on your application’s priorities—starting with smaller values like K=5 or 10, then iteratively increasing K if higher accuracy is needed.

As part of the solution, we recommend Lambda as the serverless compute option to process frames. With Lambda, you can run code for virtually any type of application or backend service—all with zero administration. Lambda takes care of everything required to run and scale your code with high availability.

With high amounts of video data, you should consider binpacking your frame processing tasks and running a batch computing job to access a large amount of compute resources. The combination of AWS Batch and Amazon Elastic Container Service (Amazon ECS) can efficiently provision resources in response to jobs submitted in order to eliminate capacity constraints, reduce compute costs, and deliver results quickly.

You will incur costs when deploying the GitHub repo in your account. When you are finished examining the example, follow the steps in the Clean up section later in this post to delete the infrastructure and stop incurring charges.

Refer to the README file in the repository to understand the building blocks of the solution in detail.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Deploy the Amplify application

Complete the following steps to deploy the Amplify application:

  1. Clone the repository to your local disk with the following command:
    git clone https://github.com/aws-samples/Serverless-Semantic-Video-Search-Vector-Database-and-a-Multi-Modal-Generative-Al-Embeddings-Model

  2. Change the directory to the cloned repository.
  3. Initialize the Amplify application:
    amplify init

  4. Clean install the dependencies of the web application:
    npm ci

  5. Create the infrastructure in your AWS account:
    amplify push

  6. Run the web application in your local environment:
    npm run dev

Create an application account

Complete the following steps to create an account in the application:

  1. Open the web application with the stated URL in your terminal.
  2. Enter a user name, password, and email address.
  3. Confirm your email address with the code sent to it.

Upload files from your computer

Complete the following steps to upload image and video files stored locally:

  1. Choose File Upload in the navigation pane.
  2. Choose Choose files.
  3. Select the images or videos from your local drive.
  4. Choose Upload Files.

Upload files from a webcam

Complete the following steps to upload images and videos from a webcam:

  1. Choose Webcam Upload in the navigation pane.
  2. Choose Allow when asked for permissions to access your webcam.
  3. Choose to either upload a single captured image or a captured video:
    1. Choose Capture Image and Upload Image to upload a single image from your webcam.
    2. Choose Start Video Capture, Stop Video Capture, and finally
      Upload Video to upload a video from your webcam.

Search videos

Complete the following steps to search the files and videos you uploaded.

  1. Choose Search in the navigation pane.
  2. Enter your prompt in the Search Videos text field. For example, we ask “Show me a person with a golden ring.”
  3. Lower the confidence parameter closer to 0 if you see fewer results than you were originally expecting.

The following screenshot shows an example of our results.

Example of results

Clean up

Complete the following steps to clean up your resources:

  1. Open a terminal in the directory of your locally cloned repository.
  2. Run the following command to delete the cloud and local resources:
    amplify delete

Conclusion

A multi-modal embeddings model has the potential to revolutionize the way industries analyze incidents captured with videos. AWS services and tools can help industries unlock the full potential of their video data and improve their safety, efficiency, and profitability. As the amount of video data continues to grow, the use of multi-modal embeddings will become increasingly important for industries looking to stay ahead of the curve. As innovations like Amazon Titan foundation models continue maturing, they will reduce the barriers to use advanced ML and simplify the process of understanding data in context. To stay updated with state-of-the-art functionality and use cases, refer to the following resources:


About the Authors

Thorben Sanktjohanser is a Solutions Architect at Amazon Web Services supporting media and entertainment companies on their cloud journey with his expertise. He is passionate about IoT, AI/ML and building smart home devices. Almost every part of his home is automated, from light bulbs and blinds to vacuum cleaning and mopping.

Talha Chattha is an AI/ML Specialist Solutions Architect at Amazon Web Services, based in Stockholm, serving key customers across EMEA. Talha holds a deep passion for generative AI technologies. He works tirelessly to deliver innovative, scalable, and valuable ML solutions in the space of large language models and foundation models for his customers. When not shaping the future of AI, he explores scenic European landscapes and delicious cuisines.

Victor Wang is a Sr. Solutions Architect at Amazon Web Services, based in San Francisco, CA, supporting innovative healthcare startups. Victor has spent 6 years at Amazon; previous roles include software developer for AWS Site-to-Site VPN, AWS ProServe Consultant for Public Sector Partners, and Technical Program Manager for Amazon RDS for MySQL. His passion is learning new technologies and traveling the world. Victor has flown over a million miles and plans to continue his eternal journey of exploration.

Akshay Singhal is a Sr. Technical Account Manager at Amazon Web Services, based in San Francisco Bay Area, supporting enterprise support customers focusing on the security ISV segment. He provides technical guidance for customers to implement AWS solutions, with expertise spanning serverless architectures and cost-optimization. Outside of work, Akshay enjoys traveling, Formula 1, making short movies, and exploring new cuisines.

Read More

Prioritizing employee well-being: An innovative approach with generative AI and Amazon SageMaker Canvas

Prioritizing employee well-being: An innovative approach with generative AI and Amazon SageMaker Canvas

In today’s fast-paced corporate landscape, employee mental health has become a crucial aspect that organizations can no longer overlook. Many companies recognize that their greatest asset lies in their dedicated workforce, and each employee plays a vital role in collective success. As such, promoting employee well-being by creating a safe, inclusive, and supportive environment is of utmost importance.

However, quantifying and assessing mental health can be a daunting task. Traditional methods like employee well-being surveys or manual approaches may not always provide the most accurate or actionable insights. In this post, we explore an innovative solution that uses Amazon SageMaker Canvas for mental health assessment at the workplace.

We delve into the following topics:

  • The importance of mental health in the workplace
  • An overview of the SageMaker Canvas low-code no-code platform for building machine learning (ML) models
  • The mental health assessment model:
    • Data preparation using the chat feature
    • Training the model on SageMaker Canvas
    • Model evaluation and performance metrics
  • Deployment and integration:
    • Deploying the mental health assessment model
    • Integrating the model into workplace wellness programs or HR systems

In this post, we use a dataset from a 2014 survey that measures attitudes towards mental health and frequency of mental health disorders in the tech workplace, then we aggregate and prepare data for an ML model using Amazon SageMaker Data Wrangler for a tabular dataset on SageMaker Canvas. Then we train, build, test, and deploy the model using SageMaker Canvas, without writing any code.

Discover how SageMaker Canvas can revolutionize the way organizations approach employee mental health assessment, empowering them to create a more supportive and productive work environment. Stay tuned for insightful content that could reshape the future of workplace well-being.

Importance of mental health

Maintaining good mental health in the workplace is crucial for both employees and employers. In today’s fast-paced and demanding work environment, the mental well-being of employees can have a significant impact on productivity, job satisfaction, and overall company success. At Amazon, where innovation and customer obsession are at the core of our values, we understand the importance of fostering a mentally healthy workforce.

By prioritizing the mental well-being of our employees, we create an environment where they can thrive and contribute their best. This helps us deliver exceptional products and services. Amazon supports mental health by providing access to resources and support services. All U.S. employees and household members are eligible to receive five free counseling sessions, per issue every year, via Amazon’s Global Employee Assistance Program (EAP), Resources for Living. Employees can also access mental health care 24/7 through a partnership with the app Twill—a digital, self-guided mental health program. Amazon also partners with Brightline, a leading provider in virtual mental health support for children and teens.

Solution overview

SageMaker Canvas brings together a broad set of capabilities to help data professionals prepare, build, train, and deploy ML models without writing any code. SageMaker Data Wrangler has also been integrated into SageMaker Canvas, reducing the time it takes to import, prepare, transform, featurize, and analyze data. In a single visual interface, you can complete each step of a data preparation workflow: data selection, cleansing, exploration, visualization, and processing. Custom Spark commands can also expand the over 300 built-in data transformations. The built-in Data Quality and Insights report guides you in performing appropriate data cleansing, verifying data quality, and detecting anomalies such as duplicate rows and target leakage. Other analyses are also available to help you visualize and understand your data.

In this post, we try to understand the factors contributing to the mental health of an employee in the tech industry in a systematic manner. We begin by understanding the feature columns, presented in the following table.

Survey Attribute Survey Attribute Description
Timestamp Timestamp when survey was taken
Age Age of person taking survey
Gender Gender of person taking survey
Country Country of person taking survey
state If you live in the United States, which state or territory do you live in?
self_employed Are you self-employed?
family_history Do you have a family history of mental illness?
treatment Have you sought treatment for a mental health condition?
work_interfere If you have a mental health condition, do you feel that it interferes with your work?
no_employees How many employees does your company or organization have?
remote_work Do you work remotely (outside of an office) at least 50% of the time?
tech_company Is your employer primarily a tech company/organization?
benefits Does your employer provide mental health benefits?
care_options Do you know the options for mental health care your employer provides?
wellness_program Has your employer ever discussed mental health as part of an employee wellness program?
seek_help Does your employer provide resources to learn more about mental health issues and how to seek help?
anonymity Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources?
leave How easy is it for you to take medical leave for a mental health condition?
mentalhealthconsequence Do you think that discussing a mental health issue with your employer would have negative consequences?
physhealthconsequence Do you think that discussing a physical health issue with your employer would have negative consequences?
coworkers Would you be willing to discuss a mental health issue with your coworkers?
physhealthinterview Would you bring up a physical health issue with a potential employer in an interview?
mentalvsphysical Do you feel that your employer takes mental health as seriously as physical health?
obs_consequence Have you heard of or observed negative consequences for coworkers with mental health conditions in your workplace?
comments Any additional notes or comments

Prerequisites

You should complete the following prerequisites before building this model:

Log in to SageMaker Canvas

When the initial setup is complete, you can access SageMaker Canvas with any of the following methods, depending on your environment’s setup:

Import the dataset into SageMaker Canvas

In SageMaker Canvas, you can see quick actions to get started building and using ML and generative artificial intelligence (AI) models, with a no code platform. Feel free to explore any of the out-of-the-box models.

We start from creating a data flow. A data flow in SageMaker Canvas is used to build a data preparation pipeline that can be scheduled to automatically import, prepare, and feed into a model build. With a data flow, you can prepare data using generative AI, over 300 built-in transforms, or custom Spark commands.

Complete the following steps:

  • Choose Prepare and analyze data.
  • For Data flow name, enter a name (for example, AssessingMentalHealthFlow).
  • Choose Create.

SageMaker Data Wrangler will open.

You can import data from multiple sources, ranging from AWS services, such as Amazon Simple Storage Service (Amazon S3) and Amazon Redshift, to third-party or partner services, including Snowflake or Databricks. To learn more about importing data to SageMaker Canvas, see Import data into Canvas.

  • Choose Import data, then choose Tabular.
  • Upload the dataset you downloaded in the prerequisites section.

After a successful import, you will be presented with a preview of the data, which you can browse.

  • Choose Import data to finish this step.

Run a Data Quality and Insights report

After you import the dataset, the SageMaker Data Wrangler data flow will open. You can run a Data Quality and Insights Report, which will perform an analysis of the data to determine potential issues to address during data preparation. Complete the following steps:

  • Choose Run Data quality and insights report.

  • For Analysis name, enter a name.
  • For Target column, choose treatment.
  • For Problem type, select Classification.
  • For Data size, choose Sampled dataset.
  • Choose Create.

You are presented with the generated report, which details any high priority warnings, data issues, and other insights to be aware of as you add data transformations and move along the model building process.

In this specific dataset, we can see that there are 27 features of different types, very little missing data, and no duplicates. To dive deeper into the report, refer to Get Insights On Data and Data Quality. To learn about other available analyzes, see Analyze and Visualize.

Prepare your data

As expected in the ML process, your dataset may require transformations to address issues such as missing values, outliers, or perform feature engineering prior to model building. SageMaker Canvas provides ML data transforms to clean, transform, and prepare your data for model building without having to write code. The transforms used are added to the model recipe, a record of the data preparation done on your data before building the model. You can refer to these advanced transformations and add them as transformation steps within your Data Wrangler flow.

Alternatively, you can use SageMaker Canvas to chat with your data and add transformations. We explore this option with some examples on our sample dataset.

Use the chat feature for exploratory analysis and building transformations

Before you use the chat feature to prepare data, note the following:

  • Chat for data prep requires the AmazonSageMakerCanvasAIServicesAccess policy. For more information, see AWS managed policy: AmazonSageMakerCanvasAIServicesAccess.
  • Chat for data prep requires access to Amazon Bedrock and the Anthropic Claude v2 model within it. For more information, see Model access.
  • You must run SageMaker Canvas data prep in the same AWS Region as the Region where you’re running your model. Chat for data prep is available in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) Regions.

To chat with your data, complete the following steps:

  • Open your SageMaker Canvas data flow.
  • Open your dataset by choosing Source or Data types.

  • Choose Chat for data prep and specify your prompts in the chat window.

  • Optionally, if an analysis has been generated by your query, choose Add to analyses to reference it for later.
  • Optionally, if you’ve transformed your data using a prompt, do the following:
  1. Choose Preview to view the results.
  2. Optionally modify the code in the transform and choose Update.
  3. If you’re happy with the results of the transform, choose Add to steps to add it to the steps pane.

Let’s try a few exploratory analyses and transformations through the chat feature.

In the following example, we ask “How many rows does the dataset have?”

In the following example, we drop the columns Timestamp, Country, state, and comments, because these features will have least impact for classification of our model. Choose View code to see the generated Spark code that performs the transformation, then choose Add to steps to add the transformation to the data flow.

You can provide a name and choose Update to save the data flow.

In the next example, we ask “Show me all unique ages sorted.”

Some ages are negative, so we should filter on valid ages. We drop rows with age below 0 or more than 100 and add this to the steps.

In the following example, we ask “Create a bar chart for null values in the dataset.”

Then we ask for a bar chart for the treatment column.

In the following example, we ask for a bar chart for the work_interfere column.

In the column work_interfere, we replace the NA values with “Don’t know.” We want to make the model weight missing values just as it weights people that have replied “Don’t know.”

For the column self_employed, we want to replace NA with “No” to make the model weight missing values just as it weights people that have replied “NA.”

You can choose to add any other transformations as needed. If you’ve followed the preceding transformations, your steps should look like the following screenshot.

Perform an analysis on the transformed data

Now that transformations have been done on the data, you may want to perform analyses to make sure they haven’t affected data integrity.

To do so, navigate to the Analyses tab to create an analysis. For this example, we create a feature correlation analysis with the correlation type linear.

The analysis report will generate a correlation matrix. The correlation matrix measures the positive or negative correlation of features among themselves, between each other. A value closer to 1 means positive correlation, and a value closer to -1 means negative correlation.

Linear feature correlation is based on Pearson’s correlation. To find the relationship between a numeric variable (like age or income) and a categorical variable (like gender or education level), we first assign numeric values to the categories in a way that allows them to best predict the numeric variable. Then we calculate the correlation coefficient, which measures how strongly the two variables are related.

Linear categorical to categorical correlation is not supported.

Numeric to numeric correlation is in the range [-1, 1], where 0 implies no correlation, 1 implies perfect correlation, and -1 implies perfect inverse correlation. Numeric to categorical and categorical to categorical correlations are in the range [0, 1], where 0 implies no correlation and 1 implies perfect correlation.

Features that are not either numeric or categorical are ignored.

The following table lists for each feature what is the most correlated feature to it.

Feature Most Correlated Feature Correlation
Age (numeric) Gender (categorical) 0.248216
Gender (categorical) Age (numeric) 0.248216
seek_help (categorical) Age (numeric) 0.175808
no_employees (categorical) Age (numeric) 0.166486
benefits (categorical) Age (numeric) 0.157729
remote_work (categorical) Age (numeric) 0.139105
care_options (categorical) Age (numeric) 0.1183
wellness_program (categorical) Age (numeric) 0.117175
phys_health_consequence (categorical) Age (numeric) 0.0961159
work_interfere (categorical) Age (numeric) 0.0797424
treatment (categorical) Age (numeric) 0.0752661
mental_health_consequence (categorical) Age (numeric) 0.0687374
obs_consequence (categorical) Age (numeric) 0.0658778
phys_health_interview (categorical) Age (numeric) 0.0639178
self_employed (categorical) Age (numeric) 0.0628861
tech_company (categorical) Age (numeric) 0.0609773
leave (categorical) Age (numeric) 0.0601671
mental_health_interview (categorical) Age (numeric) 0.0600251
mental_vs_physical (categorical) Age (numeric) 0.0389857
anonymity (categorical) Age (numeric) 0.038797
coworkers (categorical) Age (numeric) 0.0181036
supervisor (categorical) Age (numeric) 0.0167315
family_history (categorical) Age (numeric) 0.00989271

The following figure shows our correlation matrix.

You can explore more analyses of different types. For more details, see Explore your data using visualization techniques.

Export the dataset and create a model

Return to the main data flow and run the SageMaker Data Wrangler validation flow. Upon successful validation, you are ready to export the dataset for model training.

Next, you export your dataset and build an ML model on top of it. Complete the following steps:

  • Open the expanded menu in the final transformation and choose Create model.

  • For Dataset name, enter a name.
  • Choose Export.

At this point, your mental health assessment dataset is ready for model training and testing.

  • Choose Create model.

  • For Model name, enter a name.
  • For Problem type, select Predictive analysis.

SageMaker Canvas suggested this based on the dataset, but you can override this for your own experimentation. For more information about ready-to-use models provided by SageMaker Canvas, see Use Ready-to-use models.

  • Choose Create.

  • For Target column, choose treatment as the column to predict.

Because Yes or No is predicted, SageMaker Canvas detected this is a two-category prediction model.

  • Choose Configure model to set configurations.

  • For Objective metric, leave as the default F1.

F1 averages two important metrics: precision and recall.

  • For Training method, select Auto.

This option selects the algorithm most relevant to your dataset and the best range of hyperparameters to tune model candidates. Alternatively, you could use the ensemble or hyperparameter optimization training options. For more information, see Training modes and algorithm support.

  • For Data split, specify an 80/20 configuration for training and validation, respectively.

  • Choose Save and then Preview model to generate a preview.

This preview runs on subset of data and provides information on estimated model accuracy and feature importance. Based on the results, you may still apply additional transformations to improve the estimated accuracy.

Although low impact features might add noise to the model, these may still be useful to describe situations specific to your use case. Always combine predictive power with your own context to determine which features to include.

You’re now ready to build the full model with either Quick build or Standard build. Quick build only supports datasets with fewer than 50,000 rows and prioritizes speed over accuracy, training fewer combinations of models and hyperparameters, for rapid prototyping or proving out value. Standard build prioritizes accuracy and is necessary for exporting the full Jupyter notebook used for training.

  • For this post, choose Standard build.

To learn more about how SageMaker Canvas uses training and validation datasets, see Evaluating Your Model’s Performance in Amazon SageMaker Canvas and SHAP Baselines for Explainability.

Your results may differ from those in this post. Machine learning introduces stochasticity in the model training process, which can lead to slight variations.

Here, we’ve built a model that will predict with about 87% accuracy whether an individual will seek mental health treatment. At this stage, think about how you could achieve a practical impact from the Machine Learning model. For example, here an organization may consider how they can apply the model to preemptively support individuals who’s attributes suggest they would seek treatment.

Review model metrics

Let’s focus on the first tab, Overview. Here, Column impact is the estimated importance of each attribute in predicting the target. Information here can help organizations gain insights that lead to actions based on the model. For example, we see that the work_interfere column has the most significant impact in predication for treatment. Additionally, better benefits and care_options increase the likelihood of employees opting in to treatment.

On the Scoring tab, we can visualize a Sankey (or ribbon) plot of the distribution of predicted values with respect to actual values, providing insight into how the model performed during validation.

For more detailed insights, we look at the Advanced metrics tab for metric values the model may have not been optimized for, the confusion matrix, and precision recall curve.

The advanced metrics suggest we can trust the resulting model. False positives (predicting an employee will opt in for treatment when they actually don’t) and false negatives (predicting an employee will opt out when they actually opt in) are low. High numbers for either may make us skeptical about the current build and more likely to revisit previous steps.

Test the model

Now let’s use the model for making predictions. Choose Predict to navigate to the Predict tab. SageMaker Canvas allows you to generate predictions in two forms:

  • Single prediction (single “what-if scenario”)
  • Batch prediction (multiple scenarios using a CSV file)

For a first test, let’s try a single prediction. Wait a few seconds for the model to load, and now you’re ready to generate new inferences. You can change the values to experiment with the attributes and their impact.

For example, let’s make the following updates:

  • Change work_interfere from Often to Sometimes
  • Change benefits from Yes to No

Choose Update and see if the treatment prediction is affected.

In SageMaker Canvas, you can generate batch predictions either manually or automatically on a schedule. Let’s try the manual approach. To learn about automating batch predictions, refer to Automate batch predictions.

  • In practice, use a dataset different from training for testing predictions. For this example though, lets use the same file as before. Be sure to remove the work_interfere column.
  • Choose Batch prediction and upload the downloaded file.
  • Choose Generate predictions.
  • When it’s complete, choose View to see the predictions.

Deploy the model

The final (optional) step of the SageMaker Canvas workflow for ML models is deploying the model. This uses SageMaker real-time inference endpoints to host the SageMaker Canvas model and expose an HTTPS endpoint for use by applications or developers.

  1. On the Deploy tab, choose Create deployment.
  2. For Deployment name, enter a name.
  3. For Instance type, choose an instance (for this post, ml.m5.2xlarge).
  4. Set Instance count to 1.
  5. Choose Deploy.

This instance configuration is sufficient for the demo. You can change the configuration later from the SageMaker Canvas UI or using SageMaker APIs. To learn more about auto scaling such workloads, see Automatically Scale Amazon SageMaker Models.

After the deployment is successful, you can invoke the endpoint using AWS SDKs or direct HTTPs calls. For more information, see Deploy models for real-time inference.

To learn more about model deployment, refer to Deploy your Canvas models to a SageMaker Endpoint and Deploy models for real-time inference.

Clean up

Make sure to log out from SageMaker Canvas by choosing Log out. Logging out of the SageMaker Canvas application will release all resources used by the workspace instance, therefore avoiding incurring additional unintended charges.

Summary

Mental health is a dynamic and evolving field, with new research and insights constantly emerging. Staying up to date with the latest developments and best practices can be challenging, especially in a public forum. Additionally, when discussing mental health, it’s essential to approach the topic with sensitivity, respect, and a commitment to providing accurate and helpful information.

In this post, we showcased an ML approach to building a mental health model using a sample dataset and SageMaker Canvas, a low-code no-code platform from AWS. This can serve as guidance for organizations looking to explore similar solutions for their specific needs. Implementing AI to assess employee mental health and offer preemptive support can yield a myriad of benefits. By promoting detection of potential mental health needs, intervention can be more personalized and reduce the risk of drastic complications in the future. A proactive approach can also enhance employee morale and productivity, mitigating the likelihood of absenteeism, turnover and ultimately leads to a healthier and more resilient workforce.. Overall, using AI for mental health prediction and support signifies a commitment to nurturing a supportive work environment where employees can thrive.

To explore more about SageMaker Canvas with industry-specific use cases, explore a hands-on workshop. To learn more about SageMaker Data Wrangler in SageMaker Canvas, refer to Prepare Data. You can also refer to the following YouTube video to learn more about the end-to-end ML workflow with SageMaker Canvas.

Although this post provides a technical perspective, we strongly encourage readers who are struggling with mental health issues to seek professional help. Remember, there is always help available for those who ask.

Together, let’s take a proactive step towards empowering mental health awareness and supporting those in need.


About the Authors

Rushabh Lokhande is a Senior Data & ML Engineer with AWS Professional Services Analytics Practice. He helps customers implement big data, machine learning, analytics solutions, and generative AI implementations. Outside of work, he enjoys spending time with family, reading, running, and playing golf.

Bruno Klein is a Senior Machine Learning Engineer with AWS Professional Services Analytics Practice. He helps customers implement big data analytics solutions and generative AI implementations. Outside of work, he enjoys spending time with family, traveling, and trying new food.

Ryan Gomes is a Senior Data & ML Engineer with AWS Professional Services Analytics Practice. He is passionate about helping customers achieve better outcomes through analytics, machine learning, and generative AI solutions in the cloud. Outside of work, he enjoys fitness, cooking, and spending quality time with friends and family.

Read More

Introducing Aurora: The first large-scale foundation model of the atmosphere

Introducing Aurora: The first large-scale foundation model of the atmosphere

satellite image of Storm Ciarán

When Storm Ciarán battered northwestern Europe in November 2023, it left a trail of destruction. The low-pressure system associated with Storm Ciarán set new records for England, marking it as an exceptionally rare meteorological event. The storm’s intensity caught many off guard, exposing the limitations of current weather-prediction models and highlighting the need for more accurate forecasting in the face of climate change. As communities grappled with the aftermath, the urgent question arose: How can we better anticipate and prepare for such extreme weather events? 

A recent study by Charlton-Perez et al. (2024) underscored the challenges faced by even the most advanced AI weather-prediction models in capturing the rapid intensification and peak wind speeds of Storm Ciarán. To help address those challenges, a team of Microsoft researchers developed Aurora, a cutting-edge AI foundation model that can extract valuable insights from vast amounts of atmospheric data. Aurora presents a new approach to weather forecasting that could transform our ability to predict and mitigate the impacts of extreme events—including being able to anticipate the dramatic escalation of an event like Storm Ciarán.  

A flexible 3D foundation model of the atmosphere

Aurora is a 1.3 billion parameter foundation model for high-resolution  forecasting of weather and atmospheric processes. Aurora is a flexible 3D Swin Transformer with 3D Perceiver-based encoders and decoders. At pretraining time, Aurora is optimised to minimise a loss on multiple heterogeneous datasets with different resolutions, variables, and pressure levels. The model is then fine-tuned in two stages: (1) short-lead time fine-tuning of the pretrained weights (2) long-lead time (rollout) fine-tuning using Low Rank Adaptation (LoRA). The fine-tuned models are then deployed to tackle a diverse collection of operational forecasting scenarios at different resolutions.
Figure 1: Aurora is a 1.3 billion parameter foundation model for high-resolution forecasting of weather and atmospheric processes. Aurora is a flexible 3D Swin Transformer with 3D Perceiver-based encoders and decoders. At pretraining time, Aurora is optimized to minimize a loss on multiple heterogeneous datasets with different resolutions, variables, and pressure levels. The model is then fine-tuned in two stages: (1) short-lead time fine-tuning of the pretrained weights and (2) long-lead time (rollout) fine-tuning using Low Rank Adaptation (LoRA). The fine-tuned models are then deployed to tackle a diverse collection of operational forecasting scenarios at different resolutions.

Aurora’s effectiveness lies in its training on more than a million hours of diverse weather and climate simulations, which enables it to develop a comprehensive understanding of atmospheric dynamics. This allows the model to excel at a wide range of prediction tasks, even in data-sparse regions or extreme weather scenarios. By operating at a high spatial resolution of 0.1° (roughly 11 km at the equator), Aurora captures intricate details of atmospheric processes, providing more accurate operational forecasts than ever before—and at a fraction of the computational cost of traditional numerical weather-prediction systems. We estimate that the computational speed-up that Aurora can bring over the state-of-the-art numerical forecasting system Integrated Forecasting System (IFS) is ~5,000x. 

Beyond its impressive accuracy and efficiency, Aurora stands out for its versatility. The model can forecast a broad range of atmospheric variables, from temperature and wind speed to air-pollution levels and concentrations of greenhouse gases. Aurora’s architecture is designed to handle heterogeneous gold standard inputs and generate predictions at different resolutions and levels of fidelity. The model consists of a flexible 3D Swin Transformer with Perceiver-based encoders and decoders, enabling it to process and predict a range of atmospheric variables across space and pressure levels. By pretraining on a vast corpus of diverse data and fine-tuning on specific tasks, Aurora learns to capture intricate patterns and structures in the atmosphere, allowing it to excel even with limited training data when it is being fine-tuned for a specific task. 

Fast prediction of atmospheric chemistry and air pollution

Sample predictions for total column nitrogen dioxide by Aurora compared to CAMS analysis. Aurora was initialised with CAMS analysis at 1 Sep 2022 00 UTC. Predicting atmospheric gasses correctly is extremely challenging due to their spatially heterogeneous nature. In particular, nitrogen dioxide, like most variables in CAMS, is skewed towards high values in areas with large anthropogenic emissions such as densely populated areas in East Asia. In addition, it exhibits a strong diurnal cycle; e.g., sunlight reduces background levels through a process called photolysis. Aurora accurately captures both the extremes and background levels.
Latitude-weighted root mean square error (RMSE) of Aurora relative to CAMS, where negative values (blue) mean that Aurora is better. The RMSEs are computed over the period Jun 2022 to Nov 2022 inclusive. Aurora matches or outperforms CAMS on 74% of the targets.
Figure 2: Aurora outperforms operational CAMS across many targets. (a) Sample predictions for total column nitrogen dioxide by Aurora compared to CAMS analysis. Aurora was initialized with CAMS analysis at 1 Sep 2022 00 UTC. Predicting atmospheric gases correctly is extremely challenging due to their spatially heterogeneous nature. In particular, nitrogen dioxide, like most variables in CAMS, is skewed toward high values in areas with large anthropogenic emissions, such as densely populated areas in East Asia. In addition, it exhibits a strong diurnal cycle; e.g., sunlight reduces background levels via a process called photolysis. Aurora accurately captures both the extremes and background levels. (b) Latitude-weighted root mean square error (RMSE) of Aurora relative to CAMS, where negative values (blue) mean that Aurora is better. The RMSEs are computed over the period Jun 2022 to Nov 2022 inclusive. Aurora matches or outperforms CAMS on 74% of the targets.

A prime example of Aurora’s versatility is its ability to forecast air-pollution levels using data from the Copernicus Atmosphere Monitoring Service (CAMS), a notoriously difficult task due to the complex interplay of atmospheric chemistry, weather patterns, and human activities, as well as the highly heterogeneous nature of CAMS data. By leveraging its flexible encoder-decoder architecture and attention mechanisms, Aurora effectively processes and learns from this challenging data, capturing the unique characteristics of air pollutants and their relationships with meteorological variables. This enables Aurora to produce accurate five-day global air-pollution forecasts at 0.4° spatial resolution, outperforming state-of-the-art atmospheric chemistry simulations on 74% of all targets, demonstrating its remarkable adaptability and potential to tackle a wide range of environmental prediction problems, even in data-sparse or highly complex scenarios. 

Data diversity and model scaling improve atmospheric forecasting

One of the key findings of this study is that pretraining on diverse datasets significantly improves Aurora’s performance compared to training on a single dataset. By incorporating data from climate simulations, reanalysis products, and operational forecasts, Aurora learns a more robust and generalizable representation of atmospheric dynamics. It is thanks to its scale and diverse pretraining data corpus that Aurora is able outperform state-of-the-art numerical weather-prediction models and specialized deep-learning approaches across a wide range of tasks and resolutions. 

Performance versus ERA5 2021 at 6h lead time for models pretrained on different dataset configurations (i.e., no fine-tuning) labeled by C1-C4. The root mean square errors (RMSEs) are normalised by the performance of the ERA5-pretrained model (C1). Adding low-fidelity simulation data from CMIP6 (i.e., CMCC and IFS-HR) improves performance almost uniformly (C2). Adding even more simulation data improves performance further on most surface variables and for the atmospheric levels present in this newly added data (C3). Finally, configuration C4, which contains a good coverage of the entire atmosphere and also contains analysis data from GFS achieves the best overall performance with improvements across the board.
Pretraining on many diverse data sources improves the forecasting of extreme values at 6h lead time across all surface variables of IFS-HRES 2022. Additionally, the results also hold on wind speed, which is a nonlinear function of 10U and 10V.
Bigger models obtain lower validation loss for the same amount of GPU hours. We fit a power law that indicates a 5% reduction in the validation loss for every doubling of the model size.
Figure 3: Pretraining on diverse data and increasing model size improves performance. (a) Performance versus ERA5 2021 at 6h lead time for models pretrained on different dataset configurations (i.e., no fine-tuning) labeled by C1-C4. The root mean square errors (RMSEs) are normalized by the performance of the ERA5-pretrained model (C1). Adding low-fidelity simulation data from CMIP6 (i.e., CMCC and IFS-HR) improves performance almost uniformly (C2). Adding even more simulation data improves performance further on most surface variables and for the atmospheric levels present in this newly added data (C3). Finally, configuration C4, which contains good coverage of the entire atmosphere and also contains analysis data from GFS achieves the best overall performance with improvements across the board. (b) Pretraining on many diverse data sources improves the forecasting of extreme values at 6h lead time across all surface variables of IFS-HRES 2022. Additionally, the results also hold on wind speed, which is a nonlinear function of 10U and 10V. (c) Bigger models obtain lower validation loss for the same amount of GPU hours. We fit a power law that roughly translates into a 5 reduction in the training loss for every doubling of the model size.

A direct consequence of Aurora’s scale, both in terms of architecture design and training data corpus, as well as its pretraining and fine-tuning protocols, is its superior performance over the best specialized deep learning models. As an additional validation of the benefits of fine-tuning a large model pretrained on many datasets, we compare Aurora against GraphCast — pretrained only on ERA5 and currently considered the most skillful AI model at 0.25-degree resolution and lead times up to five days. Additionally, we include IFS HRES in this comparison, the gold standard in numerical weather prediction. We show that Aurora outperforms both when measured against analysis, weather station observations, and extreme values. 

Scorecard versus GraphCast at 0.25-degrees resolution. Aurora matches or outperforms GraphCast on 94% of targets. Aurora obtains the biggest gains (40%) over GraphCast in the upper atmosphere, where GraphCast performance is known to be poor. Large improvements up to 10-15% are observed at short and long lead times. The two models are closest to each other in the lower atmosphere at the 2--3 day lead time, which corresponds to the lead time GraphCast was rollout-finetuned on. At the same time, GraphCast shows slightly better performance up to five days and at most levels on specific humidity (Q).
Root mean square error (RMSE) for Aurora, GraphCast, and IFS-HRES as measured by global weather stations during 2022 for wind speed and surface temperature.
Thresholded RMSE for Aurora, GraphCast and IFS-HRES normalized by IFS-HRES performance. Aurora demonstrates improved prediction for the extreme values, or tails, of the surface variable distributions. In each plot values to the right of the centre line are cumulative RMSEs for targets found to sit above the threshold, and those to the left represent target values sitting below the threshold.
Figure 4: Aurora outperforms operational GraphCast across the vast majority of targets. (a) Scorecard versus GraphCast at 0.25-degrees resolution. Aurora matches or outperforms GraphCast on 94% of targets. Aurora obtains the biggest gains (40%) over GraphCast in the upper atmosphere, where GraphCast performance is known to be poor. Large improvements up to 10%-15% are observed at short and long lead times. The two models are closest to each other in the lower atmosphere at the 2-3 day lead time, which corresponds to the lead time GraphCast was rollout-finetuned on. At the same time, GraphCast shows slightly better performance up to five days and at most levels on specific humidity (Q). (b) Root mean square error (RMSE) and mean absolute error (MAE) for Aurora, GraphCast, and IFS-HRES as measured by global weather stations during 2022 for wind speed (left two panels) and surface temperature (right two panels). (c) Thresholded RMSE for Aurora, GraphCast and IFS-HRES normalized by IFS-HRES performance. Aurora demonstrates improved prediction for the extreme values, or tails, of the surface variable distributions. In each plot values to the right of the center line are cumulative RMSEs for targets found to sit above the threshold, and those to the left represent target values sitting below the threshold.

A paradigm shift in Earth system modeling 

The implications of Aurora extend far beyond atmospheric forecasting. By demonstrating the power of foundation models in the Earth sciences, this research paves the way for the development of comprehensive models that encompass the entire Earth system. The ability of foundation models to excel at downstream tasks with scarce data could democratize access to accurate weather and climate information in data-sparse regions, such as the developing world and polar regions. This could have far-reaching impacts on sectors like agriculture, transportation, energy harvesting, and disaster preparedness, enabling communities to better adapt to the challenges posed by climate change. 

As the field of AI-based environmental prediction evolves, we hope Aurora will serve as a blueprint for future research and development. The study highlights the importance of diverse pretraining data, model scaling, and flexible architectures in building powerful foundation models for the Earth sciences. With continued advancements in computational resources and data availability, we can envision a future where foundation models like Aurora become the backbone of operational weather and climate prediction systems, providing timely, accurate, and actionable insights to decision-makers and the public worldwide. 

Acknowledgements

We are grateful for the contributions of Cristian Bodnar, a core contributor to this project.

The post Introducing Aurora: The first large-scale foundation model of the atmosphere appeared first on Microsoft Research.

Read More

Digital Bank Debunks Financial Fraud With Generative AI

Digital Bank Debunks Financial Fraud With Generative AI

European neobank bunq is debunking financial fraudsters with the help of NVIDIA accelerated computing and AI.

Dubbed “the bank of the free,” bunq offers online banking anytime, anywhere. Through the bunq app, users can handle all their financial needs exclusively online, without needing to visit a physical bank.

With more than 12 million customers and 8 billion euros’ worth of deposits made to date, bunq has become one of the largest neobanks in the European Union. Founded in 2012, it was the first bank to obtain a European banking license in over three decades.

To meet growing customer needs, bunq turned to generative AI to help detect fraud and money laundering. Its automated transaction-monitoring system, powered by NVIDIA accelerated computing, greatly improved its training speed.

“AI has enormous potential to help humanity in so many ways, and this is a great example of how human intelligence can be coupled with AI,” said Ali el Hassouni, head of data and AI at bunq.

Faster Fraud Detection

Financial fraud is more prevalent than ever, el Hassouni said in a recent talk at NVIDIA GTC.

Traditional transaction-monitoring systems are rules based, meaning algorithms flag suspicious transactions according to a set of criteria that determine if an activity presents risk of fraud or money laundering. These criteria must be manually set, resulting in high false-positive rates and making such systems labor intensive and difficult to scale.

Instead, using supervised and unsupervised learning, bunq’s AI-powered transaction-monitoring system is completely automated and easily scalable.

Bunq achieved this using NVIDIA GPUs, which accelerated its data processing pipeline more than 5x.

In addition, compared with previous methods, bunq trained its fraud-detection model nearly 100x faster using the open-source NVIDIA RAPIDS suite of GPU-accelerated data science libraries.

RAPIDS is part of the NVIDIA AI Enterprise software platform, which accelerates data science pipelines and streamlines the development and deployment of production-grade generative AI applications.

“We chose NVIDIA’s advanced, GPU-optimized software, as it enables us to use larger datasets and speed the training of new models — sometimes by an order of magnitude — resulting in improved model accuracy and reduced false positives,” said el Hassouni.

AI Across the Bank

Bunq is seeking to tap AI’s potential across its operations.

“We’re constantly looking for new ways to apply AI for the benefit of our users,” el Hassouni said. “More than half of our user tickets are handled automatically. We also use AI to spot fake IDs when onboarding new users, automate our marketing efforts and much more.”

Finn, a personal AI assistant available to bunq customers, is powered by the company’s proprietary large language model and generative AI. It can answer user questions like, “How much did I spend on groceries last month?” and “What’s the name of the Indian restaurant I ate at last week?”

The company is exploring NVIDIA NeMo Retriever, a collection of generative AI microservices available in early access, to further improve Finn’s accuracy. NeMo Retriever is a part of NVIDIA NIM inference microservices, which provide models as optimized containers, available with NVIDIA AI Enterprise.

“Our initial testing of NeMo Retriever embedding NIM has been extremely positive, and our collaboration with NVIDIA on LLMs is poised to help us to take Finn to the next level and enhance customer experience,” el Hassouni said. 

Plus, for the digital bank’s marketing efforts, AI helps analyze consumer engagement metrics to inform future campaigns.

“We’re creating a borderless banking experience for our users, always keeping them at the heart of everything we do,” el Hassouni said.

Watch bunq’s NVIDIA GTC session on demand and subscribe to NVIDIA financial services news

Learn more about AI and financial services at Money20/20 Europe, a fintech conference running June 4-6 in Amsterdam, where NVIDIA will host an AI Summit in collaboration with AWS, and where bunq will present on a panel about AI for fraud detection.

Read More

‘Accelerate Everything,’ NVIDIA CEO Says Ahead of COMPUTEX

‘Accelerate Everything,’ NVIDIA CEO Says Ahead of COMPUTEX

“Generative AI is reshaping industries and opening new opportunities for innovation and growth,” NVIDIA founder and CEO Jensen Huang said in an address ahead of this week’s COMPUTEX technology conference in Taipei.

“Today, we’re at the cusp of a major shift in computing,” Huang told the audience, clad in his trademark black leather jacket. “The intersection of AI and accelerated computing is set to redefine the future.”

Huang spoke ahead of one of the world’s premier technology conferences to an audience of more than 6,500 industry leaders, press, entrepreneurs, gamers, creators and AI enthusiasts gathered at the glass-domed National Taiwan University Sports Center set in the verdant heart of Taipei.

The theme: NVIDIA accelerated platforms are in full production, whether through AI PCs and consumer devices featuring a host of NVIDIA RTX-powered capabilities or enterprises building and deploying AI factories with NVIDIA’s full-stack computing platform.

“The future of computing is accelerated,” Huang said. “With our innovations in AI and accelerated computing, we’re pushing the boundaries of what’s possible and driving the next wave of technological advancement.”
 

‘One-Year Rhythm’

More’s coming, with Huang revealing a roadmap for new semiconductors that will arrive on a one-year rhythm. Revealed for the first time, the Rubin platform will succeed the upcoming Blackwell platform, featuring new GPUs, a new Arm-based CPU — Vera — and advanced networking with NVLink 6, CX9 SuperNIC and the X1600 converged InfiniBand/Ethernet switch.

“Our company has a one-year rhythm. Our basic philosophy is very simple: build the entire data center scale, disaggregate and sell to you parts on a one-year rhythm, and push everything to technology limits,” Huang explained.

NVIDIA’s creative team used AI tools from members of the NVIDIA Inception startup program, built on NVIDIA NIM and NVIDIA’s accelerated computing, to create the COMPUTEX keynote. Packed with demos, this showcase highlighted these innovative tools and the transformative impact of NVIDIA’s technology.

‘Accelerated Computing Is Sustainable Computing’

NVIDIA is driving down the cost of turning data into intelligence, Huang explained as he began his talk.

“Accelerated computing is sustainable computing,” he emphasized, outlining how the combination of GPUs and CPUs can deliver up to a 100x speedup while only increasing power consumption by a factor of three, achieving 25x more performance per Watt over CPUs alone.

“The more you buy, the more you save,” Huang noted, highlighting this approach’s significant cost and energy savings.

Industry Joins NVIDIA to Build AI Factories to Power New Industrial Revolution

Leading computer manufacturers, particularly from Taiwan, the global IT hub, have embraced NVIDIA GPUs and networking solutions. Top companies include ASRock Rack, ASUS, GIGABYTE, Ingrasys, Inventec, Pegatron, QCT, Supermicro, Wistron and Wiwynn, which are creating cloud, on-premises and edge AI systems.

The NVIDIA MGX modular reference design platform now supports Blackwell, including the GB200 NVL2 platform, designed for optimal performance in large language model inference, retrieval-augmented generation and data processing.

AMD and Intel are supporting the MGX architecture with plans to deliver, for the first time, their own CPU host processor module designs. Any server system builder can use these reference designs to save development time while ensuring consistency in design and performance.

Next-Generation Networking with Spectrum-X

In networking, Huang unveiled plans for the annual release of Spectrum-X products to cater to the growing demand for high-performance Ethernet networking for AI.

NVIDIA Spectrum-X, the first Ethernet fabric built for AI, enhances network performance by 1.6x more than traditional Ethernet fabrics. It accelerates the processing, analysis and execution of AI workloads and, in turn, the development and deployment of AI solutions.

CoreWeave, GMO Internet Group, Lambda, Scaleway, STPX Global and Yotta are among the first AI cloud service providers embracing Spectrum-X to bring extreme networking performance to their AI infrastructures.

NVIDIA NIM to Transform Millions Into Gen AI Developers

With NVIDIA NIM, the world’s 28 million developers can now easily create generative AI applications. NIM — inference microservices that provide models as optimized containers — can be deployed on clouds, data centers or workstations.

NIM also enables enterprises to maximize their infrastructure investments. For example, running Meta Llama 3-8B in a NIM produces up to 3x more generative AI tokens on accelerated infrastructure than without NIM.


Nearly 200 technology partners — including Cadence, Cloudera, Cohesity, DataStax, NetApp, Scale AI, and Synopsys — are integrating NIM into their platforms to speed generative AI deployments for domain-specific applications, such as copilots, code assistants, digital human avatars and more. Hugging Face is now offering NIM — starting with Meta Llama 3.

“Today we just posted up in Hugging Face the Llama 3 fully optimized, it’s available there for you to try. You can even take it with you,” Huang said. “So you could run it in the cloud, run it in any cloud, download this container, put it into your own data center, and you can host it to make it available for your customers.”

NVIDIA Brings AI Assistants to Life With GeForce RTX AI PCs

NVIDIA’s RTX AI PCs, powered by RTX technologies, are set to revolutionize consumer experiences with over 200 RTX AI laptops and more than 500 AI-powered apps and games.

The RTX AI Toolkit and newly available PC-based NIM inference microservices for the NVIDIA ACE digital human platform underscore NVIDIA’s commitment to AI accessibility.

Project G-Assist, an RTX-powered AI assistant technology demo, was also announced, showcasing context-aware assistance for PC games and apps.

And Microsoft and NVIDIA are collaborating to help developers bring new generative AI capabilities to their Windows native and web apps with easy API access to RTX-accelerated SLMs that enable RAG capabilities that run on-device as part of Windows Copilot Runtime.

NVIDIA Robotics Adopted by Industry Leaders

NVIDIA is spearheading the $50 trillion industrial digitization shift, with sectors embracing autonomous operations and digital twins — virtual models that enhance efficiency and cut costs. Through its Developer Program, NVIDIA offers access to NIM, fostering AI innovation.

Taiwanese manufacturers are transforming their factories using NVIDIA’s technology, with Huang showcasing Foxconn’s use of NVIDIA Omniverse, Isaac and Metropolis to create digital twins, combining vision AI and robot development tools for enhanced robotic facilities.

“The next wave of AI is physical AI. AI that understands the laws of physics, AI that can work among us,” Huang said, emphasizing the importance of robotics and AI in future developments.

The NVIDIA Isaac platform provides a robust toolkit for developers to build AI robots, including AMRs, industrial arms and humanoids, powered by AI models and supercomputers like Jetson Orin and Thor.

“Robotics is here. Physical AI is here. This is not science fiction, and it’s being used all over Taiwan. It’s just really, really exciting,” Huang added.

Global electronics giants are integrating NVIDIA’s autonomous robotics into their factories, leveraging simulation in Omniverse to test and validate this new wave of AI for the physical world. This includes over 5 million preprogrammed robots worldwide.

“All the factories will be robotic. The factories will orchestrate robots, and those robots will be building products that are robotic,” Huang explained.

Huang emphasized NVIDIA Isaac’s role in boosting factory and warehouse efficiency, with global leaders like BYD Electronics, Siemens, Teradyne Robotics and Intrinsic adopting its advanced libraries and AI models.

NVIDIA AI Enterprise on the IGX platform, with partners like ADLINK, Advantech and ONYX, delivers edge AI solutions meeting strict regulatory standards, essential for medical technology and other industries.

Huang ended his keynote on the same note he began it on, paying tribute to Taiwan and NVIDIA’s many partners there. “Thank you,” Huang said. “I love you guys.”

Read More

KServe Providers Dish Up NIMble Inference in Clouds and Data Centers

KServe Providers Dish Up NIMble Inference in Clouds and Data Centers

Deploying generative AI in the enterprise is about to get easier than ever.

NVIDIA NIM, a set of generative AI inference microservices, will work with KServe, open-source software that automates putting AI models to work at the scale of a cloud computing application.

The combination ensures generative AI can be deployed like any other large enterprise application. It also makes NIM widely available through platforms from dozens of companies, such as Canonical, Nutanix and Red Hat.

The integration of NIM on KServe extends NVIDIA’s technologies to the open-source community, ecosystem partners and customers. Through NIM, they can all access the performance, support and security of the NVIDIA AI Enterprise software platform with an API call — the push-button of modern programming.

Serving AI on Kubernetes

KServe got its start as part of Kubeflow, a machine learning toolkit based on Kubernetes, the open-source system for deploying and managing software containers that hold all the components of large distributed applications.

As Kubeflow expanded its work on AI inference, what became KServe was born and ultimately evolved into its own open-source project.

Many companies have contributed to and adopted the KServe software that runs today at companies including AWS, Bloomberg, Canonical, Cisco, Hewlett Packard Enterprise, IBM, Red Hat, Zillow and NVIDIA.

Under the Hood With KServe

KServe is essentially an extension of Kubernetes that runs AI inference like a powerful cloud application. It uses a standard protocol, runs with optimized performance and supports PyTorch, Scikit-learn, TensorFlow and XGBoost without users needing to know the details of those AI frameworks.

The software is especially useful these days, when new large language models (LLMs) are emerging rapidly.

KServe lets users easily go back and forth from one model to another, testing which one best suits their needs. And when an updated version of a model gets released, a KServe feature called “canary rollouts” automates the job of carefully validating and gradually deploying it into production.

Another feature, GPU autoscaling, efficiently manages how models are deployed as demand for a service ebbs and flows, so customers and service providers have the best possible experience.

An API Call to Generative AI

The goodness of KServe will now be available with the ease of NVIDIA NIM.

With NIM, a simple API call takes care of all the complexities. Enterprise IT admins get the metrics they need to ensure their application is running with optimal performance and efficiency, whether it’s in their data center or on a remote cloud service — even if they change the AI models they’re using.

NIM lets IT professionals become generative AI pros, transforming their company’s operations. That’s why a host of enterprises such as Foxconn and ServiceNow are deploying NIM microservices.

NIM Rides Dozens of Kubernetes Platforms

Thanks to its integration with KServe, users will be able access NIM on dozens of enterprise platforms such as Canonical’s Charmed KubeFlow and Charmed Kubernetes, Nutanix GPT-in-a-Box 2.0, Red Hat’s OpenShift AI and many others.

“Red Hat has been working with NVIDIA to make it easier than ever for enterprises to deploy AI using open source technologies,” said KServe contributor Yuan Tang, a principal software engineer at Red Hat. “By enhancing KServe and adding support for NIM in Red Hat OpenShift AI, we’re able to provide streamlined access to NVIDIA’s generative AI platform for Red Hat customers.”

“Through the integration of NVIDIA NIM inference microservices with Nutanix GPT-in-a-Box 2.0, customers will be able to build scalable, secure, high-performance generative AI applications in a consistent way, from the cloud to the edge,” said the vice president of engineering at Nutanix, Debojyoti Dutta, whose team contributes to KServe and Kubeflow.

“As a company that also contributes significantly to KServe, we’re pleased to offer NIM through Charmed Kubernetes and Charmed Kubeflow,” said Andreea Munteanu, MLOps product manager at Canonical. “Users will be able to access the full power of generative AI, with the highest performance, efficiency and ease thanks to the combination of our efforts.”

Dozens of other software providers can feel the benefits of NIM simply because they include KServe in their offerings.

Serving the Open-Source Community

NVIDIA has a long track record on the KServe project. As noted in a recent technical blog, KServe’s Open Inference Protocol is used in NVIDIA Triton Inference Server, which helps users run many AI models simultaneously across many GPUs, frameworks and operating modes.

With KServe, NVIDIA focuses on use cases that involve running one AI model at a time across many GPUs.

As part of the NIM integration, NVIDIA plans to be an active contributor to KServe, building on its portfolio of contributions to open-source software that includes Triton and TensorRT-LLM. NVIDIA is also an active member of the Cloud Native Computing Foundation, which supports open-source code for generative AI and other projects.

Try the NIM API on the NVIDIA API Catalog using the Llama 3 8B or Llama 3 70B LLM models today. Hundreds of NVIDIA partners worldwide are using NIM to deploy generative AI.

Watch NVIDIA founder and CEO Jensen Huang’s COMPUTEX keynote to get the latest on AI and more.

Read More

Taiwan Electronics Giants Drive Industrial Automation With NVIDIA Metropolis and NIM

Taiwan Electronics Giants Drive Industrial Automation With NVIDIA Metropolis and NIM

Taiwan’s leading consumer electronics giants are making advances with AI automation for manufacturing, as fleets of robots and millions of cameras and sensors drive efficiencies across the smart factories of the future.

Dozens of electronics manufacturing and automation specialists — including Foxconn, Pegatron and Wistron — are showcasing their use of the NVIDIA software at COMPUTEX, in Taipei, and are called out in NVIDIA founder and CEO Jensen Huang’s keynote address.

Companies displayed the latest in computer vision and generative AI using NVIDIA Metropolis for everything from automating product manufacturing to improving worker safety and device performance.

Creating Factory Autonomy

With increasing production challenges, manufacturers are seeing a need to turn factories into autonomous machines, with generative AI and digital twins as a foundation. AI agents — driven by large language models (LLMs) — are being built that can talk and assist on warehouse floors to boost productivity and increase safety. And digital twins are helping manufacturers simulate and develop factories and AI-powered automation before being deployed in real factories.

Foxconn and its Ingrasys subsidiary use NVIDIA Omniverse and Metropolis to build digital twins for factories, planning efficiency optimizations and worker safety improvement at a number of manufacturing sites. At COMPUTEX, Foxconn is showing how it uses digital twins to plan placements of many video cameras in factories to optimize its data capture for collecting key insights.

Bringing Generative AI to the Factory Floor

Generative AI is creating productivity leaps across industries. Researcher McKinsey forecasts that generative AI will deliver as much as $290 billion in value for the advanced manufacturing industry, while bringing $4.4 trillion annually to the global economy.

At GTC in March, NVIDIA launched NVIDIA NIM, a set of microservices designed to speed up generative AI deployment in enterprises. Supporting a wide range of AI models, it ensures seamless, scalable AI inferencing, on premises or in the cloud, using industry-standard application programming interfaces.

Billions of IoT devices worldwide can tap into Metropolis and NVIDIA NIM for improvements in AI perception to enhance their capabilities.

Advancing Manufacturing With NVIDIA NIM

Linker Vision, an AI vision insights specialist, is adopting NVIDIA NIM to assist factories in deploying AI agents that can respond to natural language queries.

The Taipei company uses NVIDIA Visual Insight Agent (VIA) in manufacturing environments for always-on video feed monitoring of factory floors. With user prompts, these ChatGPT-like systems can enable operators to ask for video of factory floors to be monitored for insights and safety alerts, like when workers are not wearing hardhats.

Operators can ask questions and receive instant, context-aware responses from AI agents, which can tap into organizational knowledge via retrieval-augmented generation, an integration of AI that can enhance operational efficiency.

Leading manufacturer Pegatron has factories that span more than 20 million square feet and the facilities process and build more than 15 million assemblies per month, while deploying more than 3,500 robots across factory floors. It has announced efforts based on NVIDIA NIM and is using Metropolis multi-camera tracking reference workflows to help with worker safety and productivity on factory lines. Pegatron’s workflow fuses digital twins in Omniverse and Metropolis real-time AI to better monitor and optimize operations.

Boosting Automated Visual Inspections

Adoption of NVIDIA Metropolis is helping Taiwan’s largest electronics manufacturers streamline operations and reduce cost as they build and inspect some of the world’s most complex and high-volume products.

Quality control with manual inspections in manufacturing is a multitrillion-dollar challenge. While automated optical inspection systems have been relied upon for some time, legacy AOI systems have high false detection rates, requiring costly secondary manual inspections for verification.

NVIDIA Metropolis for Factories offers a state-of-the-art AI reference workflow for bringing sophisticated and accurate AOI inspection applications to production faster.

TRI, Taiwan’s leading AOI equipment maker, has announced integrating NVIDIA Metropolis for Factories workflow and capabilities into its latest AOI systems and is also planning to use NVIDIA NIM to further optimize system performance.

Wistron is expanding its OneAI platform for visual inspection and AOI with Metropolis. OneAI has been deployed in more than 10 Wistron factories globally, spanning hundreds of inspection points.

MediaTek, a leading innovator in connectivity and multimedia, and one of Taiwan’s largest IoT silicon vendors, announced at COMPUTEX that it’s teaming with NVIDIA to integrate NVIDIA TAO training and pretrained models into its AI development workflow for IoT device customers. The collaboration brings Metropolis and the latest advances in AI and visual perception to billions of IoT far-edge devices and streamlines software development for MediaTek’s next phase of growth in edge IoT.

Learn about NVIDIA Metropolis for Factories, NVIDIA NIM and the NVIDIA Metropolis multi-camera tracking workflow, which developers can use to build state-of-the-art real-time locating services and worker safety into their factory or warehouse operations. 

Read More

Foxconn Trains Robots, Streamlines Assembly With NVIDIA AI and Omniverse

Foxconn Trains Robots, Streamlines Assembly With NVIDIA AI and Omniverse

Foxconn operates more than 170 factories around the world — the latest one a virtual plant pushing the state of the art in industrial automation.

It’s the digital twin of a new factory in Guadalajara, hub of Mexico’s electronics industry. Foxconn’s engineers are defining processes and training robots in this virtual environment, so the physical plant can produce at high efficiency the next engine of accelerated computing, NVIDIA Blackwell HGX systems.

To design an optimal assembly line, factory engineers need to find the best placement for dozens of robotic arms, each weighing hundreds of pounds. To accurately monitor the overall process, they situate thousands of sensors, including many networked video cameras in a matrix to show plant operators all the right details.

Virtual Factories Create Real Savings

Such challenges are why companies like Foxconn are increasingly creating virtual factories for simulation and testing.

“Our digital twin will guide us to new levels of automation and industrial efficiency, saving time, cost and energy,” said Young Liu, chairman of the company that last year had revenues of nearly $200 billion.

Based on its efforts so far, the company anticipates that it can increase the manufacturing efficiency of complex servers using the simulated plant, leading to significant cost savings and reducing kilowatt-hour usage by over 30% annually.

Foxconn Teams With NVIDIA, Siemens

Foxconn is building its digital twin with software from the Siemens Xcelerator portfolio including Teamcenter and NVIDIA Omniverse, a platform for developing 3D workflows and applications based on OpenUSD.

NVIDIA and Siemens announced in March that they will connect Siemens Xcelerator applications to NVIDIA Omniverse Cloud API microservices. Foxconn will be among the first to employ the combined services, so its digital twin is physically accurate and visually realistic.

Engineers will employ Teamcenter with Omniverse APIs to design robot work cells and assembly lines. Then they’ll use Omniverse to pull all the 3D CAD elements into one virtual factory where their robots will be trained with NVIDIA Isaac Sim.

Robots Attend a Virtual School

A growing set of manufacturers is building digital twins to streamline factory processes. Foxconn is among the first to take the next step in automation — training their AI robots in the digital twin.

Inside the Foxconn virtual factory, robot arms from manufacturers such as Epson can learn how to see, grasp and move objects with NVIDIA Isaac Manipulator, a collection of NVIDIA-accelerated libraries and AI foundation models for robot arms.

For example, the robot arms may learn how to pick up a Blackwell server and place it on an autonomous mobile robot (AMR). The arms can use Isaac Manipulator’s cuMotion to find inspection paths for products, even when objects are placed in the way.

Foxconn’s AMRs, from Taiwan’s FARobot, will learn how to see and navigate the factory floor using NVIDIA Perceptor, software that helps them build a real-time 3D map that indicates any obstacles. The robot’s routes are generated and optimized by NVIDIA cuOpt, a world-record holding route optimization microservice.

Unlike many transport robots that need to stick to carefully drawn lines on the factory floor, these smart AMRs will navigate around obstacles to get wherever they need to go.

A Global Trend to Industrial Digitization

The Guadalajara factory is just the beginning. Foxconn is starting to design digital twins of factories around the world, including one in Taiwan where it will manufacture electric buses.

Foxconn is also deploying NVIDIA Metropolis, an application framework for smart cities and spaces, to give cameras on the shop floor AI-powered vision. That gives plant managers deeper insights into daily operations and opportunities to further streamline operations and improve worker safety.

With an estimated 10 million factories worldwide, the $46 trillion manufacturing sector is a rich field for industrial digitalization.

Delta Electronics, MediaTek, MSI and Pegatron are among other top electronics makers revealed at COMPUTEX this week how they’re using NVIDIA AI and Omniverse to build digital twins of their factories.

Like Foxconn, they’re racing to make their factories more agile, autonomous and sustainable to serve the demand for more than a billion smartphones, PCs and servers a year.

A reference architecture shows how to develop factory digital twins  with the NVIDIA AI and Omniverse platforms. And learn about the experiences of five companies doing this work.

Watch NVIDIA founder and CEO Jensen Huang’s COMPUTEX keynote to get the latest on AI and more.

Read More

Gen AI Healthcare Accelerated: Dozens of Companies Adopt Meta Llama 3 NIM

Gen AI Healthcare Accelerated: Dozens of Companies Adopt Meta Llama 3 NIM

Meta Llama 3, Meta’s openly available state-of-the-art large language model — trained and optimized using NVIDIA accelerated computing — is dramatically boosting healthcare and life sciences workflows, helping deliver applications that aim to improve patients’ lives.

Now available as a downloadable NVIDIA NIM inference microservice at ai.nvidia.com, Llama 3 is equipping healthcare developers, researchers and companies to innovate responsibly across a wide variety of applications. The NIM comes with a standard application programming interface that can be deployed anywhere.

For use cases spanning surgical planning and digital assistants to drug discovery and clinical trial optimization, developers can use Llama 3 to easily deploy optimized generative AI models for copilots, chatbots and more.

At COMPUTEX, one of the world’s premier technology events, NVIDIA today announced that hundreds of AI ecosystem partners are embedding NIM into their solutions.

More than 40 of these adopters are healthcare and life sciences startups and enterprises using the Llama 3 NIM to build and run applications that accelerate digital biology, digital surgery and digital health.

Advancing Digital Biology

Techbio and pharmaceutical companies, along with life sciences platform providers, use NVIDIA NIM for generative biology, chemistry and molecular prediction. With the Llama 3 NIM for intelligent assistants and NVIDIA BioNeMo NIM microservices for digital biology, researchers can build and scale end-to-end workflows for drug discovery and clinical trials.

Deloitte is driving efficiency for garnering data-based insights from gene to function for research copilots, scientific research mining, chemical property prediction and drug repurposing with its Atlas AI drug discovery accelerator, powered by the NVIDIA BioNeMo, NeMo and Llama 3 NIM microservices.

Transcripta Bio harnesses Llama 3 and BioNeMo for accelerated intelligent drug discovery. Its proprietary artificial intelligence modeling suite, Conductor AI, uses its Drug-Gene Atlas to help discover and predict the effects of new drugs at transcriptome scale.

Bolstering Clinical Trials

Quantiphi — an AI-first digital engineering company and an Elite Service Delivery Partner in the NVIDIA Partner Network — is using NVIDIA NIM to develop generative AI solutions for clinical research and development, diagnostics and patient care. These innovations are enabling organizations to save substantial cost, enhance workforce productivity and improve patient outcomes.

ConcertAI is advancing a broad set of translational and clinical development solutions within its CARA AI platform. The company has integrated the Llama 3 NIM to support population-scale patient matching to clinical trials, study automation and research site copilots with real-time insights and model management for large-scale AI applications.

Mendel AI is developing clinically focused AI solutions that can understand nuances in medical data at scale to provide actionable insights, with applications across clinical research, real-world evidence generation and cohort selection. It has deployed a fine-tuned Llama 3 NIM for its Hypercube copilot, offering a 36% performance improvement. Mendel is also exploring potential use cases with Llama 3 NIM to extract clinical information from patient records and to translate natural language into clinical queries.

Improving Digital Surgery

The operating room is bolstered by AI and the latest digital technologies, too.

Activ Surgical is using Llama 3 to accelerate development of its AI copilot and augmented-reality solution for real-time surgical guidance. The company’s ActivSight technology, which allows surgeons to view critical physiological structures and functions, aims to reduce surgical complication rates, improving patient care and safety.

Enhancing Digital Health

Generative AI-powered digital health applications enhance patient-doctor interactions, helping to improve patient outcomes and deliver more efficient healthcare.

Precision medicine company SimBioSys recently downloaded the Llama 3 NIM to help analyze a breast cancer patient’s diagnosis and tailor guidance for the physician regarding the patient’s unique characteristics.

Artisight, a startup focused on smart hospital transformation, uses Llama 3 to automate documentation and care coordination in all its clinical locations with ambient voice and vision systems.

AITEM, which offers medical and veterinary AI diagnostic solutions, is building healthcare-specific chatbots with the model.

And Abridge, which offers a generative AI platform for clinical conversations, is using the NIM to build a physician-patient encounter summarization solution.

Transcripta Bio, Activ Surgical, SimBioSys, Artisight, AITEM and Abridge are all members of NVIDIA Inception, a free program that helps startups evolve faster through cutting-edge technology, opportunities to connect with venture capitalists and access to the latest technical resources from NVIDIA.

The NVIDIA NIM collection of inference microservices is available with NVIDIA AI Enterprise, a software platform that streamlines development and deployment of production-grade copilots and other generative AI applications.

Download the Meta Llama 3 NIM now and learn more about how generative AI is reshaping healthcare and other industries by joining NVIDIA at COMPUTEX, running through June 7 in Taipei, Taiwan.

Read More