Amazon AWS – Page 106

Amazon Web Services releases two new Titan vision-language models

December 20, 2023

by Amazon AWS

Novel architectures and carefully prepared training data enable state-of-the-art performance.Read More

Amazon and MIT announce Science Hub 2023 gift project awards and fellowships

December 19, 2023

by Amazon AWS

Four professors awarded for research in machine learning and robotics; two doctoral candidates awarded fellowships.Read More

Driving advanced analytics outcomes at scale using Amazon SageMaker powered PwC’s Machine Learning Ops Accelerator

December 19, 2023

by Ankur Goyal Amazon AWS

This post was written in collaboration with Ankur Goyal and Karthikeyan Chokappa from PwC Australia’s Cloud & Digital business.

Artificial intelligence (AI) and machine learning (ML) are becoming an integral part of systems and processes, enabling decisions in real time, thereby driving top and bottom-line improvements across organizations. However, putting an ML model into production at scale is challenging and requires a set of best practices. Many businesses already have data scientists and ML engineers who can build state-of-the-art models, but taking models to production and maintaining the models at scale remains a challenge. Manual workflows limit ML lifecycle operations to slow down the development process, increase costs, and compromise the quality of the final product.

Machine learning operations (MLOps) applies DevOps principles to ML systems. Just like DevOps combines development and operations for software engineering, MLOps combines ML engineering and IT operations. With the rapid growth in ML systems and in the context of ML engineering, MLOps provides capabilities that are needed to handle the unique complexities of the practical application of ML systems. Overall, ML use cases require a readily available integrated solution to industrialize and streamline the process that takes an ML model from development to production deployment at scale using MLOps.

To address these customer challenges, PwC Australia developed Machine Learning Ops Accelerator as a set of standardized process and technology capabilities to improve the operationalization of AI/ML models that enable cross-functional collaboration across teams throughout ML lifecycle operations. PwC Machine Learning Ops Accelerator, built on top of AWS native services, delivers a fit-for-purpose solution that easily integrates into the ML use cases with ease for customers across all industries. In this post, we focus on building and deploying an ML use case that integrates various lifecycle components of an ML model, enabling continuous integration (CI), continuous delivery (CD), continuous training (CT), and continuous monitoring (CM).

Solution overview

In MLOps, a successful journey from data to ML models to recommendations and predictions in business systems and processes involves several crucial steps. It involves taking the result of an experiment or prototype and turning it into a production system with standard controls, quality, and feedback loops. It’s much more than just automation. It’s about improving organization practices and delivering outcomes that are repeatable and reproducible at scale.

Only a small fraction of a real-world ML use case comprises the model itself. The various components needed to build an integrated advanced ML capability and continuously operate it at scale is shown in Figure 1. As illustrated in the following diagram, PwC MLOps Accelerator comprises seven key integrated capabilities and iterative steps that enable CI, CD, CT, and CM of an ML use case. The solution takes advantage of AWS native features from Amazon SageMaker, building a flexible and extensible framework around this.

Figure 1 -– PwC Machine Learning Ops Accelerator capabilities

In a real enterprise scenario, additional steps and stages of testing may exist to ensure rigorous validation and deployment of models across different environments.

Data and model management provide a central capability that governs ML artifacts throughout their lifecycle. It enables auditability, traceability, and compliance. It also promotes the shareability, reusability, and discoverability of ML assets.
ML model development allows various personas to develop a robust and reproducible model training pipeline, which comprises a sequence of steps, from data validation and transformation to model training and evaluation.
Continuous integration/delivery facilitates the automated building, testing, and packaging of the model training pipeline and deploying it into the target execution environment. Integrations with CI/CD workflows and data versioning promote MLOps best practices such as governance and monitoring for iterative development and data versioning.
ML model continuous training capability executes the training pipeline based on retraining triggers; that is, as new data becomes available or model performance decays below a preset threshold. It registers the trained model if it qualifies as a successful model candidate and stores the training artifacts and associated metadata.
Model deployment allows access to the registered trained model to review and approve for production release and enables model packaging, testing, and deploying into the prediction service environment for production serving.
Prediction service capability starts the deployed model to provide prediction through online, batch, or streaming patterns. Serving runtime also captures model serving logs for continuous monitoring and improvements.
Continuous monitoring monitors the model for predictive effectiveness to detect model decay and service effectiveness (latency, pipeline throughout, and execution errors)

PwC Machine Learning Ops Accelerator architecture

The solution is built on top of AWS-native services using Amazon SageMaker and serverless technology to keep performance and scalability high and running costs low.

Figure 2 – PwC Machine Learning Ops Accelerator architecture

PwC Machine Learning Ops Accelerator provides a persona-driven access entitlement for build-out, usage, and operations that enables ML engineers and data scientists to automate deployment of pipelines (training and serving) and rapidly respond to model quality changes. Amazon SageMaker Role Manager is used to implement role-based ML activity, and Amazon S3 is used to store input data and artifacts.
Solution uses existing model creation assets from the customer and builds a flexible and extensible framework around this using AWS native services. Integrations have been built between Amazon S3, Git, and AWS CodeCommit that allow dataset versioning with minimal future management.
AWS CloudFormation template is generated using AWS Cloud Development Kit (AWS CDK). AWS CDK provides the ability to manage changes for the complete solution. The automated pipeline includes steps for out-of-the-box model storage and metric tracking.
PwC MLOps Accelerator is designed to be modular and delivered as infrastructure-as-code (IaC) to allow automatic deployments. The deployment process uses AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, and AWS CloudFormation template. Complete end-to-end solution to operationalize an ML model is available as deployable code.
Through a series of IaC templates, three distinct components are deployed: model build, model deployment , and model monitoring and prediction serving, using Amazon SageMaker Pipelines
- Model build pipeline automates the model training and evaluation process and enables approval and registration of the trained model.
- Model deployment pipeline provisions the necessary infrastructure to deploy the ML model for batch and real-time inference.
- Model monitoring and prediction serving pipeline deploys the infrastructure required to serve predictions and monitor model performance.
PwC MLOps Accelerator is designed to be agnostic to ML models, ML frameworks, and runtime environments. The solution allows for the familiar use of programming languages like Python and R, development tools such as Jupyter Notebook, and ML frameworks through a configuration file. This flexibility makes it straightforward for data scientists to continuously refine models and deploy them using their preferred language and environment.
The solution has built-in integrations to use either pre-built or custom tools to assign the labeling tasks using Amazon SageMaker Ground Truth for training datasets to provide continuous training and monitoring.
End-to-end ML pipeline is architected using SageMaker native features (Amazon SageMaker Studio , Amazon SageMaker Model Building Pipelines, Amazon SageMaker Experiments, and Amazon SageMaker endpoints).
The solution uses Amazon SageMaker built-in capabilities for model versioning, model lineage tracking, model sharing, and serverless inference with Amazon SageMaker Model Registry.
Once the model is in production, the solution continuously monitors the quality of ML models in real time. Amazon SageMaker Model Monitor is used to continuously monitor models in production. Amazon CloudWatch Logs is used to collect log files monitoring the model status, and notifications are sent using Amazon SNS when the quality of the model hits certain thresholds. Native loggers such as (boto3) are used to capture run status to expedite troubleshooting.

Solution walkthrough

The following walkthrough dives into the standard steps to create the MLOps process for a model using PwC MLOps Accelerator. This walkthrough describes a use case of an MLOps engineer who wants to deploy the pipeline for a recently developed ML model using a simple definition/configuration file that is intuitive.

Figure 3 – PwC Machine Learning Ops Accelerator process lifecycle

To get started, enroll in PwC MLOps Accelerator to get access to solution artifacts. The entire solution is driven from one configuration YAML file (config.yaml) per model. All the details required to run the solution are contained within that config file and stored along with the model in a Git repository. The configuration file will serve as input to automate workflow steps by externalizing important parameters and settings outside of code.
The ML engineer is required to populate config.yaml file and trigger the MLOps pipeline. Customers can configure an AWS account, the repository, the model, the data used, the pipeline name, the training framework, the number of instances to use for training, the inference framework, and any pre- and post-processing steps and several other configurations to check the model quality, bias, and explainability.

Figure 4 – Machine Learning Ops Accelerator configuration YAML

A simple YAML file is used to configure each model’s training, deployment, monitoring, and runtime requirements. Once the config.yaml is configured appropriately and saved alongside the model in its own Git repository, the model-building orchestrator is invoked. It also can read from a Bring-Your-Own-Model that can be configured through YAML to trigger deployment of the model build pipeline.
Everything after this point is automated by the solution and does not need the involvement of either the ML engineer or data scientist. The pipeline responsible for building the ML model includes data preprocessing, model training, model evaluation, and ost-processing. If the model passes automated quality and performance tests, the model is saved to a registry, and artifacts are written to Amazon S3 storage per the definitions in the YAML files. This triggers the creation of the model deployment pipeline for that ML model.

Figure 5 – Sample model deployment workflow

Next, an automated deployment template provisions the model in a staging environment with a live endpoint. Upon approval, the model is automatically deployed into the production environment.
The solution deploys two linked pipelines. Prediction serving deploys an accessible live endpoint through which predictions can be served. Model monitoring creates a continuous monitoring tool that calculates key model performance and quality metrics, triggering model retraining if a significant change in model quality is detected.
Now that you’ve gone through the creation and initial deployment, the MLOps engineer can configure failure alerts to be alerted for issues, for example, when a pipeline fails to do its intended job.
MLOps is no longer about packaging, testing, and deploying cloud service components similar to a traditional CI/CD deployment; it’s a system that should automatically deploy another service. For example, the model training pipeline automatically deploys the model deployment pipeline to enable prediction service, which in turn enables the model monitoring service.

Conclusion

In summary, MLOps is critical for any organization that aims to deploy ML models in production systems at scale. PwC developed an accelerator to automate building, deploying, and maintaining ML models via integrating DevOps tools into the model development process.

In this post, we explored how the PwC solution is powered by AWS native ML services and helps to adopt MLOps practices so that businesses can speed up their AI journey and gain more value from their ML models. We walked through the steps a user would take to access the PwC Machine Learning Ops Accelerator, run the pipelines, and deploy an ML use case that integrates various lifecycle components of an ML model.

To get started with your MLOps journey on AWS Cloud at scale and run your ML production workloads, enroll in PwC Machine Learning Operations.

About the Authors

Kiran Kumar Ballari is a Principal Solutions Architect at Amazon Web Services (AWS). He is an evangelist who loves to help customers leverage new technologies and build repeatable industry solutions to solve their problems. He is especially passionate about software engineering , Generative AI and helping companies with AI/ML product development.

Ankur Goyal is a director in PwC Australia’s Cloud and Digital practice, focused on Data, Analytics & AI. Ankur has extensive experience in supporting public and private sector organizations in driving technology transformations and designing innovative solutions by leveraging data assets and technologies.

Karthikeyan Chokappa (KC) is a Manager in PwC Australia’s Cloud and Digital practice, focused on Data, Analytics & AI. KC is passionate about designing, developing, and deploying end-to-end analytics solutions that transform data into valuable decision assets to improve performance and utilization and reduce the total cost of ownership for connected and intelligent things.

Rama Lankalapalli is a Sr. Partner Solutions Architect at AWS, working with PwC to accelerate their clients’ migrations and modernizations into AWS. He works across diverse industries to accelerate their adoption of AWS Cloud. His expertise lies in architecting efficient and scalable cloud solutions, driving innovation and modernization of customer applications by leveraging AWS services, and establishing resilient cloud foundations.

Jeejee Unwalla is a Senior Solutions Architect at AWS who enjoys guiding customers in solving challenges and thinking strategically. He is passionate about tech and data and enabling innovation.

ISO 42001: A new foundational global standard to advance responsible AI

December 18, 2023

by Swami Sivasubramanian Amazon AWS

Artificial intelligence (AI) is one of the most transformational technologies of our generation and provides opportunities to be a force for good and drive economic growth. The growth of large language models (LLMs), with hundreds of billions of parameters, has unlocked new generative AI use cases to improve customer experiences, boost employee productivity, and so much more. At AWS, we remain committed to harnessing AI responsibly, working hand in hand with our customers to develop and use AI systems with safety, fairness, and security at the forefront.

The AI industry reached an important milestone this week with the publication of ISO 42001. In simple terms, ISO 42001 is an international management system standard that provides guidelines for managing AI systems within organizations. It establishes a framework for organizations to systematically address and control the risks related to the development and deployment of AI. ISO 42001 emphasizes a commitment to responsible AI practices, encouraging organizations to adopt controls specific to their AI systems, fostering global interoperability, and setting a foundation for the development and deployment of responsible AI.

Trust in AI is crucial and integrating standards such as ISO 42001, which promotes AI governance, is one way to help earn public trust by supporting a responsible use approach.

As part of a broad, international community of subject matter experts, AWS has actively collaborated in ISO 42001’s development since 2021 and started laying the foundation prior to the standard’s final publication.

International standards foster global cooperation in developing and implementing responsible AI solutions. Perhaps like no other technology before it, managing these challenges requires collaboration and a truly multidisciplinary effort across technology companies, policymakers, community groups, scientists, and others—and international standards play a valuable role.

International standards are an important tool to help organizations translate domestic regulatory requirements into compliance mechanisms, including engineering practices, that are largely globally interoperable. Effective standards help reduce confusion about what AI is and what responsible AI entails, and help focus the industry on the reduction of potential harms. AWS is working in a community of diverse international stakeholders to improve emerging AI standards on variety of topics, including risk management, data quality, bias, and transparency.

Conformance with the new ISO 42001 standard is one of the ways that organizations can demonstrate a commitment to excellence in responsibly developing and deploying AI systems and applications. We are continuing to pursue the adoption of ISO 42001, and look forward to working with customers to do the same.

We are committed to investing in the future of responsible AI, and helping to inform international standards in the interest of our customers and the communities in which we all live and operate.

About the authors

Swami Sivasubramanian is Vice President of Data and Machine Learning at AWS. In this role, Swami oversees all AWS Database, Analytics, and AI & Machine Learning services. His team’s mission is to help organizations put their data to work with a complete, end-to-end data solution to store, access, analyze, and visualize, and predict.

Accelerating time-to-insight with MongoDB time series collections and Amazon SageMaker Canvas

December 18, 2023

by Igor Alekseev Amazon AWS

This is a guest post co-written with Babu Srinivasan from MongoDB.

As industries evolve in today’s fast-paced business landscape, the inability to have real-time forecasts poses significant challenges for industries heavily reliant on accurate and timely insights. The absence of real-time forecasts in various industries presents pressing business challenges that can significantly impact decision-making and operational efficiency. Without real-time insights, businesses struggle to adapt to dynamic market conditions, accurately anticipate customer demand, optimize inventory levels, and make proactive strategic decisions. Industries such as Finance, Retail, Supply Chain Management, and Logistics face the risk of missed opportunities, increased costs, inefficient resource allocation, and the inability to meet customer expectations. By exploring these challenges, organizations can recognize the importance of real-time forecasting and explore innovative solutions to overcome these hurdles, enabling them to stay competitive, make informed decisions, and thrive in today’s fast-paced business environment.

By harnessing the transformative potential of MongoDB’s native time series data capabilities and integrating it with the power of Amazon SageMaker Canvas, organizations can overcome these challenges and unlock new levels of agility. MongoDB’s robust time series data management allows for the storage and retrieval of large volumes of time-series data in real-time, while advanced machine learning algorithms and predictive capabilities provide accurate and dynamic forecasting models with SageMaker Canvas.

In this post, we will explore the potential of using MongoDB’s time series data and SageMaker Canvas as a comprehensive solution.

MongoDB Atlas

MongoDB Atlas is a fully managed developer data platform that simplifies the deployment and scaling of MongoDB databases in the cloud. It is a document based storage that provides a fully managed database, with built-in full-text and vector Search, support for Geospatial queries, Charts and native support for efficient time series storage and querying capabilities. MongoDB Atlas offers automatic sharding, horizontal scalability, and flexible indexing for high-volume data ingestion. Among all, the native time series capabilities is a standout feature, making it ideal for a managing high volume of time-series data, such as business critical application data, telemetry, server logs and more. With efficient querying, aggregation, and analytics, businesses can extract valuable insights from time-stamped data. By using these capabilities, businesses can efficiently store, manage, and analyze time-series data, enabling data-driven decisions and gaining a competitive edge.

Amazon SageMaker Canvas

Amazon SageMaker Canvas is a visual machine learning (ML) service that enables business analysts and data scientists to build and deploy custom ML models without requiring any ML experience or having to write a single line of code. SageMaker Canvas supports a number of use cases, including time-series forecasting, which empowers businesses to forecast future demand, sales, resource requirements, and other time-series data accurately. The service uses deep learning techniques to handle complex data patterns and enables businesses to generate accurate forecasts even with minimal historical data. By using Amazon SageMaker Canvas capabilities, businesses can make informed decisions, optimize inventory levels, improve operational efficiency, and enhance customer satisfaction.

The SageMaker Canvas UI lets you seamlessly integrate data sources from the cloud or on-premises, merge datasets effortlessly, train precise models, and make predictions with emerging data—all without coding. If you need an automated workflow or direct ML model integration into apps, Canvas forecasting functions are accessible through APIs.

Solution overview

Users persist their transactional time series data in MongoDB Atlas. Through Atlas Data Federation, data is extracted into Amazon S3 bucket. Amazon SageMaker Canvas access the data to build models and create forecasts. The results of the forecasting are stored in an S3 bucket. Using the MongoDB Data Federation services, the forecasts are presented visually through MongoDB Charts.

The following diagram outlines the proposed solution architecture.

Prerequisites

For this solution we use MongoDB Atlas to store time series data, Amazon SageMaker Canvas to train a model and produce forecasts, and Amazon S3 to store data extracted from MongoDB Atlas.

Make sure you have the following prerequisites:

Create an S3 bucket

Configure MongoDB Atlas cluster

Create a free MongoDB Atlas cluster by following the instructions in Create a Cluster. Setup the Database access and Network access.

Populate a time series collection in MongoDB Atlas

For the purposes of this demonstration, you can use a sample data set from from Kaggle and upload the same to MongoDB Atlas with the MongoDB tools , preferably MongoDB Compass.

The following code shows a sample data set for a time series collection:

{
"store": "1 1",
"timestamp": { "2010-02-05T00:00:00.000Z"},
"temperature": "42.31",
"target_value": 2.572,
"IsHoliday": false
}

The following screenshot shows the sample time series data in MongoDB Atlas:

Create an S3 Bucket

Create an S3 bucket in AWS , where the time series data need to be stored and analyzed. Note we have two folders. sales-train-data is used to store data extracted from MongoDB Atlas, while sales-forecast-output contains predictions from Canvas.

Create the Data Federation

Setup the Data Federation in Atlas and register the S3 bucket created previously as part of the data source. Notice the three different database/collections are created in the data federation for Atlas cluster, S3 bucket for MongoDB Atlas data and S3 bucket to store the Canvas results.

The following screenshots shows the setup of the data federation.

Setup the Atlas application service

Create the MongoDB Application Services to deploy the functions to transfer the data from MongoDB Atlas cluster to S3 bucket using the $out aggregation.

Verify the Datasource Configuration

The Application services create a new Altas Service Name that needs to be referred as the data services in the following function. Verify that the Atlas Service Name is created and note it for future reference.

Create the function

Setup the Atlas Application services to create the trigger and functions. The triggers need to be scheduled to write the data to S3 at a period frequency based on the business need for training the models.

The following script shows the function to write to the S3 bucket:

exports = function () {

   const service = context.services.get("");
   const db = service.db("")
   const events = db.collection("");

   const pipeline = [
    {
            "$out": {
               "s3": {
                  "bucket": "<S3_bucket_name>",
                  "region": "<AWS_Region>",
                   "filename": {$concat: ["<S3path>/<filename>_",{"$toString":  new Date(Date.now())}]},
                  "format": {
                        "name": "json",
                        "maxFileSize": "10GB"
                  }
               }
            }
      }
   ];

   return events.aggregate(pipeline);
};

Sample function

The function can be run through the Run tab and the errors can be debugged using the log features in the Application Services. In addition, the errors can be debugged using the Logs menu in the left pane.

The following screenshot shows the execution of the function along with the output:

Create dataset in Amazon SageMaker Canvas

The following steps assume that you have created a SageMaker domain and user profile. If you have not already done so, make sure that you configure the SageMaker domain and user profile. In the user profile, update your S3 bucket to be custom and supply your bucket name.

When complete, navigate to SageMaker Canvas, select your domain and profile, and select Canvas.

Create a dataset supplying the data source.

Select the dataset source as S3

Select the data location from the S3 bucket and select Create dataset.

Review the schema and click Create dataset

Upon successful import, the dataset will appear in the list as shown in the following screenshot.

Train the model

Next, we will use Canvas to set up to train the model. Select the dataset and click Create.

Create a model name, select Predictive analysis, and select Create.

Select target column

Next, click Configure time series model and select item_id as the Item ID column.

Select tm for the time stamp column

To specify the amount of time that you want to forecast, choose 8 weeks.

Now you are ready to preview the model or launch the build process.

After you preview the model or launch the build, your model will be created and can take up to four hours. You can leave the screen and return to see the model training status.

When the model is ready, select the model and click on the latest version

Review the model metrics and column impact and if you are satisfied with the model performance, click Predict.

Next, choose Batch prediction, and click Select dataset.

Select your dataset, and click Choose dataset.

Next, click Start Predictions.

Observe a job created or observe the job progress in SageMaker under Inference, Batch transform jobs.

When the job completes, select the job, and note the S3 path where Canvas stored the predictions.

Visualize forecast data in Atlas Charts

To visualize forecast data, create the MongoDB Atlas charts based on the Federated data (amazon-forecast-data) for P10, P50, and P90 forecasts as shown in the following chart.

Clean up

Delete the MongoDB Atlas cluster
Delete Atlas Data Federation Configuration
Delete Atlas Application Service App
Delete the S3 Bucket
Delete Amazon SageMaker Canvas dataset and models
Delete the Atlas Charts
Log out of Amazon SageMaker Canvas

Conclusion

In this post we extracted time series data from MongoDB time series collection. This is a special collection optimized for storage and querying speed of time series data. We used Amazon SageMaker Canvas to train models and generate predictions and we visualized the predictions in Atlas Charts.

For more information, refer to the following resources.

Try out MongoDB Atlas
Try out MongoDB Atlas Time Series
Try out Amazon SageMaker Canvas
Try out MongoDB Charts

About the authors

Igor Alekseev is a Senior Partner Solution Architect at AWS in Data and Analytics domain. In his role Igor is working with strategic partners helping them build complex, AWS-optimized architectures. Prior joining AWS, as a Data/Solution Architect he implemented many projects in Big Data domain, including several data lakes in Hadoop ecosystem. As a Data Engineer he was involved in applying AI/ML to fraud detection and office automation.

Babu Srinivasan is a Senior Partner Solutions Architect at MongoDB. In his current role, he is working with AWS to build the technical integrations and reference architectures for the AWS and MongoDB solutions. He has more than two decades of experience in Database and Cloud technologies . He is passionate about providing technical solutions to customers working with multiple Global System Integrators(GSIs) across multiple geographies.

Use Amazon DocumentDB to build no-code machine learning solutions in Amazon SageMaker Canvas

December 15, 2023

by Adeleke Coker Amazon AWS

We are excited to announce the launch of Amazon DocumentDB (with MongoDB compatibility) integration with Amazon SageMaker Canvas, allowing Amazon DocumentDB customers to build and use generative AI and machine learning (ML) solutions without writing code. Amazon DocumentDB is a fully managed native JSON document database that makes it straightforward and cost-effective to operate critical document workloads at virtually any scale without managing infrastructure. Amazon SageMaker Canvas is a no-code ML workspace offering ready-to-use models, including foundation models, and the ability to prepare data and build and deploy custom models.

In this post, we discuss how to bring data stored in Amazon DocumentDB into SageMaker Canvas and use that data to build ML models for predictive analytics. Without creating and maintaining data pipelines, you will be able to power ML models with your unstructured data stored in Amazon DocumentDB.

Solution overview

Let’s assume the role of a business analyst for a food delivery company. Your mobile app stores information about restaurants in Amazon DocumentDB because of its scalability and flexible schema capabilities. You want to gather insights on this data and build an ML model to predict how new restaurants will be rated, but find it challenging to perform analytics on unstructured data. You encounter bottlenecks because you need to rely on data engineering and data science teams to accomplish these goals.

This new integration solves these problems by making it simple to bring Amazon DocumentDB data into SageMaker Canvas and immediately start preparing and analyzing data for ML. Additionally, SageMaker Canvas removes the dependency on ML expertise to build high-quality models and generate predictions.

We demonstrate how to use Amazon DocumentDB data to build ML models in SageMaker Canvas in the following steps:

Create an Amazon DocumentDB connector in SageMaker Canvas.
Analyze data using generative AI.
Prepare data for machine learning.
Build a model and generate predictions.

Prerequisites

To implement this solution, complete the following prerequisites:

Have AWS Cloud admin access with an AWS Identity and Access Management (IAM) user with permissions required to complete the integration.
Complete the environment setup using AWS CloudFormation through either of the following options:
1. Deploy a CloudFormation template into a new VPC – This option builds a new AWS environment that consists of the VPC, private subnets, security groups, IAM execution roles, Amazon Cloud9, required VPC endpoints, and SageMaker domain. It then deploys Amazon DocumentDB into this new VPC. Download the template or quick launch the CloudFormation stack by choosing Launch Stack:
2. Deploy a CloudFormation template into an existing VPC – This option creates the required VPC endpoints, IAM execution roles, and SageMaker domain in an existing VPC with private subnets. Download the template or quick launch the CloudFormation stack by choosing Launch Stack:

Note that if you’re creating a new SageMaker domain, you must configure the domain to be in a private VPC without internet access to be able to add the connector to Amazon DocumentDB. To learn more, refer to Configure Amazon SageMaker Canvas in a VPC without internet access.

Follow the tutorial to load sample restaurant data into Amazon DocumentDB.
Add access to Amazon Bedrock and the Anthropic Claude model within it. For more information, see Add model access.

Create an Amazon DocumentDB connector in SageMaker Canvas

After you create your SageMaker domain, complete the following steps:

On the Amazon DocumentDB console, choose No-code machine learning in the navigation pane.
Under Choose a domain and profile¸ choose your SageMaker domain and user profile.
Choose Launch Canvas to launch SageMaker Canvas in a new tab.

When SageMaker Canvas finishes loading, you will land on the Data flows tab.

Choose Create to create a new data flow.
Enter a name for your data flow and choose Create.
Add a new Amazon DocumentDB connection by choosing Import data, then choose Tabular for Dataset type.
On the Import data page, for Data Source, choose DocumentDB and Add Connection.
Enter a connection name such as demo and choose your desired Amazon DocumentDB cluster.

Note that SageMaker Canvas will prepopulate the drop-down menu with clusters in the same VPC as your SageMaker domain.

Enter a user name, password, and database name.
Finally, select your read preference.

To protect the performance of primary instances, SageMaker Canvas defaults to Secondary, meaning that it will only read from secondary instances. When read preference is Secondary preferred, SageMaker Canvas reads from available secondary instances, but will read from the primary instance if a secondary instance is not available. For more information on how to configure an Amazon DocumentDB connection, see the Connect to a database stored in AWS.

Choose Add connection.

If the connection is successful, you will see collections in your Amazon DocumentDB database shown as tables.

Drag your table of choice to the blank canvas. For this post, we add our restaurant data.

The first 100 rows are displayed as a preview.

To start analyzing and preparing your data, choose Import data.
Enter a dataset name and choose Import data.

Analyze data using generative AI

Next, we want to get some insights on our data and look for patterns. SageMaker Canvas provides a natural language interface to analyze and prepare data. When the Data tab loads, you can start chatting with your data with the following steps:

Choose Chat for data prep.
Gather insights about your data by asking questions like the samples shown in the following screenshots.

To learn more about how to use natural language to explore and prepare data, refer to Use natural language to explore and prepare data with a new capability of Amazon SageMaker Canvas.

Let’s get a deeper sense of our data quality by using the SageMaker Canvas Data Quality and Insights Report, which automatically evaluates data quality and detects abnormalities.

On the Analyses tab, choose Data Quality and Insights Report.
Choose rating as the target column and Regression as the problem type, then choose Create.

This will simulate model training and provide insights on how we can improve our data for machine learning. The complete report is generated in a few minutes.

Our report shows that 2.47% of rows in our target have missing values—we’ll address that in the next step. Additionally, the analysis shows that the address line 2, name, and type_of_food features have the most prediction power in our data. This indicates basic restaurant information like location and cuisine may have an outsized impact on ratings.

Prepare data for machine learning

SageMaker Canvas offers over 300 built-in transformations to prepare your imported data. For more information on transformation features of SageMaker Canvas, refer to Prepare data with advanced transformations. Let’s add some transformations to get our data ready for training an ML model.

Navigate back to the Data flow page by choosing the name of your data flow at the top of the page.
Choose the plus sign next to Data types and choose Add transform.
Choose Add step.
Let’s rename the address line 2 column to cities.
1. Choose Manage columns.
2. Choose Rename column for Transform.
3. Choose address line 2 for Input column, enter cities for New name, and choose Add.
Additionally, lets drop some unnecessary columns.
1. Add a new transform.
2. For Transform, choose Drop column.
3. For Columns to drop, choose URL and restaurant_id.
4. Choose Add.
  [
Our rating feature column has some missing values, so let’s fill in those rows with the average value of this column.
1. Add a new transform.
2. For Transform, choose Impute.
3. For Column type, choose Numeric.
4. For Input columns, choose the rating column.
5. For Imputing strategy, choose Mean.
6. For Output column, enter rating_avg_filled.
7. Choose Add.
We can drop the rating column because we have a new column with filled values.
Because type_of_food is categorical in nature, we’ll want to numerically encode it. Let’s encode this feature using the one-hot encoding technique.
1. Add a new transform.
2. For Transform, choose One-hot encode.
3. For Input columns, choose type_of_food.
4. For Invalid handling strategy¸ choose Keep.
5. For Output style¸ choose Columns.
6. For Output column, enter encoded.
7. Choose Add.

Build a model and generate predictions

Now that we have transformed our data, let’s train a numeric ML model to predict the ratings for restaurants.

Choose Create model.
For Dataset name, enter a name for the dataset export.
Choose Export and wait for the transformed data to be exported.
Choose the Create model link at the bottom left corner of the page.

You can also select the dataset from the Data Wrangler feature on the left of the page.

Enter a model name.
Choose Predictive analysis, then choose Create.
Choose rating_avg_filled as the target column.

SageMaker Canvas automatically selects a suitable model type.

Choose Preview model to ensure there are no data quality issues.
Choose Quick build to build the model.

The model creation will take approximately 2–15 minutes to complete.

You can view the model status after the model finishes training. Our model has an RSME of 0.422, which means the model often predicts the rating of a restaurant within +/- 0.422 of the actual value, a solid approximation for the rating scale of 1–6.

Finally, you can generate sample predictions by navigating to the Predict tab.

Clean up

To avoid incurring future charges, delete the resources you created while following this post. SageMaker Canvas bills you for the duration of the session, and we recommend logging out of SageMaker Canvas when you’re not using it. Refer to Logging out of Amazon SageMaker Canvas for more details.

Conclusion

In this post, we discussed how you can use SageMaker Canvas for generative AI and ML with data stored in Amazon DocumentDB. In our example, we showed how an analyst can quickly build a high-quality ML model using a sample restaurant dataset.

We showed the steps to implement the solution, from importing data from Amazon DocumentDB to building an ML model in SageMaker Canvas. The entire process was completed through a visual interface without writing a single line of code.

To start your low-code/no-code ML journey, refer to Amazon SageMaker Canvas.

About the authors

Adeleke Coker is a Global Solutions Architect with AWS. He works with customers globally to provide guidance and technical assistance in deploying production workloads at scale on AWS. In his spare time, he enjoys learning, reading, gaming and watching sport events.

Gururaj S Bayari is a Senior DocumentDB Specialist Solutions Architect at AWS. He enjoys helping customers adopt Amazon’s purpose-built databases. He helps customers design, evaluate, and optimize their internet scale and high performance workloads powered by NoSQL and/or Relational databases.

Tim Pusateri is a Senior Product Manager at AWS where he works on Amazon SageMaker Canvas. His goal is to help customers quickly derive value from AI/ML. Outside of work, he loves to be outdoors, play guitar, see live music, and spend time with family and friends.

Pratik Das is a Product Manager at AWS. He enjoys working with customers looking to build resilient workloads and strong data foundations in the cloud. He brings expertise working with enterprises on modernization, analytical and data transformation initiatives.

Varma Gottumukkala is a Senior Database Specialist Solutions Architect at AWS based out of Dallas Fort Worth. Varma works with the customers on their database strategy and architect their workloads using AWS purpose built databases. Before joining AWS, he worked extensively with relational databases, NOSQL databases and multiple programming languages for the last 22 years.

Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools

December 14, 2023

by Pranav Murthy Amazon AWS

Amazon SageMaker Studio offers a broad set of fully managed integrated development environments (IDEs) for machine learning (ML) development, including JupyterLab, Code Editor based on Code-OSS (Visual Studio Code Open Source), and RStudio. It provides access to the most comprehensive set of tools for each step of ML development, from preparing data to building, training, deploying, and managing ML models. You can launch fully managed JuptyerLab with pre-configured SageMaker Distribution in seconds to work with your notebooks, code, and data. The flexible and extensible interface of SageMaker Studio allows you to effortlessly configure and arrange ML workflows, and you can use the AI-powered inline coding companion to quickly author, debug, explain, and test code.

In this post, we take a closer look at the updated SageMaker Studio and its JupyterLab IDE, designed to boost the productivity of ML developers. We introduce the concept of Spaces and explain how JupyterLab Spaces enable flexible customization of compute, storage, and runtime resources to improve your ML workflow efficiency. We also discuss our shift to a localized execution model in JupyterLab, resulting in a quicker, more stable, and responsive coding experience. Additionally, we cover the seamless integration of generative AI tools like Amazon CodeWhisperer and Jupyter AI within SageMaker Studio JupyterLab Spaces, illustrating how they empower developers to use AI for coding assistance and innovative problem-solving.

Introducing Spaces in SageMaker Studio

The new SageMaker Studio web-based interface acts as a command center for launching your preferred IDE and accessing your Amazon SageMaker tools to build, train, tune, and deploy models. In addition to JupyterLab and RStudio, SageMaker Studio now includes a fully managed Code Editor based on Code-OSS (Visual Studio Code Open Source). Both JupyterLab and Code Editor can be launched using a flexible workspace called Spaces.

A Space is a configuration representation of a SageMaker IDE, such as JupyterLab or Code Editor, designed to persist regardless of whether an application (IDE) associated with the Space is actively running or not. A Space represents a combination of a compute instance, storage, and other runtime configurations. With Spaces, you can create and scale the compute and storage for your IDE up and down as you go, customize runtime environments, and pause and resume coding anytime from anywhere. You can spin up multiple such Spaces, each configured with a different combination of compute, storage, and runtimes.

When a Space is created, it is equipped with an Amazon Elastic Block Store (Amazon EBS) volume, which is used to store users’ files, data, caches, and other artifacts. It’s attached to a ML compute instance whenever a Space is run. The EBS volume ensures that user files, data, cache, and session states are consistently restored whenever the Space is restarted. Importantly, this EBS volume remains persistent, whether the Space is in a running or stopped state. It will continue to persist until the Space is deleted.

Additionally, we have introduced the bring-your-own file system feature for users who wish to share environments and artifacts across different Spaces, users, or even domains. This enables you to optionally equip your Spaces with your own Amazon Elastic File System (Amazon EFS) mount, facilitating the sharing of resources across various workspaces.

Creating a Space

Creating and launching a new Space is now quick and straightforward. It takes just a few seconds to set up a new Space with fast launch instances and less than 60 seconds to run a Space. Spaces are equipped with predefined settings for compute and storage, managed by administrators. SageMaker Studio administrators can establish domain-level presets for compute, storage, and runtime configurations. This setup enables you to quickly launch a new space with minimal effort, requiring only a few clicks. You also have the option to modify a Space’s compute, storage, or runtime configurations for further customization.

It’s important to note that creating a Space requires updating the SageMaker domain execution role with a policy like the following example. You need to grant your users permissions for private spaces and user profiles necessary to access these private spaces. For detailed instructions, refer to Give your users access to private spaces.

{
  "Version": "2012-10-17",
  "Statement": [
    {

      "Effect": "Allow",
      "Action": [
        "sagemaker:CreateApp",
        "sagemaker:DeleteApp"
      ],
      "Resource": "arn:aws:sagemaker:{{Region}}:{{AccountId}}:app/*",
      "Condition": {
        "Null": {
          "sagemaker:OwnerUserProfileArn": "true"
        }
      }
    },
    {
      "Sid": "SMStudioCreatePresignedDomainUrlForUserProfile",
      "Effect": "Allow",
      "Action": [
        "sagemaker:CreatePresignedDomainUrl"
      ],
      "Resource": "arn:aws:sagemaker:{{Region}}:{{AccountId}}:user-profile/${sagemaker:DomainId}/${sagemaker:UserProfileName}"
    },
    {
      "Sid": "SMStudioAppPermissionsListAndDescribe",
      "Effect": "Allow",
      "Action": [
        "sagemaker:ListApps",
        "sagemaker:ListDomains",
        "sagemaker:ListUserProfiles",
        "sagemaker:ListSpaces",
        "sagemaker:DescribeApp",
        "sagemaker:DescribeDomain",
        "sagemaker:DescribeUserProfile",
        "sagemaker:DescribeSpace"
      ],
      "Resource": "*"
    },
    {
      "Sid": "SMStudioAppPermissionsTagOnCreate",
      "Effect": "Allow",
      "Action": [
        "sagemaker:AddTags"
      ],
      "Resource": "arn:aws:sagemaker:{{Region}}:{{AccountId}}:*/*",
      "Condition": {
        "Null": {
          "sagemaker:TaggingAction": "false"
        }
      }
    },
    {
      "Sid": "SMStudioRestrictSharedSpacesWithoutOwners",
      "Effect": "Allow",
      "Action": [
        "sagemaker:CreateSpace",
        "sagemaker:UpdateSpace",
        "sagemaker:DeleteSpace"
      ],
      "Resource": "arn:aws:sagemaker:{{Region}}:{{AccountId}}:space/${sagemaker:DomainId}/*",
      "Condition": {
        "Null": {
          "sagemaker:OwnerUserProfileArn": "true"
        }
      }
    },
    {
      "Sid": "SMStudioRestrictSpacesToOwnerUserProfile",
      "Effect": "Allow",
      "Action": [
        "sagemaker:CreateSpace",
        "sagemaker:UpdateSpace",
        "sagemaker:DeleteSpace"
      ],
      "Resource": "arn:aws:sagemaker:{{Region}}:{{AccountId}}:space/${sagemaker:DomainId}/*",
      "Condition": {
        "ArnLike": {
          "sagemaker:OwnerUserProfileArn": "arn:aws:sagemaker:$AWS Region:$111122223333:user-profile/${sagemaker:DomainId}/${sagemaker:UserProfileName}"
        },
        "StringEquals": {
          "sagemaker:SpaceSharingType": [
            "Private",
            "Shared"
          ]
        }
      }
    },
    {
      "Sid": "SMStudioRestrictCreatePrivateSpaceAppsToOwnerUserProfile",
      "Effect": "Allow",
      "Action": [
        "sagemaker:CreateApp",
        "sagemaker:DeleteApp"
      ],
      "Resource": "arn:aws:sagemaker:{{Region}}:{{AccountId}}:app/${sagemaker:DomainId}/*",
      "Condition": {
        "ArnLike": {
          "sagemaker:OwnerUserProfileArn": "arn:aws:sagemaker:${aws:Region}:${aws:PrincipalAccount}:user-profile/${sagemaker:DomainId}/${sagemaker:UserProfileName}"
        },
        "StringEquals": {
          "sagemaker:SpaceSharingType": [
            "Private"
          ]
        }
      }
    },
  ]
}

To create a space, complete the following steps:

In SageMaker Studio, choose JupyterLab on the Applications menu.
Choose Create JupyterLab space.
For Name, enter a name for your Space.
Choose Create space.
Choose Run space to launch your new Space with default presets or update the configuration based on your requirements.

Reconfiguring a Space

Spaces are designed for users to seamlessly transition between different compute types as needed. You can begin by creating a new Space with a specific configuration, primarily consisting of compute and storage. If you need to switch to a different compute type with a higher or lower vCPU count, more or less memory, or a GPU-based instance at any point in your workflow, you can do so with ease. After you stop the Space, you can modify its settings using either the UI or API via the updated SageMaker Studio interface and then restart the Space. SageMaker Studio automatically handles the provisioning of your existing Space to the new configuration, requiring no extra effort on your part.

Complete the following steps to edit an existing space:

On the space details page, choose Stop space.
Reconfigure the compute, storage, or runtime.
Choose Run space to relaunch the space.

Your workspace will be updated with the new storage and compute instance type you requested.

The new SageMaker Studio JupyterLab architecture

The SageMaker Studio team continues to invent and simplify its developer experience with the release of a new fully managed SageMaker Studio JupyterLab experience. The new SageMaker Studio JupyterLab experience combines the best of both worlds: the scalability and flexibility of SageMaker Studio Classic (see the appendix at the end of this post) with the stability and familiarity of the open source JupyterLab. To grasp the design of this new JupyterLab experience, let’s delve into the following architecture diagram. This will help us better understand the integration and features of this new JupyterLab Spaces platform.

In summary, we have transitioned towards a localized architecture. In this new setup, Jupyter server and kernel processes operate alongside in a single Docker container, hosted on the same ML compute instance. These ML instances are provisioned when a Space is running, and linked with an EBS volume that is created when the Space was initially created.

This new architecture brings several benefits; we discuss some of these in the following sections.

Reduced latency and increased stability

SageMaker Studio has transitioned to a local run model, moving away from the previous split model where code was stored on an EFS mount and run remotely on an ML instance via remote Kernel Gateway. In the earlier setup, Kernel Gateway, a headless web server, enabled kernel operations over remote communication with Jupyter kernels through HTTPS/WSS. User actions like running code, managing notebooks, or running terminal commands were processed by a Kernel Gateway app on a remote ML instance, with Kernel Gateway facilitating these operations over ZeroMQ (ZMQ) within a Docker container. The following diagram illustrates this architecture.

The updated JupyterLab architecture runs all kernel operations directly on the local instance. This local Jupyter Server approach typically provides improved performance and straightforward architecture. It minimizes latency and network complexity, simplifies the architecture for easier debugging and maintenance, enhances resource utilization, and accommodates more flexible messaging patterns for a variety of complex workloads.

In essence, this upgrade brings running notebooks and code much closer to the kernels, significantly reducing latency and boosting stability.

Improved control over provisioned storage

SageMaker Studio Classic originally used Amazon EFS to provide persistent, shared file storage for user home directories within the SageMaker Studio environment. This setup enables you to centrally store notebooks, scripts, and other project files, accessible across all your SageMaker Studio sessions and instances.

With the latest update to SageMaker Studio, there is a shift from Amazon EFS-based storage to an Amazon EBS-based solution. The EBS volumes, provisioned with SageMaker Studio Spaces, are GP3 volumes designed to deliver a consistent baseline performance of 3,000 IOPS, independent of the volume size. This new Amazon EBS storage offers higher performance for I/O-intensive tasks such as model training, data processing, high-performance computing, and data visualization. This transition also gives SageMaker Studio administrators greater insight into and control over storage usage by user profiles within a domain or across SageMaker. You can now set default (DefaultEbsVolumeSizeInGb) and maximum (MaximumEbsVolumeSizeInGb) storage sizes for JupyterLab Spaces within each user profile.

In addition to improved performance, you have the ability to flexibly resize the storage volume attached to your Space’s ML compute instance by editing your Space setting either using the UI or API action from your SageMaker Studio interface, without requiring any administration action. However, note that you can only edit EBS volume sizes in one direction—after you increase the Space’s EBS volume size, you will not be able to lower it back down.

SageMaker Studio now offers elevated control of provisioned storage for administrators:

SageMaker Studio administrators can manage the EBS volume sizes for user profiles. These JupyterLab EBS volumes can vary from a minimum of 5 GB to a maximum of 16 TB. The following code snippet shows how to create or update a user profile with default and maximum space settings:

aws --region $REGION sagemaker create-user-profile 
--domain-id $DOMAIN_ID 
--user-profile-name $USER_PROFILE_NAME 
--user-settings '{
    "SpaceStorageSettings": {
        "DefaultEbsStorageSettings":{
            "DefaultEbsVolumeSizeInGb":5,
            "MaximumEbsVolumeSizeInGb":100
        }
    }
}'


# alternatively to update an existing user profile
aws --region $REGION sagemaker update-user-profile 
--domain-id $DOMAIN_ID 
--user-profile-name $USER_PROFILE_NAME 
--user-settings '{
    "SpaceStorageSettings": {
        "DefaultEbsStorageSettings":{
            "DefaultEbsVolumeSizeInGb":25,
            "MaximumEbsVolumeSizeInGb":100 
        }
    }
}'

SageMaker Studio now offers an enhanced auto-tagging feature for Amazon EBS resources, automatically labeling volumes created by users with domain, user, and Space information. This advancement simplifies cost allocation analysis for storage resources, aiding administrators in managing and attributing costs more effectively. It’s also important to note that these EBS volumes are hosted within the service account, so you won’t have direct visibility. Nonetheless, storage usage and associated costs are directly linked to the domain ARN, user profile ARN, and Space ARN, facilitating straightforward cost allocation.
Administrators can also control encryption of a Space’s EBS volumes, at rest, using customer managed keys (CMK).

Shared tenancy with bring-your-own EFS file system

ML workflows are typically collaborative, requiring efficient sharing of data and code among team members. The new SageMaker Studio enhances this collaborative aspect by enabling you to share data, code, and other artifacts via a shared bring-your-own EFS file system. This EFS drive can be set up independently of SageMaker or could be an existing Amazon EFS resource. After it’s provisioned, it can be seamlessly mounted onto SageMaker Studio user profiles. This feature is not restricted to user profiles within a single domain—it can extend across domains, as long as they are within the same Region.

The following example code shows you how to create a domain and attach an existing EFS volume to it using its associated fs-id. EFS volumes can be attached to a domain at the root or prefix level, as the following commands demonstrate:

# create a domain with and attach an existing EFS volume at root level
aws sagemaker create-domain --domain-name "myDomain" 
 --vpc-id {VPC_ID} --subnet-ids {SUNBET_IDS} --auth-mode IAM 
 --default-user-settings 
 "CustomFileSystemConfigs=[{EFSFileSystemConfig={FileSystemId="fs-12345678"}}]"
 
# create a domain with and attach an existing EFS volume at file system prefix leve
aws sagemaker create-domain --domain-name "myDomain" 
 --vpc-id {VPC_ID} --subnet-ids {SUNBET_IDS} --auth-mode IAM 
 --default-user-settings 
 "CustomFileSystemConfigs=[{EFSFileSystemConfig={FileSystemId="fs-12345678", FileSystemPath="/my/custom/path"}}]"

# update an existing domain with your own EFS
aws sagemaker update-domain --region us-west-2 --domain-id d-xxxxx 
    --default-user-settings 
    "CustomFileSystemConfigs=[{EFSFileSystemConfig={FileSystemId="fs-12345678"}}]"

When an EFS mount is made available in a domain and its related user profiles, you can choose to attach it to a new space. This can be done using either the SageMaker Studio UI or an API action, as shown in the following example. It’s important to note that when a space is created with an EFS file system that’s provisioned at the domain level, the space inherits its properties. This means that if the file system is provisioned at a root or prefix level within the domain, these settings will automatically apply to the space created by the domain users.

# attach an a preconfigured EFS to a space
aws sagemaker create-space 
--space-name byofs-space --domain-id "myDomain" 
--ownership-settings "OwnerUserProfileName={USER_PROFILE_NAME}" 
--space-sharing-settings "SharingType=Private" 
--space-settings 
"AppType=JupyterLab,CustomFileSystems=[{EFSFileSystem={FileSystemId="fs-12345678"}}]")

After mounting it to a Space, you can locate all your files located above the admin-provisioned mount point. These files can be found in the directory path /mnt/custom-file-system/efs/fs-12345678.

EFS mounts make is straightforward to share artifacts between a user’s Space or between multiple users or across domains, making it ideal for collaborative workloads. With this feature, you can do the following:

Share data – EFS mounts are ideal for storing large datasets crucial for data science experiments. Dataset owners can load these mounts with training, validation, and test datasets, making them accessible to user profiles within a domain or across multiple domains. SageMaker Studio admins can also integrate existing application EFS mounts while maintaining compliance with organizational security policies. This is done through flexible prefix-level mounting. For example, if production and test data are stored on the same EFS mount (such as fs-12345678:/data/prod and fs-12345678:/data/test), mounting /data/test onto the SageMaker domain’s user profiles grants users access only to the test dataset. This setup allows for analysis or model training while keeping production data secure and inaccessible.
Share Code – EFS mounts facilitate the quick sharing of code artifacts between user profiles. In scenarios where users need to rapidly share code samples or collaborate on a common code base without the complexities of frequent git push/pull commands, shared EFS mounts are highly beneficial. They offer a convenient way to share work-in-progress code artifacts within a team or across different teams in SageMaker Studio.
Share development environments – Shared EFS mounts can also serve as a means to quickly disseminate sandbox environments among users and teams. EFS mounts provide a solid alternative for sharing Python environments like conda or virtualenv across multiple workspaces. This approach circumvents the need for distributing requirements.txt or environment.yml files, which can often lead to the repetitive task of creating or recreating environments across different user profiles.

These features significantly enhance the collaborative capabilities within SageMaker Studio, making it effortless for teams to work together efficiently on complex ML projects. Additionally, Code Editor based on Code-OSS (Visual Studio Code Open Source) shares the same architectural principles as the aforementioned JupyterLab experience This alignment brings several advantages, such as reduced latency, enhanced stability, and improved administrative control, and enables user access to shared workspaces, similar to those offered in JupyterLab Spaces.

Generative AI-powered tools on JupyterLab Spaces

Generative AI, a rapidly evolving field in artificial intelligence, uses algorithms to create new content like text, images, and code from extensive existing data. This technology has revolutionized coding by automating routine tasks, generating complex code structures, and offering intelligent suggestions, thereby streamlining development and fostering creativity and problem-solving in programming. As an indispensable tool for developers, generative AI enhances productivity and drives innovation in the tech industry. SageMaker Studio enhances this developer experience with pre-installed tools like Amazon CodeWhisperer and Jupyter AI, using generative AI to accelerate the development lifecycle.

Amazon CodeWhisperer

Amazon CodeWhisperer is a programming assistant that enhances developer productivity through real-time code recommendations and solutions. As an AWS managed AI service, it’s seamlessly integrated into the SageMaker Studio JupyterLab IDE. This integration makes Amazon CodeWhisperer a fluid and valuable addition to a developer’s workflow.

Amazon CodeWhisperer excels in increasing developer efficiency by automating common coding tasks, suggesting more effective coding patterns, and decreasing debugging time. It serves as an essential tool for both beginner and seasoned coders, providing insights into best practices, accelerating the development process, and improving the overall quality of code. To start using Amazon CodeWhisperer, make sure that the Resume Auto-Suggestions feature is activated. You can manually invoke code suggestions using keyboard shortcuts.

Alternatively, write a comment describing your intended code function and begin coding; Amazon CodeWhisperer will start providing suggestions.

Note that although Amazon CodeWhisperer is pre-installed, you must have the codewhisperer:GenerateRecommendations permission as part of the execution role to receive code recommendations. For additional details, refer to Using CodeWhisperer with Amazon SageMaker Studio. When you use Amazon CodeWhisperer, AWS may, for service improvement purposes, store data about your usage and content. To opt out of the Amazon CodeWhisperer data sharing policy, you can navigate to the Setting option from the top menu then navigate to Settings Editor and disable Share usage data with Amazon CodeWhisperer from the Amazon CodeWhisperer settings menu.

Jupyter AI

Jupyter AI is an open source tool that brings generative AI to Jupyter notebooks, offering a robust and user-friendly platform for exploring generative AI models. It enhances productivity in JupyterLab and Jupyter Notebooks by providing features like the %%ai magic for creating a generative AI playground inside notebooks, a native chat UI in JupyterLab for interacting with AI as a conversational assistant, and support for a wide array of large language model (LLM) providers like AI21, Anthropic, Cohere, and Hugging Face or managed services like Amazon Bedrock and SageMaker endpoints. This integration offers more efficient and innovative methods for data analysis, ML, and coding tasks. For example, you can interact with a domain-aware LLM using the Jupyternaut chat interface for help with processes and workflows or generate example code through CodeLlama, hosted on SageMaker endpoints. This makes it a valuable tool for developers and data scientists.

Jupyter AI provides an extensive selection of language models ready for use right out of the box. Additionally, custom models are also supported via SageMaker endpoints, offering flexibility and a broad range of options for users. It also offers support for embedding models, enabling you to perform inline comparisons and tests and even build or test ad hoc Retrieval Augmented Generation (RAG) apps.

Jupyter AI can act as your chat assistant, helping you with code samples, providing you with answers to questions, and much more.

You can use Jupyter AI’s %%ai magic to generate sample code inside your notebook, as shown in the following screenshot.

JupyterLab 4.0

The JupyterLab team has released version 4.0, featuring significant improvements in performance, functionality, and user experience. Detailed information about this release is available in the official JupyterLab Documentation.

This version, now standard in SageMaker Studio JupyterLab, introduces optimized performance for handling large notebooks and faster operations, thanks to improvements like CSS rule optimization and the adoption of CodeMirror 6 and MathJax 3. Key enhancements include an upgraded text editor with better accessibility and customization, a new extension manager for easy installation of Python extensions, and improved document search capabilities with advanced features. Additionally, version 4.0 brings UI improvements, accessibility enhancements, and updates to development tools, and certain features have been backported to JupyterLab 3.6.

Conclusion

The advancements in SageMaker Studio, particularly with the new JupyterLab experience, mark a significant leap forward in ML development. The updated SageMaker Studio UI, with its integration of JupyterLab, Code Editor, and RStudio, offers an unparalleled, streamlined environment for ML developers. The introduction of JupyterLab Spaces provides flexibility and ease in customizing compute and storage resources, enhancing the overall efficiency of ML workflows. The shift from a remote kernel architecture to a localized model in JupyterLab greatly increases stability while decreasing startup latency. This results in a quicker, more stable, and responsive coding experience. Moreover, the integration of generative AI tools like Amazon CodeWhisperer and Jupyter AI in JupyterLab further empowers developers, enabling you to use AI for coding assistance and innovative problem-solving. The enhanced control over provisioned storage and the ability to share code and data effortlessly through self-managed EFS mounts greatly facilitate collaborative projects. Lastly, the release of JupyterLab 4.0 within SageMaker Studio underscores these improvements, offering optimized performance, better accessibility, and a more user-friendly interface, thereby solidifying JupyterLab’s role as a cornerstone of efficient and effective ML development in the modern tech landscape.

Give SageMaker Studio JupyterLab Spaces a try using our quick onboard feature, which allows you to spin up a new domain for single users within minutes. Share your thoughts in the comments section!

Appendix: SageMaker Studio Classic’s kernel gateway architecture

A SageMaker Classic domain is a logical aggregation of an EFS volume, a list of users authorized to access the domain, and configurations related to security, application, networking, and more. In the SageMaker Studio Classic architecture of SageMaker, each user within the SageMaker domain has a distinct user profile. This profile encompasses specific details like the user’s role and their Posix user ID in the EFS volume, among other unique data. Users access their individual user profile through a dedicated Jupyter Server app, connected via HTTPS/WSS in their web browser. SageMaker Studio Classic uses a remote kernel architecture using a combination of Jupyter Server and Kernel Gateway app types, enabling notebook servers to interact with kernels on remote hosts. This means that the Jupyter kernels operate not on the notebook server’s host, but within Docker containers on separate hosts. In essence, your notebook is stored in the EFS home directory, and runs code remotely on a different Amazon Elastic Compute Cloud (Amazon EC2) instance, which houses a pre-built Docker container equipped with ML libraries such as PyTorch, TensorFlow, Scikit-Learn, and more.

The remote kernel architecture in SageMaker Studio offers notable benefits in terms of scalability and flexibility. However, it has its limitations, including a maximum of four apps per instance type and potential bottlenecks due to numerous HTTPS/WSS connections to a common EC2 instance type. These limitations could negatively affect the user experience.

The following architecture diagram depicts the SageMaker Studio Classic architecture. It illustrates the user’s process of connecting to a Kernel Gateway app via a Jupyter Server app, using their preferred web browser.

About the authors

Pranav Murthy is an AI/ML Specialist Solutions Architect at AWS. He focuses on helping customers build, train, deploy and migrate machine learning (ML) workloads to SageMaker. He previously worked in the semiconductor industry developing large computer vision (CV) and natural language processing (NLP) models to improve semiconductor processes using state of the art ML techniques. In his free time, he enjoys playing chess and traveling. You can find Pranav on LinkedIn.

Kunal Jha is a Senior Product Manager at AWS. He is focused on building Amazon SageMaker Studio as the best-in-class choice for end-to-end ML development. In his spare time, Kunal enjoys skiing and exploring the Pacific Northwest. You can find him on LinkedIn.

Majisha Namath Parambath is a Senior Software Engineer at Amazon SageMaker. She has been at Amazon for over 8 years and is currently working on improving the Amazon SageMaker Studio end-to-end experience.

Bharat Nandamuri is a Senior Software Engineer working on Amazon SageMaker Studio. He is passionate about building high scale backend services with focus on Engineering for ML systems. Outside of work, he enjoys playing chess, hiking and watching movies.

Derek Lause is a Software Engineer at AWS. He is committed to deliver value to customers through Amazon SageMaker Studio and Notebook Instances. In his spare time, Derek enjoys spending time with family and friends and hiking. You can find Derek on LinkedIn.

How AWS Prototyping enabled ICL-Group to build computer vision models on Amazon SageMaker

December 14, 2023

by Dr. Markus Bestehorn Amazon AWS

This is a customer post jointly authored by ICL and AWS employees.

ICL is a multi-national manufacturing and mining corporation based in Israel that manufactures products based on unique minerals and fulfills humanity’s essential needs, primarily in three markets: agriculture, food, and engineered materials. Their mining sites use industrial equipment that has to be monitored because machinery failures can result in loss of revenue or even environmental damages. Due to the extremely harsh conditions (low and high temperatures, vibrations, salt water, dust), attaching sensors to these mining machines for remote monitoring is difficult. Therefore, most machines are manually or visually monitored continuously by on-site workers. These workers frequently check camera pictures to monitor the state of a machine. Although this approach has worked in the past, it doesn’t scale and incurs relatively high costs.

To overcome this business challenge, ICL decided to develop in-house capabilities to use machine learning (ML) for computer vision (CV) to automatically monitor their mining machines. As a traditional mining company, the availability of internal resources with data science, CV, or ML skills was limited.

In this post, we discuss the following:

How ICL developed the in-house capabilities to build and maintain CV solutions that allow automatic monitoring of mining equipment to improve efficiency and reduce waste
A deep dive into a solution for mining screeners that was developed with the support of the AWS Prototyping program

Using the approach described in this post, ICL was able to develop a framework on AWS using Amazon SageMaker to build other use cases based on extracted vision from about 30 cameras, with the potential of scaling to thousands of such cameras on their production sites.

Building in-house capabilities through AWS Prototyping

Building and maintaining ML solutions for business-critical workloads requires sufficiently skilled staff. Outsourcing such activities is often not possible because internal know-how about business process needs to be combined with technical solution building. Therefore, ICL approached AWS for support in their journey to build a CV solution to monitor their mining equipment and acquire the necessary skills.

AWS Prototyping is an investment program where AWS embeds specialists into customer development teams to build mission-critical use cases. During such an engagement, the customer development team is enabled on the underlying AWS technologies while building the use case over the course of 3–6 weeks and getting hands-on help. Besides a corresponding use case, all the customer needs are 3–7 developers that can spend more than 80% of their working time building the aforementioned use case. During this time, the AWS specialists are fully assigned to the customer’s team and collaborate with them remotely or on-site.

ICL’s computer vision use case

For the prototyping engagement, ICL selected the use case for monitoring their mining screeners. A screener is a large industrial mining machine where minerals dissolved in water are processed. The water flows in several lanes from the top of the machine to the bottom. The influx is monitored for each of the lanes individually. When the influx runs out of the lane, it’s called overflow, which indicates that the machine is overloaded. Overflowing influx are minerals that are not processed by the screener and are lost. This needs to be avoided by regulating the influx. Without an ML solution, the overflow needs to be monitored by humans and it potentially takes time until the overflow is observed and handled.

The following images show the input and outputs of the CV models. The raw camera picture (left) is processed using a semantic segmentation model (middle) to detect the different lanes. Then the model (right) estimates the coverage (white) and overflow (red).

Although the prototyping engagement focused on a single type of machine, the general approach to use cameras and automatically process their images while using CV is applicable to a wider range of mining equipment. This allows ICL to extrapolate the know-how gained during the prototyping engagement to other locations, camera types, and machines, and also maintain the ML models without requiring support from any third party.

During the engagement, the AWS specialists and the ICL development team would meet every day and codevelop the solution step by step. ICL data scientists would either work independently on their assigned tasks or receive hands-on, pair-programming support from AWS ML specialists. This approach ensures that ICL data scientists not only gained experience to systematically develop ML models using SageMaker, but also to embed these models into applications as well as automate the whole lifecycle of such models, including automated retraining or model monitoring. After 4 weeks of this collaboration, ICL was able to move this model into production without requiring further support within 8 weeks, and has built models for other use cases since then. The technical approach of this engagement is described in the next section.

Monitoring mining screeners using CV models with SageMaker

SageMaker is a fully managed platform that addresses the complete lifecycle of an ML model: it provides services and features that support teams working on ML models from labeling their data in Amazon SageMaker Ground Truth to training and optimizing the model, as well as hosting ML models for production use. Prior to the engagement, ICL had installed the cameras and obtained pictures as shown in the previous images (left-most image) and stored them in an Amazon Simple Storage Service (Amazon S3) bucket. Before models can be trained, it’s necessary to generate training data. The joint ICL-AWS team addressed this in three steps:

Label the data using a semantic segmentation labeling job in SageMaker Ground Truth, as shown in the following image.
Preprocess the labeled images using image augmentation techniques to increase the number of data samples.
Split the labeled images into training, test, and validation sets, so that the performance and accuracy of the model can be measured adequately during the training process.

To achieve production scale for ML workloads, automating these steps is crucial to maintain the quality of the training input. Therefore, whenever new images are labeled using SageMaker Ground Truth, the preprocessing and splitting steps are run automatically and the resulting datasets are stored in Amazon S3, as shown model training workflow in the following diagram. Similarly, the model deployment workflow uses assets from SageMaker to update endpoints automatically whenever an updated model is available.

ICL is using several approaches to implement ML models into production. Some involve their current AI platform called KNIME, which allows them to quickly deploy models developed in the development environment into production by industrializing them into products. Several combinations of using KNIME and AWS services were analyzed; the preceding architecture was the most suitable to ICL’ s environment.

The SageMaker semantic segmentation built-in algorithm is used to train models for screener grid area segmentation. By choosing this built-in algorithm over a self-built container, ICL doesn’t have to deal with the undifferentiated heavy lifting of maintaining a Convolutional Neural Network (CNN) while being able to use such a CNN for their use case. After experimenting with different configurations and parameters, ICL used a Fully Convolutional Network (FCN) algorithm with a pyramid scene parsing network (PSPNet) to train the model. This allowed ICL to finalize the model building within 1 week of the prototyping engagement.

After a model has been trained, it has to be deployed to be usable for the screener monitoring. In line with the model training, this process is fully automated and orchestrated using AWS Step Functions and AWS Lambda. After the model is successfully deployed on the SageMaker endpoint, incoming pictures from the cameras are resized to fit the model’s input format and then fed into the endpoint for predictions using Lambda functions. The result of the semantic segmentation prediction as well as the overflow detection are then stored in Amazon DynamoDB and Amazon S3 for downstream analysis. If overflow is detected, Amazon Simple Notification Service (Amazon SNS) or Lambda functions can be used to automatically mitigate the overflow and control the corresponding lanes on the affected screener. The following diagram illustrates this architecture.

Conclusion

This post described how ICL, an Israeli mining company, developed their own computer vision approach for automated monitoring of mining equipment using cameras. We first showed how to address such a challenge from an organizational point of view that is focused on enablement, then we provided a detailed look into how the model was built using AWS. Although the challenge of monitoring may be unique to ICL, the general approach to build a prototype alongside AWS specialists can be applied to similar challenges, particularly for organizations that don’t have the necessary AWS knowledge.

If you want to learn how to build a production-scale prototype of your use case, reach out to your AWS account team to discuss a prototyping engagement.

About the Authors

Markus Bestehorn leads the customer engineering and prototyping teams in Germany, Austria, Switzerland, and Israel for AWS. He has a PhD degree in computer science and is specialized in building complex machine learning and IoT solutions.

David Abekasis leads the data science team at ICL Group with a passion to educate others on data analysis and machine learning while helping solve business challenges. He has an MSc in Data Science and an MBA. He was fortunate to research spatial and time series data in the precision agriculture domain.

Ion Kleopas is a Sr. Machine Learning Prototyping Architect with an MSc in Data Science and Big Data. He helps AWS customers build innovative AI/ML solutions by enabling their technical teams on AWS technologies through the co-development of prototypes for challenging machine learning use cases, paving their path to production.

Miron Perel is a Principal Machine Learning Business Development Manager with Amazon Web Services. Miron advises Generative AI companies building their next generation models.

Automate PDF pre-labeling for Amazon Comprehend

December 14, 2023

by Oskar Schnaack Amazon AWS

Amazon Comprehend is a natural-language processing (NLP) service that provides pre-trained and custom APIs to derive insights from textual data. Amazon Comprehend customers can train custom named entity recognition (NER) models to extract entities of interest, such as location, person name, and date, that are unique to their business.

To train a custom model, you first prepare training data by manually annotating entities in documents. This can be done with the Comprehend Semi-Structured Documents Annotation Tool, which creates an Amazon SageMaker Ground Truth job with a custom template, allowing annotators to draw bounding boxes around the entities directly on the PDF documents. However, for companies with existing tabular entity data in ERP systems like SAP, manual annotation can be repetitive and time-consuming.

To reduce the effort of preparing training data, we built a pre-labeling tool using AWS Step Functions that automatically pre-annotates documents by using existing tabular entity data. This significantly decreases the manual work needed to train accurate custom entity recognition models in Amazon Comprehend.

In this post, we walk you through the steps of setting up the pre-labeling tool and show examples of how it automatically annotates documents from a public dataset of sample bank statements in PDF format. The full code is available on the GitHub repo.

Solution overview

In this section, we discuss the inputs and outputs of the pre-labeling tool and provide an overview of the solution architecture.

Inputs and outputs

As input, the pre-labeling tool takes PDF documents that contain text to be annotated. For the demo, we use simulated bank statements like the following example.

The tool also takes a manifest file that maps PDF documents with the entities that we want to extract from these documents. Entities consists of two things: the expected_text to extract from the document (for example, AnyCompany Bank) and the corresponding entity_type (for example, bank_name). Later in this post, we show how to construct this manifest file from a CSV document like the following example.

The pre-labeling tool uses the manifest file to automatically annotate the documents with their corresponding entities. We can then use these annotations directly to train an Amazon Comprehend model.

Alternatively, you can create a SageMaker Ground Truth labeling job for human review and editing, as shown in the following screenshot.

When the review is complete, you can use the annotated data to train an Amazon Comprehend custom entity recognizer model.

Architecture

The pre-labeling tool consists of multiple AWS Lambda functions orchestrated by a Step Functions state machine. It has two versions that use different techniques to generate pre-annotations.

The first technique is fuzzy matching. This requires a pre-manifest file with expected entities. The tool uses the fuzzy matching algorithm to generate pre-annotations by comparing text similarity.

Fuzzy matching looks for strings in the document that are similar (but not necessarily identical) to the expected entities listed in the pre-manifest file. It first calculates text similarity scores between the expected text and words in the document, then it matches all pairs above a threshold. Therefore, even if there are no exact matches, fuzzy matching can find variants like abbreviations and misspellings. This allows the tool to pre-label documents without requiring the entities to appear verbatim. For example, if 'AnyCompany Bank' is listed as an expected entity, Fuzzy Matching will annotate occurrences of 'Any Companys Bank'. This provides more flexibility than strict string matching and enables the pre-labeling tool to automatically label more entities.

The following diagram illustrates the architecture of this Step Functions state machine.

The second technique requires a pre-trained Amazon Comprehend entity recognizer model. The tool generates pre-annotations using the Amazon Comprehend model, following the workflow shown in the following diagram.

The following diagram illustrates the full architecture.

In the following sections, we walk through the steps to implement the solution.

Deploy the pre-labeling tool

Clone the repository to your local machine:

git clone https://github.com/aws-samples/amazon-comprehend-automated-pdf-prelabeling-tool.git

This repository has been built on top of the Comprehend Semi-Structured Documents Annotation Tool and extends its functionalities by enabling you to start a SageMaker Ground Truth labeling job with pre-annotations already displayed on the SageMaker Ground Truth UI.

The pre-labeling tool includes both the Comprehend Semi-Structured Documents Annotation Tool resources as well as some resources specific to the pre-labeling tool. You can deploy the solution with AWS Serverless Application Model (AWS SAM), an open source framework that you can use to define serverless application infrastructure code.

If you have previously deployed the Comprehend Semi-Structured Documents Annotation Tool, refer to the FAQ section in Pre_labeling_tool/README.md for instructions on how to deploy only the resources specific to the pre-labeling tool.

If you haven’t deployed the tool before and are starting fresh, do the following to deploy the whole solution.

Change the current directory to the annotation tool folder:

cd amazon-comprehend-semi-structured-documents-annotation-tools

Build and deploy the solution:

make ready-and-deploy-guided

Create the pre-manifest file

Before you can use the pre-labeling tool, you need to prepare your data. The main inputs are PDF documents and a pre-manifest file. The pre-manifest file contains the location of each PDF document under 'pdf' and the location of a JSON file with expected entities to label under 'expected_entities'.

The notebook generate_premanifest_file.ipynb shows how to create this file. In the demo, the pre-manifest file shows the following code:

[
  {
    'pdf': 's3://<bucket>/data_aws_idp_workshop_data/bank_stmt_0.pdf',
    'expected_entities': 's3://<bucket>/prelabeling-inputs/expected-entities/example-demo/fuzzymatching_version/file_bank_stmt_0.json'
  },
  ...
]

Each JSON file listed in the pre-manifest file (under expected_entities) contains a list of dictionaries, one for each expected entity. The dictionaries have the following keys:

‘expected_texts’ – A list of possible text strings matching the entity.
‘entity_type’ – The corresponding entity type.
‘ignore_list’ (optional) – The list of words that should be ignored in the match. These parameters should be used to prevent fuzzy matching from matching specific combinations of words that you know are wrong. This can be useful if you want to ignore some numbers or email addresses when looking at names.

For example, the expected_entities of the PDF shown previously looks like the following:

[
  {
    'expected_texts': ['AnyCompany Bank'],
    'entity_type': 'bank_name',
    'ignore_list': []
  },
  {
    'expected_texts': ['JANE DOE'],
    'entity_type': 'customer_name',
    'ignore_list': ['JANE.DOE@example_mail.com']
  },
  {
    'expected_texts': ['003884257406'],
    'entity_type': 'checking_number',
    'ignore_list': []
  },
 ...
]

Run the pre-labeling tool

With the pre-manifest file that you created in the previous step, start running the pre-labeling tool. For more details, refer to the notebook start_step_functions.ipynb.

To start the pre-labeling tool, provide an event with the following keys:

Premanifest – Maps each PDF document to its expected_entities file. This should contain the Amazon Simple Storage Service (Amazon S3) bucket (under bucket) and the key (under key) of the file.
Prefix – Used to create the execution_id, which names the S3 folder for output storage and the SageMaker Ground Truth labeling job name.
entity_types – Displayed in the UI for annotators to label. These should include all entity types in the expected entities files.
work_team_name (optional) – Used for creating the SageMaker Ground Truth labeling job. It corresponds to the private workforce to use. If it’s not provided, only a manifest file will be created instead of a SageMaker Ground Truth labeling job. You can use the manifest file to create a SageMaker Ground Truth labeling job later on. Note that as of this writing, you can’t provide an external workforce when creating the labeling job from the notebook. However, you can clone the created job and assign it to an external workforce on the SageMaker Ground Truth console.
comprehend_parameters (optional) – Parameters to directly train an Amazon Comprehend custom entity recognizer model. If omitted, this step will be skipped.

To start the state machine, run the following Python code:

import boto3
stepfunctions_client = boto3.client('stepfunctions')

response = stepfunctions_client.start_execution(
stateMachineArn=fuzzymatching_prelabeling_step_functions_arn,
input=json.dumps(<event-dict>)
)

This will start a run of the state machine. You can monitor the progress of the state machine on the Step Functions console. The following diagram illustrates the state machine workflow.

When the state machine is complete, do the following:

Inspect the following outputs saved in the prelabeling/ folder of the comprehend-semi-structured-docs S3 bucket:
- Individual annotation files for each page of the documents (one per page per document) in temp_individual_manifests/
- A manifest for the SageMaker Ground Truth labeling job in consolidated_manifest/consolidated_manifest.manifest
- A manifest that can be used to train a custom Amazon Comprehend model in consolidated_manifest/consolidated_manifest_comprehend.manifest
On the SageMaker console, open the SageMaker Ground Truth labeling job that was created to review the annotations
Inspect and test the custom Amazon Comprehend model that was trained

As mentioned previously, the tool can only create SageMaker Ground Truth labeling jobs for private workforces. To outsource the human labeling effort, you can clone the labeling job on the SageMaker Ground Truth console and attach any workforce to the new job.

Clean up

To avoid incurring additional charges, delete the resources that you created and delete the stack that you deployed with the following command:

make delete

Conclusion

The pre-labeling tool provides a powerful way for companies to use existing tabular data to accelerate the process of training custom entity recognition models in Amazon Comprehend. By automatically pre-annotating PDF documents, it significantly reduces the manual effort required in the labeling process.

The tool has two versions: fuzzy matching and Amazon Comprehend-based, giving flexibility on how to generate the initial annotations. After documents are pre-labeled, you can quickly review them in a SageMaker Ground Truth labeling job or even skip the review and directly train an Amazon Comprehend custom model.

The pre-labeling tool enables you to quickly unlock the value of your historical entity data and use it in creating custom models tailored to your specific domain. By speeding up what is typically the most labor-intensive part of the process, it makes custom entity recognition with Amazon Comprehend more accessible than ever.

For more information about how to label PDF documents using a SageMaker Ground Truth labeling job, see Custom document annotation for extracting named entities in documents using Amazon Comprehend and Use Amazon SageMaker Ground Truth to Label Data.

About the authors

Oskar Schnaack is an Applied Scientist at the Generative AI Innovation Center. He is passionate about diving into the science behind machine learning to make it accessible for customers. Outside of work, Oskar enjoys cycling and keeping up with trends in information theory.

Romain Besombes is a Deep Learning Architect at the Generative AI Innovation Center. He is passionate about building innovative architectures to address customers’ business problems with machine learning.

Improve your Stable Diffusion prompts with Retrieval Augmented Generation

December 14, 2023

by James Yi Amazon AWS

Text-to-image generation is a rapidly growing field of artificial intelligence with applications in a variety of areas, such as media and entertainment, gaming, ecommerce product visualization, advertising and marketing, architectural design and visualization, artistic creations, and medical imaging.

Stable Diffusion is a text-to-image model that empowers you to create high-quality images within seconds. In November 2022, we announced that AWS customers can generate images from text with Stable Diffusion models in Amazon SageMaker JumpStart, a machine learning (ML) hub offering models, algorithms, and solutions. The evolution continued in April 2023 with the introduction of Amazon Bedrock, a fully managed service offering access to cutting-edge foundation models, including Stable Diffusion, through a convenient API.

As an ever-increasing number of customers embark on their text-to-image endeavors, a common hurdle arises—how to craft prompts that wield the power to yield high-quality, purpose-driven images. This challenge often demands considerable time and resources as users embark on an iterative journey of experimentation to discover the prompts that align with their visions.

Retrieval Augmented Generation (RAG) is a process in which a language model retrieves contextual documents from an external data source and uses this information to generate more accurate and informative text. This technique is particularly useful for knowledge-intensive natural language processing (NLP) tasks. We now extend its transformative touch to the world of text-to-image generation. In this post, we demonstrate how to harness the power of RAG to enhance the prompts sent to your Stable Diffusion models. You can create your own AI assistant for prompt generation in minutes with large language models (LLMs) on Amazon Bedrock, as well as on SageMaker JumpStart.

Approaches to crafting text-to-image prompts

Creating a prompt for a text-to-image model may seem straightforward at first glance, but it’s a deceptively complex task. It’s more than just typing a few words and expecting the model to conjure an image that aligns with your mental image. Effective prompts should provide clear instructions while leaving room for creativity. They must balance specificity and ambiguity, and they should be tailored to the particular model being used. To address the challenge of prompt engineering, the industry has explored various approaches:

Prompt libraries – Some companies curate libraries of pre-written prompts that you can access and customize. These libraries contain a wide range of prompts tailored to various use cases, allowing you to choose or adapt prompts that align with your specific needs.
Prompt templates and guidelines – Many companies and organizations provide users with a set of predefined prompt templates and guidelines. These templates offer structured formats for writing prompts, making it straightforward to craft effective instructions.
Community and user contributions – Crowdsourced platforms and user communities often play a significant role in improving prompts. Users can share their fine-tuned models, successful prompts, tips, and best practices with the community, helping others learn and refine their prompt-writing skills.
Model fine-tuning – Companies may fine-tune their text-to-image models to better understand and respond to specific types of prompts. Fine-tuning can improve model performance for particular domains or use cases.

These industry approaches collectively aim to make the process of crafting effective text-to-image prompts more accessible, user-friendly, and efficient, ultimately enhancing the usability and versatility of text-to-image generation models for a wide range of applications.

Using RAG for prompt design

In this section, we delve into how RAG techniques can serve as a game changer in prompt engineering, working in harmony with these existing approaches. By seamlessly integrating RAG into the process, we can streamline and enhance the efficiency of prompt design.

Semantic search in a prompt database

Imagine a company that has accumulated a vast repository of prompts in its prompt library or has created a large number of prompt templates, each designed for specific use cases and objectives. Traditionally, users seeking inspiration for their text-to-image prompts would manually browse through these libraries, often sifting through extensive lists of options. This process can be time-consuming and inefficient. By embedding prompts from the prompt library using text embedding models, companies can build a semantic search engine. Here’s how it works:

Embedding prompts – The company uses text embeddings to convert each prompt in its library into a numerical representation. These embeddings capture the semantic meaning and context of the prompts.
User query – When users provide their own prompts or describe their desired image, the system can analyze and embed their input as well.
Semantic search – Using the embeddings, the system performs a semantic search. It retrieves the most relevant prompts from the library based on the user’s query, considering both the user’s input and historical data in the prompt library.

By implementing semantic search in their prompt libraries, companies empower their employees to access a vast reservoir of prompts effortlessly. This approach not only accelerates prompt creation but also encourages creativity and consistency in text-to-image generation.y

Prompt generation from semantic search

Although semantic search streamlines the process of finding relevant prompts, RAG takes it a step further by using these search results to generate optimized prompts. Here’s how it works:

Semantic search results – After retrieving the most relevant prompts from the library, the system presents these prompts to the user, alongside the user’s original input.
Text generation model – The user can select a prompt from the search results or provide further context on their preferences. The system feeds both the selected prompt and the user’s input into an LLM.
Optimized prompt – The LLM, with its understanding of language nuances, crafts an optimized prompt that combines elements from the selected prompt and the user’s input. This new prompt is tailored to the user’s requirements and is designed to yield the desired image output.

The combination of semantic search and prompt generation not only simplifies the process of finding prompts but also ensures that the prompts generated are highly relevant and effective. It empowers you to fine-tune and customize your prompts, ultimately leading to improved text-to-image generation results. The following are examples of images generated from Stable Diffusion XL using the prompts from semantic search and prompt generation.

Original Prompt

Prompts from Semantic Search

Optimized Prompt by LLM

a cartoon of a little dog

cute cartoon of a dog having a sandwich at the dinner table
a cartoon illustration of a punk dog, anime style, white background
a cartoon of a boy and his dog walking down a forest lane

A cartoon scene of a boy happily walking hand in hand down a forest lane with his cute pet dog, in animation style.

RAG-based prompt design applications across diverse industries

Before we explore the application of our suggested RAG architecture, let’s start with an industry in which an image generation model is most applicable. In AdTech, speed and creativity are critical. RAG-based prompt generation can add instant value by generating prompt suggestions to create many images quickly for an advertisement campaign. Human decision-makers can go through the auto-generated images to select the candidate image for the campaign. This feature can be a standalone application or embedded into popular software tools and platforms currently available.

Another industry where the Stable Diffusion model can enhance productivity is media and entertainment. The RAG architecture can assist in use cases of avatar creation, for example. Starting from a simple prompt, RAG can add much more color and characteristics to the avatar ideas. It can generate many candidate prompts and provide more creative ideas. From these generated images, you can find the perfect fit for the given application. It increases the productivity by automatically generating many prompt suggestions. The variation it can come up with is the immediate benefit of the solution.

Solution overview

Empowering customers to construct their own RAG-based AI assistant for prompt design on AWS is a testament to the versatility of modern technology. AWS provides a plethora of options and services to facilitate this endeavor. The following reference architecture diagram illustrates a RAG application for prompt design on AWS.

When it comes to selecting the right LLMs for your AI assistant, AWS offers a spectrum of choices to cater to your specific requirements.

Firstly, you can opt for LLMs available through SageMaker JumpStart, utilizing dedicated instances. These instances support a variety of models, including Falcon, Llama 2, Bloom Z, and Flan-T5, or you can explore proprietary models such as Cohere’s Command and Multilingual Embedding, or Jurassic-2 from AI21 Labs.

If you prefer a more simplified approach, AWS offers LLMs on Amazon Bedrock, featuring models like Amazon Titan and Anthropic Claude. These models are easily accessible through straightforward API calls, allowing you to harness their power effortlessly. The flexibility and diversity of options ensure that you have the freedom to choose the LLM that best aligns with your prompt design goals, whether you’re seeking an innovation with open containers or the robust capabilities of proprietary models.

When it comes to building the essential vector database, AWS provides a multitude of options through their native services. You can opt for Amazon OpenSearch Service, Amazon Aurora, or Amazon Relational Database Service (Amazon RDS) for PostgreSQL, each offering robust features to suit your specific needs. Alternatively, you can explore products from AWS partners like Pinecone, Weaviate, Elastic, Milvus, or Chroma, which provide specialized solutions for efficient vector storage and retrieval.

To help you get started to construct a RAG-based AI assistant for prompt design, we’ve put together a comprehensive demonstration in our GitHub repository. This demonstration uses the following resources:

Image generation: Stable Diffusion XL on Amazon Bedrock
Text embedding: Amazon Titan on Amazon Bedrock
Text generation: Claude 2 on Amazon Bedrock
Vector database: FAISS, an open source library for efficient similarity search
Prompt library: Prompt examples from DiffusionDB, the first large-scale prompt gallery dataset for text-to-image generative models

Additionally, we’ve incorporated LangChain for LLM implementation and Streamit for the web application component, providing a seamless and user-friendly experience.

Prerequisites

You need to have the following to run this demo application:

An AWS account
Basic understanding of how to navigate Amazon SageMaker Studio
Basic understanding of how to download a repo from GitHub
Basic knowledge of running a command on a terminal

Run the demo application

You can download all the necessary code with instructions from the GitHub repo. After the application is deployed, you will see a page like the following screenshot.

With this demonstration, we aim to make the implementation process accessible and comprehensible, providing you with a hands-on experience to kickstart your journey into the world of RAG and prompt design on AWS.

Clean up

After you try out the app, clean up your resources by stopping the application.

Conclusion

RAG has emerged as a game-changing paradigm in the world of prompt design, revitalizing Stable Diffusion’s text-to-image capabilities. By harmonizing RAG techniques with existing approaches and using the robust resources of AWS, we’ve uncovered a pathway to streamlined creativity and accelerated learning.

For additional resources, visit the following:

About the authors

James Yi is a Senior AI/ML Partner Solutions Architect in the Emerging Technologies team at Amazon Web Services. He is passionate about working with enterprise customers and partners to design, deploy and scale AI/ML applications to derive their business values. Outside of work, he enjoys playing soccer, traveling and spending time with his family.

Rumi Olsen is a Solutions Architect in the AWS Partner Program. She specializes in serverless and machine learning solutions in her current role, and has a background in natural language processing technologies. She spends most of her spare time with her daughter exploring the nature of Pacific Northwest.

Solution overview

PwC Machine Learning Ops Accelerator architecture

Solution walkthrough

Conclusion

About the Authors

About the authors

MongoDB Atlas

Amazon SageMaker Canvas

Solution overview

Prerequisites

Configure MongoDB Atlas cluster

Populate a time series collection in MongoDB Atlas

Create an S3 Bucket

Create the Data Federation

Setup the Atlas application service

Verify the Datasource Configuration

Create the function

Sample function

Create dataset in Amazon SageMaker Canvas

Train the model

Visualize forecast data in Atlas Charts

Clean up

Conclusion

About the authors

Solution overview

Prerequisites

Create an Amazon DocumentDB connector in SageMaker Canvas

Analyze data using generative AI

Prepare data for machine learning

Build a model and generate predictions

Clean up

Conclusion

About the authors

Introducing Spaces in SageMaker Studio

Creating a Space

Reconfiguring a Space

The new SageMaker Studio JupyterLab architecture

Reduced latency and increased stability

Improved control over provisioned storage

Shared tenancy with bring-your-own EFS file system

Generative AI-powered tools on JupyterLab Spaces

Amazon CodeWhisperer

Jupyter AI

JupyterLab 4.0

Conclusion

Appendix: SageMaker Studio Classic’s kernel gateway architecture

About the authors

Building in-house capabilities through AWS Prototyping

ICL’s computer vision use case

Monitoring mining screeners using CV models with SageMaker

Conclusion

About the Authors

Solution overview

Inputs and outputs

Architecture

Deploy the pre-labeling tool

Create the pre-manifest file

Run the pre-labeling tool

Clean up

Conclusion

About the authors

Approaches to crafting text-to-image prompts

Using RAG for prompt design

Semantic search in a prompt database

Prompt generation from semantic search

RAG-based prompt design applications across diverse industries

Solution overview

Prerequisites

Run the demo application

Clean up

Conclusion

About the authors

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.