Using Amazon Rekognition to improve bicycle safety

Using Amazon Rekognition to improve bicycle safety

Cycling is a fun way to stay fit, enjoy nature, and connect with friends and acquaintances. However, riding is becoming increasingly dangerous, especially in situations where cyclists and cars share the road. According to the NHTSA, in the United States an average of 883 people on bicycles are killed in traffic crashes, with an average of about 45,000 injury-only crashes reported annually. While total bicycle fatalities only account for just over 2% of all traffic fatalities in the United States, as a cyclist, it’s still terrifying to be pushed off the road by a large SUV or truck. To better protect themselves, many cyclists are starting to ride with cameras mounted to the front or back of their bicycle. In this blog post, I will demonstrate a machine learning solution that cyclists can use to better identify close calls.

Many US states and countries throughout the world have some sort of 3-feet law. A 3-feet law requires motor vehicles to provide about 3 feet (1 meter) of distance when passing a bicycle. To promote safety on the road, cyclists are increasingly recording their rides, and if they encounter a dangerous situation where they aren’t given an appropriate safe distance, they can provide a video of the encounter to local law enforcement to help correct behavior. However, finding a single encounter in a recording of a multi-hour ride is time consuming and often requires specialized video skills to generate a short clip of the encounter.

To solve some of these problems, I have developed a simple solution using Amazon Rekognition video analysis. Amazon Rekognition can detect labels (essentially objects) and the timestamp of when that object is detected in a video. Amazon Rekognition can be used to quickly find any vehicles that appear in the video of a recorded ride.

If a cyclist’s camera records a passing vehicle, it must then determine if the vehicle is too close to the bicycle—in other words, if the vehicle is within the 3-foot range set by law. If it is, then I want to generate a clip of the encounter, which can be provided to the relevant authorities. The following figure shows the view from a cyclist’s camera with bounding boxes that identify a vehicle that’s passing too close to the bicycle. A box at the bottom of the image shows the approximate 3-foot area around the bicycle.

A red bounding box identifies a vehicle, while a green bounding box identifies the location of the bicycle. The boxes overlap, showing the vehicle is too close to the bicycle.

Solution overview

The architecture of the solution is shown in the following figure.

A video recording is uploaded to an S3 bucket, where it is processed, close encounters are detected, video is extracted, and links to the encounter are provided

The steps of the solution are:

  1. When a cyclist completes a ride, they upload their MP4 videos from the ride into an Amazon Simple Storage Service (Amazon S3)
  2. The bucket has been configured with an S3 event notification that sends object created notifications to an AWS Lambda
  3. The Lambda function kicks off an AWS Step Functions workflow that begins by calling the StartLabelDetection API as part of Amazon Rekognition videos. The StartLabelDetection API is configured to detect Bus, Car, Fire Truck, Pickup Truck, Truck, Limo, and Moving Van as labels. It ignores other related non-vehicle labels like License Plate, Wheel, Tire, and Car Mirror.
  4. The Amazon Rekognition API returns a set of JSON identifying the selected labels and timestamps of detected objects.
  5. This JSON result is sent to a Lambda function to perform the geometry math to determine if a vehicle box overlapped with the bicycle safe area.
  6. Any detected encounters are generated and passed off to AWS Elemental MediaConvert, which can create snippets of video corresponding to the detected encounters, using the CreateJob API
  7. MediaConvert creates these videos and uploads them to an S3 bucket.
  8. Another Lambda function is called to generate pre-signed URLs of the videos. This allows the videos to be temporarily downloaded by anyone with the pre-signed URL.
  9. Amazon Simple Notification Service (Amazon SNS) sends an email message with links to the pre-signed URLs.

Prerequisites

To use the solution outlined in this post, you must have:

  1. An AWS account with appropriate permissions to allow you to deploy AWS CloudFormation stacks
  2. A video recording in MP4 format with the .MP4 extension using the H.264 codec. The video should be from a front or rear-facing camera, from any off-the-shelf vendor (for example GoPro, DJI, or Cycliq). The maximum file size is 10 GB.

Deploying the solution

  1. Deploy this solution in your environment or select Launch Stack. This solution will deploy in the AWS US East (N. Virginia) us-east-1 AWS Region.

Launch stack

  1. The Create stack page from the CloudFormation dashboard appears. At the bottom of the page, choose Next.
  2. On the Specify stack details page, enter the email address where you’d like to receive notifications. Choose Next.
  3. Select the box that says I acknowledge that AWS CloudFormation might create IAM resources and Choose Next. Choose Submit and the installation will begin. The solution takes about 5 minutes to be installed.
  4. You will receive an email confirming your Amazon SNS subscription. You will not receive emails from the solution unless you confirm your subscription.
  5. After the stack completes, select the Outputs tab and take note of the bucket name listed under InputBucket.

Using the solution

To test the solution, I have a sample video where I asked a stunt driver to drive very closely to me.

To begin the video processing, I upload the video to the S3 bucket (the InputBucket from the Outputs tab). The bucket has encryption enabled, so under Properties, I choose Specify an encryption key and select Use bucket settings for default encryption. Choosing Upload begins the upload process, as shown in the following figure.

Uploading the video to S3, I specify the file and the settings for encryption

After a moment, the step function begins processing. After a few minutes, you will receive an email with links to any encounters identified, as shown in the following figure.

An email that contains links to view the detected encounters

In my case, it identified two encounters. In the first encounter identified, I rode too close to a parked car. However, in the second encounter identified, it shows a dangerous encounter that I experienced with my stunt driver.

Had this been an actual dangerous encounter, the video clip could be provided to the appropriate authorities to help change behavior and make the road safer for everyone.

Pricing

Because this is a fully serverless solution, you only pay for what you use. With Amazon Rekognition, you pay for the minutes of video that are processed. With MediaConvert, you pay for normalized minutes of video processed, which is each minute of video output with multipliers that apply based on features used. The solution’s use of Lambda, Step Functions, and SNS are minimal and will likely fall under the free tier for most users.

Clean up

To delete the resources created as part of this solution, go to the CloudFormation console, select the stack that was deployed, and choose Delete.

Conclusion

In this example I demonstrated how to use Amazon Rekognition video analysis in a unique scenario. Amazon Rekognition is a powerful computer vision tool that allows you to get insights out of images or video without the overhead of building or managing a machine learning model. Of course, Amazon Rekognition can also handle more advanced use cases than the one I demonstrated here.

In this example I demonstrated how using Amazon Rekognition with other serverless services can yield a serverless video processing workflow that—in this case—can help improve the safety of cyclists. While you might not be an avid cyclist, the solution demonstrated here can be extended to a variety of use cases and industries. For example, this solution could be extended to detect wildlife on nature cameras or you could use Amazon Rekognition streaming video events to detect people and packages in security video.

Get started today by using Amazon Rekognition for your computer vision use case.


About the Author

Mike George author photoMike George is a Principal Solutions Architect at Amazon Web Services (AWS) based in Salt Lake City, Utah. He enjoys helping customers solve their technology problems. His interests include software engineering, security, artificial intelligence (AI), and machine learning (ML).

Read More

Transfer Learning in Scalable Graph Neural Network for Improved Physical Simulation

In recent years, graph neural network (GNN) based models showed promising results in simulating complex physical systems. However, training dedicated graph network simulator can be costly, as most models are confined to fully supervised training. Extensive data generated from traditional simulators is required to train the model. It remained unexplored how transfer learning could be applied to improve the model performance and training efficiency. In this work, we introduce a pretraining and transfer learning paradigm for graph network simulator.
First, We proposed the scalable graph U-net…Apple Machine Learning Research

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

AI agents continue to gain momentum, as businesses use the power of generative AI to reinvent customer experiences and automate complex workflows. We are seeing Amazon Bedrock Agents applied in investment research, insurance claims processing, root cause analysis, advertising campaigns, and much more. Agents use the reasoning capability of foundation models (FMs) to break down user-requested tasks into multiple steps. They use developer-provided instructions to create an orchestration plan and carry out that plan by securely invoking company APIs and accessing knowledge bases using Retrieval Augmented Generation (RAG) to accurately handle the user’s request.

Although organizations see the benefit of agents that are defined, configured, and tested as managed resources, we have increasingly seen the need for an additional, more dynamic way to invoke agents. Organizations need solutions that adjust on the fly—whether to test new approaches, respond to changing business rules, or customize solutions for different clients. This is where the new inline agents capability in Amazon Bedrock Agents becomes transformative. It allows you to dynamically adjust your agent’s behavior at runtime by changing its instructions, tools, guardrails, knowledge bases, prompts, and even the FMs it uses—all without redeploying your application.

In this post, we explore how to build an application using Amazon Bedrock inline agents, demonstrating how a single AI assistant can adapt its capabilities dynamically based on user roles.

Inline agents in Amazon Bedrock Agents

This runtime flexibility enabled by inline agents opens powerful new possibilities, such as:

  • Rapid prototyping – Inline agents minimize the time-consuming create/update/prepare cycles traditionally required for agent configuration changes. Developers can instantly test different combinations of models, tools, and knowledge bases, dramatically accelerating the development process.
  • A/B testing and experimentation – Data science teams can systematically evaluate different model-tool combinations, measure performance metrics, and analyze response patterns in controlled environments. This empirical approach enables quantitative comparison of configurations before production deployment.
  • Subscription-based personalization – Software companies can adapt features based on each customer’s subscription level, providing more advanced tools for premium users.
  • Persona-based data source integration – Institutions can adjust content complexity and tone based on the user’s profile, providing persona-appropriate explanations and resources by changing the knowledge bases associated to the agent on the fly.
  • Dynamic tool selection – Developers can create applications with hundreds of APIs, and quickly and accurately carry out tasks by dynamically choosing a small subset of APIs for the agent to consider for a given request. This is particularly helpful for large software as a service (SaaS) platforms needing multi-tenant scaling.

Inline agents expand your options for building and deploying agentic solutions with Amazon Bedrock Agents. For workloads needing managed and versioned agent resources with a pre-determined and tested configuration (specific model, instructions, tools, and so on), developers can continue to use InvokeAgent on resources created with CreateAgent. For workloads that need dynamic runtime behavior changes for each agent invocation, you can use the new InvokeInlineAgent API. With either approach, your agents will be secure and scalable, with configurable guardrails, a flexible set of model inference options, native access to knowledge bases, code interpretation, session memory, and more.

Solution overview

Our HR assistant example shows how to build a single AI assistant that adapts to different user roles using the new inline agent capabilities in Amazon Bedrock Agents. When users interact with the assistant, the assistant dynamically configures agent capabilities (such as model, instructions, knowledge bases, action groups, and guardrails) based on the user’s role and their specific selections. This approach creates a flexible system that adjusts its functionality in real time, making it more efficient than creating separate agents for each user role or tool combination. The complete code for this HR assistant example is available on our GitHub repo.

This dynamic tool selection enables a personalized experience. When an employee logs in without direct reports, they see a set of tools that they have access to based on their role. They can select from options like requesting vacation time, checking company policies using the knowledge base, using a code interpreter for data analysis, or submitting expense reports. The inline agent assistant is then configured with only these selected tools, allowing it to assist the employee with their chosen tasks. In a real-world example, the user would not need to make the selection, because the application would make that decision and automatically configure the agent invocation at runtime. We make it explicit in this application so that you can demonstrate the impact.

Similarly, when a manager logs in to the same system, they see an extended set of tools reflecting their additional permissions. In addition to the employee-level tools, managers have access to capabilities like running performance reviews. They can select which tools they want to use for their current session, instantly configuring the inline agent with their choices.

The inclusion of knowledge bases is also adjusted based on the user’s role. Employees and managers see different levels of company policy information, with managers getting additional access to confidential data like performance review and compensation details. For this demo, we’ve implemented metadata filtering to retrieve only the appropriate level of documents based on the user’s access level, further enhancing efficiency and security.

Let’s look at how the interface adapts to different user roles.

The employee view provides access to essential HR functions like vacation requests, expense submissions, and company policy lookups. Users can select which of these tools they want to use for their current session.

The manager view extends these options to include supervisory functions like compensation management, demonstrating how the inline agent can be configured with a broader set of tools based on user permissions.

The manager view extends these capabilities to include supervisory functions like compensation management, demonstrating how the inline agent dynamically adjusts its available tools based on user permissions. Without inline agents, we would need to build and maintain two separate agents.

As shown in the preceding screenshots, the same HR assistant offers different tool selections based on the user’s role. An employee sees options like Knowledge Base, Apply Vacation Tool, and Submit Expense, whereas a manager has additional options like Performance Evaluation. Users can select which tools they want to add to the agent for their current interaction.

This flexibility allows for quick adaptation to user needs and preferences. For instance, if the company introduces a new policy for creating business travel requests, the tool catalog can be quickly updated to include a Create Business Travel Reservation tool. Employees can then choose to add this new tool to their agent configuration when they need to plan a business trip, or the application could automatically do so based on their role.

With Amazon Bedrock inline agents, you can create a catalog of actions that is dynamically selected by the application or by users of the application. This increases the level of flexibility and adaptability of your solutions, making them a perfect fit for navigating the complex, ever-changing landscape of modern business operations. Users have more control over their AI assistant’s capabilities, and the system remains efficient by only loading the necessary tools for each interaction.

Technical foundation: Dynamic configuration and action selection

Inline agents allow dynamic configuration at runtime, enabling a single agent to effectively perform the work of many. By specifying action groups and modifying instructions on the fly, even within the same session, you can create versatile AI applications that adapt to various scenarios without multiple agent deployments.

The following are key points about inline agents:

  • Runtime configuration – Change the agent’s configuration, including its FM, at runtime. This enables rapid experimentation and adaptation without redeploying the application, reducing development cycles.
  • Governance at tool level – Apply governance and access control at the tool level. With agents changing dynamically at runtime, tool-level governance helps maintain security and compliance regardless of the agent’s configuration.
  • Agent efficiency – Provide only necessary tools and instructions at runtime to reduce token usage and improve the agent accuracy. With fewer tools to choose from, it’s less complicated for the agent to select the right one, reducing hallucinations in the tool selection process. This approach can also lead to lower costs and improved latency compared to static agents because removing unnecessary tools, knowledge bases, and instructions reduces the number of input and output tokens being processed by the agent’s large language model (LLM).
  • Flexible action catalog – Create reusable actions for dynamic selection based on specific needs. This modular approach simplifies maintenance, updates, and scalability of your AI applications.

The following are examples of reusable actions:

  • Enterprise system integration – Connect with systems like Salesforce, GitHub, or databases
  • Utility tools – Perform common tasks such as sending emails or managing calendars
  • Team-specific API access – Interact with specialized internal tools and services
  • Data processing – Analyze text, structured data, or other information
  • External services – Fetch weather updates, stock prices, or perform web searches
  • Specialized ML models – Use specific machine learning (ML) models for targeted tasks

When using inline agents, you configure parameters for the following:

  • Contextual tool selection based on user intent or conversation flow
  • Adaptation to different user roles and permissions
  • Switching between communication styles or personas
  • Model selection based on task complexity

The inline agent uses the configuration you provide at runtime, allowing for highly flexible AI assistants that efficiently handle various tasks across different business contexts.

Building an HR assistant using inline agents

Let’s look at how we built our HR Assistant using Amazon Bedrock inline agents:

  1. Create a tool catalog – We developed a demo catalog of HR-related tools, including:
    • Knowledge Base – Using Amazon Bedrock Knowledge Bases for accessing company policies and guidelines based on the role of the application user. In order to filter the knowledge base content based on the user’s role, you also need to provide a metadata file specifying the type of employee’s roles that can access each file
    • Apply Vacation – For requesting and tracking time off.
    • Expense Report – For submitting and managing expense reports.
    • Code Interpreter – For performing calculations and data analysis.
    • Compensation Management – for conducting and reviewing employee compensation assessments (manager only access).
  2. Set conversation tone – We defined multiple conversation tones to suit different interaction styles:
    • Professional – For formal, business-like interactions.
    • Casual – For friendly, everyday support.
    • Enthusiastic – For upbeat, encouraging assistance.
  3. Implement access control – We implemented role-based access control. The application backend checks the user’s role (employee or manager) and provides access to appropriate tools and information and passes this information to the inline agent. The role information is also used to configure metadata filtering in the knowledge bases to generate relevant responses. The system allows for dynamic tool use at runtime. Users can switch personas or add and remove tools during their session, allowing the agent to adapt to different conversation needs in real time.
  4. Integrate the agent with other services and tools – We connected the inline agent to:
    • Amazon Bedrock Knowledge Bases for company policies, with metadata filtering for role-based access.
    • AWS Lambda functions for executing specific actions (such as submitting vacation requests or expense reports).
    • A code interpreter tool for performing calculations and data analysis.
  5. Create the UI – We created a Flask-based UI that performs the following actions:
    • Displays available tools based on the user’s role.
    • Allows users to select different personas.
    • Provides a chat window for interacting with the HR assistant.

To understand how this dynamic role-based functionality works under the hood, let’s examine the following system architecture diagram.

As shown in preceding architecture diagram, the system works as follows:

  1. The end-user logs in and is identified as either a manager or an employee.
  2. The user selects the tools that they have access to and makes a request to the HR assistant.
  3. The agent breaks down the problems and uses the available tools to solve for the query in steps, which may include:
    1. Amazon Bedrock Knowledge Bases (with metadata filtering for role-based access).
    2. Lambda functions for specific actions.
    3. Code interpreter tool for calculations.
    4. Compensation tool (accessible only to managers to submit base pay raise requests).
  4. The application uses the Amazon Bedrock inline agent to dynamically pass in the appropriate tools based on the user’s role and request.
  5. The agent uses the selected tools to process the request and provide a response to the user.

This approach provides a flexible, scalable solution that can quickly adapt to different user roles and changing business needs.

Conclusion

In this post, we introduced the Amazon Bedrock inline agent functionality and highlighted its application to an HR use case. We dynamically selected tools based on the user’s roles and permissions, adapted instructions to set a conversation tone, and selected different models at runtime. With inline agents, you can transform how you build and deploy AI assistants. By dynamically adapting tools, instructions, and models at runtime, you can:

  • Create personalized experiences for different user roles
  • Optimize costs by matching model capabilities to task complexity
  • Streamline development and maintenance
  • Scale efficiently without managing multiple agent configurations

For organizations demanding highly dynamic behavior—whether you’re an AI startup, SaaS provider, or enterprise solution team—inline agents offer a scalable approach to building intelligent assistants that grow with your needs. To get started, explore our GitHub repo and HR assistant demo application, which demonstrate key implementation patterns and best practices.

To learn more about how to be most successful in your agent journey, read our two-part blog series:

To get started with Amazon Bedrock Agents, check out the following GitHub repository with example code.


About the authors

Ishan Singh is a Generative AI Data Scientist at Amazon Web Services, where he helps customers build innovative and responsible generative AI solutions and products. With a strong background in AI/ML, Ishan specializes in building Generative AI solutions that drive business value. Outside of work, he enjoys playing volleyball, exploring local bike trails, and spending time with his wife and dog, Beau.

Maira Ladeira Tanke is a Senior Generative AI Data Scientist at AWS. With a background in machine learning, she has over 10 years of experience architecting and building AI applications with customers across industries. As a technical lead, she helps customers accelerate their achievement of business value through generative AI solutions on Amazon Bedrock. In her free time, Maira enjoys traveling, playing with her cat, and spending time with her family someplace warm.

Mark Roy is a Principal Machine Learning Architect for AWS, helping customers design and build generative AI solutions. His focus since early 2023 has been leading solution architecture efforts for the launch of Amazon Bedrock, the flagship generative AI offering from AWS for builders. Mark’s work covers a wide range of use cases, with a primary interest in generative AI, agents, and scaling ML across the enterprise. He has helped companies in insurance, financial services, media and entertainment, healthcare, utilities, and manufacturing. Prior to joining AWS, Mark was an architect, developer, and technology leader for over 25 years, including 19 years in financial services. Mark holds six AWS certifications, including the ML Specialty Certification.

Nitin Eusebius is a Sr. Enterprise Solutions Architect at AWS, experienced in Software Engineering, Enterprise Architecture, and AI/ML. He is deeply passionate about exploring the possibilities of generative AI. He collaborates with customers to help them build well-architected applications on the AWS platform, and is dedicated to solving technology challenges and assisting with their cloud journey.

Ashrith Chirutani is a Software Development Engineer at Amazon Web Services (AWS). He specializes in backend system design, distributed architectures, and scalable solutions, contributing to the development and launch of high-impact systems at Amazon. Outside of work, he spends his time playing ping pong and hiking through Cascade trails, enjoying the outdoors as much as he enjoys building systems.

Shubham Divekar is a Software Development Engineer at Amazon Web Services (AWS), working in Agents for Amazon Bedrock. He focuses on developing scalable systems on the cloud that enable AI applications frameworks and orchestrations. Shubham also has a background in building distributed, scalable, high-volume-high-throughput systems in IoT architectures.

Vivek Bhadauria is a Principal Engineer for Amazon Bedrock. He focuses on building deep learning-based AI and computer vision solutions for AWS customers. Oustide of work, Vivek enjoys trekking and following cricket.

Read More

Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock

In this post, we discuss what embeddings are, show how to practically use language embeddings, and explore how to use them to add functionality such as zero-shot classification and semantic search. We then use Amazon Bedrock and language embeddings to add these features to a really simple syndication (RSS) aggregator application.

Amazon Bedrock is a fully managed service that makes foundation models (FMs) from leading AI startups and Amazon available through an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case. Amazon Bedrock offers a serverless experience, so you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using Amazon Web Services (AWS) services without having to manage infrastructure. For this post, we use the Cohere v3 Embed model on Amazon Bedrock to create our language embeddings.

Use case: RSS aggregator

To demonstrate some of the possible uses of these language embeddings, we developed an RSS aggregator website. RSS is a web feed that allows publications to publish updates in a standardized, computer-readable way. On our website, users can subscribe to an RSS feed and have an aggregated, categorized list of the new articles. We use embeddings to add the following functionalities:

  • Zero-shot classification – Articles are classified between different topics. There are some default topics, such as Technology, Politics, and Health & Wellbeing, as shown in the following screenshot. Users can also create their own topics.
    An example of the topics functionality
  • Semantic search – Users can search their articles using semantic search, as shown in the following screenshot. Users can not only search for a specific topic but also narrow their search by factors such as tone or style.
    Example of the semantic search functionality

This post uses this application as a reference point to discuss the technical implementation of the semantic search and zero-shot classification features.

Solution overview

This solution uses the following services:

  • Amazon API Gateway – The API is accessible through Amazon API Gateway. Caching is performed on Amazon CloudFront for certain topics to ease the database load.
  • Amazon Bedrock with Cohere v3 Embed – The articles and topics are converted into embeddings with the help of Amazon Bedrock and Cohere v3 Embed.
  • Amazon CloudFront and Amazon Simple Storage Service (Amazon S3) – The single-page React application is hosted using Amazon S3 and Amazon CloudFront.
  • Amazon Cognito – Authentication is done using Amazon Cognito user pools.
  • Amazon EventBridge – Amazon EventBridge and EventBridge schedules are used to coordinate new updates.
  • AWS Lambda – The API is a Fastify application written in TypeScript. It’s hosted on AWS Lambda.
  • Amazon Aurora PostgreSQL-Compatible Edition and pgvector – Amazon Aurora PostgreSQL-Compatible is used as the database, both for the functionality of the application itself and as a vector store using pgvector.
  • Amazon RDS Proxy – Amazon RDS Proxy is used for connection pooling.
  • Amazon Simple Queue Service (Amazon SQS) – Amazon SQS is used to queue events. It consumes one event at a time so it doesn’t hit the rate limit of Cohere in Amazon Bedrock.

The following diagram illustrates the solution architecture.

Scope of solution

What are embeddings?

This section offers a quick primer on what embeddings are and how they can be used.

Embeddings are numerical representations of concepts or objects, such as language or images. In this post, we discuss language embeddings. By reducing these concepts to numerical representations, we can then use them in a way that a computer can understand and operate on.

Let’s take Berlin and Paris as an example. As humans, we understand the conceptual links between these two words. Berlin and Paris are both cities, they’re capitals of their respective countries, and they’re both in Europe. We understand their conceptual similarities almost instinctively, because we can create a model of the world in our head. However, computers have no built-in way of representing these concepts.

To represent these concepts in a way a computer can understand, we convert them into language embeddings. Language embeddings are high dimensional vectors that learn their relationships with each other through the training of a neural network. During training, the neural network is exposed to enormous amounts of text and learns patterns based on how words are colocated and relate to each other in different contexts.

Embedding vectors allow computers to model the world from language. For instance, if we embed “Berlin” and “Paris,” we can now perform mathematical operations on these embeddings. We can then observe some fairly interesting relationships. For instance, we could do the following: Paris – France + Germany ~= Berlin. This is because the embeddings capture the relationships between the words “Paris” and “France” and between “Germany” and “Berlin”—specifically, that Paris and Berlin are both capital cities of their respective countries.

The following graph shows the word vector distance between countries and their respective capitals.

The embedding representations of countries, and capitals.

Subtracting “France” from “Paris” removes the country semantics, leaving a vector representing the concept of a capital city. Adding “Germany” to this vector, we are left with something closely resembling “Berlin,” the capital of Germany. The vectors for this relationship are shown in the following graph.

Demonstrating the manipulation of these embeddings to show conceptual similarities.

For our use case, we use the pre-trained Cohere Embeddings model in Amazon Bedrock, which embeds entire texts rather than a single word. The embeddings represent the meaning of the text and can be operated on using mathematical operations. This property can be useful to map relationships such as similarity between texts.

Zero-shot classification

One way in which we use language embeddings is by using their properties to calculate how similar an article is to one of the topics.

To do this, we break down a topic into a series of different and related embeddings. For instance, for culture, we have a set of embeddings for sports, TV programs, music, books, and so on. We then embed the incoming title and description of the RSS articles, and calculate the similarity against the topic embeddings. From this, we can assign topic labels to an article.

The following figure illustrates how this works. The embeddings that Cohere generates are highly dimensional, containing 1,024 values (or dimensions). However, to demonstrate how this system works, we use an algorithm designed to reduce the dimensionality of the embeddings, t-distributed Stochastic Neighbor Embedding (t-SNE), so that we can view them in two dimensions. The following image uses these embeddings to visualize how topics are clustered based on similarity and meaning.

Clustering of different topics

You can use the embedding of an article and check the similarity of the article against the preceding embeddings. You can then say that if an article is clustered closely to one of these embeddings, it can be classified with the associated topic.

This is the k-nearest neighbor (k-NN) algorithm. This algorithm is used to perform classification and regression tasks. In k-NN, you can make assumptions around a data point based on its proximity to other data points. For instance, you can say that an article that has proximity to the music topic shown in the preceding diagram can be tagged with the culture topic.

The following figure demonstrates this with an ArsTechnica article. We plot against the embedding of an article’s title and description: (The climate is changing so fast that we haven’t seen how bad extreme weather could get: Decades-old statistics no longer represent what is possible in the present day).

Display of ars-technica article clustered against different topics

The advantage of this approach is that you can add custom, user-generated topics. You can create a topic by first creating a series of embeddings of conceptually related items. For instance, an AI topic would be similar to the embeddings for AI, Generative AI, LLM, and Anthropic, as shown in the following screenshot.

Adding a custom topic to the application

In a traditional classification system, we’d be required to train a classifier—a supervised learning task where we’d need to provide a series of examples to establish whether an article belongs to its respective topic. Doing so can be quite an intensive task, requiring labeled data and training the model. For our use case, we can provide examples, create a cluster, and tag articles without having to provide labeled examples or train additional models. This is shown in the following screenshot of results page of our website.

In our application, we ingest new articles on a schedule. We use EventBridge schedules to periodically call a Lambda function, which checks if there are new articles. If there are, it creates an embedding from them using Amazon Bedrock and Cohere.

We calculate the article’s distance to the different topic embeddings, and can then determine whether the article belongs to that category. This is done with Aurora PostgreSQL with pgvector. We store the embeddings of the topics and then calculate their distance using the following SQL query:

const topics = await sqlClient.then(it=> it.query(
    `SELECT name, embedding_description, similarity
     FROM (SELECT topic_id as name, embedding_description, (1- ABS( 1 –(embed.embedding <-> $1))) AS "similarity" FROM topic_embedding_link embed)  topics
     ORDER BY similarity desc`,
    [toSql(articleEmbedding)]
  ))

The <-> operator in the preceding code calculates the Euclidean distance between the article and the topic embedding. This number allows us to understand how close an article is to one of the topics. We can then determine the appropriateness of a topic based on this ranking.

We then tag the article with the topic. We do this so that the subsequent request for a topic is as computationally as light as possible; we do a simple join rather than calculating the Euclidean distance.

const formattedTopicInsert = pgformat(
    `INSERT INTO feed_article_topic_link(topic_id, feed_article_id) VALUES %L ON CONFLICT DO NOTHING`,
    topicLinks
  )

We also cache a specific topic/feed combination because these are calculated hourly and aren’t expected to change in the interim.

Semantic search

As previously discussed, the embeddings produced by Cohere contain a multitude of features; they embed the meanings and semantics of a word of phrase. We’ve also found that we can perform mathematical operations on these embeddings to do things such as calculate the similarity between two phrases or words.

We can use these embeddings and calculate the similarity between a search term and an embedding of an article with the k-NN algorithm to find articles that have similar semantics and meanings to the search term we’ve provided.

For example, in one of our RSS feeds, we have a lot of different articles that rate products. In a traditional search system, we’d rely on keyword matches to provide relevant results. Although it might be simple to find a specific article (for example, by searching “best digital notebooks”), we would need a different method to capture multiple product list articles.

In a semantic search system, we first transform the term “Product list” in an embedding. We can then use the properties of this embedding to perform a search within our embedding space. Using the k-NN algorithm, we can find articles that are semantically similar. As shown in the following screenshot, despite not containing the text “Product list” in either the title or description, we’ve been able to find articles that contain a product list. This is because we were able to capture the semantics of the query and match it to the existing embeddings we have for each article.

Semantic search example

In our application, we store these embeddings using pgvector on Aurora PostgreSQL. pgvector is an open source extension that enables vector similarity search in PostgreSQL. We transform our search term into an embedding using Amazon Bedrock and Cohere v3 Embed.

After we’ve converted the search term to an embedding, we can compare it with the embeddings on the article that have been saved during the ingestion process. We can then use pgvector to find articles that are clustered together. The SQL code for that is as follows:

SELECT *
FROM (
    SELECT feed_articles.id as id, title, feed_articles.feed_id as feed, feedName, slug, description, url, author, image, published_at as published, 1 - ABS(1 - (embedding <-> $2)) AS "similarity"
    FROM feed_articles
    INNER JOIN (select feed_id, name as feedName from feed_user_subscription fus where fus.user_id=$1) sub on feed_articles.feed_id=sub.feed_id
    ${feedId != undefined ? `WHERE feed_articles.feed_id = $4` : ""}
)
WHERE similarity > 0.95
ORDER BY similarity desc
LIMIT $3;

This code calculates the distance between the topics, and the embedding of this article as “similarity.” If this distance is close, then we can assume that the topic of the article is related, and we therefore attach the topic to the article.

Prerequisites

To deploy this application in your own account, you need the following prerequisites:

  • An active AWS account.
  • Model access for Cohere Embed English. On the Amazon Bedrock console, choose Model access in the navigation pane, then choose Manage model access. Select the FMs of your choice and request access.

Model access dialog

Deploy the AWS CDK stack

When the prerequisite steps are complete, you’re ready to set up the solution:

  1. Clone the GitHub repository containing the solution files:
    git clone https://github.com/aws-samples/rss-aggregator-using-cohere-embeddings-bedrock
  1. Navigate to the solution directory:
    cd infrastructure
  1. In your terminal, export your AWS credentials for a role or user in ACCOUNT_ID. The role needs to have all necessary permissions for AWS CDK deployment:
    • export AWS_REGION=”<region>”
      – The AWS Region you want to deploy the application to
    • export AWS_ACCESS_KEY_ID=”<access-key>”
      – The access key of your role or user
    • export AWS_SECRET_ACCESS_KEY=”<secret-key>”
      – The secret key of your role or user
  1. If you’re deploying the AWS CDK for the first time, run the following command:
    cdk bootstrap
  1. To synthesize the AWS CloudFormation template, run the following command:
    cdk synth -c vpc_id=<ID Of your VPC>
  1. To deploy, use the following command:
    cdk deploy -c vpc_id=<ID Of your VPC>

When deployment is finished, you can check these deployed stacks by visiting the AWS CloudFormation console, as shown in the following screenshot.

Application CDK Stack

Clean up

Run the following command in the terminal to delete the CloudFormation stack provisioned using the AWS CDK:

cdk destroy --all

Conclusion

In this post, we explored what language embeddings are and how they can be used to enhance your application. We’ve learned how, by using the properties of embeddings, we can implement a real-time zero-shot classifier and can add powerful features such as semantic search.

The code for this application can be found on the accompanying GitHub repo. We encourage you to experiment with language embeddings and find out what powerful features they can enable for your applications!


About the Author

About the AuthorThomas Rogers is a Solutions Architect based in Amsterdam, the Netherlands. He has a background in software engineering. At AWS, Thomas helps customers build cloud solutions, focusing on modernization, data, and integrations.

Read More

Physicists Tap James Web Space Telescope to Track New Asteroids and City-Killer Rock

Physicists Tap James Web Space Telescope to Track New Asteroids and City-Killer Rock

Asteroids were responsible for extinction events hundreds of millions of years ago on Earth, providing no shortage of doomsday film plots for Hollywood.

But researchers focused on asteroid tracking are on a mission to locate them for today’s real-world concerns: planetary defense.

The new and unexpected discovery tool applied in this research is NASA’s James Web Space Telescope (JWST), which was tapped for views of these asteroids from previous research and enabled by NVIDIA accelerated computing.

An international team of researchers, led by MIT physicists, reported on the cover of Nature this week how the new method was able to spot 10-meter asteroids within the main asteroid belt located between Jupiter and Mars.

These rocks in space can range from the size of a bus to several Costco stores in width and deliver destruction to cities on Earth.

The finding of more than 100 space rocks of this size marks the smallest asteroids ever detected in the main asteroid belt. Previously, the smallest asteroids spotted measured more than half a mile in diameter.

Researchers say the novel method — tapping into previous studies, asteroid synthetic movement tracking and infrared observations — will help identify and track orbital movements of asteroids likely to approach Earth, supporting asteroid defense efforts.

“We have been able to detect near-Earth objects down to 10 meters in size when they are really close to Earth,” Artem Burdanov, the study’s co-lead author and a research scientist in MIT’s Department of Earth, Atmospheric and Planetary Sciences, told MIT News. “We now have a way of spotting these small asteroids when they are much farther away, so we can do more precise orbital tracking, which is key for planetary defense.”

New research has also supported follow-up observations on asteroid 2024YR4, which is on a potential collision course with Earth by 2032.

Capturing Asteroid Images With Infrared JWST Driven by NVIDIA GPUs

Observatories typically look at the reflected light off asteroids to determine their size, which can be inaccurate. Using a telescope with infrared, like the JWST, can help track the thermal signals of asteroids for a more precise way at gauging their size.

Asteroid hunters focused on planetary defense are looking out for near-Earth asteroids. These rocks have orbits around the Sun that are within 28 million miles of Earth’s orbit. And any asteroid around 450 feet long is capable of demolishing a sizable city.

The asteroid paper’s co-authors included MIT professors of planetary science co-lead Julien de Wit and Richard Binzel. Contributions from international institutions included the University of Liege in Belgium, Charles University in the Czech Republic, the European Space Agency, and institutions in Germany including the Max Planck Institute for Extraterrestrial Physics and the University of Oldenburg.

The work was supported by the NVIDIA Academic Grant Program.

Harnessing GPUs to Save the Planet From Asteroids

The 2024YR4 near-Earth asteroid — estimated as wide as 300 feet and capable of destroying a city the size of New York — has a 2.3% chance of striking Earth.

Movies like Armageddon provide fictional solutions, like implanting a nuclear bomb, but it’s unclear how this could play out off screen.

The JWST technology will soon be the only telescope capable of tracking the space rock as it moves away from Earth before coming back.

The new study used the JWST, the best-ever telescope in the  infrared, on images of TRAPPIST-1, a star studied to search for signs of atmospheres around its seven terrestrial planets and located about 40 light years from Earth. The data include more than 10,000 images of the star.

After processing the images from JWST to study TRAPPIST-1’s planets, the researchers considered whether they could do more with the datasets. They’re looking at if they can search for otherwise undetectable asteroids using JWST’s infrared capabilities and a new detection technique they had deployed on other datasets called synthetic tracking.

The researchers applied synthetic tracking methods, which doesn’t require previous information on an asteroid’s motion. Instead it does “fully blind” search by testing possible shifts, like velocity vectors.

Such techniques are computationally intense, and they created bottlenecks until NVIDIA GPUs were applied to such work in recent years. Harnessing GPU-based synthetic tracking increases the scientific return on resources when conducting exoplanet transit-search surveys by recovering serendipitous asteroid detections, the study said.

After applying their GPU-based framework for detecting asteroids in targeted exoplanet surveys, the researchers were able to detect eight known and 139 unknown asteroids, the paper’s authors noted.

“Today’s GPU technology was key to unlocking the scientific achievement of detecting the small-asteroid population of the main belt, but there is more to it in the form of planetary-defense efforts,” said de Wit. “Since our study, the potential Earth-impactor 2024YR4 has been detected, and we now know that JWST can observe such an asteroid all the way out to the main belt as they move away from Earth before coming back. And in fact, JWST will do just that soon.”

 

Gif attribution: https://en.wikipedia.org/wiki/2024_YR4#/media/File:2024_YR4_ESO-VLT.gif

Read More

GeForce NOW Welcomes Warner Bros. Games to the Cloud With ‘Batman: Arkham’ Series

GeForce NOW Welcomes Warner Bros. Games to the Cloud With ‘Batman: Arkham’ Series

It’s a match made in heaven — GeForce NOW and Warner Bros. Games are collaborating to bring the beloved Batman: Arkham series to the cloud as part of GeForce NOW’s fifth-anniversary celebration. Just in time for Valentine’s Day, gamers can fall in love all over again with Gotham City’s Dark Knight, streaming his epic adventures from anywhere, on nearly any device.

The sweet treats don’t end there. GeForce NOW also brings the launch of the highly anticipated Sid Meier’s Civilization VII.

It’s all part of the lovable lineup of seven games joining the cloud this week.

A Match Made in Gotham City

GeForce NOW is welcoming Warner Bros. Games to the cloud with the Batman: Arkham series, including Arkham Asylum Game of the Year Edition, Batman: Arkham City Game of the Year Edition and Batman: Arkham Knight Premium. Don the cape and cowl of the world’s greatest detective, bringing justice to the streets of Gotham City with bone-crushing combat and ingenious gadgets.

Batman: Arkham Asylum on GeForce NOW
Villains check in, but they don’t check out.

Experience the dark and gritty world of Gotham City’s infamous asylum in the critically acclaimed action-adventure game Batman: Arkham Asylum. The Game of the Year (GOTY) Edition enhances the original title with additional challenge maps, allowing players to test their skills as the Dark Knight. Unravel the Joker’s sinister plot, face off against iconic villains and harness Batman’s gadgets and detective abilities in this groundbreaking title.

Arkham City on GeForce NOW
Every street is a crime scene.

Members can expand their crimefighting horizons in the open-world sequel, Batman: Arkham City. The GOTY Edition includes the full game, plus all downloadable content, featuring Catwoman, Nightwing and Robin as playable characters. Explore the sprawling super-prison of Arkham City, confront a rogues’ gallery of villains and uncover the mysteries behind Hugo Strange’s Protocol 10. With enhanced gameplay and an even larger arsenal of gadgets, Batman: Arkham City elevates the Batman experience to new heights.

Arkham Knight on GeForce NOW
Just another Tuesday in the neighborhood.

Conclude the Batman: Arkham trilogy in style with Batman: Arkham Knight. The Premium Edition includes the base game and season pass, offering new story missions, additional DC Super-Villains, legendary Batmobiles, advanced challenge maps and alternative character skins. Take control of a fully realized Gotham City, master the iconic Batmobile and face off against the Scarecrow and the mysterious Arkham Knight in this epic finale. With stunning visuals and refined gameplay, Batman: Arkham Knight delivers the ultimate Batman experience.

It’s the ideal time for members to be swept off their feet by the Caped Crusader. Stream the Batman: Arkham series with a GeForce NOW Ultimate membership and experience these iconic titles in stunning 4K resolution at up to 120 frames per second. Feel the heartbeat of Gotham City, the rush of grappling between skyscrapers and the thrill of outsmarting Gotham’s most notorious villains — all from the cloud.

Build an Empire in the Cloud

GeForce NOW’s fifth-anniversary celebration continues this week with the gift of Sid Meier’s Civilization VII in the cloud at launch.

Civ 7 on GeForce NOW
Grow your civilization with the power of the cloud.

2K Games’ highly anticipated sequel comes to the cloud with innovative gameplay mechanics. This latest installment introduces a dynamic three-Age structure — Antiquity, Exploration and Modern — allowing players to evolve their civilizations throughout history and transition between civilizations with the flexibility to shape empires’ destinies.

Explore unknown lands, expand territories, engage in diplomacy or battle with rival nations. Sid Meier’s Civilization VII introduces a crisis-event system at the end of each era, bringing challenges for players to navigate.

With its refined gameplay and bold new features, the title offers both longtime fans and newcomers a fresh and engaging take on the classic formula that has defined the series for decades.

Prepare for New Games

Legacy steel & sorcery on GeForce NOW
Steel and sorcery, meet cloud and FPS.

Legacy: Steel & Sorcery is an action-packed player vs. player vs. environment extraction role-playing game (RPG) set in the fantasy world of Mithrigarde by Notorious Studios, former World of Warcraft developers. Choose from distinctive classes like Warrior, Hunter, Rogue and Priest, each with unique abilities and environmental interactions. The game features a dynamic combat system emphasizing skill-based PvP, a full-loot system and RPG progression elements. Explore expansive outdoor zones solo or team up with friends to search for treasures, complete quests and battle both AI-controlled foes and rival players for an immersive fantasy RPG experience with a fresh twist on the extraction genre.

Look for the following games available to stream in the cloud this week:

What are you planning to play this weekend? Let us know on X or in the comments below.

Read More

ARMOR: Egocentric Perception for Humanoid Robot Collision Avoidance and Motion Planning

Humanoid robots have significant gaps in their sensing and perception, making it hard to perform motion planning in dense environments. To address this, we introduce ARMOR, a novel egocentric perception system that integrates both hardware and software, specifically incorporating wearable-like depth sensors for humanoid robots. Our distributed perception approach enhances the robot’s spatial awareness, and facilitates more agile motion planning. We also train a transformer-based imitation learning (IL) policy in simulation to perform dynamic collision avoidance, by leveraging around 86 hours…Apple Machine Learning Research

Robust Autonomy Emerges from Self-Play

Self-play has powered breakthroughs in two-player and multi-player games. Here we show that self-play is a surprisingly effective strategy in another domain. We show that robust and naturalistic driving emerges entirely from self-play in simulation at unprecedented scale — 1.6~billion~km of driving. This is enabled by GigaFlow, a batched simulator that can synthesize and train on 42 years of subjective driving experience per hour on a single 8-GPU node. The resulting policy achieves state-of-the-art performance on three independent autonomous driving benchmarks. The policy outperforms the…Apple Machine Learning Research