Enabling Fast Gradient Clipping and Ghost Clipping in Opacus

Enabling Fast Gradient Clipping and Ghost Clipping in Opacus

Introduction and Context

Differentially Private Stochastic Gradient Descent (DP-SGD) is the canonical method for training machine learning models with differential privacy. It involves the following two modifications to its non-private counterpart, Stochastic Gradient Descent.

  1. Per-sample gradient clipping: Clip gradients with respect to every sample in the mini-batch, ensuring that its norm is at most a pre-specified value, “Clipping Norm”, C, in every iteration.

  2. Noise addition: Add Gaussian noise of pre-specified variance, depending on the clipping norm and privacy parameters, to the average clipped gradient, in every iteration.

The first change, per-sample gradient clipping, introduces additional complexities since, in general, it requires instantiating per-sample gradients.

Opacus is a PyTorch implementation of DP-SGD. Opacus addresses the above task by employing hook functions, which allows intervening on specific events, such as forward and backward passes. For more details about Opacus, we encourage readers to review the previous blog posts: DP-SGD Algorithm Explained, Efficient Per-Sample Gradient Computation in Opacus and Efficient Per-Sample Gradient Computation for More Layers in Opacus.

While Opacus provides substantial efficiency gains compared to the naive approaches, the memory cost of instantiating per-sample gradients is significant. In particular, memory usage is proportional to the batch size times the number of trainable parameters. Consequently, memory limits Opacus to small batch sizes and/or small models, significantly restricting its range of applications.

We introduce Fast Gradient Clipping and Ghost Clipping to Opacus, which enable developers and researchers to perform gradient clipping without instantiating the per-sample gradients. As an example, this allows for fine-tuning 7M parameters of BERT, on a single 16GB GPU, with a batch size of 1024, with memory comparable to using PyTorch (without applying DP-SGD). In contrast, the previous version of Opacus, supported a maximum batch size of roughly 256 for the same setting. We provide a tutorial on how to use Fast Gradient Clipping in Opacus with the aforementioned task as an example.

Fast Gradient Clipping and Ghost Clipping

The key idea behind these techniques is based on the following observation: suppose per-sample gradient norms are known, then gradient clipping can be achieved by backpropagation on a re-weighted loss function $ bar{L} $. This loss function is defined as $ bar{L} = sum_{i} R_{i} L_{i} $, where $ R_i = minleft(frac{C}{C_i}, 1right) $ are the clipping coefficients computed from the per-sample gradient norms $ {C_i} $ and $ {L_i} $ are per-sample losses.

The above idea may seem circular at first glance, as it appears to require instantiating per-sample gradients in order to calculate per-sample gradient norms. However, for certain widely-used components of neural network architectures, such as fully connected/linear layers, it is indeed possible to obtain per-sample gradient norms in a single backpropagation pass without the need for per-sample gradients. This suggests a workflow that involves two backpropagation passes: the first to compute per-sample gradient norms, and the second to compute the aggregated (not per-sample) clipped gradient. The second backpropagation is simply the standard batched backpropagation.

backpropagation diagram

backpropagation diagram

Figure 1: Comparison between vanilla Opacus (top left), Fast Gradient Clipping (top right), and Ghost clipping (bottom). We marked in red gradient instantiations that become memory bottlenecks. For vanilla Opacus, it has to instantiate the per-sample gradients. Fast Gradient Clipping instantiates per-sample gradients for each layer to compute its norm, which is immediately released once the backward pass moves on to the next layer. Ghost Clipping works directly from per-sample activation gradients and per-sample activations, and avoids the need for gradient instantiation.

Fast Gradient Clipping
In Fast Gradient Clipping, the per-sample gradient norm is calculated in three steps:

  1. For each layer, the per-sample gradient is instantiated and its norm is calculated.
  2. The per-sample gradient is then immediately discarded.
  3. The (squared) per-sample gradient norms of each layer are summed up to obtain the overall (squared) per-sample gradient norm.

Ghost Clipping
Extending the approach of Fast Gradient Clipping, Ghost Clipping uses the fact that for linear layers1, per-sample gradient norms can be calculated just from activation gradients and activations. In particular, let backprops and activations be per-sample activation gradients and activations, of dimensions batch_size ✕ output_width and batch_size ✕ input_width, respectively. The per-sample gradient is the outer product of the two, which takes O(batch_size ✕ input_width ✕ output_width) time and space.

The ghost clipping trick instead calculates the (squared) norm of backprops and activations, sample-wise, and takes their product, which gives the (squared) norm of the gradient. This takes O(batch-size ✕ (input_width + output_width)) time and takes O(batch-size) space to store. Since per-sample activation and per-sample activation gradients are already stored, additional memory is needed only for storing the norms.

Relationship between Fast Gradient Clipping and Ghost Clipping

  1. Fast Gradient Clipping and Ghost Clipping are complementary techniques. Fast Gradient Clipping can be applied to any type of layer, while Ghost Clipping is a strictly better technique for supported layers.
  2. Our implementation automatically switches to Fast Gradient Clipping when the layer is not supported by Ghost Clipping.

How to use Fast Gradient Clipping in Opacus

The training loop is identical to that of the standard PyTorch loop. As in Opacus before, we use the PrivacyEngine(), which “sanitizes” the model and optimizer. To enable Ghost Clipping, the argument grad_sample_mode="ghost" is used. Additionally, make_private() takes the loss criterion as an extra input and sanitizes it. This allows us to hide the two backward passes and the loss rescaling in between in loss.backward().

from opacus import PrivacyEngine
criterion = nn.CrossEntropyLoss() # example loss function

privacy_engine = PrivacyEngine()
model_gc, optimizer_gc, criterion_gc, train_loader, = privacy_engine.make_private(
        module=model,
        optimizer=optimizer,
        data_loader=train_loader,
        noise_multiplier=noise_multiplier
        max_grad_norm=max_grad_norm,
	 criterion=criterion,
        grad_sample_mode="ghost",
)

# The training loop below is identical to that of PyTorch

for input_data, target_data in train_loader:
    output_gc = model_gc(input_data) # Forward pass
    optimizer_gc.zero_grad()
    loss = criterion_gc(output_gc, target_data)
    loss.backward()
    optimizer_gc.step()  # Add noise and update the model

Internally, before the first pass, we enable the hooks, which allows us to capture layer-wise values corresponding to forward and backward calls. They are used to compute the per-sample gradient norms. We then compute the clipping coefficients, rescale the loss function and disable hooks, which lets us use the standard PyTorch backward pass.

Memory Complexity Analysis

Consider a multi-layer neural network with the following properties:

L: Number of layers
d: Maximum layer width
B: Batch size
K: Number of non-supported/non-linear layers

The memory overhead of DP-SGD with Ghost Clipping compared to plain (PyTorch) SGD is an additive O(BL), required to store the per-sample gradient norms for all layers. Further, if there is a non-supported layer (if K≥1), then there is an additional O(Bd2) memory to instantiate the gradient of that layer.

Memory Benchmarking

We provide results on the memory usage for a variety of settings.

Fine-Tuning BERT

We consider the problem of privately fine-tuning the last three layers of BERT for a text classification task. The base model has over 100M parameters, of which we fine-tune the last three layers, BertEncoder, BertPooler, and Classifier, comprising roughly 7.6M parameters. The experiments are run on a P100 GPU with 16 GB of memory.

The following table reports the maximum memory and time taken per iteration for the various methods:

Batch size
B = 32 B = 128 B = 512 B = 1024 B = 2048
Mem Time Mem Time Mem Time Mem Time
PyTorch SGD 236 MB 0.15 s 1.04 GB 0.55 s 5.27 GB 2.1 s 12.7 GB 4.2 s OOM
DP-SGD 1,142 MB 0.21 s 4.55 GB 0.68 s OOM OOM OOM
FGC DP-SGD 908 MB 0.21 s 3.6 GB 0.75 s OOM OOM OOM
GC DP-SGD 362 MB 0.21 s 1.32 GB 0.67 s 5.27 GB 2.5 s 12.7 GB 5 s OOM

In terms of peak memory footprint, DP-SGD > FGC DP-SGD ≫ GC DP-SGD ≈ PyTorch SGD. Further, the runtimes are similar because most of the parameters are frozen and the forward pass takes up most of the time.

Synthetic Setup: Memory Profiling

We consider the following setup to profile the memory used by PyTorch SGD, Vanilla DP-SGD and Ghost Clipping, GC DP-SGD.

  • 2-layer fully connected neural network
    • Input: 5120
    • Hidden: 2560
    • Output: 1280
    • Total number of model parameters = 15.6M
    • Model size = 62.5 MB
  • Batch size, different values, as seen in the table below.

The table below summarizes the max memory increase (in MB) broken down by stages of the training loop for each of the methods.

Batch Size Method Model to GPU Forward First Backward Second Backward Optimizer Step
32 PyTorch SGD 62.5 0.5 62.5 N/A 0
Vanilla DP-SGD 62.5 0.47 3,663 N/A 162.5
GC DP-SGD 62.5 0.47 63.13 50 125
217 PyTorch SGD 62.5 1920 1932.5 N/A 0
Vanilla DP-SGD OOM
GC DP-SGD 62.5 1920 2625 1932.5 125

Industry use case

We tested Ghost Clipping DP-SGD on an internal Meta use case, consisting of a model of size roughly 100B with 40M trainable parameters. Our initial results show that Ghost Clipping SGD reduces 95% memory of vanilla DP-SGD, and achieves comparable memory usage to PyTorch SGD.

Conclusion

In this post, we describe implementations of Fast Gradient Clipping and Ghost Clipping in Opacus that enable memory-efficient training of machine learning models with differential privacy. Currently, the Ghost Clipping implementation only applies to linear layers, but, as outlined in part 3 of the series, it can be extended to “generalized” linear layers such as convolutions and multi-head attention. The current techniques require two explicit backpropagation steps, which increases runtime. We will explore developments on top of Ghost Clipping such as the Book-Keeping algorithm for mitigation.

To learn more about Opacus, visit opacus.ai and github.com/pytorch/opacus.

Acknowledgements

We thank Iden Kalemaj, Darren Liu, Karthik Prasad, Hao Shi, Igor Shilov, Davide Testuggine, Eli Uriegas, Haicheng Wang, and Richard Zou for valuable feedback and suggestions.

  1. There are ways to extend Ghost Clipping to non-linear layers. 

Read More

On the Benefits of Pixel-Based Hierarchical Policies for Task Generalization

Reinforcement learning practitioners often avoid hierarchical policies, especially in image-based observation spaces. Typically, the single-task performance improvement over flat-policy counterparts does not justify the additional complexity associated with implementing a hierarchy. However, by introducing multiple decision-making levels, hierarchical policies can compose lower-level policies to more effectively generalize between tasks, highlighting the need for multi-task evaluations. We analyze the benefits of hierarchy through simulated multi-task robotic control experiments from pixels…Apple Machine Learning Research

Cohere Rerank 3 Nimble now generally available on Amazon SageMaker JumpStart

Cohere Rerank 3 Nimble now generally available on Amazon SageMaker JumpStart

The Cohere Rerank 3 Nimble foundation model (FM) is now generally available in Amazon SageMaker JumpStart. This model is the newest FM in Cohere’s Rerank model series, built to enhance enterprise search and Retrieval Augmented Generation (RAG) systems.

In this post, we discuss the benefits and capabilities of this new model with some examples.

Overview of Cohere Rerank models

Cohere’s Rerank family of models are designed to enhance existing enterprise search systems and RAG systems. Rerank models improve search accuracy over both keyword-based and embedding-based search systems. Cohere Rerank 3 is designed to reorder documents retrieved by initial search algorithms based on their relevance to a given query. A reranking model, also known as a cross-encoder, is a type of model that, given a query and document pair, will output a similarity score. For FMs, words, sentences, or entire documents are often encoded as dense vectors in a semantic space. By calculating the cosine of the angle between these vectors, you can quantify their semantic similarity and output as a single similarity score. You can use this score to reorder the documents by relevance to your query.

Cohere Rerank 3 Nimble is the newest model from Cohere’s Rerank family of models, designed to improve speed and efficiency from its predecessor Cohere Rerank 3. According to Cohere’s benchmark tests including BEIR (Benchmarking IR) for accuracy and internal benchmarking datasets, Cohere Rerank 3 Nimble maintains high accuracy while being approximately 3–5 times faster than Cohere Rerank 3. The speed improvement is designed for enterprises looking to enhance their search capabilities without sacrificing performance.

The following diagram represents the two-stage retrieval of a RAG pipeline and illustrates where Cohere Rerank 3 Nimble is incorporated into the search pipeline.

Flow of Solution

In the first stage of retrieval in the RAG architecture, a set of candidate documents are returned based on the knowledge base that’s relevant to the query. In the second stage, Cohere Rerank 3 Nimble analyzes the semantic relevance between the query and each retrieved document, reordering them from most to least relevant. The top-ranked documents augment the original query with additional context. This process improves search result quality by identifying the most pertinent documents. Integrating Cohere Rerank 3 Nimble into a RAG system enables users to send fewer but higher-quality documents to the language model for grounded generation. This results in improved accuracy and relevance of search results without adding latency.

Overview of SageMaker JumpStart

SageMaker JumpStart offers access to a broad selection of publicly available FMs. These pre-trained models serve as powerful starting points that can be deeply customized to address specific use cases. You can now use state-of-the-art model architectures, such as language models, computer vision models, and more, without having to build them from scratch.

Amazon SageMaker is a comprehensive, fully managed machine learning (ML) platform that revolutionizes the entire ML workflow. It offers an unparalleled suite of tools that cater to every stage of the ML lifecycle, from data preparation to model deployment and monitoring. Data scientists and developers can use the SageMaker integrated development environment (IDE) to access a vast array of pre-built algorithms, customize their own models, and seamlessly scale their solutions. The platform’s strength lies in its ability to abstract away the complexities of infrastructure management, allowing you to focus on innovation rather than operational overhead. The automated ML capabilities of SageMaker, including automated machine learning (AutoML) features, democratize ML by enabling even non-experts to build sophisticated models. Furthermore, its robust governance features help organizations maintain control and transparency over their ML projects, addressing critical concerns around regulatory compliance.

Prerequisites

Make sure your SageMaker AWS Identity and Access Management (IAM) service role has the AmazonSageMakerFullAccess permission policy attached.

To deploy Cohere Rerank 3 Nimble successfully, confirm one of the following:

  • Make sure your IAM role has the following permissions and you have the authority to make AWS Marketplace subscriptions in the AWS account used:
    • aws-marketplace:ViewSubscriptions
    • aws-marketplace:Unsubscribe
    • aws-marketplace:Subscribe
  • Alternatively, confirm your AWS account has a subscription to the model. If so, you can skip the following deployment instructions and start with subscribing to the model package.

Deploy Cohere Rerank 3 Nimble on SageMaker JumpStart

You can access the Cohere Rerank 3 family of models using SageMaker JumpStart in Amazon SageMaker Studio, as shown in the following screenshot.

Cohere Sagemaker Jumpstart Viea

Deployment starts when you choose Deploy, and you may be prompted to subscribe to this model through AWS Marketplace. If you are already subscribed, you can choose Deploy again to deploy the model. After deployment finishes, you will see that an endpoint is created. You can test the endpoint by passing a sample inference request payload or by selecting the testing option using the SDK.

Cohere rerank model card

Subscribe to the model package

To subscribe to the model package, complete the following steps:

  1. Depending on the model you want to deploy, open the model package listing page for cohere-rerank-nimble-english or cohere-rerank-nimble-multilingual.
  2. On the AWS Marketplace listing, choose Continue to subscribe.
  3. On the Subscribe to this software page, review and choose Accept Offer if you and your organization agree with EULA, pricing, and support terms.
  4. Choose Continue to configuration and then choose an AWS Region.

A product ARN will be displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3.

Deploy Cohere Rerank 3 Nimble using the SDK

To deploy the model using the SDK, copy the product ARN from the previous step and specify it in the model_package_arn in the following code:

from cohere_aws import Client
import boto3
region = boto3.Session().region_name

model_package_arn = "Specify the model package ARN here"

After you specify the model package ARN, you can create the endpoint, as shown in the following code. Specify the name of the endpoint, the instance type, and the number of instances being used. Make sure you have the account-level service limit for using ml.g5.xlarge for endpoint usage as one or more instances. To request a service quota increase, refer to AWS service quotas.

co = Client(region_name=region)
co.create_endpoint(arn=model_package_arn, endpoint_name="cohere-rerank-3/cohere-rerank-nimble-multilingual", instance_type="ml.g5.xlarge", n_instances=1)

If the endpoint is already created, you just need to connect to it with the following code:

co.connect_to_endpoint(endpoint_name="cohere-rerank-3/cohere-rerank-nimble-multilingual-v3")

Follow a similar process as detailed earlier to deploy Cohere Rerank 3 on SageMaker JumpStart.

Inference example with Cohere Rerank 3 Nimble

Cohere Rerank 3 Nimble offers robust multilingual support. The model is available in both English and multilingual versions supporting over 100 languages.

The following code example illustrates how to perform real-time inference using Cohere Rerank 3 Nimble-English:

documents = [
    {"Title":"Incorrect Password","Content":"Hello, I have been trying to access my account for the past hour and it keeps saying my password is incorrect. Can you please help me?"},
    {"Title":"Confirmation Email Missed","Content":"Hi, I recently purchased a product from your website but I never received a confirmation email. Can you please look into this for me?"},
    {"Title":"Questions about Return Policy","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"Customer Support is Busy","Content":"Good morning, I have been trying to reach your customer support team for the past week but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Received Wrong Item","Content":"Hi, I have a question about my recent order. I received the wrong item and I need to return it."},
    {"Title":"Customer Service is Unavailable","Content":"Hello, I have been trying to reach your customer support team for the past hour but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Return Policy for Defective Product","Content":"Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"Wrong Item Received","Content":"Good morning, I have a question about my recent order. I received the wrong item and I need to return it."},
    {"Title":"Return Defective Product","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."}
]

In the following code, the top_n inference parameter for Cohere Rerank 3 and Rerank 3 Nimble specifies the number of top-ranked results to return after reranking the input documents. It allows you to control how many of the most relevant documents are included in the final output. To determine an optimal value for top_n, consider factors such as the diversity of your document set, the complexity of your queries, and the desired balance between precision and latency for enterprise search or RAG.

response = co.rerank(documents=documents, query='What emails have been about returning items?', rank_fields=["Title","Content"], top_n=2)

The following is the output from Cohere Rerank 3 Nimble-English:

Documents: [RerankResult<document: {'Title': 'Received Wrong Item', 'Content': 'Hi, I have a question about my recent order. I received the wrong item and I need to return it.'}, index: 4, relevance_score: 0.0068771075>, RerankResult<document: {'Title': 'Wrong Item Received', 'Content': 'Good morning, I have a question about my recent order. I received the wrong item and I need to return it.'}, index: 7, relevance_score: 0.0064131636>]

Cohere Rerank 3 Nimble multilingual support

The multilingual capabilities of Cohere Rerank 3 Nimble-Multilingual enable global organizations to provide consistent, improved search experiences to users across different Regions and language preferences.

In the following example, we create an input payload for a list of emails in multiple languages. We can take the same set of emails from earlier and translate them to different languages. These examples are available under the SageMaker JumpStart model card and are randomly generated for this example.

documents = [
    {"Title":"Contraseña incorrecta","Content":"Hola, llevo una hora intentando acceder a mi cuenta y sigue diciendo que mi contraseña es incorrecta. ¿Puede ayudarme, por favor?"},
    {"Title":"Confirmation Email Missed","Content":"Hi, I recently purchased a product from your website but I never received a confirmation email. Can you please look into this for me?"},
    {"Title":"أسئلة حول سياسة الإرجاع","Content":"مرحبًا، لدي سؤال حول سياسة إرجاع هذا المنتج. لقد اشتريته قبل بضعة أسابيع وهو معيب"},
    {"Title":"Customer Support is Busy","Content":"Good morning, I have been trying to reach your customer support team for the past week but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Falschen Artikel erhalten","Content":"Hallo, ich habe eine Frage zu meiner letzten Bestellung. Ich habe den falschen Artikel erhalten und muss ihn zurückschicken."},
    {"Title":"Customer Service is Unavailable","Content":"Hello, I have been trying to reach your customer support team for the past hour but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Return Policy for Defective Product","Content":"Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"收到错误物品","Content":"早上好,关于我最近的订单,我有一个问题。我收到了错误的商品,需要退货。"},
    {"Title":"Return Defective Product","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."}
]

Use the following code to perform real-time inference using Cohere Rerank 3 Nimble-Multilingual:

response = co.rerank(documents=documents, query='What emails have been about returning items?', rank_fields=['Title','Content'], top_n=2)
print(f'Documents: {response}')

The following is the output from Cohere Rerank 3 Nimble-Multilingual:

Documents: [RerankResult<document: {'Title': '收到错误物品', 'Content': '早上好,关于我最近的订单,我有一个问题。我收到了错误的商品,需要退货。'}, index: 7, relevance_score: 0.034553625>, RerankResult<document: {'Title': 'أسئلة حول سياسة الإرجاع', 'Content': 'مرحبًا، لدي سؤال حول سياسة إرجاع هذا المنتج. لقد اشتريته قبل بضعة أسابيع وهو معيب'}, index: 2, relevance_score: 0.00037263767>]

The output translated to English is as follows:

Documents: [RerankResult<document: {'Title': 'Received Wrong Item', 'Content': 'Good morning, I have a question about my recent order. I received the wrong item and need to return it.'}, index: 7, relevance_score: 0.034553625>, RerankResult<document: {'Title': 'Questions about Return Policy', 'Content': 'Hello, I have a question about the return policy for this product. I bought it a few weeks ago and it's defective'}, index: 2, relevance_score: 0.00037263767>]

In both examples, the relevance scores are normalized to be in the range [0, 1]. Scores close to 1 indicate a high relevance to the query, and scores closer to 0 indicate low relevance.

Use cases suitable for Cohere Rerank 3 Nimble

The Cohere Rerank 3 Nimble model provides an option that prioritizes efficiency. The model is ideal for enterprises looking to enable their customers to accurately search complex documentation, build applications that understand over 100 languages, and retrieve the most relevant information from various data stores. In industries such as retail, where website drop-off increases with every 100 milliseconds added to search response time, having a faster AI model like Cohere Rerank 3 Nimble powering the enterprise search system translates to higher conversion rates.

Conclusion

Cohere Rerank 3 and Rerank 3 Nimble are now available on SageMaker JumpStart. To get started, refer to Train, deploy, and evaluate pretrained models with SageMaker JumpStart.

Interested in diving deeper? Check out the Cohere on AWS GitHub repo.


About the Authors

Breanne Warner is an Enterprise Solutions Architect at Amazon Web Services supporting healthcare and life science (HCLS) customers. She is passionate about supporting customers to use generative AI on AWS and evangelizing model adoption. Breanne is also on the Women@Amazon board as co-director of Allyship with the goal of fostering inclusive and diverse culture at Amazon. Breanne holds a Bachelor’s of Science in Computer Engineering from University of Illinois at Urbana Champaign (UIUC)

Nithin Vijeaswaran is a Solutions Architect at AWS. His area of focus is generative AI and AWS AI Accelerators. He holds a Bachelor’s degree in Computer Science and Bioinformatics. Niithiyn works closely with the Generative AI GTM team to enable AWS customers on multiple fronts and accelerate their adoption of generative AI. He’s an avid fan of the Dallas Mavericks and enjoys collecting sneakers.

Karan Singh is a Generative AI Specialist for third-party models at AWS, where he works with top-tier third-party foundational model providers to define and run join GTM motions that help customers train, deploy, and scale foundational models. Karan holds a Bachelor’s of Science in Electrical and Instrumentation Engineering from Manipal University and a Master’s in Science in Electrical Engineering from Northwestern University, and is currently an MBA Candidate at the Haas School of Business at University of California, Berkeley.

Read More

AI Chases the Storm: New NVIDIA Research Boosts Weather Prediction, Climate Simulation

AI Chases the Storm: New NVIDIA Research Boosts Weather Prediction, Climate Simulation

As hurricanes, tornadoes and other extreme weather events occur with increased frequency and severity, it’s more important than ever to improve and accelerate climate research and prediction using the latest technologies.

Amid peaks in the current Atlantic hurricane season, NVIDIA Research today announced a new generative AI model, dubbed StormCast, for emulating high-fidelity atmospheric dynamics. This means the model can enable reliable weather prediction at mesoscale — a scale larger than storms but smaller than cyclones — which is critical for disaster planning and mitigation.

Detailed in a paper written in collaboration with the Lawrence Berkeley National Laboratory and the University of Washington, StormCast arrives as extreme weather phenomena are taking lives, destroying homes and causing more than $150 billion in damage annually in the U.S. alone.

It’s just one example of how generative AI is supercharging thundering breakthroughs in climate research and actionable extreme weather prediction, helping scientists tackle challenges of the highest stakes: saving lives and the world.

NVIDIA Earth-2 — a digital twin cloud platform that combines the power of AI, physical simulations and computer graphics — enables simulation and visualization of weather and climate predictions at a global scale with unprecedented accuracy and speed.

At COMPUTEX in June, NVIDIA founder and CEO Jensen Huang announced CorrDiff, available through Earth-2.

In Taiwan, for example, the National Science and Technology Center for Disaster Reduction predicts fine-scale details of typhoons using CorrDiff, an NVIDIA generative AI model offered as part of Earth-2.

CorrDiff can super-resolve 25-kilometer-scale atmospheric data by 12.5x down to 2 kilometers — 1,000x faster and using 3,000x less energy for a single inference than traditional methods.

That means the center’s potentially lifesaving work, which previously cost nearly $3 million on CPUs, can be accomplished using about $60,000 on a single system with an NVIDIA H100 Tensor Core GPU. It’s a massive reduction that shows how generative AI and accelerated computing increase energy efficiency and lower costs.

The center also plans to use CorrDiff to predict downwash — when strong winds funnel down to street level, damaging buildings and affecting pedestrians — in urban areas.

Now, StormCast adds hourly autoregressive prediction capabilities to CorrDiff, meaning it can predict future outcomes based on past ones.

A Global Impact From a Regional Focus

Global climate research begins at a regional level.

Physical hazards of weather and climate change can vary dramatically on regional scales. But reliable numerical weather prediction at this level comes with substantial computational costs. This is due to the high spatial resolution needed to represent the underlying fluid-dynamic motions at mesoscale.

Regional weather prediction models — often referred to as convection-allowing models, or CAMs — have traditionally forced researchers to face varying tradeoffs in resolution, ensemble size and affordability.

CAMs are useful to meteorologists for tracking the evolution and structure of storms, as well as for monitoring its convective mode, or how a storm is organized when it forms. For example, the likelihood of a tornado is based on a storm’s structure and convective mode.

A mesoscale convective system visualized using NOAA’s Geostationary Operational Environmental Satellite. Image courtesy of NOAA.

CAMs also help researchers understand the implications for weather-related physical hazards at the infrastructure level.

For example, global climate model simulations can be used to inform CAMs, helping them translate slow changes in the moisture content of large atmospheric rivers into flash-flooding projections in vulnerable coastal areas.

At lower resolutions, machine learning models trained on global data have emerged as useful emulators of numerical weather prediction models that can be used to improve early-warning systems for severe events. These machine learning models typically have a spatial resolution of about 30 kilometers and a temporal resolution of six hours.

Now, with the help of generative diffusion, StormCast enables this at a 3-kilometer, hourly scale.

Despite being in its infancy, the model — when applied with precipitation radars — already offers forecasts with lead times of up to six hours that are up to 10% more accurate than the U.S. National Oceanic and Atmospheric Administration (NOAA)’s state-of-the-art 3-kilometer operational CAM.

Plus, outputs from StormCast exhibit physically realistic heat and moisture dynamics, and can predict over 100 variables, such as temperature, moisture concentration, wind and rainfall radar reflectivity values at multiple, finely spaced altitudes. This enables scientists to confirm the realistic 3D evolution of a storm’s buoyancy — a first-of-its-kind accomplishment in AI weather simulation.

NVIDIA researchers trained StormCast on approximately three-and-a-half years of NOAA climate data from the central U.S., using NVIDIA accelerated computing to speed calculations.

More Innovations Brewing

Scientists are already looking to harness the model’s benefits.

“Given both the outsized impacts of organized thunderstorms and winter precipitation, and the major challenges in forecasting them with confidence, the production of computationally tractable storm-scale ensemble weather forecasts represents one of the grand challenges of numerical weather prediction,” said Tom Hamill, head of innovation at The Weather Company. “StormCast is a notable model that addresses these challenges, and The Weather Company is excited to collaborate with NVIDIA on developing, evaluating and potentially using these deep learning forecast models.”

“Developing high-resolution weather models requires AI algorithms to resolve convection, which is a huge challenge,” said Imme Ebert-Uphoff, machine learning lead at Colorado State University’s Cooperative Institute for Research in the Atmosphere. “The new NVIDIA research explores the potential of accomplishing this with diffusion models like StormCast, which presents a significant step toward the development of future AI models for high-resolution weather prediction.”

Alongside the acceleration and visualization of physically accurate climate simulations, as well as a digital twin of our planet, such research breakthroughs signify how NVIDIA Earth-2 is enabling a new, vital era of climate research.

Learn more about sustainable computing and NVIDIA Research, a global team of hundreds of scientists and engineers focused on topics including climate AI, computer graphics, computer vision, self-driving cars and robotics.

Featured image courtesy of NASA.

See notice regarding software product information.

Read More

Can You Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Self-supervised features are typically used in place of filter-bank features in speaker verification models. However, these models were originally designed to ingest filter-banks as inputs, and thus, training them on self-supervised features assumes that both feature types require the same amount of learning for the task. In this work, we observe that pre-trained self-supervised speech features inherently include information required for a downstream speaker verification task, and therefore, we can simplify the downstream model without sacrificing performance. To this end, we revisit the…Apple Machine Learning Research

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

Perform generative AI-powered data prep and no-code ML over any size of data using Amazon SageMaker Canvas

Amazon SageMaker Canvas now empowers enterprises to harness the full potential of their data by enabling support of petabyte-scale datasets. Starting today, you can interactively prepare large datasets, create end-to-end data flows, and invoke automated machine learning (AutoML) experiments on petabytes of data—a substantial leap from the previous 5 GB limit. With over 50 connectors, an intuitive Chat for data prep interface, and petabyte support, SageMaker Canvas provides a scalable, low-code/no-code (LCNC) ML solution for handling real-world, enterprise use cases.

Organizations often struggle to extract meaningful insights and value from their ever-growing volume of data. You need data engineering expertise and time to develop the proper scripts and pipelines to wrangle, clean, and transform data. Then you must experiment with numerous models and hyperparameters requiring domain expertise. Afterward, you need to manage complex clusters to process and train your ML models over these large-scale datasets.

Starting today, you can prepare your petabyte-scale data and explore many ML models with AutoML by chat and with a few clicks. In this post, we show you how you can complete all these steps with the new integration in SageMaker Canvas with Amazon EMR Serverless without writing code.

Solution overview

For this post, we use a sample dataset of a 33 GB CSV file containing flight purchase transactions from Expedia between April 16, 2022, and October 5, 2022. We use the features to predict the base fare of a ticket based on the flight date, distance, seat type, and others.

In the following sections, we demonstrate how to import and prepare the data, optionally export the data, create a model, and run inference, all in SageMaker Canvas.

Prerequisites

You can follow along by completing the following prerequisites:

  1. Set up SageMaker Canvas.
  2. Download the dataset from Kaggle and upload it to an Amazon Simple Storage Service (Amazon S3) bucket.
  3. Add emr-serverless as a trusted entity to the SageMaker Canvas execution role to allow Amazon EMR processing jobs.

Import data in SageMaker Canvas

We start by importing the data from Amazon S3 using Amazon SageMaker Data Wrangler in SageMaker Canvas. Complete the following steps:

  1. In SageMaker Canvas, choose Data Wrangler in the navigation pane.
  2. On the Data flows tab, choose Tabular on the Import and prepare dropdown menu.
  3. Enter the S3 URI for the file and choose Go, then choose Next.
  4. Give your dataset a name, choose Random for Sampling method, then choose Import.

Importing data from the SageMaker Data Wrangler flow allows you to interact with a sample of the data before scaling the data preparation flow to the full dataset. This improves time and performance because you don’t need to work with the entirety of the data during preparation. You can later use EMR Serverless to handle the heavy lifting. When SageMaker Data Wrangler finishes importing, you can start transforming the dataset.

After you import the dataset, you can first look at the Data Quality Insights Report to see recommendations from SageMaker Canvas on how to improve the data quality and therefore improve the model’s performance.

  1. In the flow, choose the options menu (three dots) for the node, then choose Get data insights.
  2. Give your analysis a name, select Regression for Problem type, choose baseFare for Target column, select Sampled dataset for Data Size, then choose Create.

Assessing the data quality and analyzing the report’s findings is often the first step because it can guide the proceeding data preparation steps. Within the report, you will find dataset statistics, high priority warnings around target leakage, skewness, anomalies, and a feature summary.

Prepare the data with SageMaker Canvas

Now that you understand your dataset characteristics and potential issues, you can use the Chat for data prep feature in SageMaker Canvas to simplify data preparation with natural language prompts. This generative artificial intelligence (AI)-powered capability reduces the time, effort, and expertise required for the often complex tasks of data preparation.

  1. Choose the .flow file on the top banner to go back to your flow canvas.
  2. Choose the options menu for the node, then choose Chat for data prep.

For our first example, converting searchDate and flightDate to datetime format might help us perform date manipulations and extract useful features such as year, month, day, and the difference in days between searchDate and flightDate. These features can find temporal patterns in the data that can influence the baseFare.

  1. Provide a prompt like “Convert searchDate and flightDate to datetime format” to view the code and choose Add to steps.

In addition to data preparation using the chat UI, you can use LCNC transforms with the SageMaker Data Wrangler UI to transform your data. For example, we use one-hot encoding as a technique to convert categorical data into numerical format using the LCNC interface.

  1. Add the transform Encode categorical.
  2. Choose One-hot encode for Transform and add the following columns: startingAirport, destinationAirport, fareBasisCode, segmentsArrivalAirportCode, segmentsDepartureAirportCode, segmentsAirlineName, segmentsAirlineCode, segmentsEquipmentDescription, and segmentsCabinCode.

You can use the advanced search and filter option in SageMaker Canvas to select columns that are of String data type to simplify the process.

Refer to the SageMaker Canvas blog for other examples using SageMaker Data Wrangler. For this post, we simplify our efforts with these two steps, but we encourage you to use both chat and transforms to add data preparation steps on your own. In our testing, we successfully ran all our data preparation steps through the chat using the following prompts as an example:

  • “Add another step that extracts relevant features such as year, month, day, and day of the week which can enhance temporality to our dataset”
  • “Have Canvas convert the travelDuration, segmentsDurationInSeconds, and segmentsDistance column from string to numeric”
  • “Handle missing values by imputing the mean for the totalTravelDistance column, and replacing missing values as ‘Unknown’ for the segmentsEquipmentDescription column”
  • “Convert boolean columns isBasicEconomy, isRefundable, and isNonStop to integer format (0 and 1)”
  • “Scale numerical features like totalFare, seatsRemaining, totalTravelDistance using Standard Scaler from scikit-learn”

When these steps are complete, you can move to the next step of processing the full dataset and creating a model.

(Optional) Export your data in Amazon S3 using an EMR Serverless job

You can process the entire 33 GB dataset by running the data flow using EMR Serverless for the data preparation job without worrying about the infrastructure.

  1. From the last node in the flow diagram, choose Export and Export data to Amazon S3.
  2. Provide a dataset name and output location.
  3. It is recommended to keep Auto job configuration selected unless you want to change any of the Amazon EMR or SageMaker Processing configs. (If your data is greater than 5 GB data processing will run in EMR Serverless, otherwise it will run within the SageMaker Canvas workspace.)
  4. Under EMR Serverless, provide a job name and choose Export.

You can view the job status in SageMaker Canvas on the Data Wrangler page on the Jobs tab.

You can also view the job status on the Amazon EMR Studio console by choosing Applications under Serverless in the navigation pane.

Create a model

You can also create a model at the end of your flow.

  1. Choose Create model from the node options, and SageMaker Canvas will create a dataset and then navigate you to create a model.
  2. Provide a dataset and model name, select Predictive analysis for Problem type, choose baseFare as the target column, then choose Export and create model.

The model creation process will take a couple of minutes to complete.

  1. Choose My Models in the navigation pane.
  2. Choose the model you just exported and navigate to version 1.
  3. Under Model type, choose Configure model.
  4. Select Numeric model type, then choose Save.
  5. On the dropdown menu, choose Quick Build to start the build process.

When the build is complete, on the Analyze page, you can the following tabs:

  • Overview – This gives you a general overview of the model’s performance, depending on the model type.
  • Scoring – This shows visualizations that you can use to get more insights into your model’s performance beyond the overall accuracy metrics.
  • Advanced metrics – This contains your model’s scores for advanced metrics and additional information that can give you a deeper understanding of your model’s performance. You can also view information such as the column impacts.

Run inference

In this section, we walk through the steps to run batch predictions against the generated dataset.

  1. On the Analyze page, choose Predict.
  2. To generate predictions on your test dataset, choose Manual.
  3. Select the test dataset you created and choose Generate predictions.
  4. When the predictions are ready, either choose View in the pop-up message at the bottom of the page or navigate to the Status column to choose Preview on the options menu (three dots).

You’re now able to review the predictions.

You have now used the generative AI data preparation capabilities in SageMaker Canvas to prepare a large dataset, trained a model using AutoML techniques, and run batch predictions at scale. All of this was done with a few clicks and using a natural language interface.

Clean up

To avoid incurring future session charges, log out of SageMaker Canvas. To log out, choose Log out in the navigation pane of the SageMaker Canvas application.

When you log out of SageMaker Canvas, your models and datasets aren’t affected, but SageMaker Canvas cancels any Quick build tasks. If you log out of SageMaker Canvas while running a Quick build, your build might be interrupted until you relaunch the application. When you relaunch, SageMaker Canvas automatically restarts the build. Standard builds continue even if you log out.

Conclusion

The introduction of petabyte-scale AutoML support within SageMaker Canvas marks a significant milestone in the democratization of ML. By combining the power of generative AI, AutoML, and the scalability of EMR Serverless, we’re empowering organizations of all sizes to unlock insights and drive business value from even the largest and most complex datasets.

The benefits of ML are no longer confined to the domain of highly specialized experts. SageMaker Canvas is revolutionizing the way businesses approach data and AI, putting the power of predictive analytics and data-driven decision-making into the hands of everyone. Explore the future of no-code ML with SageMaker Canvas today.


About the authors

Bret Pontillo is a Sr. Solutions Architect at AWS. He works closely with enterprise customers building data lakes and analytical applications on the AWS platform. In his free time, Bret enjoys traveling, watching sports, and trying new restaurants.

Polaris Jhandi is a Cloud Application Architect with AWS Professional Services. He has a background in AI/ML & big data. He is currently working with customers to migrate their legacy Mainframe applications to the Cloud.

Peter Chung is a Solutions Architect serving enterprise customers at AWS. He loves to help customers use technology to solve business problems on various topics like cutting costs and leveraging artificial intelligence. He wrote a book on AWS FinOps, and enjoys reading and building solutions.

Read More

Abstracts: August 15, 2024

Abstracts: August 15, 2024

Microsoft Research Podcast - Abstracts

Members of the research community at Microsoft work continuously to advance their respective fields. Abstracts brings its audience to the cutting edge with them through short, compelling conversations about new and noteworthy achievements.

In this episode, Microsoft Product Manager Shrey Jain and OpenAI Research Scientist Zoë Hitzig join host Amber Tingle to discuss “Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online.” In their paper, Jain, Hitzig, and their coauthors describe how malicious actors can draw on increasingly advanced AI tools to carry out deception, making online deception harder to detect and more harmful. Bringing ideas from cryptography into AI policy conversations, they identify a possible mitigation: a credential that allows its holder to prove they’re a person––not a bot––without sharing any identifying information. This exploratory research reflects a broad range of collaborators from across industry, academia, and the civil sector specializing in areas such as security, digital identity, advocacy, and policy.

Transcript

[MUSIC]

AMBER TINGLE: Welcome to Abstracts, a Microsoft Research Podcast that puts the spotlight on world-class research—in brief. I’m Amber Tingle. In this series, members of the research community at Microsoft give us a quick snapshot—or a podcast abstract—of their new and noteworthy papers.

[MUSIC FADES]

Our guests today are Shrey Jain and Zoë Hitzig. Shrey is a product manager at Microsoft, and Zoë is a research scientist at OpenAI. They are two of the corresponding authors on a new paper, “Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online.” This exploratory research comprises multidisciplinary collaborators from across industry, academia, and the civil sector. The paper is available now on arXiv. Shrey and Zoë, thank you so much for joining us, and welcome back to the Microsoft Research Podcast.


SHREY JAIN: Thank you. We’re happy to be back.

ZOË HITZIG: Thanks so much.

TINGLE: Shrey, let’s start with a brief overview of your paper. Why is this research important, and why do you think this is something we should all know about?

JAIN: Malicious actors have been exploiting anonymity as a way to deceive others online. And historically, deception has been viewed as this unfortunate but necessary cost as a way to preserve the internet’s commitment to privacy and unrestricted access to information. And today, AI is changing the way we should think about malicious actors’ ability to be successful in those attacks. It makes it easier to create content that is indistinguishable from human-created content, and it is possible to do so in a way that is only getting cheaper and more accessible. And so this paper aims to address a countermeasure to protect against AI-powered deception at scale while also protecting privacy. And I think the reason why people should care about this problem is for two reasons. One is it can very soon become very logistically annoying to deal with these various different types of scams that can occur. I think we’ve all been susceptible to different types of attacks or scams that, you know, people have had. But now these scams are going to become much more persuasive and effective. And so for various different recovery purposes, it can become very challenging to get access back to your accounts or rebuild your reputation that someone may damage online. But more importantly, there’s also very dangerous things that can happen. Kids might not be safe online anymore. Or our ability to communicate online for democratic processes. A lot of the way in which we shape political views today happens online. And that’s also at risk. And in response to that, we propose in this paper a solution titled personhood credentials. Personhood credentials enable people to prove that they are in fact a real person without revealing anything more about themselves online.

TINGLE: Zoë, walk us through what’s already been done in this field, and what’s your unique contribution to the literature here?

HITZIG: I see us as intervening on two separate bodies of work. And part of what we’re doing in this paper is bringing together those two bodies of work. There’s been absolutely amazing work for decades in cryptography and in security. And what cryptographers have been able to do is to figure out protocols that allow people to prove very specific claims about themselves without revealing their full identity. So when you think about walking into a bar and the bartender asks you to prove that you’re over 21—or over 18, depending on where you are—you typically have to show your full driver’s license. And now that’s revealing a lot of information. It says, you know, where you live, whether you’re an organ donor. It’s revealing a lot of information to that bartender. And online, we don’t know what different service providers are storing about us. So, you know, the bartender might not really care where we live or whether we’re an organ donor. But when we’re signing up for digital services and we have to show a highly revealing credential like a driver’s license just to get access to something, we’re giving over too much information in some sense. And so this one body of literature that we’re really drawing on is a literature in cryptography. The idea that I was talking about there, where you can prove privately just isolated claims about yourself, that’s an idea called an anonymous credential. It allows you to be anonymous with respect to some kind of service provider while still proving a limited claim about yourself, like “I am over 18,” or in the case of personhood credentials, you prove, “I am a person.” So that’s all one body of literature. Then there’s this huge other body of literature and set of conversations happening in policy circles right now around what to do about AI. Huge questions abounding. Shrey and I have written a prior paper called “Contextual Confidence and Generative AI,” which we talked about on this podcast, as well, and in that paper, we offered a framework for thinking about the specific ways that generative AI, sort of, threatens the foundations of our modes of communication online. And we outlined about 16 different solutions that could help us to solve the coming problems that generative AI might bring to our online ecosystems. And what we decided to do in this paper was focus on a set of solutions that we thought are not getting enough attention in those AI and AI policy circles. And so part of what this paper is doing is bringing together these ideas from this long body of work in cryptography into those conversations.

TINGLE: I’d like to know more about your methodology, Shrey. How did your team go about conducting this research?

JAIN: So we had a wide range of collaborators from industry, academia, the civil sector who work on topics of digital identity, privacy, advocacy, security, and AI policy which came together to think about, what is the clearest way in which we can explain what we believe is a countermeasure that can protect against AI-powered deception that, from a technological point of view, there’s already a large body of work that we can reference but from a “how this can be implemented.” Discussing the tradeoffs that various different types of academics and industry leaders are thinking about. Can we communicate that very clearly? And so the methodology here was really about bringing together a wide range of collaborators to really bridge these two bodies of work together and communicate it clearly—not just the technical solutions but also the tradeoffs.

TINGLE: So, Zoë, what are the major findings here, and how are they presented in the paper?

HITZIG: I am an economist by training. Economists love to talk about tradeoffs. You know, when you have some of this, it means you have a little bit less of that. It’s kind of like the whole business of economics. And a key finding of the paper, as I see it, is that we begin with what feels like a tradeoff, which is on the one hand, as Shrey was saying, we want to be able to be anonymous online because that has great benefits. It means we can speak truth to power. It means we can protect civil liberties and invite everyone into online spaces. You know, privacy is a core feature of the internet. And at the same time, the, kind of, other side of the tradeoff that we’re often presented is, well, if you want all that privacy and anonymity, it means that you can’t have accountability. There’s no way of tracking down the bad actors and making sure that they don’t do something bad again. And we’re presented with this tradeoff between anonymity on the one hand and accountability on the other hand. All that is to say, a key finding of this paper, as I see it, is that personhood credentials and more generally this class of anonymous credentials that allow you to prove different pieces of your identity online without revealing your entire identity actually allow you to evade the tradeoff and allow you to, in some sense, have your cake and eat it, too. What it allows us to do is to create some accountability, to put back some way of tracing people’s digital activities to an accountable entity. What we also present in the paper are a number of different, sort of, key challenges that will have to be taken into account in building any kind of system like this. But we present all of that, all of those challenges going forward, as potentially very worth grappling with because of the potential for this, sort of, idea to allow us to preserve the internet’s commitment to privacy, free speech, and anonymity while also creating accountability for harm.

TINGLE: So Zoë mentioned some of these tradeoffs. Let’s talk a little bit more about real-world impact, Shrey. Who benefits most from this work?

JAIN: I think there’s many different people that benefit. One is anyone who’s communicating or doing anything online in that they can have more confidence in their interactions. And it, kind of, builds back on the paper that Zoë and I wrote last year on contextual confidence and generative AI, which is that we want to have confidence in our interactions, and in order to do that, one component is being able to identify who you’re speaking with and also doing it in a privacy-preserving way. And I think another person who benefits is policymakers. I think today, when we think about the language and technologies that are being promoted, this complements a lot of the existing work that’s being done on provenance and watermarking. And I think the ability for those individuals to be successful in their mission, which is creating a safer online space, this work can help guide these individuals to be more effective in their mission in that it highlights a technology that is not currently as discussed comparatively to these other solutions and complements them in order to protect online communication.

HITZIG: You know, social media is flooded with bots, and sometimes the problem with bots is that they’re posting fake content, but other times, the problem with bots is that there are just so many of them and they’re all retweeting each other and it’s very hard to tell what’s real. And so what a personhood credential can do is say, you know, maybe each person is only allowed to have five accounts on a particular social media platform.

TINGLE: So, Shrey, what’s next on your research agenda? Are there lingering questions—I know there are—and key challenges here, and if so, how do you hope to answer them?

JAIN: We believe we’ve aggregated a strong set of industry, academic, and, you know, civil sector collaborators, but we’re only a small subset of the people who are going to be interacting with these systems. And so the first area of next steps is to gather feedback about the proposal of a solution that we’ve had and how can we improve that: are there tradeoffs that we’re missing? Are there technical components that we weren’t thinking as deeply through? And I think there’s a lot of narrow open questions that come out of this. For instance, how do personhood credentials relate to existing laws regarding identity theft or protection laws? In areas where service providers can’t require government IDs, how does that apply to personhood credentials that rely on government IDs? I think that there’s a lot of these open questions that we address in the paper that I think need more experimentation and thinking through but also a lot of empirical work to be done. How do people react to personhood credentials, and does it actually enhance confidence in their interactions online? I think that there’s a lot of open questions on the actual effectiveness of these tools. And so I think there’s a large area of work to be done there, as well.

HITZIG: I’ve been thinking a lot about the early days of the internet. I wasn’t around for that, but I know that every little decision that was made in a very short period of time had incredibly lasting consequences that we’re still dealing with now. There’s an enormous path dependence in every kind of technology. And I feel that right now, we’re in that period of time, the small window where generative AI is this new thing to contend with, and it’s uprooting many of our assumptions about how our systems can work or should work. And I’m trying to think about how to set up those institutions, make these tiny decisions right so that in the future we have a digital architecture that’s really serving the goals that we want it to serve.

[MUSIC]

TINGLE: Very thoughtful. With that, Shrey Jain, Zoë Hitzig, thank you so much for joining us today.

HITZIG: Thank you so much, Amber.

TINGLE: And thanks to our listeners, as well. If you’d like to learn more about Shrey and Zoë’s work on personhood credentials and advanced AI, you’ll find a link to this paper at aka.ms/abstracts, or you can read it on arXiv. Thanks again for tuning in. I’m Amber Tingle, and we hope you’ll join us next time on Abstracts.

[MUSIC FADES]

The post Abstracts: August 15, 2024 appeared first on Microsoft Research.

Read More