February 2022 – Page 7

New Levels Unlocked: Africa’s Game Developers Reach Toward the Next Generation

Looking for a challenge? Try maneuvering a Kenyan minibus through traffic or dropping seed balls on deforested landscapes.

Or download Africa’s Legends and battle through fiendishly difficult puzzles with Ghana’s Ananse or Nigeria’s Oya by your side.

Games like these are connecting with a hyper-connected African youth population that’s growing fast.

Africa is the youngest region in the world.

Sixty percent of the continent’s population is under 25, and by 2030 the UN predicts the youth population to Africa will increase by 42 percent.

Disposable incomes are rising and high-speed internet connections are proliferating, too.

Africa is projected to have more than 680 million mobile phone users by the end of 2025, driving a surge in the number of gamers.

Across the region 177 million or 95 percent of gamers use mobile devices.

As a result Africa is the fastest-growing region for mobile game downloads, according to mobile insights firm App Annie.

Kenya’s Usiku Games and Ghana’s Leti Arts are among the new generation of African game developers who are pioneering games that connect with the experiences, challenges and histories of these gamers.

Each is developing mobile games aimed at educating youth on a continent where 41 percent of the population is under 15.

And more are coming: in January, South African startup Carry1st raised $20 million from marquee investors such as Andreesen Horowitz and Google for a mobile game publishing platform targeting the African market.

The timing couldn’t be better. Market research firm Mordor Intelligence expects gaming revenue on the continent to grow at a 12 percent annual rate through 2026 compared to 9.6 percent for the entire world.

A Path for Africa’s Gaming Developers

As creators of Okoa Simba, the first game developed in Kenya to be published globally, Nairobi-based Usiku Games believes it can serve as a role model for future game developers in Africa.

Usiku Games is determined to reach younger audiences with educational messages that are embedded within compelling games and animations.

“Our games directly influence the knowledge and behavior of youth on topics such as gender-based violence, mental health, sexual and reproductive health, education, and peaceful resolution of conflicts,” said Usiku Games founder and CEO Jay Shapiro.

One of its projects includes working in Unreal Engine with NVIDIA technologies to create a 3D game focused on HIV prevention and contraception for teen girls.

“For game developers such as myself, this is about making something that will capture the imagination and inspire vulnerable youth in Africa, and all parts of the world,” said Shapiro, a Toronto native, who has lived in Singapore, New York, Mexico and Cambodia. “I want to create rich, visually compelling stories that impact and serve the next generation.”

Creating Visual Stories With NVIDIA GPUs

As the first gaming studio in Ghana, Leti Arts, founded in 2009, uses NVIDIA GPUs to help build mobile games and digital comics based on African history and folklore.

“Games with African settings made by Africans are the best way to cultivate a sense of cultural authenticity,” said Leti co-founder and CEO Eyram Tawia.

A comic and computer game enthusiast since junior high school, Tawia, a Mandela Washington Fellow, the flagship program of the U.S. Government’s Young African Leaders Initiative, wanted to turn the stories he’d heard and drawn as a child into immersive experiences.

“Art and culture contribute just as much to an economy as jobs,” Tawia said . “They help increase a community’s social capital, attracting talent, growth and innovation.”

The nine-person company’s most successful games include Africa’s Legends (2014) and The Hottseat (2019).

The long-term vision for Leti Arts is to make games from Africa for the world. Tawia says the high quality of its games enables gamers to better relate with the games and content being produced.

The continent is home to a growing number of game studios. In addition to Usiku Games and Leti Arts they include Maliyo Games, Kirro Games, Kayfo Games and others.

More games, and game developers, are coming. Tawia and Leti Arts have worked to mentor talent through internships, boot camps and workshops.

Last year Leti trained and supported over 30 game developers in partnership with ITTHYK Gaming and sponsored by Microsoft.

Tolo Sagala is the heroine of Leti Arts’ “Africa’s Legends – The Game.”

Expanding the Omniverse

Both Usiku and Leti Arts, which are members of NVIDIA Inception, a global program designed to nurture cutting-edge startups, are also exploring NVIDIA Omniverse for real-time 3D design collaboration, AI-powered animation and game development.

With Africa’s gaming industry worth well over half a billion dollars in 2021, investments are also booming for African gaming startups.

“As Africa’s demand for local and regional gaming content grows, more startups are entering this space,” said Kate Kallot, head of Emerging Areas at NVIDIA.

“Africa’s gaming landscape is punctuated by a growing base of startups and studios who are challenging the norms of traditional games, and their impact is anticipated to reach well beyond the continent itself to other game developers and audiences.”

Learn more about Leti Arts and Usiku Games, among others, by catching up on our GTC session focused on the African gaming industry.

And check out entrepreneur and Leti Arts founder Eyram Tawia’s book, “Uncompromising Passion: The Humble Beginnings of an African Game Industry” (CreateSpace Independent Publishing Platform, 2016).

The post New Levels Unlocked: Africa’s Game Developers Reach Toward the Next Generation appeared first on The Official NVIDIA Blog.

Amazon Scholar Ranjit Jhala named ACM Fellow

Jhala received the ACM honor for lifetime contributions to software verification, developing innovative tools to help computer programmers test their code.Read More

Engineering Director Sriram Sankar discusses Meta’s first research award opportunity in silent data corruptions at scale

Apply profanity masking in Amazon Translate

Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation. This post shows how you can mask profane words and phrases with a grawlix string (“?$#@$”).

Amazon Translate typically chooses clean words for your translation output. But in some situations, you want to prevent words that are commonly considered as profane terms from appearing in the translated output. For example, when you’re translating video captions or subtitle content, or enabling in-game chat, and you want the translated content to be age appropriate and clear of any profanity, Amazon Translate allows you to mask the profane words and phrases using the profanity masking setting. You can apply profanity masking to both real-time translation or asynchronous batch processing in Amazon Translate. When using Amazon Translate with profanity masking enabled, the five-character sequence ?$#@$ is used to mask each profane word or phrase, regardless of the number of characters. Amazon Translate detects each profane word or phrase literally, not contextually.

Solution overview

To mask profane words and phrases in your translation output, you can enable the profanity option under the additional settings on the Amazon Translate console when you run the translations with Amazon Translate both through real-time and asynchronous batch processing requests. The following sections demonstrate using profanity masking for real-time translation requests via the Amazon Translate console, AWS Command Line Interface (AWS CLI), or with the Amazon Translate SDK (Python Boto3).

Amazon Translate console

To demonstrate handling profanity with real-time translation, we use the following sample text in French to be translated into English:

Ne sois pas une garce

Complete the following steps on the Amazon Translate console:

Choose French (fr) as the Source language.
Choose English (en) as the Target Language.
Enter the preceding example text in the Source Language text area.

The translated text appears under Target language. It contains a word that is considered profane in English.

Expand Additional settings and enable Profanity.

The word is now replaced with the grawlix string ?$#@$.

AWS CLI

Calling the translate-text AWS CLI command with --settings Profanity=MASK masks profane words and phrases in your translated text.

The following AWS CLI commands are formatted for Unix, Linux, and macOS. For Windows, replace the backslash () Unix continuation character at the end of each line with a caret (^).

aws translate translate-text 
--text <<INPUT TEXT>> 
--source-language-code fr 
--target-language-code en 
--settings Profanity=MASK

You get a response like the following snippet:

{
    "TranslatedText": "<output text with ?$#@$>",
    "SourceLanguageCode": "fr",
    "TargetLanguageCode": "en",
    "AppliedSettings": {
        "Profanity": "MASK"
    }
}

Amazon Translate SDK (Python Boto3)

The following Python 3 code uses the real-time translation call with the profanity setting:

import boto3
import json

translate = boto3.client('translate')

SOURCE_TEXT = ("<Sample Input Text>")

OUTPUT_LANG_CODE = 'en'

result = translate.translate_text(
    Text=SOURCE_TEXT,
    SourceLanguageCode='auto',
    TargetLanguageCode=OUTPUT_LANG_CODE,
    Settings={'Profanity': 'MASK'}
)

print("Translated Text:{}".format(result['TranslatedText']))

Conclusion

You can use the profanity masking setting to mask words and phrases that are considered profane to keep your translated text clean and meet your business requirements. To learn more about all the ways you can customize your translations, refer to Customizing Your Translations using Amazon Translate.

About the Authors

Siva Rajamani is a Boston-based Enterprise Solutions Architect at AWS. He enjoys working closely with customers and supporting their digital transformation and AWS adoption journey. His core areas of focus are serverless, application integration, and security. Outside of work, he enjoys outdoors activities and watching documentaries.

Sudhanshu Malhotra is a Boston-based Enterprise Solutions Architect for AWS. He’s a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. His core areas of focus are DevOps, machine learning, and security. When he’s not working with customers on their journey to the cloud, he enjoys reading, hiking, and exploring new cuisines.

Watson G. Srivathsan is the Sr. Product Manager for Amazon Translate, AWS’s natural language processing service. On weekends you will find him exploring the outdoors in the Pacific Northwest.

How Süddeutsche Zeitung optimized their audio narration process with Amazon Polly

This is a guest post by Jakob Kohl, a Software Developer at the Süddeutsche Zeitung. Süddeutsche Zeitung is one of the leading quality dailies in Germany when it comes to paid subscriptions and unique users. Its website, SZ.de, reaches more than 15 million monthly unique users as of October 2021.

Thanks to smart speakers and podcasts, the audio industry has experienced a real boom in recent years. At Süddeutsche Zeitung, we’re constantly looking for new ways to make our diverse journalism even more accessible. As pioneers in digital journalism, we want to open up more opportunities for Süddeutsche Zeitung readers to consume articles. We started looking for solutions that could provide high-quality audio narration for our articles. Our ultimate goal was to launch a “listen to the article” feature.

In this post, we share how we optimized our audio narration process with Amazon Polly, a service that turns text into lifelike speech using advanced deep learning technologies.

Why Amazon Polly?

We believe that Vicki, the German neural Amazon Polly voice, is currently the best German voice on the market. Amazon Polly offers the impressive feature to switch between languages, correctly pronouncing for example English movie titles as well as personal names in different languages (for an example, listen to the article Schall und Wahn on our website).

A big part of our infrastructure already runs on AWS, so using Amazon Polly was a perfect fit. We can combine Amazon Polly with the following components:

An Amazon Simple Notification Service (Amazon SNS) topic to which we can subscribe for articles. The articles are sent to this topic by the CMS whenever they’re saved by an editor.
An Amazon CloudFront distribution with Lambda@Edge to paywall premium articles, which we can reuse for audio versions of articles.

The Amazon Polly API is easy to use and well documented. It took us less than a week to get our proof of concept to work.

The challenge

Hundreds of new articles are published every day on SZ.de. After initial publication, they might get updated several times for various reasons—new paragraphs are added in news-driven articles, typos are fixed, teasers are changed, or metadata is optimized for search engines.

Generating speech for the initial publication of an article is straightforward, because the whole text needs to be synthesized. But how can we quickly generate the audio for updated versions of articles without paying twice for the same content? Our biggest challenge was to prevent sending the whole text to Amazon Polly repeatedly for every single update.

Our technical solution

Every time an editor saves an article, the new version of the article is published to an SNS topic. An AWS Lambda function is subscribed to this topic and called for every new version of an article. This function runs the following steps:

Check if the new version of the article has already been completely synthesized. If so, the function stops immediately (this may happen when only metadata is changed that doesn’t affect the audio).
Convert the article into multiple SSML documents, roughly one for each text paragraph.
For each SSML document, the function checks if it has already been synthesized to audio using calculated hashes. For example:
1. If an article is saved for the first time, all SSML documents must be synthesized.
2. If a typo has been fixed in a single paragraph, only the SSML document for this paragraph must be re-synthesized.
3. If a new paragraph is added to the article, only the SSML document for this new paragraph must be synthesized.
Send all not-yet-synthesized SSML documents separately to Amazon Polly.

These checks help optimize performance and reduce cost by preventing the synthesis of an entire article multiple times. We avoid incurring additional charges due to minor changes such as a title edit or metadata adjustments for SEO reasons.

The following diagram illustrates the solution workflow.

After Amazon Polly synthesizes the SSML documents, the audio files are sent to an output bucket in Amazon Simple Storage Service (Amazon S3). A second Lambda function is listening for object creation on that bucket, waits for the completion of all audio fragments of an article, and merges them into a final audio file using FFmpeg from a Lambda layer. This final audio is sent to another S3 bucket, which is used as the origin in our CloudFront distribution. In CloudFront, we reuse an existing paywall for premium articles for the corresponding audio version.

Based on our freemium model, we provide a shortened audio version of premium articles. Non-subscribers are able to listen to the first paragraph for free, but are required to purchase a subscription to access the full article.

Conclusion

Integration of Amazon Polly into our existing infrastructure was very straightforward. Our content requires minimal customization because we only include paragraphs and some additional breaks. The most challenging part was performance and cost optimization, which we achieved by splitting the article up into multiple SSML documents corresponding to paragraphs, checking for changes in each SSML document, and building the whole audio file by merging the fragments. With these optimizations, we are able to achieve the following:

Decrease the amount of synthesized characters by at least 50% by only synthesizing real changes.
Reduce the time it takes for a change in the article text to appear in the audio because there is less audio to synthesize.
Add arbitrary audio files between paragraphs without re-synthesizing the whole article. For example, we can include a sound file in the shortened audio version of a premium articles to separate the first paragraph from the ensuing note that a subscription is needed to listen to the full version.

In the first month after the launch of the “listen to the article” feature in our SZ.de articles, we received a lot of positive user feedback. We were able to reach almost 30,000 users during the first 2 months after launch. From these users, approximately 200 converted into a paid subscription only from listening to the teaser of an article behind our paywall. The “listen to the article” feature isn’t behind our paywall, but users can only listen to premium articles fully if they have a subscription. Our website also offers free articles without a paywall. In the future, we will expand the feature to other SZ platforms, especially our mobile news apps.

About the Author

Jakob Kohl is a Software Developer at the Süddeutsche Zeitung, where he enjoys working with modern technologies on an agile website team. He is one of the main developers of the “listen to an SZ article” feature. In his leisure time, he likes building wooden furniture, where technical and visual design is as important as in web development.

An International Scientific Challenge for the Diagnosis and Gleason Grading of Prostate Cancer

Posted by Po-Hsuan Cameron Chen, Software Engineer, Google Health and Maggie Demkin, Program Manager, Kaggle

In recent years, machine learning (ML) competitions in health have attracted ML scientists to work together to solve challenging clinical problems. These competitions provide access to relevant data and well-defined problems where experienced data scientists come to compete for solutions and learn new methods. However, a fundamental difficulty in organizing such challenges is obtaining and curating high quality datasets for model development and independent datasets for model evaluation. Importantly, to reduce the risk of bias and to ensure broad applicability of the algorithm, evaluation of the generalisability of resulting algorithms should ideally be performed on multiple independent evaluation datasets by an independent group of scientists.

One clinical problem that has attracted substantial ML research is prostate cancer, a condition that 1 in 9 men develop in their lifetime. A prostate cancer diagnosis requires pathologists to examine biological tissue samples under a microscope to identify cancer and grade the cancer for signs of aggressive growth patterns in the cells. However, this cancer grading task (called Gleason grading) is difficult and subjective due to the need for visual assessment of cell differentiation and Gleason pattern predominance. Building a large dataset of samples with expert annotations can help with the development of ML systems to aid in prostate cancer grading.

To help accelerate and enable more research in this area, Google Health, Radboud University Medical Center and Karolinska Institutet joined forces to organize a global competition, the Prostate cANcer graDe Assessment (PANDA) Challenge, on the open Kaggle platform. In “Artificial Intelligence for Diagnosis and Gleason Grading of Prostate Cancer: the PANDA challenge”, published in Nature Medicine, we present the results of the challenge. The study design of the PANDA challenge provided the largest public whole-slide image dataset available and was open to participants from April 21st until July 23rd, 2020. The development datasets remain available for further research. In this effort, we compiled and publicly released a European cohort of prostate cancer cases for algorithm development and pioneered a standardized evaluation setup for digital pathology that enabled independent, blinded external validation of the algorithms on data from both the United States and EU.

The global competition attracted participants from 65 countries (the size of the circle for each country illustrates the number of participants).

Design of the Panda Challenge
The challenge had two phases: a development phase (i.e., the Kaggle competition) and a validation phase. During the competition, 1,290 developers from 65 countries competed in building the best performing Gleason grading algorithm, having full access to a development set for algorithm training. Throughout the competition teams submitted algorithms that were evaluated on a hidden tuning set.

In the validation phase, a selection of top performing algorithms were independently evaluated on internal and external validation datasets with high quality reference grades from panels of expert prostate pathologists. In addition, a group of general pathologists graded a subset of the same cases to put the difficulty of the task and dataset in context. The algorithms submitted by the teams were then compared to grades done by groups of international and US general pathologists on these subsets.

Overview of the PANDA challenge’s phases for development and validation.

Research Velocity During the Challenge
We found that a group of Gleason grading ML algorithms developed during a global competition could achieve pathologist-level performance and generalize well to intercontinental and multinational cohorts. On all external validation sets, these algorithms achieved high agreement with urologic pathologists (prostate specialists) and high sensitivity for detecting tumor in biopsies. The Kaggle platform enabled the tracking of teams’ performance throughout the competition. Impressively, the first team achieving high agreement with the prostate pathologists at above 0.90 (quadratically weighted Cohen’s kappa) on the internal validation set occurred within the first 10 days of the competition. By the 33rd day, the median performance of all teams exceeded a score of 0.85.

Progression of algorithms’ performances throughout the competition, as shown by the highest score on the tuning and internal validation sets among all participating teams. During the competition teams could submit their algorithm for evaluation on the tuning set, after which they received their score. At the same time, algorithms were evaluated on the internal validation set, without disclosing these results to the participating teams. The development of the top score obtained by any team shows the rapid improvement of the algorithms.

Learning from the Challenge
By moderating the discussion forum on the Kaggle platform, we learned that the teams’ openness in sharing code via colab notebooks led to rapid improvement across the board, a promising sign for future public challenges, and a clear indication of the power of sharing knowledge on a common platform.

Organizing a public challenge that evaluates algorithm generalization across independent cohorts using high quality reference standard panels presents substantial logistical difficulties. Assembling this size of a dataset across countries and organizations was a massive undertaking. This work benefited from an amazing collaboration between the three organizing institutions which have all contributed respective publications in this space, two in Lancet Oncology and one in JAMA Oncology. Combining these efforts provided a high quality foundation on which this competition could be based. With the publication, Radboud and Karolinska research groups are also open sourcing the PANDA challenge development datasets to facilitate the further improvement of prostate Gleason grading algorithms. We look forward to seeing many more advancements in this field, and more challenges that can catalyze extensive international knowledge sharing and collaborative research.

Acknowledgements
Key contributors to this project at Google include Po-Hsuan Cameron Chen, Kunal Nagpal, Yuannan Cai, David F. Steiner, Maggie Demkin, Sohier Dane, Fraser Tan, Greg S. Corrado, Lily Peng, Craig H. Mermel. Collaborators on this project include Wouter Bulten, Kimmo Kartasalo, Peter Ström, Hans Pinckaers, Hester van Boven, Robert Vink, Christina Hulsbergen-van de Kaa, Jeroen van der Laak, Mahul B. Amin, Andrew J. Evans, Theodorus van der Kwast, Robert Allan, Peter A. Humphrey, Henrik Grönberg, Hemamali Samaratunga, Brett Delahunt, Toyonori Tsuzuki, Tomi Häkkinen, Lars Egevad, Masi Valkonen, Pekka Ruusuvuori, Geert Litjens, Martin Eklund and the PANDA Challenge consortium. We thank Ellery Wulczyn, Annisah Um’rani, Yun Liu, and Dale Webster for their feedback on the manuscript and guidance on the project. We thank our collaborators at NMCSD, particularly Niels Olson, for internal re-use of de-identified data which contributed to the US external validation set. Sincere appreciation also goes to Sami Lachgar, Ashley Zlatinov, and Lauren Winer for their feedback on the blogpost.

Alexa AI team discusses NeurIPS workshop best paper award

Paper deals with detecting and answering out-of-domain requests for task-oriented dialogue systems.Read More

MuZero’s first step from research into the real world

Collaborating with YouTube to optimise video compression in the open source VP9 codec.Read More

MuZero’s first step from research into the real world

Guiding Frozen Language Models with Learned Soft Prompts

Posted by Brian Lester, AI Resident and Noah Constant, Senior Staff Software Engineer, Google Research

Large pre-trained language models, which are continuing to grow in size, achieve state-of-art results on many natural language processing (NLP) benchmarks. Since the development of GPT and BERT, standard practice has been to fine-tune models on downstream tasks, which involves adjusting every weight in the network (i.e., model tuning). However, as models become larger, storing and serving a tuned copy of the model for each downstream task becomes impractical.

An appealing alternative is to share across all downstream tasks a single frozen pre-trained language model, in which all weights are fixed. In an exciting development, GPT-3 showed convincingly that a frozen model can be conditioned to perform different tasks through “in-context” learning. With this approach, a user primes the model for a given task through prompt design, i.e., hand-crafting a text prompt with a description or examples of the task at hand. For instance, to condition a model for sentiment analysis, one could attach the prompt, “Is the following movie review positive or negative?” before the input sequence, “This movie was amazing!”

Sharing the same frozen model across tasks greatly simplifies serving and allows for efficient mixed-task inference, but unfortunately, this is at the expense of task performance. Text prompts require manual effort to design, and even well-designed prompts still far underperform compared to model tuning. For instance, the performance of a frozen GPT-3 175B parameter model on the SuperGLUE benchmark is 5 points below a fine-tuned T5 model that uses 800 times fewer parameters.

In “The Power of Scale for Parameter-Efficient Prompt Tuning”, presented at EMNLP 2021, we explore prompt tuning, a more efficient and effective method for conditioning frozen models using tunable soft prompts. Just like engineered text prompts, soft prompts are concatenated to the input text. But rather than selecting from existing vocabulary items, the “tokens” of the soft prompt are learnable vectors. This means a soft prompt can be optimized end-to-end over a training dataset. In addition to removing the need for manual design, this allows the prompt to condense information from datasets containing thousands or millions of examples. By comparison, discrete text prompts are typically limited to under 50 examples due to constraints on model input length. We are also excited to release the code and checkpoints to fully reproduce our experiments.

Prompt tuning retains the strong task performance of model tuning, while keeping the pre-trained model frozen, enabling efficient multitask serving.

Prompt Tuning
To create a soft prompt for a given task, we first initialize the prompt as a fixed-length sequence of vectors (e.g., 20 tokens long). We attach these vectors to the beginning of each embedded input and feed the combined sequence into the model. The model’s prediction is compared to the target to calculate a loss, and the error is back-propagated to calculate gradients, however we only apply these gradient updates to our new learnable vectors — keeping the core model frozen. While soft prompts learned in this way are not immediately interpretable, at an intuitive level, the soft prompt is extracting evidence about how to perform a task from the labeled dataset, performing the same role as a manually written text prompt, but without the need to be constrained to discrete language.

Our codebase, implemented in the new JAX-based T5X framework, makes it easy for anyone to replicate this procedure, and provides practical hyperparameter settings, including a large learning rate (0.3), which we found was important for achieving good results.

Since soft prompts have a small parameter footprint (we train prompts with as few as 512 parameters), one can easily pass the model a different prompt along with each input example. This enables mixed-task inference batches, which can streamline serving by sharing one core model across many tasks.

Left: With model tuning, incoming data are routed to task-specific models. Right: With prompt tuning, examples and prompts from different tasks can flow through a single frozen model in large batches, better utilizing serving resources.

Improvement with Scale
When evaluated on SuperGLUE and using a frozen T5 model, prompt tuning significantly outperforms prompt design using either GPT-3 or T5. Furthermore, as model size increases, prompt tuning catches up to the performance level of model tuning. Intuitively, the larger the pre-trained model, the less of a “push” it needs to perform a specific task, and the more capable it is of being adapted in a parameter-efficient way.

As scale increases, prompt tuning matches model tuning, despite tuning 25,000 times fewer parameters.

The effectiveness of prompt tuning at large model scales is especially important, since serving separate copies of a large model can incur significant computational overhead. In our paper, we demonstrate that larger models can be conditioned successfully even with soft prompts as short as 5 tokens. For T5 XXL, this means tuning just 20 thousand parameters to guide the behavior of an 11 billion parameter model.

Resilience to Domain Shift
Another advantage of prompt tuning is its resilience to domain shift. Since model tuning touches every weight in the network, it has the capacity to easily overfit on the provided fine-tuning data and may not generalize well to variations in the task at inference time. By comparison, our learned soft prompts have a small number of parameters, so the solutions they represent may be more generalizable.

To test generalizability, we train prompt tuning and model tuning solutions on one task, and evaluate zero-shot on a closely related task. For example, when we train on the Quora Question Pairs task (i.e., detecting if two questions are duplicates) and evaluate on MRPC (i.e., detecting if two sentences from news articles are paraphrases), prompt tuning achieves +3.2 points higher accuracy than model tuning.

Train	Eval	Tuning	Accuracy	F1

QQP	MRPC	Model	73.1 ±0.9	81.2 ±2.1
QQP	MRPC	Prompt	76.3 ±0.1	84.3 ±0.3

MRPC	QQP	Model	74.9 ±1.3	70.9 ±1.2
MRPC	QQP	Prompt	75.4 ±0.8	69.7 ±0.3

On zero-shot domain transfer between two paraphrase detection tasks, prompt tuning matches or outperforms model tuning, depending on the direction of transfer.

Looking Forward
Prompt-based learning is an exciting new area that is quickly evolving. While several similar methods have been proposed — such as Prefix Tuning, WARP, and P-Tuning — we discuss their pros and cons and demonstrate that prompt tuning is the simplest and the most parameter efficient method.

In addition to the Prompt Tuning codebase, we’ve also released our LM-adapted T5 checkpoints, which we found to be better-suited for prompt tuning compared to the original T5. This codebase was used for the prompt tuning experiments in FLAN, and the checkpoints were used as a starting point for training the BigScience T0 model. We hope that the research community continues to leverage and extend prompt tuning in future research.

Acknowledgements
This project was a collaboration between Brian Lester, Rami Al-Rfou and Noah Constant. We are grateful to the following people for feedback, discussion and assistance: Waleed Ammar, Lucas Dixon, Slav Petrov, Colin Raffel, Adam Roberts, Sebastian Ruder, Noam Shazeer, Tu Vu and Linting Xue.

Vedere AI

Monthly Archives: February 2022

New Levels Unlocked: Africa’s Game Developers Reach Toward the Next Generation

A Path for Africa’s Gaming Developers

Creating Visual Stories With NVIDIA GPUs

Expanding the Omniverse

Amazon Scholar Ranjit Jhala named ACM Fellow

Engineering Director Sriram Sankar discusses Meta’s first research award opportunity in silent data corruptions at scale

Apply profanity masking in Amazon Translate

Solution overview

Amazon Translate console

AWS CLI

Amazon Translate SDK (Python Boto3)

Conclusion

About the Authors

How Süddeutsche Zeitung optimized their audio narration process with Amazon Polly

Why Amazon Polly?

The challenge

Our technical solution

Conclusion

About the Author

An International Scientific Challenge for the Diagnosis and Gleason Grading of Prostate Cancer

Alexa AI team discusses NeurIPS workshop best paper award

MuZero’s first step from research into the real world

MuZero’s first step from research into the real world

Guiding Frozen Language Models with Learned Soft Prompts

Navigation

GenAI Vision Endless Possibilities

"I'm interested in things that change the world or that affect the future and wondrous, new technology where you see it, and you're like, 'Wow, how did that even happen? How is that possible?'" -- Elon Musk

Copyright © 2019-2025 Vedere AI. All Rights Reserved.