British Newscaster speaking style now available in Amazon Polly

British Newscaster speaking style now available in Amazon Polly

Amazon Polly turns text into lifelike speech, allowing you to create applications that talk and build entirely new categories of speech-enabled products. We’re thrilled to announce the launch of a brand-new, British Newscaster speaking style voice: Amy. The speaking style mimics a formal and authoritative British newsreader. This Newscaster voice is the result of our latest achievements in Neural Text-to-Speech (NTTS) technology, making it possible to release new voices with only a few hours of recordings.

Amy’s British English Newscaster voice offers an alternative to the existing Newscaster speaking styles in US English (Matthew and Joanna, launched in July 2019) and US Spanish (Lupe, launched in April 2020). The style is suitable for a multitude of sectors, such as publishing and media. The high quality of the voice and its broadcaster-like style contribute to a more pleasant listening experience to relay news content.

Don’t just take our word for it! Our customer SpeechKit is a text-to-audio service that utilizes Amazon Polly as a core component of their toolkit. Here’s what their co-founder and COO, James MacLeod, has to say about this exciting new style: “News publishers use SpeechKit to publish their articles and newsletters in audio. The Amy Newscaster style is another great improvement from the Polly team, the pitch and clarity of intonation of this style fits well with this type of short-to-mid form news publishing. It provides listeners with a direct and informative style they’re used to hearing from human-read audio articles. As these voices advance, and new listening habits develop, publishers continue to observe improvements in audio engagement. News publishers can now start using the Amy Newscaster style through SpeechKit to make their articles available in audio, at scale, and track audio engagement.

You can listen to the following samples to hear how this brand-new British Newscaster speaking style sounds:

Amy: 

The following samples are the other Newscaster speaking styles in US English and US Spanish: 

Matthew:

Joanna:

Lupe: 

You can use Amy’s British Newscaster speaking style via the Amazon Polly console, the AWS Command Line Interface (AWS CLI), or AWS SDK. The feature is available in all AWS Regions supporting NTTS. For more information, see What Is Amazon Polly? For the full list of available voices, see Voices in Amazon Polly. Or log in to the Amazon Polly console to try it out for yourself! Additionally, Amy Newscaster and other selected Polly voices are now available to Alexa skill developers.

 


About the Author

Goeric Huybrechts is a Software Development Engineer in the Amazon Text-to-Speech Research team. At work, he is passionate about everything that touches AI. Outside of work, he loves sports, football in particular, and loves to travel.

Read More

Facebook announces winners of the People’s Expectations and Experiences with Digital Privacy research awards

Facebook is committed to honoring people’s privacy in our products, policies, and services. We serve a diverse global community of users, and we need to understand people’s unique privacy needs around the world. This is why we launched a request for research proposals in August to help broaden and deepen our collective knowledge of global privacy expectations and experiences. After a thorough review process, we’ve selected the following as recipients of these awards.
VIEW RFP“I am thrilled by the quality of the proposed work we received in our latest funding opportunity about users’ privacy expectations and experiences,” says Head of Facebook Privacy Research Liz Keneski. “I look forward to seeing the results of these carefully selected funded studies. I expect these projects to have an impact on inclusive privacy at Facebook and in the tech industry at large, including for vulnerable populations in the diverse communities that we serve — and on the ways we measure and understand people’s privacy expectations reliably.”

We sought applications from across the social sciences and technical disciplines, with particular interest in (1) improving understanding of users’ privacy attitudes, concerns, preferences, needs, behaviors, and outcomes, and (2) novel interventions for digital transparency and control that are meaningful for diverse populations, context, and data types.

“We received an impressive 147 proposals from 34 countries and 120 universities,” says Sharon Ayalde, Facebook Research Program Manager. “This was our first year offering funding opportunities in the privacy research space. We look forward to continuing these efforts into 2021 and to strengthening our collaborations with key experts globally.”

Thank you to everyone who took the time to submit a proposal, and congratulations to the winners. Recipients of our two earlier privacy research award opportunities were announced in May.

Research award recipients

Deploying visual interventions for users of varying digital literacy levels
Xinru Page, Brian Smith, Mainack Mondal, Norman Makoto Su (Brigham Young University)

Empowering under-resourced parents in teens’ social media privacy education
Natalie Bazarova, Dan Cosley, Ellen Wenting Zou, Leslie Park, Martha Brandt (Cornell University)

Helping people manage privacy settings using social influences
Jason Hong, Laura Dabbish (Carnegie Mellon University)

Measuring privacy across cultural, political, and expressive contexts
Dmitry Epstein, Aysenur Dal, Elizabeth Stoycheff, Erik Nisbet, Kelly Quinn, Olga Kamenchuk, Thorsten Faas (The Hebrew University of Jerusalem)

Privacy personas in the MENA region: A large-scale analysis of 21 countries
Bernard J. Jansen, Joni Salminen, Justin D. Martin, Kareem Darwish, Lene Nielsen, Soon-gyo Jung, Yin (David) Yang (Hamad Bin Khalifa University)

Finalists

Bridging the urban-rural gap in digital privacy autonomy among older adults
Kaileigh Byrne, Bart Knijnenburg (Clemson University)

Digital afterlife and the displaced: Challenges and practices among Rohingyas
Faheem Hussain, Md. Arifur Rahman (Arizona State University)

Digital privacy attitudes and misperceptions among key stakeholders
Joseph Turow Yphtach Lelkes (University of Pennsylvania)

Digital privacy in a connected world — awareness and expectations of children
Serdar Abaci (University of Edinburgh)

Digital privacy rights and government data collection during COVID-19
Ariadne Vromen, Francesco Bailo, Kimberlee Weatherall (Australian National University)

Informed consent: Conditions of effective consent to privacy messages
Glenna Read, Jonathan Peters (University of Georgia)

Managing your personal “data bank”
Aija E. Leiponen, Joy Z. Wu (Cornell University)

Privacy design for vulnerable and digitally low-literate populations
Maryam Mustafa, Mobin Javed (Lahore University of Management Sciences)

Understanding folk theories of privacy
Jeff Hancock, Xun “Sunny” Liu (Stanford University)

For more information about topics of interest, eligibility, and requirements, visit the award page. To receive updates on new research award opportunities, subscribe to our email newsletter.

The post Facebook announces winners of the People’s Expectations and Experiences with Digital Privacy research awards appeared first on Facebook Research.

Read More

Hyundai Motor Group to Integrate Software-Defined AI Infotainment Powered by NVIDIA DRIVE Across Entire Fleet

Hyundai Motor Group to Integrate Software-Defined AI Infotainment Powered by NVIDIA DRIVE Across Entire Fleet

From its entry-level vehicles to premium ones, Hyundai Motor Group will deliver the latest in AI-powered convenience and safety to every new customer.

The leading global auto group, which produces more than 7 million vehicles a year, announced today that every Hyundai, Kia and Genesis model will include infotainment systems powered by NVIDIA DRIVE, with production starting in 2022. By making high-performance, energy-efficient compute a standard feature, every vehicle will include a rich, software-defined AI user experience that’s always at the cutting edge.

Hyundai Motor Group has been working with NVIDIA since 2015, developing a state-of-the-art in-vehicle infotainment system on NVIDIA DRIVE that shipped in the Genesis GV80 and G80 models last year. The companies have also been collaborating on an advanced digital cockpit for release in late 2021.

The Genesis G80

Now, the automaker is standardizing AI for all its vehicles by extending NVIDIA DRIVE throughout its entire fleet — marking its commitment to developing software-defined and constantly updateable vehicles for more intelligent transportation.

A Smarter Co-Pilot

AI and accelerated computing have opened the door for a vast array of new functionalities in next-generation vehicles.

Specifically, these software-defined AI cockpit features can be realized with a centralized, high-performance computing architecture. Traditionally, vehicle infotainment requires a collection of electronic control units and switches to perform basic functions, such as changing the radio station or adjusting temperature.

Consolidating these components with the NVIDIA DRIVE software-defined AI platform simplifies the architecture while creating more compute headroom to add new features. With NVIDIA DRIVE at the core, automakers such as Hyundai can orchestrate crucial safety and convenience features, building vehicles that become smarter over time.

The NVIDIA DRIVE platform

These capabilities include driver or occupant monitoring to ensure eyes stay on the road or exiting passengers avoid oncoming traffic. They can elevate convenience in the car by clearly providing information on the vehicle’s surroundings or recommending faster routes and nearby restaurants.

Delivering the Future to Every Fleet

Hyundai is making this new area of in-vehicle AI a reality for all of its customers.

The automaker will leverage the high-performance compute of NVIDIA DRIVE to roll out its new connected car operating system to every new Hyundai, Kia and Genesis vehicle. The software platform consolidates the massive amounts of data generated by the car to deliver personalized convenience and safety features for the vehicle’s occupants.

By running on NVIDIA DRIVE, the in-vehicle infotainment system can process the myriad vehicle data in parallel to deliver features instantaneously. It can provide these services regardless of whether the vehicle is connected to the internet, customizing to each user safely and securely for the ultimate level of convenience.

With this new centralized cockpit architecture, Hyundai Motor Group will bring AI to every new customer, offering software upgradeable applications for the entire life of its upcoming fleet.

The post Hyundai Motor Group to Integrate Software-Defined AI Infotainment Powered by NVIDIA DRIVE Across Entire Fleet appeared first on The Official NVIDIA Blog.

Read More

Learn from the winner of the AWS DeepComposer Chartbusters challenge The Sounds of Science

Learn from the winner of the AWS DeepComposer Chartbusters challenge The Sounds of Science

AWS is excited to announce the winner of the AWS DeepComposer Chartbusters The Sounds of Science Challenge, Sungin Lee. AWS DeepComposer gives developers a creative way to get started with machine learning (ML). In June, we launched Chartbusters, a monthly global competition during which developers use AWS DeepComposer to create original compositions and compete to showcase their ML skills. The third challenge, The Sounds of Science, challenged developers to create background music for a video-clip.

Sungin is a Junior Solutions Architect for MegazoneCloud, one of the largest AWS partners in South Korea. Sungin studied linguistics and anthropology in university, but made a career change to cloud engineering. When Sungin first started learning about ML, he never knew he would create the winning composition for the Chartbusters challenge.

We interviewed Sungin to learn about his experience competing in the third Chartbusters challenge, which ran from September 2–23, 2020, and asked him to tell us more about how he created his winning composition.


Sungin Lee at his work station.

Getting started with machine learning

Sungin began his interest in ML and Generative Adversarial Networks (GANs) through the vocational education he received as he transitioned to cloud engineering.

“As part of the curriculum, there was a team project in which my team tried to make a model that generates an image according to the given sentence through GANs. Unfortunately, we failed at training the model due to the complexity of it but [the experience] deepened my interest in GANs.”

After receiving his vocational education, Sungin chose to pursue a career in cloud engineering and joined Megazone Cloud. Six months in to his career, Sungin’s team leader at work encouraged him to try AWS DeepComposer.

“When the challenge first launched, my team leader told me about the challenge and encouraged me to participate in it. I was already interested in GANs and music, and as a new hire, I wanted to show my machine learning skills.” 

Building in AWS DeepComposer

In The Sounds of Science, developers composed background music for a video clip using the Autoregressive Convolutional Neural Network (AR-CNN) algorithm and edited notes with the newly launched Edit melody feature to better match the music with the provided video.

“I began by selecting the initial melody. When I first saw the video, I thought that one of the sample melodies, ‘Ode to Joy,’ went quite well with the atmosphere of the video and decided to use it. But I wanted the melody to sound more soothing than the original so I slightly lowered the pitch. Then I started enhancing the melody with AR-CNN.”


Sungin composing his melody.

Sungin worked on his competition for a day before generating his winning melody.

“I generated multiple compositions with AR-CNN until I liked the melody. Then I started adding more instruments. I experimented with all sample models from MuseGan and decided that rock suits melody the best. I found the ‘edit melody’ feature very helpful. In the process of enhancing the melody with AR-CNN, some off-key notes would appear and disrupt the harmony. But with the ‘edit melody’ feature, I could just remove or modify the wrong note and put the music back in key!”

The Edit melody feature on the AWS DeepComposer console.

“The biggest obstacle was my own doubt. I had a hard time being satisfied with the output, and even thought of giving up on the competition and never submitting any compositions. But then I thought, why give up? So I submitted my best composition by far and won the challenge.”

You can listen to Sungin’s winning composition, “The Joy,” on the AWS DeepComposer SoundCloud page.

Conclusion

Sungin believes that the AWS DeepComposer Chartbusters challenge gave him the confidence in his career transition to continue pursuing ML.

“It has been only a year since I started studying machine learning properly. As a non-Computer Science major without any basic computer knowledge, it was hard to successfully achieve my goals with machine learning. For example, my team project during the vocational education ended up unsuccessful, and the AWS DeepRacer model that I made could not finish the track. Then, when I was losing confidence in myself, I won first place in the AWS DeepComposer Chartbusters challenge! This victory reminded me that I could actually win something with machine learning and motivated me to keep studying.”

Overall, Sungin completed the challenge with a feeling of accomplishment and a desire to learn more.

“This challenge gave me self-confidence. I will keep moving forward on my machine learning path and keep track on new GAN techniques.”

Congratulations to Sungin for his well-deserved win!

We hope Sungin’s story has inspired you to learn more about ML and get started with AWS DeepComposer. Check out the next AWS DeepComposer Chartbusters challenge, and start composing today.

 


About the Author

Paloma Pineda is a Product Marketing Manager for AWS Artificial Intelligence Devices. She is passionate about the intersection of technology, art, and human centered design. Out of the office, Paloma enjoys photography, watching foreign films, and cooking French cuisine.

Read More

Accelerating Research: Texas A&M Launching Grace Supercomputer for up to 20x Boost

Accelerating Research: Texas A&M Launching Grace Supercomputer for up to 20x Boost

Texas A&M University is turbocharging the research of its scientists and engineers with a new supercomputer powered by NVIDIA A100 Tensor Core GPUs.

The Grace supercomputer — named to honor programming pioneer Grace Hopper — handles almost 20 times the processing of its predecessor, Ada.

Texas A&M’s Grace supercomputing cluster comes as user demand at its High Performance Research Computing unit has doubled since 2016. It now has more than 2,600 researchers seeking to run workloads.

The Grace system promises to enhance A&M’s research capabilities and competitiveness. It will allow A&M researchers to keep pace with current trends across multiple fields enabled by advances in high performance computing.

Researchers at Texas A&M University will have access to the new system in December. Dell Technologies is the primary vendor for the Grace system.

Boosting Research

The new Grace architecture will enable researchers to make leaps with HPC in AI and data science. It also provides a foundation for a workforce in exascale computing, which processes a billion billion calculations per second.

The Grace system is set to support the university’s researchers in drug design, materials science, geosciences, fluid dynamics, biomedical applications, biophysics, genetics, quantum computing, population informatics and autonomous vehicles.

“The High Performance Research Computing lab has a mission to infuse computational and data analysis technologies into the research and creative activities of every academic discipline at Texas A&M,” said Honggao Liu, executive director of the facility.

Research at Texas A&M University in 2019 provided $952 million in revenue for the university known for its scholarship and scientific discovery support.

Petaflops Performance

Like its namesake Grace Hopper — whose work in the 1950s led to the COBOL programming language — the new Grace supercomputing cluster will be focused on fueling innovation and making groundbreaking discoveries.

The system boosts processing up to 6.2 petaflops. A one petaflops computer can handle one quadrillion floating point operations per second (flops).

In addition to the A100 GPUs, the Grace cluster is powered by single-precision NVIDIA T4 Tensor Core GPUs and NVIDIA RTX 6000 GPUs in combination with more than 900 Dell EMC PowerEdge servers.

The system is interconnected with NVIDIA Mellanox high-speed, low-latency HDR InfiniBand fabric, enabling smart in-network computing engines for accelerated computing. It also includes 5.12PB of usable high-performance DDN storage running the Lustre parallel file system.

The post Accelerating Research: Texas A&M Launching Grace Supercomputer for up to 20x Boost appeared first on The Official NVIDIA Blog.

Read More

TensorFlow Community Spotlight program update

TensorFlow Community Spotlight program update

Posted by Marcus Chang, TensorFlow Program Manager

In June we started the TensorFlow Community Spotlight Program to offer the developer community an opportunity to showcase their hard work and passion for ML and AI by submitting their TensorFlow projects for the chance to be featured and recognized on Twitter with the hashtag #TFCommunitySpotlight.

GIF of posture tracking tool in use
Olesya Chernyavskaya, a Community Spotlight winner, created a tool in TensorFlow to track posture and blur the screen if a person is sitting poorly.

Now a little over four months in, we’ve received many great submissions and it’s been amazing to see all of the creative uses of TensorFlow across Python, JavaScript, Android, iOS, and many other areas of TensorFlow.

We’d like to learn about your projects, too. You can share them with us using this form. Here are our previous Spotlight winners:

Pranav Natekar

Pranav used TensorFlow to create a tool that identifies patterns in Indian Classical music to help students learn the Tabla. Pranav’s GitHub → http://goo.gle/2Z5f7Op

Chart of Indian Classical music patterns


Olesya Chernyavskaya

Working from home and trying to improve your posture? Olesya created a tool in TensorFlow to track posture and blur the screen if a person is sitting poorly. Olesya’s GitHub → https://goo.gle/2CAHvz9

GIF of posture tracking tool in use

Javier Gamazo Tejero

Javier used TensorFlow to capture movement with a webcam and transfer it to Google Street View to give a virtual experience of walking through different cities. Javier’s GitHub → https://goo.gle/3hgkmBc

GIF of virtual walking through cities

Hugo Zanini

Hugo used TensorFlow.js to create real-time semantic segmentation in a browser. Hugo’s GitHub → https://goo.gle/310RDKc

GIF of real-time semantic segmentation

Yana Vasileva

Ambianic.ai created a fall detection surveillance system in which user data is never sent to any 3rd party cloud servers. Yana’s GitHub → goo.gle/2XvYY3q

GIF of fall detection surveillance system

Samarth Gulati and Praveen Sinha

These developers had artists upload an image as a texture for TensorFlow facemesh 3D model and used CSS blend modes to give an illusion of face paint on the user’s face. Samarth’s GitHub → https://goo.gle/2Qe3Gyx

GIF of facemesh 3D model

Laetitia Hebert

Laetitia created a model system for understanding genes, neurons and behavior of the Roundworm, as it naturally moves through a variety of complex postures. Laetitia’s GitHub → https://goo.gle/2ZgLZ6L

GIF of Worm Pose

Henry Ruiz
Rigging.js is a react.js application that utilizes the facemesh Tensorflow.js model. Using a camera, it maps the movements of a person into a 3D model. Henry’s GitHub → https://goo.gle/3iCXBZj

GIF of Rigging JS

DeepPavlov.ai

The DeepPavlov AI library solves numerous NLP and NLU problems in a short amount of time using pre-trained models or training your own variations of the models. DeepPavlov’s GitHub → https://goo.gle/3jl967S

DeepPavlov AI library

Mayank Thakur

Mayank created a special hand gesture feature to go with the traditional face recognition lock systems on mobile phones that will help increase security. Mayank’s GitHub → https://goo.gle/3j7evyN

GIF of hand gesture software

Firiuza Shigapova

Using TensorFlow 2.x, Firiuza built a library for Graph Neural Networks containing GraphSage and GAT models for node and graph classification problems. Firiuza’s GitHub → https://goo.gle/3kFcmvz

GIF of Graph Neural Networks

Thank you for all the submissions thus far. Congrats to the winners, and we look forward to growing this community of Community Spotlight recipients so be sure to submit your projects here.

More information

Read More

Announcing the Objectron Dataset

Announcing the Objectron Dataset

Posted by Adel Ahmadyan and Liangkai Zhang, Software Engineers, Google Research

The state of the art in machine learning (ML) has achieved exceptional accuracy on many computer vision tasks solely by training models on photos. Building upon these successes and advancing 3D object understanding has great potential to power a wider range of applications, such as augmented reality, robotics, autonomy, and image retrieval. For example, earlier this year we released MediaPipe Objectron, a set of real-time 3D object detection models designed for mobile devices, which were trained on a fully annotated, real-world 3D dataset, that can predict objects’ 3D bounding boxes.

Yet, understanding objects in 3D remains a challenging task due to the lack of large real-world datasets compared to 2D tasks (e.g., ImageNet, COCO, and Open Images). To empower the research community for continued advancement in 3D object understanding, there is a strong need for the release of object-centric video datasets, which capture more of the 3D structure of an object, while matching the data format used for many vision tasks (i.e., video or camera streams), to aid in the training and benchmarking of machine learning models.

Today, we are excited to release the Objectron dataset, a collection of short, object-centric video clips capturing a larger set of common objects from different angles. Each video clip is accompanied by AR session metadata that includes camera poses and sparse point-clouds. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists of 15K annotated video clips supplemented with over 4M annotated images collected from a geo-diverse sample (covering 10 countries across five continents).

Example videos in the Objectron dataset.

A 3D Object Detection Solution
Along with the dataset, we are also sharing a 3D object detection solution for four categories of objects — shoes, chairs, mugs, and cameras. These models are released in MediaPipe, Google’s open source framework for cross-platform customizable ML solutions for live and streaming media, which also powers ML solutions like on-device real-time hand, iris and body pose tracking.

Sample results of 3D object detection solution running on mobile.

In contrast to the previously released single-stage Objectron model, these newest versions utilize a two-stage architecture. The first stage employs the TensorFlow Object Detection model to find the 2D crop of the object. The second stage then uses the image crop to estimate the 3D bounding box while simultaneously computing the 2D crop of the object for the next frame, so that the object detector does not need to run every frame. The second stage 3D bounding box predictor runs at 83 FPS on Adreno 650 mobile GPU.

Diagram of a reference 3D object detection solution.

Evaluation Metric for 3D Object Detection
With ground truth annotations, we evaluate the performance of 3D object detection models using 3D intersection over union (IoU) similarity statistics, a commonly used metric for computer vision tasks, which measures how close the bounding boxes are to the ground truth.

We propose an algorithm for computing accurate 3D IoU values for general 3D-oriented boxes. First, we compute the intersection points between faces of the two boxes using Sutherland-Hodgman Polygon clipping algorithm. This is similar to frustum culling, a technique used in computer graphics. The volume of the intersection is computed by the convex hull of all the clipped polygons. Finally, the IoU is computed from the volume of the intersection and volume of the union of two boxes. We are releasing the evaluation metrics source code along with the dataset.

Compute the 3D intersection over union using the polygon clipping algorithm, Left: Compute the intersection points of each face by clipping the polygon against the box. Right: Compute the volume of intersection by computing the convex hull of all intersection points (green).

Dataset Format
The technical details of the Objectron dataset, including usage and tutorials, are available on the dataset website. The dataset includes bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes, and is stored in the objectron bucket on Google Cloud storage with the following assets:

  • The video sequences
  • The annotation labels (3D bounding boxes for objects)
  • AR metadata (such as camera poses, point clouds, and planar surfaces)
  • Processed dataset: shuffled version of the annotated frames, in tf.example format for images and SequenceExample format for videos.
  • Supporting scripts to run evaluation based on the metric described above
  • Supporting scripts to load the data into Tensorflow, PyTorch, and Jax and to visualize the dataset, including “Hello World” examples

With the dataset, we are also open-sourcing a data-pipeline to parse the dataset in popular Tensorflow, PyTorch and Jax frameworks. Example colab notebooks are also provided.

By releasing this Objectron dataset, we hope to enable the research community to push the limits of 3D object geometry understanding. We also hope to foster new research and applications, such as view synthesis, improved 3D representation, and unsupervised learning. Stay tuned for future activities and developments by joining our mailing list and visiting our github page.

Acknowledgements
The research described in this post was done by Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Artsiom Ablavatski, Mogan Shieh, Ryan Hickman, Buck Bourdon, Alexander Kanaukou, Chuo-Ling Chang, Matthias Grundmann, ‎and Tom Funkhouser. We thank Aliaksandr Shyrokau, Sviatlana Mialik, Anna Eliseeva, and the annotation team for their high quality annotations. We also would like to thank Jonathan Huang and Vivek Rathod for their guidance on TensorFlow Object Detection API.

Read More