Federated learning is an increasingly popular paradigm that enables a large number of entities to collaboratively learn better models. In this work, we study minimax group fairness in federated learning scenarios where different participating entities may only have access to a subset of the population groups during the training phase. We formally analyze how our proposed group fairness objective differs from existing federated learning fairness criteria that impose similar performance across participants instead of demographic groups. We provide an optimization algorithm — FedMinMax — for…Apple Machine Learning Research
Use computer vision to measure agriculture yield with Amazon Rekognition Custom Labels
In the agriculture sector, the problem of identifying and counting the amount of fruit on trees plays an important role in crop estimation. The concept of renting and leasing a tree is becoming popular, where a tree owner leases the tree every year before the harvest based on the estimated fruit yeild. The common practice of manually counting fruit is a time-consuming and labor-intensive process. It’s one of the hardest but most important tasks in order to obtain better results in your crop management system. This estimation of the amount of fruit and flowers helps farmers make better decisions—not only on only leasing prices, but also on cultivation practices and plant disease prevention.
This is where an automated machine learning (ML) solution for computer vision (CV) can help farmers. Amazon Rekognition Custom Labels is a fully managed computer vision service that allows developers to build custom models to classify and identify objects in images that are specific and unique to your business.
Rekognition Custom Labels doesn’t require you to have any prior computer vision expertise. You can get started by simply uploading tens of images instead of thousands. If the images are already labeled, you can begin training a model in just a few clicks. If not, you can label them directly within the Rekognition Custom Labels console, or use Amazon SageMaker Ground Truth to label them. Rekognition Custom Labels uses transfer learning to automatically inspect the training data, select the right model framework and algorithm, optimize the hyperparameters, and train the model. When you’re satisfied with the model accuracy, you can start hosting the trained model with just one click.
In this post, we showcase how you can build an end-to-end solution using Rekognition Custom Labels to detect and count fruit to measure agriculture yield.
Solution overview
We create a custom model to detect fruit using the following steps:
- Label a dataset with images containing fruit using Amazon SageMaker Ground Truth.
- Create a project in Rekognition Custom Labels.
- Import your labeled dataset.
- Train the model.
- Test the new custom model using the automatically generated API endpoint.
Rekognition Custom Labels lets you manage the ML model training process on the Amazon Rekognition console, which simplifies the end-to-end model development and inference process.
Prerequisites
To create an agriculture yield measuring model, you first need to prepare a dataset to train the model with. For this post, our dataset is composed of images of fruit. The following images show some examples.
We sourced our images from our own garden. You can download the image files from the GitHub repo.
For this post, we only use a handful of images to showcase the fruit yield use case. You can experiment further with more images.
To prepare your dataset, complete the following steps:
- Create an Amazon Simple Storage Service (Amazon S3) bucket.
- Create two folders inside this bucket, called
raw_data
andtest_data
, to store images for labeling and model testing. - Choose Upload to upload the images to their respective folders from the GitHub repo.
The uploaded images aren’t labeled. You label the images in the following step.
Label your dataset using Ground Truth
To train the ML model, you need labeled images. Ground Truth provides an easy process to label the images. The labeling task is performed by a human workforce; in this post, you create a private workforce. You can use Amazon Mechanical Turk for labeling at scale.
Create a labeling workforce
Let’s first create our labeling workforce. Complete the following steps:
- On the SageMaker console, under Ground Truth in the navigation pane, choose Labeling workforces.
- On the Private tab, choose Create private team.
- For Team name, enter a name for your workforce (for this post,
labeling-team
). - Choose Create private team.
- Choose Invite new workers.
- In the Add workers by email address section, enter the email addresses of your workers. For this post, enter your own email address.
- Choose Invite new workers.
You have created a labeling workforce, which you use in the next step while creating a labeling job.
Create a Ground Truth labeling job
To great your labeling job, complete the following steps:
- On the SageMaker console, under Ground Truth, choose Labeling jobs.
- Choose Create labeling job.
- For Job name, enter
fruits-detection
. - Select I want to specify a label attribute name different from the labeling job name.
- For Label attribute name¸ enter
Labels
. - For Input data setup, select Automated data setup.
- For S3 location for input datasets, enter the S3 location of the images, using the bucket you created earlier (
s3://{your-bucket-name}/raw-data/images/
). - For S3 location for output datasets, select Specify a new location and enter the output location for annotated data (
s3://{your-bucket-name}/annotated-data/
). - For Data type, choose Image.
- Choose Complete data setup.
This creates the image manifest file and updates the S3 input location path. Wait for the message “Input data connection successful.”
- Expand Additional configuration.
- Confirm that Full dataset is selected.
This is used to specify whether you want to provide all the images to the labeling job or a subset of images based on filters or random sampling.
- For Task category, choose Image because this is a task for image annotation.
- Because this is an object detection use case, for Task selection, select Bounding box.
- Leave the other options as default and choose Next.
- Choose Next.
Now you specify your workers and configure the labeling tool. - For Worker types, select Private.For this post, you use an internal workforce to annotate the images. You also have the option to select a public contractual workforce (Amazon Mechanical Turk) or a partner workforce (Vendor managed) depending on your use case.
- For Private teams¸ choose the team you created earlier.
- Leave the other options as default and scroll down to Bounding box labeling tool.It’s essential to provide clear instructions here in the labeling tool for the private labeling team. These instructions acts as a guide for annotators while labeling. Good instructions are concise, so we recommend limiting the verbal or textual instructions to two sentences and focusing on visual instructions. In the case of image classification, we recommend providing one labeled image in each of the classes as part of the instructions.
- Add two labels:
fruit
andno_fruit
. - Enter detailed instructions in the Description field to provide instructions to the workers. For example:
You need to label fruits in the provided image. Please ensure that you select label 'fruit' and draw the box around the fruit just to fit the fruit for better quality of label data. You also need to label other areas which look similar to fruit but are not fruit with label 'no_fruit'
.You can also optionally provide examples of good and bad labeling images. You need to make sure that these images are publicly accessible. - Choose Create to create the labeling job.
After the job is successfully created, the next step is to label the input images.
Start the labeling job
Once you have successfully created the job, the status of the job is InProgress
. This means that the job is created and the private workforce is notified via email regarding the task assigned to them. Because you have assigned the task to yourself, you should receive an email with instructions to log in to the Ground Truth Labeling project.
- Open the email and choose the link provided.
- Enter the user name and password provided in the email.
You may have to change the temporary password provided in the email to a new password after login. - After you log in, select your job and choose Start working.
You can use the provided tools to zoom in, zoom out, move, and draw bounding boxes in the images. - Choose your label (
fruit
orno_fruit
) and then draw a bounding box in the image to annotate it. - When you’re finished, choose Submit.
Now you have correctly labeled images that will be used by the ML model for training.
Create your Amazon Rekognition project
To create your agriculture yield measuring project, complete the following steps:
- On the Amazon Rekognition console, choose Custom Labels.
- Choose Get Started.
- For Project name, enter
fruits_yield
. - Choose Create project.
You can also create a project on the Projects page. You can access the Projects page via the navigation pane. The next step is to provide images as input.
Import your dataset
To create your agriculture yield measuring model, you first need to import a dataset to train the model with. For this post, our dataset is already labeled using Ground Truth.
- For Import images, select Import images labeled by SageMaker Ground Truth.
- For Manifest file location, enter the S3 bucket location of your manifest file (
s3://{your-bucket-name}/fruits_image/annotated_data/fruits-labels/manifests/output/output.manifest
). - Choose Create Dataset.
You can see your labeled dataset.
Now you have your input dataset for the ML model to start training on them.
Train your model
After you label your images, you’re ready to train your model.
Wait for the training to complete. Now you can start testing the performance for this trained model.
Test your model
Your agriculture yield measuring model is now ready for use and should be in the Running
state. To test the model, complete the following steps:
Step 1 : Start the model
On your model details page, on the Use model tab, choose Start.
Rekognition Custom Labels also provides the API calls for starting, using, and stopping your model.
Step 2 : Test the model
When the model is in the Running
state, you can use the sample testing script analyzeImage.py
to count the amount of fruit in an image.
- Download this script from of the GitHub repo.
- Edit this file to replace the parameter
bucket
with your bucket name andmodel
with your Amazon Rekognition model ARN.
We use the parameters photo
and min_confidence
as input for this Python script.
You can run this script locally using the AWS Command Line Interface (AWS CLI) or using AWS CloudShell. In our example, we ran the script via the CloudShell console. Note that CloudShell is free to use.
Make sure to install the required dependences using the command pip3 install boto3 PILLOW
if not already installed.
The following screenshot shows the output, which detected two fruits in the input image. We supplied 15.jpeg as the photo argument and 85 as the min_confidence
value.
The following example shows image 15.jpeg with two bounding boxes.
You can run the same script with other images and experiment by changing the confidence score further.
Step 3: Stop the model
When you’re done, remember to stop model to avoid incurring in unnecessary charges. On your model details page, on the Use model tab, choose Stop.
Clean up
To avoid incurring unnecessary charges, delete the resources used in this walkthrough when not in use. We need to delete the Amazon Rekognition project and the S3 bucket.
Delete the Amazon Rekognition project
To delete the Amazon Rekognition project, complete the following steps:
- On the Amazon Rekognition console, choose Use Custom Labels.
- Choose Get started.
- In the navigation pane, choose Projects.
- On the Projects page, select the project that you want to delete.
- Choose Delete.
The Delete project dialog box appears.
- Choose Delete.
- If the project has no associated models:
- Enter delete to delete the project.
- Choose Delete to delete the project.
- If the project has associated models or datasets:
- Enter delete to confirm that you want to delete the model and datasets.
- Choose either Delete associated models, Delete associated datasets, or Delete associated datasets and models, depending on whether the model has datasets, models, or both.
Model deletion might take a while to complete. Note that the Amazon Rekognition console can’t delete models that are in training or running. Try again after stopping any running models that are listed, and wait until the models listed as training are complete. If you close the dialog box during model deletion, the models are still deleted. Later, you can delete the project by repeating this procedure.
- Enter delete to confirm that you want to delete the project.
- Choose Delete to delete the project.
Delete your S3 bucket
You first need to empty the bucket and then delete it.
- On the Amazon S3 console, choose Buckets.
- Select the bucket that you want to empty, then choose Empty.
- Confirm that you want to empty the bucket by entering the bucket name into the text field, then choose Empty.
- Choose Delete.
- Confirm that you want to delete the bucket by entering the bucket name into the text field, then choose Delete bucket.
Conclusion
In this post, we showed you how to create an object detection model with Rekognition Custom Labels. This feature makes it easy to train a custom model that can detect an object class without needing to specify other objects or losing accuracy in its results.
For more information about using custom labels, see What Is Amazon Rekognition Custom Labels?
About the authors
Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.
Sameer Goel is a Sr. Solutions Architect in the Netherlands, who drives customer success by building prototypes on cutting-edge initiatives. Prior to joining AWS, Sameer graduated with a master’s degree from Boston, with a concentration in data science. He enjoys building and experimenting with AI/ML projects on Raspberry Pi. You can find him on LinkedIn.
New programmable materials can sense their own movements
MIT researchers have developed a method for 3D printing materials with tunable mechanical properties, that sense how they are moving and interacting with the environment. The researchers create these sensing structures using just one material and a single run on a 3D printer.
To accomplish this, the researchers began with 3D-printed lattice materials and incorporated networks of air-filled channels into the structure during the printing process. By measuring how the pressure changes within these channels when the structure is squeezed, bent, or stretched, engineers can receive feedback on how the material is moving.
The method opens opportunities for embedding sensors within architected materials, a class of materials whose mechanical properties are programmed through form and composition. Controlling the geometry of features in architected materials alters their mechanical properties, such as stiffness or toughness. For instance, in cellular structures like the lattices the researchers print, a denser network of cells makes a stiffer structure.
This technique could someday be used to create flexible soft robots with embedded sensors that enable the robots to understand their posture and movements. It might also be used to produce wearable smart devices that provide feedback on how a person is moving or interacting with their environment.
“The idea with this work is that we can take any material that can be 3D-printed and have a simple way to route channels throughout it so we can get sensorization with structure. And if you use really complex materials, then you can have motion, perception, and structure all in one,” says co-lead author Lillian Chin, a graduate student in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).
Joining Chin on the paper are co-lead author Ryan Truby, a former CSAIL postdoc who is now as assistant professor at Northwestern University; Annan Zhang, a CSAIL graduate student; and senior author Daniela Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science and director of CSAIL. The paper is published today in Science Advances.
Architected materials
The researchers focused their efforts on lattices, a type of “architected material,” which exhibits customizable mechanical properties based solely on its geometry. For instance, changing the size or shape of cells in the lattice makes the material more or less flexible.
While architected materials can exhibit unique properties, integrating sensors within them is challenging given the materials’ often sparse, complex shapes. Placing sensors on the outside of the material is typically a simpler strategy than embedding sensors within the material. However, when sensors are placed on the outside, the feedback they provide may not provide a complete description of how the material is deforming or moving.
Instead, the researchers used 3D printing to incorporate air-filled channels directly into the struts that form the lattice. When the structure is moved or squeezed, those channels deform and the volume of air inside changes. The researchers can measure the corresponding change in pressure with an off-the-shelf pressure sensor, which gives feedback on how the material is deforming.
Because they are incorporated into the material, these “fluidic sensors” offer advantages over conventional sensor materials.
“Sensorizing” structures
The researchers incorporate channels into the structure using digital light processing 3D printing. In this method, the structure is drawn out of a pool of resin and hardened into a precise shape using projected light. An image is projected onto the wet resin and areas struck by the light are cured.
But as the process continues, the resin remains stuck inside the sensor channels. The researchers had to remove excess resin before it was cured, using a mix of pressurized air, vacuum, and intricate cleaning.
They used this process to create several lattice structures and demonstrated how the air-filled channels generated clear feedback when the structures were squeezed and bent.
“Importantly, we only use one material to 3D print our sensorized structures. We bypass the limitations of other multimaterial 3D printing and fabrication methods that are typically considered for patterning similar materials,” says Truby.
Building off these results, they also incorporated sensors into a new class of materials developed for motorized soft robots known as handed shearing auxetics, or HSAs. HSAs can be twisted and stretched simultaneously, which enables them to be used as effective soft robotic actuators. But they are difficult to “sensorize” because of their complex forms.
They 3D printed an HSA soft robot capable of several movements, including bending, twisting, and elongating. They ran the robot through a series of movements for more than 18 hours and used the sensor data to train a neural network that could accurately predict the robot’s motion.
Chin was impressed by the results — the fluidic sensors were so accurate she had difficulty distinguishing between the signals the researchers sent to the motors and the data that came back from the sensors.
“Materials scientists have been working hard to optimize architected materials for functionality. This seems like a simple, yet really powerful idea to connect what those researchers have been doing with this realm of perception. As soon as we add sensing, then roboticists like me can come in and use this as an active material, not just a passive one,” she says.
“Sensorizing soft robots with continuous skin-like sensors has been an open challenge in the field. This new method provides accurate proprioceptive capabilities for soft robots and opens the door for exploring the world through touch,” says Rus.
In the future, the researchers look forward to finding new applications for this technique, such as creating novel human-machine interfaces or soft devices that have sensing capabilities within the internal structure. Chin is also interested in utilizing machine learning to push the boundaries of tactile sensing for robotics.
“The use of additive manufacturing for directly building robots is attractive. It allows for the complexity I believe is required for generally adaptive systems,” says Robert Shepherd, associate professor at the Sibley School of Mechanical and Aerospace Engineering at Cornell University, who was not involved with this work. “By using the same 3D printing process to build the form, mechanism, and sensing arrays, their process will significantly contribute to researcher’s aiming to build complex robots simply.”
This research was supported, in part, by the National Science Foundation, the Schmidt Science Fellows Program in partnership with the Rhodes Trust, an NSF Graduate Fellowship, and the Fannie and John Hertz Foundation.
New-and-Improved Content Moderation Tooling
We are introducing a new-and-improved content moderation tool: The Moderation endpoint improves upon our previous content filter, and is available for free today to OpenAI API developers.
To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content — an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation.
When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm — content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a range of applications. Importantly, this reduces the chances of products “saying” the wrong thing, even when deployed to users at-scale. As a consequence, AI can unlock benefits in sensitive settings, like education, where it could not otherwise be used with confidence.
Moderation endpoint
Flagged
Flagged
The Moderation endpoint helps developers to benefit from our infrastructure investments. Rather than build and maintain their own classifiers—an extensive process, as we document in our paper—they can instead access accurate classifiers through a single API call.
As part of OpenAI’s commitment to making the AI ecosystem safer, we are providing this endpoint to allow free moderation of all OpenAI API-generated content. For instance, Inworld, an OpenAI API customer, uses the Moderation endpoint to help their AI-based virtual characters “stay on-script”. By leveraging OpenAI’s technology, Inworld can focus on their core product – creating memorable characters.
Additionally, we welcome the use of the endpoint to moderate content not generated with the OpenAI API. In one case, the company NGL – an anonymous messaging platform, with a focus on safety – uses the Moderation endpoint to detect hateful language and bullying in their application. NGL finds that these classifiers are capable of generalizing to the latest slang, allowing them to remain more-confident over time. Use of the Moderation endpoint to monitor non-API traffic is in private beta and will be subject to a fee. If you are interested, please reach out to us at support@openai.com.
Get started with the Moderation endpoint by checking out the documentation. More details of the training process and model performance are available in our paper. We have also released an evaluation dataset, featuring Common Crawl data labeled within these categories, which we hope will spur further research in this area.
OpenAI
Amazon and UW announce inaugural Science Hub faculty research awards
Six UW professors will advance artificial intelligence and robotics research with new grants.Read More
Amazon SageMaker Automatic Model Tuning now supports SageMaker Training Instance Fallbacks
Today Amazon SageMaker announced the support of SageMaker training instance fallbacks for Amazon SageMaker Automatic Model Tuning (AMT) that allow users to specify alternative compute resource configurations.
SageMaker automatic model tuning finds the best version of a model by running many training jobs on your dataset using the ranges of hyperparameters that you specify for your algorithm. Then, it chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose.
Previously, users only had the option to specify a single instance configuration. This can lead to problems when the specified instance type isn’t available due to high utilization. In the past, your training jobs would fail with an InsufficientCapacityError (ICE). AMT used smart retries to avoid these failures in many cases, but it remained powerless in the face of sustained low capacity.
This new feature means that you can specify a list of instance configurations in the order of preference, such that your AMT job will automatically fallback to the next instance in the list in the event of low capacity.
In the following sections, we walk through these high-level steps for overcoming an ICE:
- Define HyperParameter Tuning Job Configuration
- Define the Training Job Parameters
- Create the Hyperparameter Tuning Job
- Describe training job
Define HyperParameter Tuning Job Configuration
The HyperParameterTuningJobConfig object describes the tuning job, including the search strategy, the objective metric used to evaluate training jobs, the ranges of the parameters to search, and the resource limits for the tuning job. This aspect wasn’t changed with today’s feature release. Nevertheless, we’ll go over it to give a complete example.
The ResourceLimits
object specifies the maximum number of training jobs and parallel training jobs for this tuning job. In this example, we’re doing a random search strategy and specifying a maximum of 10 jobs (MaxNumberOfTrainingJobs
) and 5 concurrent jobs (MaxParallelTrainingJobs
) at a time.
The ParameterRanges
object specifies the ranges of hyperparameters that this tuning job searches. We specify the name, as well as the minimum and maximum value of the hyperparameter to search. In this example, we define the minimum and maximum values for the Continuous and Integer parameter ranges and the name of the hyperparameter (“eta”, “max_depth”).
AmtTuningJobConfig={
"Strategy": "Random",
"ResourceLimits": {
"MaxNumberOfTrainingJobs": 10,
"MaxParallelTrainingJobs": 5
},
"HyperParameterTuningJobObjective": {
"MetricName": "validation:rmse",
"Type": "Minimize"
},
"ParameterRanges": {
"CategoricalParameterRanges": [],
"ContinuousParameterRanges": [
{
"MaxValue": "1",
"MinValue": "0",
"Name": "eta"
}
],
"IntegerParameterRanges": [
{
"MaxValue": "6",
"MinValue": "2",
"Name": "max_depth"
}
]
}
}
Define the Training Job Parameters
In the training job definition, we define the input needed to run a training job using the algorithm that we specify. After the training completes, SageMaker saves the resulting model artifacts to an Amazon Simple Storage Service (Amazon S3) location that you specify.
Previously, we specified the instance type, count, and volume size under the ResourceConfig
parameter. When the instance under this parameter was unavailable, an Insufficient Capacity Error (ICE) was thrown.
To avoid this, we now have the HyperParameterTuningResourceConfig
parameter under the TrainingJobDefinition
, where we specify a list of instances to fall back on. The format of these instances is the same as in the ResourceConfig
. The job will traverse the list top-to-bottom to find an available instance configuration. If an instance is unavailable, then instead of an Insufficient Capacity Error (ICE), the next instance in the list is chosen, thereby overcoming the ICE.
TrainingJobDefinition={
"HyperParameterTuningResourceConfig": {
"InstanceConfigs": [
{
"InstanceType": "ml.m4.xlarge",
"InstanceCount": 1,
"VolumeSizeInGB": 5
},
{
"InstanceType": "ml.m5.4xlarge",
"InstanceCount": 1,
"VolumeSizeInGB": 5
}
]
},
"AlgorithmSpecification": {
"TrainingImage": "433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest",
"TrainingInputMode": "File"
},
"InputDataConfig": [
{
"ChannelName": "train",
"CompressionType": "None",
"ContentType": "json",
"DataSource": {
"S3DataSource": {
"S3DataDistributionType": "FullyReplicated",
"S3DataType": "S3Prefix",
"S3Uri": "s3://<bucket>/test/"
}
},
"RecordWrapperType": "None"
}
],
"OutputDataConfig": {
"S3OutputPath": "s3://<bucket>/output/"
},
"RoleArn": "arn:aws:iam::340308762637:role/service-role/AmazonSageMaker-ExecutionRole-20201117T142856",
"StoppingCondition": {
"MaxRuntimeInSeconds": 259200
},
"StaticHyperParameters": {
"training_script_loc": "q2bn-sagemaker-test_6"
},
}
Run a Hyperparameter Tuning Job
In this step, we’re creating and running a hyperparameter tuning job with the hyperparameter tuning resource configuration defined above.
We initialize a SageMaker client and create the job by specifying the tuning config, training job definition, and a job name.
import boto3
sm = boto3.client('sagemaker')
sm.create_hyper_parameter_tuning_job(
HyperParameterTuningJobName="my-job-name",
HyperParameterTuningJobConfig=AmtTuningJobConfig,
TrainingJobDefinition=TrainingJobDefinition)
Describe training jobs
The following function lists all instance types used during the experiment and can be used to verify if an SageMaker training instance has automatically fallen back to the next instance in the list during resource allocation.
Conclusion
In this post, we demonstrated how you can now define a pool of instances on which your AMT experiment can fall back in the case of InsufficientCapacityError
. We saw how to define a hyperparameter tuning job configuration, as well as specify the maximum number of training jobs and maximum parallel jobs. Finally, we saw how to overcome the InsufficientCapacityError
by using the HyperParameterTuningResourceConfig
parameter, which can be specified under the training job definition.
To learn more about AMT, visit Amazon SageMaker Automatic Model Tuning.
About the authors
Doug Mbaya is a Senior Partner Solution architect with a focus in data and analytics. Doug works closely with AWS partners, helping them integrate data and analytics solution in the cloud.
Kruthi Jayasimha Rao is a Partner Solutions Architect in the Scale-PSA team. Kruthi conducts technical validations for Partners enabling them progress in the Partner Path.
Bernard Jollans is a Software Development Engineer for Amazon SageMaker Automatic Model Tuning.
Design in the Age of Digital Twins: A Conversation With Graphics Pioneer Donald Greenberg
Asked about the future of design, Donald Greenberg holds up a model of a human aorta.
“After my son became an intravascular heart surgeon at the Cleveland Clinic, he hired one of my students to use CAT scans and create digital 3D models of an aortic aneurysm,” said the computer graphics pioneer in a video interview from his office at Cornell University.
The models enabled custom stents that fit so well patients could leave the hospital soon after they’re inserted. It’s one example Greenberg gives of how computer graphics are becoming part of every human enterprise.
A Whole New Chapter
Expanding the frontier, he’s creating new tools for an architecture design course based on today’s capabilities for building realistic 3D worlds and digital twins. It will define a holistic process so everyone from engineers to city planners can participate in a design.
The courseware is still at the concept stage, but his passion for it is palpable. “This is my next big project, and I’m very excited about it,” said the computer graphics professor of the work, which is sponsored by NVIDIA.
“NVIDIA is superb at the hardware and the software algorithms, and for a long time its biggest advantage is in how it fits them together,” he said.
Greenberg imagines a design process open enough to include urban planners concerned with affordable housing, environmental activists mindful of sustainable living and neighbors who want to know the impact a new structure might have on their access to sunlight.
“I want to put people from different disciplines in the same foxhole so they can see things from different points of view at the same time,” said Greenberg, whose courses have spanned Cornell’s architecture, art, computer science, engineering and business departments.
Teaching With Omniverse
A multidisciplinary approach has fueled Greenberg’s work since 1968, when he started teaching at both Cornell’s colleges of engineering and architecture. And he’s always been rooted in the latest technology.
Today, that means inspiring designers and construction experts to enter the virtual worlds built with photorealistic graphics, simulations and AI in NVIDIA Omniverse.
“Omniverse expands, to multiple domains, the work done with Universal Scene Description, developed by some of the brightest graphics people at places like Pixar — it’s a superb environment for modern collaboration,” he said.
It’s a capability that couldn’t have existed without the million-X advances in computing Greenberg has witnessed in his 54-year career.
He recalls his excitement in 1979 when he bought a VAX-11/780 minicomputer, his first system capable of a million instructions per second. In one of his many SIGGRAPH talks, he said designers would someday have personal workstations capable of 100 MIPS.
Seeing Million-X Advances
The prediction proved almost embarrassingly conservative.
“Now I have a machine that’s 1012 times more powerful than my first computer — I feel like a surfer riding a tidal wave, and that’s one reason I’m still teaching,” he said.
It’s a long way from the system at General Electric’s Visual Simulation Laboratory in Syracuse, New York, where in the late 1960s he programmed on punch cards to help create one of the first videos generated solely with computer graphics. The 18-minute animation wowed audiences and took him and 14 of his architecture students two years to create.
NASA used the same GE system to train astronauts how to dock the Apollo module with the lunar lander. And the space agency was one of the early adopters of digital twins, he notes, a fact that saved the lives of the Apollo 13 crew after a system malfunction two days into their trip to the moon.
From Sketches to Digital Twins
For Greenberg, it all comes down to the power of computer graphics.
“I love to draw, 99% of intellectual intake comes through our eyes and my recent projects are about how to go from a sketch or idea to a digital twin,” he said.
Among his few regrets, he said he’ll miss attending SIGGRAPH in person this year.
“It became an academic home for my closest friends and collaborators, a community of mavericks and the only place I found creative people with both huge imaginations and great technical skills, but it’s hard to travel at my age,” said the 88-year-old, whose pioneering continues in front of his computer screen.
“I have a whole bunch of stuff I’m working on that I call techniques in search of a problem, like trying to model how the retina sees an image — I’m just getting started on that one,” he said.
—————————————————————————————————————————————————
Learn More About Omniverse at SIGGRAPH
Anyone can get started working on digital twins with Omniverse by taking a free, self-paced online course at the NVIDIA Deep Learning Institute. And individuals can download Omniverse free.
Educators can request early access to the “Graphics & Omniverse” teaching kit. SIGGRAPH attendees can join a session on “The Metaverse for International Educators” or one of four hands-on training labs on Omniverse.
To learn more, watch NVIDIA’s CEO Jensen Huang and others in a special address at SIGGRAPH on-demand.
The post Design in the Age of Digital Twins: A Conversation With Graphics Pioneer Donald Greenberg appeared first on NVIDIA Blog.
Reid Blackman: The ethics of AI
The author of Ethical Machines explains why companies pursuing ethical AI must ultimately place the responsibility with their senior leadership.Read More
AI Flying Off the Shelves: Restocking Robot Rolls Out to Hundreds of Japanese Convenience Stores
Tokyo-based startup Telexistence this week announced it will deploy NVIDIA AI-powered robots to restock shelves at hundreds of FamilyMart convenience stores in Japan.
There are 56,000 convenience stores in Japan — the third-highest density worldwide. Around 16,000 of them are run by FamilyMart. Telexistence aims to save time for these stores by offloading repetitive tasks like refilling shelves of beverages to a robot, allowing retail staff to tackle more complex tasks like interacting with customers.
It’s just one example of what can be done by Telexistence’s robots, which run on the NVIDIA Jetson edge AI and robotics platform. The company is also developing AI-based systems for warehouse logistics with robots that sort and pick packages.
“We want to deploy robots to industries that support humans’ everyday life,” said Jin Tomioka, CEO of Telexistence. “The first space we’re tackling this is through convenience stores — a huge network that supports daily life, especially in Japan, but is facing a labor shortage.”
The company, founded in 2017, next plans to expand to convenience stores in the U.S., which is also plagued with a labor shortage in the retail industry — and where more than half of consumers say they visit one of the country’s 150,000 convenience stores at least once a month.
Telexistence Robots Stock Up at FamilyMart
Telexistence will begin deploying its restocking robots, called TX SCARA, to 300 FamilyMart stores in August — and aims to bring the autonomous machines to additional FamilyMart locations, as well as other major convenience store chains, in the coming years.
“Staff members spend a lot of time in the back room of the store, restocking shelves, instead of out with customers,” said Tomioka. “Robotics-as-a-service can allow staff to spend more time with customers.”
TX SCARA runs on a track and includes multiple cameras to scan each shelf, using AI to identify drinks that are running low and plan a path to restock them. The AI system can successfully restock beverages automatically more than 98% of the time.
In the rare cases that the robot misjudges the placement of the beverage or a drink topples over, there’s no need for the retail staff to drop their task to get the robot back up and running. Instead, Telexistence has remote operators on standby, who can quickly address the situation by taking manual control through a VR system that uses NVIDIA GPUs for video streaming.
Telexistence estimates that a busy convenience store needs to restock more than 1,000 beverages a day. TX SCARA’s cloud system maintains a database of product sales based on the name, date, time and number of items stocked by the robots during operation. This allows the AI to prioritize which items to restock first based on past sales data.
Achieving Edge AI With NVIDIA Jetson
TX SCARA has multiple AI models under the hood. An object-detection model identifies the types of drinks in a store to determine which one belongs on which shelf. It’s combined with another model that helps detect the movement of the robot’s arm, so it can pick up a drink and accurately place it on the shelf between other products. A third is for anomaly detection: recognizing if a drink has fallen over or off the shelf. One more detects which drinks are running low in each display area.
The Telexistence team used custom pre-trained neural networks as their base models, adding synthetic and annotated real-world data to fine-tune the neural networks for their application. Using a simulation environment to create more than 80,000 synthetic images helped the team augment their dataset so the robot could learn to detect drinks in any color, texture or lighting environment.
For AI model training, the team relied on an NVIDIA DGX Station. The robot itself uses two NVIDIA Jetson embedded modules: the NVIDIA Jetson AGX Xavier for AI processing at the edge, and the NVIDIA Jetson TX2 module to transmit video streaming data.
On the software side, the team uses the NVIDIA JetPack SDK for edge AI and the NVIDIA TensorRT SDK for high-performance inference.
“Without TensorRT, our models wouldn’t run fast enough to detect objects in the store efficiently,” said Pavel Savkin, chief robotics automation officer at Telexistence.
Telexistence further optimized its AI models using half-precision (FP16) instead of single-precision floating-point format (FP32).
Learn more about the latest in AI and robotics at NVIDIA GTC, running online Sept. 19-22. Registration is free.
The post AI Flying Off the Shelves: Restocking Robot Rolls Out to Hundreds of Japanese Convenience Stores appeared first on NVIDIA Blog.
3 Questions: Amar Gupta on an integrated approach to enhanced health-care delivery
Covid-19 was somewhat of a metaverse itself. Many of our domains turned digital — with much attention toward one emerging space: virtual care. The pandemic exacerbated the difficulties of providing appropriate medical board oversight to ensure proper standard of services for patients. MIT researcher and former professor Amar Gupta explores through his research on how different states approach quality, safety, and coordination issues related to telemedicine and health care — and how we need to take an integrated approach to address the interoperability challenge and enhance care delivery.
Q: Since the onset of the global Covid-19 pandemic, how has the quality and landscape of patient care changed?
A: Covid-19 has served as a major catalyst for the adoption of virtual techniques in the U.S. and other countries around the globe. This adoption has occurred in many medical specialties, both in urban and rural areas. At the same time, it has raised several issues and challenges that need to be addressed on a priority basis.
In our recent research paper, we found that in the U.S., “the increased amount of virtual care during the Covid-19 pandemic has exacerbated the challenge of providing appropriate medical board oversight to ensure proper quality of care delivery and safety of patients. This is partly due to the conventional model of each state medical board holding responsibility for medical standards and oversight only within the jurisdiction of that state board and partly due to regulatory waivers and reduced enforcement of privacy policies.”
The prevailing restrictions, related to privacy of patient medical records and the ability for doctors from other states to see those records, were temporarily removed or made less prohibitive. This, in turn, can lead to situations where more medical images can go on an unauthorized basis into the public domain.
And then we have the overarching challenge of interoperability across medical practices and organizations, states, and countries. Years ago, it was just one doctor alone, or one medical system. Now a patient is going to multiple hospitals, multiple doctors. We find this creates issues with respect to treatment, as well as quality and safety of the patient, because the records are scattered or not easily accessed. Sometimes the same test is done two, three times over. Sometimes the records of another hospital are not looked at. Increasingly, medical professionals are complaining about the growing problem of information glut. Based partly on our previous work at successfully assisting major re-engineering and interoperability efforts in financial and defense industries, we believe that Covid-19 reinforced the urgent need for a broadly accepted global approach in the health-care interoperability arena.
Q: You recently published a paper about the impact of growing virtual care and the need for an integrated approach to enhance care delivery. Can you elaborate on your research study and subsequent proposal for the medical community?
A: The paper was started based on a presentation that I made in Washington, D.C., to a group of senior government officials about telemedicine, regulation, and quality control. The Federation of State Medical Boards then gave us names and addresses of the state medical boards in the U.S., and some abroad. We wrote to all of them with a questionnaire to find out what they were doing with respect to telemedicine.
A few of the questions we explored were: Do they have any standards for telemedicine in evaluating the quality of services being rendered? How do they deal with complaints? Have they received any complaints related to telemedicine?
We got responses from only some of the medical boards. What was clear is that there weren’t any uniform standards across the nation. In several states, there are two medical boards, one for allopathic medicine and one for osteopathic medicine.
It’s very difficult to be disbarred in the U.S. — the standards are very high. We found that there were cases when a doctor who had been disbarred from medical practice in one state was still practicing in another. There was also a case where the doctor had been disbarred in three states and was practicing in a fourth state.
We have instances of interstate telemedicine in the U.S., intercountry work in Europe, and intercontinental telemedicine today. Patients in the ICU at Emory University in Atlanta, for example, at nighttime, are seen by medical personnel working during day time in Australia. This is consistent with the model that we had proposed in our other paper to improve quality and safety of patients by addressing the consequences of circadian misalignment and sleep deprivation among doctors and other medical personnel.
We don’t want doctors who have been penalized in one city, state, or country going to another country and working there. Here, even within the country, this safeguard has not been historically true. For one, the Federation of the State Medical Boards itself has written that many people do not really register their complaints with them, which is cited in our research. There’s also a database available where state regulators can see what happened in other states with respect to specific doctors. That was used less than 100 times in 2017. In fact, two states used it for more than half of these cases. Some states never used it at all. They were basically neglecting what had happened to the doctor in other states, which was frightening.
The Federation of State Medical Boards recently developed a new technology to address this problem. They created an experimental website called docinfo.org, and they invited us to look at it. Using this site, we tried an experiment, by searching for a specific doctor who had been disbarred in three states. These database sites recommended that we have to go to the sites of the three state medical boards, and it actually took us there. When we got to the state medical boards, all the information has been redacted. This reminded me of write-only memory, where information is available somewhere, but nobody’s able to access it, which doesn’t really help the customer.
One of the state medical boards responded that “our state does not allow us to give any information under the Freedom of Information Act to anybody outside the state.” Another one, in our study, refused to give us any information, and said that, based on what we’ve written before, “I know what you’re going to do with this information. I’m not going to give it to you.”
The aspect of medical personnel other than doctors has been covered in a companion research paper: “Enhancing quality of healthcare and patient safety: oversight of physician assistants, nurses, and pharmacists in era of COVID-19 and beyond,” and its first reference asserts that medical error is the third major cause of death in the U.S.
People argue about the quality and cost of health care. If you look at the U.S. today, the cost per patient is the highest in the whole world. If you look at quality, the U.S. is generally ranked below all the other developed countries. In order to enhance quality and safety of health care as well as reduce overall cost, I propose that we need something like the equivalent of Jeanne Clery Act for health care, which “requires public and private colleges and universities to disclose information about certain crimes that occur on or near campus” — but related to doctors and other medical personnel.
If we have these types of techniques available, then patient-reported outcomes and the use of AI techniques will aid in getting our hands around how to improve health care not just for people, but for health care services and products, too. We really need to take that bigger initiative not only in this nation, but on a seamless basis around the world.
Q: With Covid-19, we saw the proliferation of AI-based solutions with predictive modeling, synthetic biology, and surveillance and contact monitoring. Predating the pandemic, robust AI models have enabled better forecasting, medical imaging, clinical workflows. What ongoing issues need to be addressed?
A: The definition of medicine has changed over the years. At one point, there was a doctor, and that doctor did most of the tasks. The nurse may be there, and a compounder to do the medications. The quality control issue was mainly on the doctor. Today, it’s a blend of the hospital network, doctors, bureaucrats, administrators. There are technical staff in charge of telemedicine systems and computer scientists who work on modeling.
Recently, I supervised a graduate thesis on prescription opioids, and we found that there was systematic discrimination. With white males, they were much more likely to be given the prescription. If it was a woman or a Black person, they were much less likely to get the pills, even with the same set of symptoms and issues. The graduate student also looked at the nurses records, and found that they were repeatedly saying, for one kind of patient, they were “less complaining,” and others were “complaining,” which in turn impacted the chance of getting the opioid prescription.
Now, trained AI models that assist in decision-making will also present bias. But in a situation like this, whom does one file a complaint against? Do you file it against the hospital? The doctor and nurse? The computer scientist?
In today’s world, as these systems are progressing from a single doctor to much more integrated system, it’s becoming more and more difficult to decide who is at fault. If they’re not taken care of earlier, we run the risk of large-scale harm.
AI-based networks are supposed to be trained and retrained at regular intervals using the latest data from a cohort of patients. As patients’ conditions change, and they take different drugs, the way they react to any other drug will be different. Few of these models are going through any retraining process.
About 15 years ago, I had coined the term “three-pronged approach” to describe my vision of evolving health care. The three-pronged approach means that there are people in proximity to the patient, maybe a nurse practitioner or family member who might be helping. There is a doctor who’s a domain expert who may be in another city, another state, another country. There’s IT and AI work that will take place.
The three-pronged approach to health care is very much in vogue today. To find effective solutions, we can’t look at a single prong — we need an integrated approach. While there are over 100 health-care interoperability efforts around the world which pertain to a particular geographic region or a particular medical specialty, we need to address the challenge of interoperability by devising and implementing a broadly accepted staged plan for global adoption, rather than just focusing at local, state, or national level. This, in turn, will also enable superior leveraging and management of health-care personnel, services, and products to support the global quest for health care for all: better, quicker, and less expensive.