Improve search accuracy with Spell Checker in Amazon Kendra

Amazon Kendra is an intelligent search service powered by machine learning. You can receive spelling suggestions for misspelled terms in your queries by utilizing the Amazon Kendra Spell Checker. Spell Checker helps reduce the frequency of queries returning irrelevant results by providing spelling suggestions for unrecognized terms.

In this post, we explore how to use Amazon Kendra Spell Checker on the AWS Management Console, as well as how to enable Spell Checker in an Amazon Kendra-powered search application through the AWS Command Line Interface (AWS CLI) and AWS SDK.

Use Amazon Kendra Spell Checker on the console

You can automatically receive spelling suggestions for your misspelled Amazon Kendra queries when querying through the console.

On the Amazon Kendra console, choose your desired index, then choose Search indexed content in the navigation pane. Make sure that the selected index has ingested documents; in this post, we use the sample AWS documentation found in the Data sources section of the navigation pane.

On the Amazon Kendra search console, simply submit a query as you usually would. Misspelled terms in the query are substituted with suggested terms in the “Did you mean” section of the search console.

Choosing the suggested query submits a new query with the corrected spelling.

As you can see, the query results provided through the suggested query are significantly more relevant, thanks to Spell Checker!

Use Amazon Kendra Spell Checker in search applications

Search applications powered by Amazon Kendra can quickly and easily enable Spell Checker through the AWS CLI or AWS SDK, which we walk through in this section. Additionally, we go over an example of how to process the Spell Checker response.

AWS CLI

Let’s look at how AWS CLI users can opt in to Amazon Kendra Spell Checker to receive spelling suggestions for misspelled query terms. We use the AWS CLI to query Amazon Kendra as usual, with only one small change: we include the --spell-correction-configuration IncludeQuerySpellCheckSuggestions=true argument:

$ aws kendra query --query-text "what is knedar" --index-id [YOUR_INDEX_ID] --spell-correction-configuration IncludeQuerySpellCheckSuggestions=true

In addition to the normal query results, the response from Amazon Kendra now contains a SpellCorrectedQueries object, if there are any spelling suggestions for the query. For more information, see SpellCorrectedQuery.

// Full query response omitted for brevity
"SpellCorrectedQueries": [
  {
    "SuggestedQueryText": "what is kendra",
    "Corrections": [
      {
        "BeginOffset": 8,
        "EndOffset": 14,
        "Term": "knedar",
        "CorrectedTerm": "kendra"
      }
    ]
  }
]

AWS SDK

Next, let’s walk through how Amazon Kendra provides spell check functionality for AWS SDK users. For this example, we use Python 3. We submit a query with a few spelling errors, and print out the SpellCorrectedQueries object in the response:

import boto3

kendra = boto3.client('kendra')

index_id = '[YOUR_INDEX_ID]'
query_text = 'kendra fre teir hours'
spell_correction_configuration = { 'IncludeQuerySpellCheckSuggestions': True }

response = kendra.query(
  IndexId = index_id,
  QueryText = query_text,
  SpellCorrectionConfiguration = spell_correction_configuration
)

print(response['SpellCorrectedQueries'])

The response from Amazon Kendra now contains the expected spelling suggestions:

[
  {
    'SuggestedQueryText': 'kendra free tier hours', 
    'Corrections': [
      {
        'BeginOffset': 7, 
        'EndOffset': 11, 
        'Term': 'fre', 
        'CorrectedTerm': 'free'
      }, 
      {
        'BeginOffset': 12, 
        'EndOffset': 16, 
        'Term': 'teir', 
        'CorrectedTerm': 'tier'
      }
    ]
  }
]

Process the Amazon Kendra Spell Check response

Now that we’ve gone over how to programmatically get spelling suggestions through either the AWS CLI or AWS SDK, we can examine how we turn the response into a human-readable suggested query. For this example, we use the sample output from the previous section:

[
  {
    'SuggestedQueryText': 'kendra free tier hours', 
    'Corrections': [
      {
        'BeginOffset': 7, 
        'EndOffset': 11, 
        'Term': 'fre', 
        'CorrectedTerm': 'free'
      }, 
      {
        'BeginOffset': 12, 
        'EndOffset': 16, 
        'Term': 'teir', 
        'CorrectedTerm': 'tier'
      }
    ]
  }
]

Each SpellCorrectedQuery has two keys: SuggestedQueryText and Corrections.

  • SuggestedQueryText maps to a string containing the updated query with the suggested spelling corrections.
  • Corrections maps to a list of Correction objects, which contains the beginning and ending offset of the correction, as well as the original term from the query and the spelling suggestion for that term.

For our example, we want to show the suggested query text with the newly suggested terms italicized, similar to what is done on the Amazon Kendra console. To achieve this, we can add HTML italics opening tags <i> at the BeginOffset of each Correction and HTML italics closing tags </i> at the EndOffset of each Correction in the Corrections list. Note that BeginOffset and EndOffset are based on the length of the corrected terms, not the original terms.

Adding the italics tags to SuggestedQueryText gives us the following suggested query text:

kendra <i>free</i> <i>tier</i> hours

As you can see, Amazon Kendra Spell Checker makes it simple to add spell check functionality to your search application.

Conclusion

Spell Checker is a new, powerful feature offered by Amazon Kendra. Spell Checker is a simple, effective way to quickly reduce the number of unhelpful queries by providing spelling suggestions to end-users for misspelled terms.

Spell Checker is available in all AWS Regions where Amazon Kendra is available, and supports all languages currently supported by Amazon Kendra.

To learn more about Amazon Kendra, visit the Amazon Kendra product page.


About the Author

Matthew Peretick is a Software Development Engineer at Amazon Web Services based in New York City. Matthew is a member of the Amazon Kendra team focused on enhancing the Amazon Kendra query experience.

Read More