Automate Sitecore Search Updates with Azure Functions & Sitecore Experience Edge Webhooks

Amit KumarAmit Kumar
15 min read

In my previous blog post: Creating Azure Function Webhooks for Sitecore XM Cloud, I explained how to create Azure Function Webhooks for Sitecore XM Cloud integration. We covered the fundamentals of Azure Functions, webhooks, and the step-by-step process of setting up and registering webhooks with Experience Edge.

Keeping your Sitecore Search index up-to-date with the latest content from Sitecore XM Cloud is crucial for delivering accurate and relevant search results. This article demonstrates how to automate near to real-time content updates using Azure Function Webhooks with Sitecore Experience Edge, use GraphQL queries to extract published content, and push it to Sitecore Search using the Sitecore Search Ingestion API. 🔝

🧠 How Experience Edge OnUpdate Events Trigger Azure Webhooks

When content is published to XM Cloud, Sitecore's Experience Edge can send webhook notifications using its OnUpdate event. These events can be directed to Azure Functions, which are set up to listen for these triggers and respond appropriately.

The Azure Function receives the update, extracts important identifiers, and starts the process of content enrichment and re-indexing. 🔝

When Azure Function integrated with Sitecore XM Cloud:

  • Instant Updates: Get notifications through webhooks when content changes in XM Cloud CMS.

  • Automated Indexing: Push updates to Sitecore Search without manual intervention.

  • Scalability: Serverless architecture ensures cost-efficiency and scalability.

➡️Data Flow Architecture

Before we get into the implementation details, let's first understand the overall data flow:

  1. Content is published in Sitecore XM Cloud

  2. Experience Edge gets the published content

  3. Experience Edge sends an "OnUpdate" event to our Azure Function webhook

  4. The Azure Function handles the event data

  5. The Function uses the Experience Edge GraphQL API to get full content details

  6. The Function updates the matching document in Sitecore Search using the Ingestion API

This setup is shown in the diagram below: 🔝

🛠️Implementing the Solution

Step 1: Validate the OnUpdate Event Data

When your Azure Function receives an "OnUpdate" event from Experience Edge, it contains important information about the updated content. We need to check if we have received the expected attributes. Specifically, we will validate that the webhook data includes updates and entity_definition as LayoutData. Here is a sample webhook response: 🔝

{
  "invocation_id": "c5e5be25-7bcd-473a-a8e4-438c3664ce91",
  "updates": [
    {
      "identifier": "02BEB417C6DD4DF18E2518D771A16325", 
       "entity_definition": "Item",
      "operation": "Update",
      "entity_culture": "fr-CA"
    },
    {
      "identifier": "AmitKumar",
      "entity_definition": "SiteInfo",
      "operation": "Update",
      "entity_culture": "all"
    },
    {
      "identifier": "02BEB417C6DD4DF18E2518D771A16325",
      "entity_definition": "Item",
      "operation": "Update",
      "entity_culture": "en"
    },
    {
      "identifier": "8B0A6A3024F94EAEA631E7494C3EA648-layout",
      "entity_definition": "LayoutData",
      "operation": "Update",
      "entity_culture": "fr-CA"
    },
    {
      "identifier": "8B0A6A3024F94EAEA631E7494C3EA648",
      "entity_definition": "Item",
      "operation": "Update",
      "entity_culture": "en"
    },
    {
      "identifier": "8B0A6A3024F94EAEA631E7494C3EA648-layout",
      "entity_definition": "LayoutData",
      "operation": "Update",
      "entity_culture": "en"
    }
  ],
  "continues": false
}

Here's how to validate the key details: 🔝

        [FunctionName("ReadJSON")]
        public static async Task<IActionResult> Run(
            [HttpTrigger(AuthorizationLevel.Function, "post", Route = "ReadJSON")] HttpRequest req,
            ILogger log)
        {
            log.LogInformation($"Executing function > ReadJSON: {nameof(Run)}");
            string jsonString = string.Empty;
            try
            {
                string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
                dynamic jObject = JObject.Parse(requestBody);
                jsonString = Convert.ToString(jObject);


                // Read the request body
                var receivedPageLayoutData = JsonConvert.DeserializeObject<WebhookRequestBody>(requestBody);

                // Validate the request body
                if (receivedPageLayoutData?.Updates == null || !receivedPageLayoutData.Updates.Any())
                {
                    log.LogWarning("No updates found.");
                    return new BadRequestObjectResult("No updates found.");
                }

                // Extract layout data updates
                var pageLayoutData = receivedPageLayoutData.Updates
                    .Where(update => string.Equals(update.ItemType, "LayoutData", StringComparison.OrdinalIgnoreCase) &&
                                     string.Equals(update.ItemEventType, "Update", StringComparison.OrdinalIgnoreCase))
                    .Select(update => new UpdateLayoutResponse
                    {
                        ItemID = update.ItemID.Replace("-layout", ""),
                        ItemLanguage = update.ItemLanguage,
                        // Convert string to GUID format
                        ItemGUID = Guid.TryParse(update.ItemID.Replace("-layout", ""), out Guid parsedGuid)
                            ? parsedGuid.ToString("D")
                            : string.Empty
                    })
                    .ToList();

                // Log "Page Layout Data not found." and return as BAD request
                if (pageLayoutData == null || !pageLayoutData.Any())
                {
                    log.LogWarning("Page Layout Data not found.");
                    return new BadRequestObjectResult("Page Layout Data not found.");
                }
                // After this get the data from Experience Edge GraphQL Delivery API.
            }
            catch (Exception ex)
            {
                return new BadRequestObjectResult("Error processing request " + ex.Message);
            }

            return new OkObjectResult(jsonString);
        }
    }
}

In the code block above, you need to create a Data Model to receive the webhook response. This allows you to access and validate the updates and entity_definition attributes from the response: 🔝

Data Model to map the Experience Edge Webhook response:

public class UpdateLayoutResponse
{
    [JsonProperty("identifier")]
    public string ItemID { get; set; }

    [JsonProperty("entity_culture")]
    public string ItemLanguage { get; set; }

    [JsonProperty("entity_definition")]
    public string ItemType { get; set; }

    [JsonProperty("operation")]
    public string ItemEventType { get; set; }

    /// <summary>
    /// Item ID string in GUID Format
    /// </summary>
    public string ItemGUID { get; set; }
}

public class WebhookRequestBody
{
    [JsonProperty("updates")]
    public List<UpdateLayoutResponse> Updates { get; set; }
}

Step 2: Query Experience Edge GraphQL API

Once you have the item ID (ItemGUID), you need to get the necessary Sitecore Item field details. You will send these details to Sitecore Search using the Experience Edge GraphQL Delivery API.

You might receive multiple updated page items from the "OnUpdate" event from Experience Edge. Therefore, you need to loop through all the page items and create the Experience Edge GraphQL Search Query for the published items from Experience Edge. 🔝

Before running the XE GQL Query, you need the sc_apikey and XE GQL URL endpoint You can obtain it by logging into your XM Cloud Instance, then navigating to XM Cloud Deploy App ➡️ Project ➡️ Environment ➡️ Project Details Tab, and generating the Delivery API token:

                // After this get the data from Experience Edge GraphQL Delivery API.
                string xeSearchGQLQuery = string.Empty;
                foreach (var item in pageLayoutData)
                {
                    xeSearchGQLQuery = "Replace the <item.ItemID> and <BASE PAGE LAYOUT ID> placeholder and store the actual GQL Query ";
                    // Append the string for the current Identifier GUID
                    // Build the XE GraphQL Search query like this and replace your item.ItemID
                    /* ========== GraphQL Search Query ==========================
                    query SearchQuery {
                        search(
                           where: {
                        AND:
                            [

                        {
                            OR:
                                [
                                   { name: "_path", value: "<item.ItemID>", operator: CONTAINS },
                        ] }
                        { name: "_language", value: "en" }
                            {
                               name: "_templates"
                               value: "<BASE PAGE LAYOUT ID>"
                               operator: CONTAINS
                            }                     
                       ]
                     }
                    first: 2
                       ) {
                            total
                            results {
                              id(format: "D")
                              name
                       template {
                                    id(format: "D")
                         name
                       }
                                path
                                language { name }
                                title: field(name: "Title") { jsonValue }
                                description: field(name: "content") { jsonValue }
                            }
                        }
                    }
                    ============================================================
                     */

                    log.LogInformation($"Executing function GQL Query");
                    try
                    {
                        using var client = new HttpClient();

                        // Set the API key in the request header
                        client.DefaultRequestHeaders.Add("sc_apikey", "<Experience Edge Key>");

                        // Wrap the GraphQL query in a JSON object
                        var xeRequestBody = new
                        {
                            query = xeSearchGQLQuery
                        };

                        // Serialize the request body to JSON
                        var xeQueryJSON = System.Text.Json.JsonSerializer.Serialize(requestBody);

                        // Create the JSON content for the POST request
                        var xeQueryContent = new StringContent(xeQueryJSON, Encoding.UTF8, "application/json");

                        // Send the POST request
                        var xeQueryResponse = await client.PostAsync("https://edge.sitecorecloud.io/api/graphql/v1", xeQueryContent);

                        // Ensure the response is successful
                        xeQueryResponse.EnsureSuccessStatusCode();

                        // Read the response content
                        var xeQueryResponseResults = await xeQueryResponse.Content.ReadAsStringAsync();

                        // Deserialize the JSON response to the Root object
                        var xeQueryresponseRoot = System.Text.Json.JsonSerializer.Deserialize<ResultRoot>(xeQueryResponseResults, new JsonSerializerOptions
                        {
                            PropertyNameCaseInsensitive = true
                        });

                        if (xeQueryresponseRoot != null && xeQueryresponseRoot.Data != null && xeQueryresponseRoot.Data.Search != null)
                        {
                            // Process the search results
                        }
                        else
                        {
                            log.LogWarning("No search results found.");
                        }
                    }
                    catch (HttpRequestException httpEx)
                    {
                        // Log and handle HTTP-specific exceptions
                        log.LogError($"HTTP Request Error: {httpEx.Message}");
                        throw;
                    }
                    catch (System.Text.Json.JsonException jsonEx)
                    {
                        // Log and handle JSON deserialization errors
                        log.LogError($"JSON Deserialization Error: {jsonEx.Message}");
                        throw;
                    }
                    catch (Exception ex)
                    {
                        // Log and handle any other exceptions
                        log.LogError($"An error occurred: {ex.Message}");
                        throw;
                    }
                }
💡
In the code block above, we are dynamically creating the Experience Edge GraphQL Search Query for published page items. We can send more than one page item ID to get a response from XE in a single call, which helps reduce our XE API call quota: 👇 🔝
query SearchQuery {
  search(
  where: {
    AND: [

      {
        OR: [
          {
            name: "_path",
            value: "<item1.ItemID>",
            operator: CONTAINS
          },
          {
            name: "_path",
            value: "<item2.ItemID>",
            operator: CONTAINS
          },
        ]
      }
      {
        name: "_language",
        value: "en"
      }
      {
        name: "_templates"
        value: "<BASE PAGE LAYOUT ID>"
        operator: CONTAINS
      }
    ]
  }
  first: 2
  ) {
    total results
    {
      rendered
      id(format: "D"
      ) name
      template {
        id(format: "D"
        ) name
      }
      path language
      { name }
      title:
      field(name: "Title"
      ) { jsonValue }
      description:
      field(name: "content"
      ) { jsonValue }
    }
  }
}

After running the above query, you will receive a response in the following format from Experience Edge GraphQL Delivery API. You need to map this response to your custom data model for further processing: 🔝

{
  "data": {
    "search": {
      "total": 35,
      "results": [
        {
          "id": "af81673a-e745-40b2-8ea7-34e0589693a7",
          "name": "business",
          "template": {
            "id": "1226c3c7-2d1f-48f5-87b4-c1da459f05e5",
            "name": "Content Page"
          },
          "path": "/sitecore/content/Verticals/Financial/Home/business",
          "language": {
            "name": "en"
          },
          "title": {
            "jsonValue": {
              "value": "Business"
            }
          },
          "description": {
            "jsonValue": {
              "value": "<p>Support for you and your business. Guaranteed.</p>"
            }
          }
        },
        {
          "id": "b2a38fb8-2e85-4d58-afe5-c12beccbee3a",
          "name": "Home",
          "template": {
            "id": "5282d418-6c18-46eb-85af-9116cde9e6eb",
            "name": "Full Width Page"
          },
          "path": "/sitecore/content/Verticals/Financial/Home",
          "language": {
            "name": "en"
          },
          "title": {
            "jsonValue": {
              "value": "PLAY! Financial"
            }
          },
          "description": {
            "jsonValue": {
              "value": ""
            }
          }
        }
      ]
    }
  }
}
You might get more than one result for the Page Item ID (e.g., 5282d418-6c18-46eb-85af-9116cde9e6eb). Before creating the JSON payload for your Sitecore Search Ingestion API, make sure to match the Item ID you received (from the XE Delivery API response) with the identifier value from the Sitecore Experience Edge webhook response. This way, you won't process unnecessary Page items in the Sitecore Search. 🔝

Step 3: Update Sitecore Search via Ingestion API

With the content details retrieved, you can now update the corresponding document in Sitecore Search using the Ingestion API. Use the PATCH method to partially update the resource with the provided data (only the specified fields are updated), or use the PUT method to replace the entire resource with the provided data. 🔝

Currently, the Sitecore Search Ingestion API does not support BATCH (bulk) updates. So, we need to go through each Page Item one by one. For each item, create a JSON Payload to update the fields in the Sitecore Search Document with the actual values from the XE Delivery API response. Here is the code block (assuming you have mapped the XM Delivery API response with the custom data model):

                            // Process the search results
                            foreach (var result in xeQueryresponseRoot.Data.Search.Results)
                            {
                                // Do something with each result
                                log.LogInformation($"Result ID: {result.Id}, Name: {result.Name}");
                                // You can access other properties of the result as needed
                                // For example, if you want to log the title:
                                // log.LogInformation($"Title: {result.Title?.JsonValue?.Value}");

                                // Now, you can build the JSON object to send to Sitecore Search Ingestion API for update with required meta data where we considered that item id is the documentId and stored as
                                // 68101283-41a8-4712-ba00-58fbfa7fec1c
                                // The Search Ingestion API endpoint is in the format: "https://discover.sitecorecloud.io/ingestion/v1/domains/{searchDomain}/sources/{searchSourceIds}/entities/{entity}/documents/{documentId}?locale={locale}"
                                // and to patch {68101283-41a8-4712-ba00-58fbfa7fec1c}, the URL is as follows - where 1234567890 is the domain ID and 09876543 is the source ID:
                                // https://discover.sitecorecloud.io/ingestion/v1/domains/1234567890/sources/09876543/entities/Content/documents/{68101283-41a8-4712-ba00-58fbfa7fec1c}?locale=en

                                try
                                {
                                    // Replace placeholders in the URL
                                    string searchIngestionApiUrl = "https://discover.sitecorecloud.io/ingestion/v1/domains/1234567890/sources/09876543/entities/Content/documents/{68101283-41a8-4712-ba00-58fbfa7fec1c}?locale=en";

                                    // Replace the placeholders with actual values
                                    string searchIngestionAPIQuery = @"
                                        {
                                          ""document"": {
                                            ""id"": ""{result.Id}"",
                                            ""fields"": {
                                              ""description"": ""{result.description}"",
                                            }
                                          }
                                        }";

                                    // Create the HTTP request
                                    // PATCH: Partially updates the resource with the provided payload (only the specified fields are updated).
                                    // PUT: Replaces the entire resource with the provided payload.
                                    var searchIngestionAPIrequest = new HttpRequestMessage(HttpMethod.Patch, searchIngestionApiUrl)
                                    {
                                        Headers = { { "Authorization", "{Sitecore Search API Ingestion Key Value}" } },
                                        Content = new StringContent(searchIngestionAPIQuery, Encoding.UTF8, "application/json")
                                    };

                                    using var searchIngestionAPIClient = new HttpClient();
                                    var searchIngestionAPIResponse = await client.SendAsync(searchIngestionAPIrequest);

                                    // Check the response status
                                    if (searchIngestionAPIResponse.IsSuccessStatusCode)
                                    {
                                        return new OkObjectResult("Successfully updated document");
                                    }
                                    else
                                    {
                                        return new BadRequestObjectResult("Error processing search request request");
                                    }
                                }
                                catch (Exception ex)
                                {
                                    return new BadRequestObjectResult("Error processing search request: " + ex.Message);
                                }
                            }
📣
To update in Sitecore Search, you need the unique Sitecore Search Document ID, which is ideally added as the Sitecore Item ID during initial indexing, but if not, you must call the Sitecore Search API to retrieve the Search Document ID using the Sitecore Item fields for the Page Item you want to update. 🔝

The Search Ingestion API is in the format https://discover.sitecorecloud.io/ingestion/v1/domains/{searchDomain}/sources/{searchSourceIds}/entities/{entity}/documents/{documentId}?locale={locale}

To do this, we need to gather the required information to replace in the URL above and obtain the API Key.

PlaceholderDetails
{searchDomain}This is your Sitecore Search Domain ID. You can find it by logging into the Sitecore Search CEC, then going to Administration ➡️ Domain Settings ➡️ General Settings ➡️ Domain Information ➡️ DOMAIN ID.
{searchSourceIds}This is your Sitecore Search Source ID. You can find it by logging into the Sitecore Search CEC, then going to Sources. There, you will see a list of available Sitecore Search Sources. Based on your needs, choose the Source content you want to update, and then you can select the SOURCE ID.
{entity}In Sitecore Search, an entity is a template for the document that has fields. You will add field values that you want to be indexed in Sitecore Search. When creating the Sitecore Search Source, you will add the Tags Definition and select specific Entities, like the content entity.
{documentId}This is the unique id for indexed document at Sitecore Search, and assume that you will be having Sitecore ITEM ID as document id.
{locale}This is the language of the indexed document in Sitecore Search.
{Sitecore Search API Ingestion Key Value}This is your Sitecore Search Ingestion API Key value. You can find it by logging into the Sitecore Search CEC, then going to Developer Resources ➡️ API Access ➡️ API Keys Section > ingestion. 🔝
🕵
To retrieve or search items in Sitecore Search, you need the Event/Discover key from Developer Resources ➡️ API Access ➡️ API Keys Section.

Understanding Locale Conversion for Sitecore Search API Integration

The Locale Identifier Format for Sitecore Search is ${language}_${country}. For example, if the Sitecore Item language is stored as fr-CA in XM Cloud CMS, the Experience Edge Webhook will return the entity_culture value as fr-CA, where "fr" is the French language and "CA" is the Canada country. When updating the document in Sitecore Search using the Sitecore Search Ingestion API, you should send the locale as fr_CA, not fr-CA. Make sure to handle this locale conversion in your implementation.

Step 4: Putting It All Together

You can find all the code used in this blog on GitHub. It includes a complete Azure Function that processes the Experience Edge Webhook response, validates the JSON response, queries the Experience Edge GraphQL API to get the necessary Sitecore Item details, and updates Sitecore Search. To access the code, please scan the QR code below. 🔝

⚠️ Challenges and Troubleshooting

While setting up this workflow, I faced these challenges:

  • Correctly mapping webhook payload fields to your search index schema.

  • Mapping complex content structures from GraphQL.

  • Handling authentication and authorization for both GraphQL and Ingestion API endpoints.

  • Handling rate limits and retries for failed PATCH requests.

  • Ensuring that each update happens only once and avoiding duplicate updates. 🔝

Pro tip: Addressing these issues requires careful error handling, robust logging, and thorough testing of your Azure Function logic.

✅Best Practices and Considerations

Error Handling and Retry Logic

When implementing this solution, it's important to have strong error handling and retry logic. Network problems or temporary service outages shouldn't cause updates to be lost.

Performance Optimization

For high-traffic sites, consider using batching logic to manage multiple updates efficiently.

  1. Queue incoming webhook events

  2. Process them in groups at regular times 🔝

Security Considerations

Always make sure your Azure Function and API keys are well-secured:

  1. Store API keys in Azure Key Vault.

  2. Use managed identities for Azure resources.

  3. Implement proper authentication for your webhook endpoints.

🔚Conclusion

By using Azure Functions, Experience Edge webhooks, Experience Edge Delivery API, and Sitecore Search Ingestion API, you can automate content updates almost in real-time. This method cuts down on manual work, boosts accuracy, and improves user experience. 🔝

Thank you to the Sitecore community and the Sitecore Documentation Team for providing invaluable guidance and resources throughout this process. 🙏

Next Steps:

  • Test your Azure Function locally before deploying.

  • Monitor logs to ensure seamless data flow.

  • Explore advanced GraphQL queries for complex content structures.

🙂
Are you ready to dive into this implementation? I'd love to hear your thoughts! 💬 Feel free to share your feedback or any questions you might have in the comments below. And if you enjoy this content, consider subscribing 📰 for more updates and insights. Your engagement means the world to me and helps me continue providing valuable resources! 🌟

🧾Credit/References

🔗Pingback

Leveraging Experience Edge Webhooks for Sitecore: Learn how webhooks enhance Sitecore implementations.Triggering ISR Revalidation with XM Cloud Webhooks: Automate cache updates using webhooks.Real-Time Search Updates via Experience Edge: Sync search indexes instantly with webhooks.
On-Demand ISR After Publishing to Edge: Revalidate Vercel cache dynamically.Item-Based Webhooks in Sitecore 10.3: Fixes and tips for webhook events.On-Demand ISR in XM Cloud & Next.js: Boost performance with dynamic updates.
Incremental Search Updates in XM Cloud: Best practices for partial indexing.Sitecore Search Ingestion API Guide: PATCH method for document updates.Experience Edge GraphQL API: Query real-time content efficiently.
Sitecore XM Cloud Overview: Headless CMS for modern experiences.Leveraging Experience Edge Webhooks for High-Quality Sitecore Implementations: Explore how webhooks improve real-time content sync in Sitecore XM Cloud.Triggering ISR Revalidation Using Webhooks in Sitecore XM Cloud: Automate cache updates with Experience Edge webhooks for faster delivery 🔝
0
Subscribe to my newsletter

Read articles from Amit Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Amit Kumar
Amit Kumar

My name is Amit Kumar. I work as a hands-on Solution Architect. My experience allows me to provide valuable insights and guidance to organizations looking to leverage cutting edge technologies for their digital solutions.As a Solution Architect, I have extensive experience in designing and implementing robust and scalable solutions using server-side and client-side technologies. My expertise lies in architecting complex systems, integrating various modules, and optimizing performance to deliver exceptional user experiences. Additionally, I stay up-to-date with the latest industry trends and best practices to ensure that my solutions are always cutting-edge and aligned with business objectives.