In my previous post, I compared SitecoreAI Pathway with the old XM to XM Cloud Migration Tool at a high level. In this post, I am going deeper — walking through the Sitecore Website migration path step by step, with screenshots from my actual testing.
This is the path most Sitecore developers and administrators will follow when migrating from an existing Sitecore XM or XP instance to SitecoreAI (XM Cloud). It involves extracting source data using the XMComponentExtraction tool, exporting your target site structure, and letting Pathway's AI handle the content audit, mapping, and migration.
I am sharing this as someone still exploring the tool — these are my hands-on findings, not expert advice. If you spot something I have missed or interpreted differently, I would love to hear from you.
Prerequisites Before You Begin
Before starting the migration in Pathway, make sure these are in place:
- Target SitecoreAI site structure — Your target environment should already have templates, renderings, component templates, and page designs configured. Pathway maps content to existing structures; it does not create new ones.
- Media assets migrated — All in-scope media should be moved from source to target first. You can use the XM to XM Cloud Migration Tool for this, as Pathway does not handle media library migration for the Sitecore website path.
- .NET 9.0 Runtime — Required on the machine where you will run the XMComponentExtraction console app.
- PowerShell 7+ — Required for running the export-structure.ps1 script.
- CM access — You need access to the source Sitecore CM instance for deploying the extraction handler and running the tool.
Step 1 — Select CMS (Sitecore Website)
After installing Pathway from the Sitecore Cloud Portal Marketplace, click on it from the Apps section on your Home page. You will land on the migration creation screen.
Select "Sitecore website" as your source. The description says "Extract content from a Sitecore XM website." Give your migration a name and description, then click Next.
Step 1 — Selecting "Sitecore website" as the source and entering migration details
Notice there is also an "Any website" option here — that is the web crawling path I will cover in the next blog post.
Step 2 — Configure SitecoreAI Instance
Step 2 is the most involved part of the setup. There are three things happening on this screen:
1. Configure target environment — On the left, select your target SitecoreAI environment and site. The site path auto-populates based on your selection and gets locked after you initialize the migration.
2. Upload target SitecoreAI structure — In the middle section, upload your xmc-structure.json file. This tells Pathway what templates, renderings, and page designs exist in your target environment so it knows what to map to.
3. Generate storage for source data — Below that, Pathway generates an Azure Blob Storage URL. This is where your extracted source data will be uploaded. Copy this URL — you will need it when running the XMComponentExtraction tool.
Step 2 — Configuration screen with target environment, structure upload, and Azure Blob Storage URL
Click "Migration Initialized" to start. You will see a confirmation message "Migration initialized successfully." The Migration Summary panel on the right shows all the details.
When you click "Read instruction here" next to the SitecoreAI Structure section, a popup appears with instructions to download the CMSExportStructure package.
Download CMSExportStructure package — this generates the xmc-structure.json file
Preparing the Target Structure (xmc-structure.json)
This is a step that happens outside of Pathway. You need to generate the xmc-structure.json file that describes your target site's structure.
The approach I followed:
- Create a Sitecore package from the target SitecoreAI environment containing the relevant items — Components, Pages, Renderings, and Presentation.
- Download and extract the package zip file.
- In the extracted folders, you will see GUID-named subfolders — this is the standard Sitecore serialization format. Each folder represents an item.
- Place these in the corresponding folders expected by the export script.
- Run the PowerShell script:
pwsh -NoProfile -ExecutionPolicy Bypass -File .\export-structure.ps1
Extracted folder structure — Components, Pages, Presentation, Renderings with GUID-named item folders
The script reads these serialized items and outputs the xmc-structure.json file. Here is what the JSON structure looks like:
xmc-structure.json — contains Pages, Renderings, and PageDesignMappings from the target site
The JSON contains four main sections: Pages (page templates with their IDs and fields), Renderings (components like "Related Blog Articles" and "Reviews" with their template types), PageDesignMappings (linking page templates to page designs), and PageDesigns.
Alternatively, if you are using Sitecore Content Serialization (SCS), you can serialize items directly instead of creating a Sitecore package.
Once the JSON is ready, upload it back in the Pathway configuration screen. You will see a confirmation: "SitecoreAI structure file uploaded successfully."
Structure uploaded successfully — note the confirmation message and the Migration Summary showing "Uploaded" status
Extracting Source Data — XMComponentExtraction
Now for the source side. The XMComponentExtraction tool extracts page data as JSON from your Sitecore XM/XP instance. It supports On-Prem, PaaS, and Container environments.
The tool has two parts:
- ExtractorHandler.ashx — A handler that exposes endpoints using the Sitecore Item API on the CM server.
- Extractor.App.Con.exe — A .NET 9.0 console application that calls the handler and uploads extracted data to Azure Blob Storage.
Important prerequisite: Before running the extraction, update the Sitecore.Services.SecurityPolicy setting in .\App_Config\Sitecore\Services.Client\Sitecore.Services.Client.config to ServicesOnPolicy. This grants access to Entity and Item Services that the handler needs. Remember to revert to ServicesLocalOnlyPolicy after extraction is complete — this is a security consideration.
To install the handler, copy ExtractorHandler.ashx to your CM's wwwroot/sitecore/admin directory. Verify it works by navigating to: /sitecore/admin/ExtractorHandler.ashx?action=getallsitenames
Then run the console app with the required parameters:
.\Extractor.App.Con.exe --cmHostName=sc1041cm.dev.local --userName=admin --password=b --uploadUrl="<Azure Blob SAS URL from Pathway>"
The tool prompts you to select the site name and enter the language code. Then it processes the items and uploads them to Azure Blob Storage.
XMComponentExtraction console output — login, site selection (website), language (en), processing, and "Extractor App Ended"
Once this completes, the source page data is in Azure Blob Storage. From this point, Pathway handles everything within the SitecoreAI environment — you do not need to keep your source system connection active.
Step 3 — Content Audit
Back in Pathway, Step 3 is where the AI takes over. There are two actions here: Grouping and Template Mapping.
Grouping — Click the Grouping button and the AI analyzes all extracted pages, grouping them by source template. In my test with a simple demo site, it found 1 group ("Sitecore Experience Hub") with 1 page (Home). For real-world sites with hundreds of pages, this is the step where Pathway's value becomes clear — instead of mapping every page individually, the AI identifies common patterns and groups similar pages together.
Template Mapping — Next, click Template Mapping. The AI matches each group to the most appropriate target template. You can click "View template match details" to see the AI's reasoning — why it chose a particular mapping based on page structure and content. In my case, the "Sitecore Experience Hub" group was mapped to the "Page" target template.
Content Audit — Grouping completed (1 group, 1 page), Template Mapping completed, showing "Sitecore Experience Hub" group
Step 4 — Map Content
In Step 4, the AI maps source components to target SitecoreAI components. You can review the suggested mappings and adjust them manually if needed.
Clicking "View page details" shows you the specifics — the page path, page ID, template name, and template ID. This helps you verify that the mapping makes sense for your content model.
Map Content — Component Mapping completed, with Page details showing the Home page mapped to "Page" template
Step 5 — Migration
The final step — click the Migration button and watch the real-time dashboard.
Here is where I have to be honest about my experience. In my test with a simple demo site (1 page), the migration result was 0 succeeded, 1 failed. The Home page at /sitecore/content/Home failed to migrate.
Migration result — 0 succeeded, 1 failed. The Home page did not migrate successfully.
I want to be transparent about this because it reflects the reality of working with a tool that is still in beta. The failure could be due to several factors — how the demo site was set up, a mismatch in the component mapping, or a limitation in the current version. This is a simple test site and not a complex real-world scenario, so the failure might be specific to my setup.
In contrast, my "Any Website" migration (using the web crawling path with my blog nehemiahj.com) migrated all 50 pages successfully — 50 succeeded, 0 failed. I will cover that walkthrough in my next post.
What I Learned
A few takeaways from this walkthrough:
The prerequisite setup takes time. Between deploying the XMComponentExtraction handler, changing SecurityPolicy settings, preparing the target structure JSON, and managing the Azure Blob URL, there are several moving parts. Plan for this upfront rather than expecting it to be quick.
The CMSExportStructure step needs attention. Getting the right items serialized and placed in the correct folder structure for the PowerShell script is important. If your JSON is incomplete or missing renderings, the AI will not have enough information to map correctly.
AI reasoning is available but not customizable. You can see why the AI made certain mapping decisions through the "View template match details" option. But you cannot guide or instruct the AI — for example, telling it "these are product pages, not blog posts." This is something I hope Sitecore adds in a future update.
Failures happen and that is okay. The tool is in beta. Not every migration will succeed on the first run, and understanding why it failed is part of the learning process. Post-migration review and refinement should be expected.
The Security Policy change is easy to forget. Reverting ServicesOnPolicy back to ServicesLocalOnlyPolicy after extraction is a security step that should not be skipped. Consider adding it to your migration checklist.
Up Next
In the next post, I will walk through the "Any Website" migration path — using Pathway's built-in web crawler to migrate content from my blog (nehemiahj.com). That test had a much better outcome (50/50 success), and the web crawling approach is simpler to set up since it does not require XMComponentExtraction or Azure Blob Storage.
Useful Links:
No comments:
Post a Comment