Sitecore Catalog Export for Azure Recommendations API

Azure Recommendations API requires a product catalog snapshot and the transactions history to train a model. This blog post will show you how you can export a Sitecore Commerce reference storefront catalog using PowerShell Extensions.

Bare Minimum

Let’s start small. At a minimum, the Recommendations API needs your SKU #s, product name, and the category name:

1
AAA04294,Office Language Pack Online DwnLd,Office
AAA04303,Minecraft Download Game,Games
C9F00168,Kiruna Flip Cover,Accessories

The following script will give us the data we need:

1
2
3
4
5
6
7
8
9
10
$catalog = '/sitecore/Commerce/Catalog Management/Catalogs/Adventure Works Catalog'
$product = '{225F8638-2611-4841-9B89-19A5440A1DA1}' # Commerce Product Template

$products = Get-ChildItem -Path $catalog -Recurse `
| Where { $_.Template.InnerItem['__Base template'] -like $product }

$products | Select 'Name', `
'__Display Name', `
@{Name = 'Category'; Expression = {$_.Parent.Name}} `
| Sort 'Name' -Unique

The result looks like this:

1
Name        __Display name                   Category
----        --------------                   --------
22565422120 Gift Card                        Departments
AW007-08    Black Diamond Quicksilver II     Carabiners
AW009-08    Black Diamond Quicksilver II     SaleItems
AW013-08    Petzl Spirit                     Adventure Works Catalog
...

Adding Features

Features need to be exported in a special format. Different products in a given catalog may have different features and even have different number of them. Azure solves this by requiring features as a comma separated list of name value pairs:

1
AAA04294,Office Language Pack Online DwnLd,Office,, softwaretype=productivity
BAB04303,Minecraft DwnLd,Games,, softwaretype=gaming, compatibility=iOS, agegroup=all
C9F00168,Kiruna Flip Cover,Accessories,, compatibility=lumia, hardwaretype=mobile

The following addition to the script will add features to the list:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
function ExtractFeatures($product)
{
$fields = $product.Template.OwnFields `
| Where { $_.Name -notlike 'images' -and $_.Name -notlike '*date'} `
| Where { $product[$_.Name] -ne '' }

$features = @()
foreach($field in $fields)
{
$features += @{Name = $field.Name; Value = $product[$field.Name]}
}

return $features
}

function ApplyFeatures($product)
{
foreach($feature in $product.Features)
{
$product | Add-Member -Name $feature.Name `
-MemberType NoteProperty `
-Value "$($feature.Name)=$($feature.Value)"
}

$product.PSObject.Properties.Remove('Features')

return $product
}

# ... (see above)

$products | Select 'Name', `
'__Display Name', `
@{Name = 'Category'; Expression = {$_.Parent.Name}}, `
'Description', `
@{Name = 'Features'; Expression = {ExtractFeatures($_)}} `
| Sort 'Name' -Unique `
| %{ ApplyFeatures $_ }

PSObject is a dynamic type that you can modify on the fly. First, I extracted a collection of features into a new Features property. Then I applied features to become new properties on the product object. CSV export will be able to pick it up transparently. I hope.

CSV

It should now be easy to export the list as CSV. There’s a caveat though.

Both ConvertTo-CSV and Export-CSV will happily export the list for you but will normalize every record to the common set of fields.

You won’t see the features in the list. Here’s a trick to get every product in the export have its own features:

1
2
3
4
5
6
7
8
9
10
# ... (see above)

$products | Select 'Name', `
'__Display Name', `
@{Name = 'Category'; Expression = {$_.Parent.Name}}, `
'Description', `
@{Name = 'Features'; Expression = {ExtractFeatures($_)}} `
| Sort 'Name' -Unique `
| %{ ApplyFeatures $_ } `
| %{ ConvertTo-CSV -InputObject $_ -NoTypeInformation | Select -Skip 1 }

Instead of piping the entire set to the ConvertTo-CSV, I basically processed the list one by one in the foreach loop. I also removed the type info and the CSV headers. Azure doesn’t need labels anyway. Works like a charm!

1
"AW007-08","Black Diamond Quicksilver II","Carabiners","Straight","BasePrice=10.0000"
"AW009-08","Black Diamond Quicksilver II","SaleItems","Straight"
"AW013-08","Petzl Spirit","Adventure Works Catalog","Straight"
"AW014-08","Petzl Spirit","Carabiners","Straight","BasePrice=14.0000"
"AW029-03","Women's woven tee","Shirts","Short-sleeve, breathable henley, 100% cotton knit","BasePrice=35.0000","Brand=Litware"

Commas and Quotes

There’s one more thing that I needed to do for Azure Recommendations API to absorb the catalog. As you could tell, the catalog format is not exactly CSV. Every line can have different number of fields basically. Neither does Azure backend use CSV parsing to read it.

The double quotes in the export above were taken literally. Azure would think that the SKU # is "AW007-08", for example. And then the commas in the descriptions where messing up the parsing as well. My next post will be about the Recommendations API itself and I will write more about it, but here’s the final version that produces a clean catalog export ready to go:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
function ExtractFeatures($product)
{
# ... (see above)
}

function ApplyFeatures($product)
{
# ... (see above)
}

function CleanUpCommas($product)
{
foreach ($prop in $product.PSObject.Properties)
{
$src = $product.PSObject.Members[$prop.Name].Value
$product.PSObject.Members[$prop.Name].Value = $src -replace ",", ";"
}

return $product
}

function CleanUpQuotes($line)
{
return $line -replace """", ""
}

# ... (see above)

$products | Select 'Name', `
'__Display Name', `
@{Name = 'Category'; Expression = {$_.Parent.Name}}, `
'Description', `
@{Name = 'Features'; Expression = {ExtractFeatures($_)}} `
| Sort 'Name' -Unique `
| %{ ApplyFeatures $_ } `
| %{ CleanUpCommas $_ } `
| %{ ConvertTo-CSV -InputObject $_ -NoTypeInformation | Select -Skip 1 } `
| %{ CleanUpQuotes $_ }

Got to love PowerShell. Cheers!