Loading Brand Data...

Search Revenue Analysis: YSL

Visualizing which catalog attributes contribute the most to search-driven revenue.

The Prioritization Engine: How It Works

This report is powered by a Python system that bridges the gap between customer search queries and your product catalog. It prioritizes attributes not by how *often* they are used, but by how much *revenue* they generate.

Here is the step-by-step logic:

1

Filter Catalog Attributes

The system first parses the entire Product Catalog XML. It applies a crucial filter to ignore "noisy" attributes (e.g., booleans like is-new, online-flag) and focuses only on rich, descriptive content.

2

Build Keyword Index

For all *beneficial* attributes (like long-description), the system reads the text. Using an NLP library (NLTK), it *stems* each word (e.g., "running", "ran" -> run). It builds a massive index mapping each stemmed_keyword to the attributes that contain it.

3

Match Search Queries with Priority Classification

The analysis processes customer search queries with revenue data (from CSV), applying intelligent prioritization:

1. Brand Keyword Check: First checks if query matches brand-specific keywords from database (e.g., "genifique") → 🏷️ Highest Priority
2. Generic Keyword Check: Checks for category keywords (e.g., "serum", "moisturizer") → 🔸 High Priority
3. Token Matching: Stems query words (e.g., "wrinkle cream" → wrinkle, cream) and looks up in reverse index to find matching attributes
4. Value Assessment: Classifies catalog matches as High Volume (80th percentile searches) → 📊, High Performance ($1000+ revenue or 5%+ conversion) → , or Other Catalog → 📝
5. Filter Low Value: Queries below minimum thresholds (<10 searches AND <$50 revenue) are filtered out to focus on meaningful opportunities

Attribute Revenue & Prioritize

When a query matches an attribute, the query's entire revenue is attributed to that attribute. The final report (which this page reads) sums the total revenue for all queries that matched each attribute. This is how long-description ends up with over $93k—it matched thousands of high-revenue queries.

Screenshot of the analysis tool

The Logic: "Before" vs. "After"

Looking at attribute usage count alone is misleading. An attribute might be used on 10,000 products but drive zero revenue. This analysis cross-matches usage with search revenue to find what *truly* matters.

"Before" Analysis

Top 5 Attributes by Usage Count

"After" Analysis

Top 5 Attributes by Revenue Impact

Conclusion: We filter out "noisy" attributes (like c-dimWeight) to focus on revenue-driving attributes (like long-description).

Top 25 Search-Driving Attributes

Deep Dive: The "Why"

Why is 'long-description' #1?

The data shows that broad, text-heavy attributes like long-description and short-description are overwhelmingly the highest revenue drivers.

This is because they contain the highest density of high-intent keywords. Customers searching for specific products (e.g., "advent calendar") or ingredients ("niacinamide") are matched against these rich-text fields.

Top Matched Keywords

(for the #1 attribute)

P.S. Top Unmatched Queries

These are high-revenue queries that did not match any priority catalog attributes. This represents a content gap and an opportunity for future content optimization.

📊 How This Analysis Works

Understanding the methodology behind these insights

🎯 Overview

This analysis identifies which product catalog attributes are most valuable for improving search results and driving revenue. It combines three critical data sources to provide actionable insights:

  • Product Catalog XML - Contains product attributes and their values
  • Library XML - Contains dynamic content (HTML/text) referenced by catalog attributes
  • Search Queries CSV - Real customer search data with revenue, orders, and conversion metrics
  • Keywords Database JSON - Brand-specific and generic keywords for enhanced matching

❌ The Problem

Customers search for products, but many searches return no results because:

  • Product attributes don't contain the words customers are using
  • Important searchable content is missing or poorly structured
  • No clear understanding of which attributes drive the most revenue

✅ The Solution

By analyzing search queries customers actually use (with their revenue), we can:

  • Identify the most valuable attributes to enhance
  • Prioritize content improvements based on revenue impact
  • Discover gaps where customer language doesn't match product data
  • Optimize search indexing by focusing on high-performing attributes

💰 The Gain

📈
Revenue Recovery
Capture lost revenue from searches with no results
🎯
Content Prioritization
Focus resources on attributes that matter most
📊
Data-Driven Decisions
Use actual customer behavior, not assumptions
Measurable Impact
Track revenue, orders, and conversion improvements

🔑 Key Innovation: Dynamic Content Resolution

Many e-commerce catalogs use content assets to avoid duplication. Instead of storing "Reduces wrinkles, brightens skin" on 50 products, they store it once with a content-id like "retinol-benefits" and reference it.

Example Flow:

1. Catalog XML:
Product: "Night Serum"
c-benefits = "serum-benefits-001"
2. Library XML:
content-id: "serum-benefits-001"
body: "Reduces fine lines and wrinkles. Brightens skin."
3. Application Processing:
✓ Resolves ID → actual text
✓ Tokenizes: ["reduce", "fine", "line", "wrinkle", "brighten", "skin"]
✓ Indexes: wrinkle → c-benefits
4. Customer Query: "wrinkle cream"
✓ Matches token "wrinkle" → c-benefits
✓ MATCH! c-benefits is valuable for wrinkle searches

This means we analyze the real content customers search for, not just content IDs, providing accurate attribution of search value to catalog attributes.

🎯 5-Level Prioritization System

Not all search queries are equal. We use sophisticated multi-factor prioritization:

🏷️
Priority 1: Brand Keyword Matches
Queries containing brand-specific keywords show highest purchase intent
Example: "genifique serum" (contains "genifique")
🔸
Priority 2: Generic Keyword Matches
Queries containing category keywords represent product discovery
Example: "anti-aging serum" (contains "serum")
📊
Priority 3: High Volume Catalog
Matches catalog attributes with search volume above 80th percentile
Example: Query with 500+ searches
Priority 4: High Performance Catalog
Matches catalog with high revenue (≥$1000) OR high conversion (≥5%)
Example: Query with $2000 revenue
📝
Priority 5: Other Catalog
Matches catalog with minimum thresholds (10+ searches OR $50+ revenue)
Standard optimization opportunities
🗑️
Filtered Out: Low Value
Below minimum thresholds (<10 searches AND <$50 revenue) - not worth optimization effort

🔄 Analysis Workflow

1
Data Ingestion
Load Library XML, Catalog XML, Search CSV, Keywords DB
2
Content Resolution
Resolve content IDs to actual text, clean HTML, build reverse index
3
Query Matching
Match queries to attributes, classify priority, aggregate financials
4
Generate Insights
Sort by priority & metrics, create comprehensive reports, export to CSV/Excel/HTML

🛠️ Technical Stack

Data Processing
  • • lxml (XML parsing)
  • • pandas (DataFrames)
  • • NLTK (NLP, stemming)
Analysis
  • • Token-based matching
  • • Reverse indexing (O(1))
  • • Financial aggregation
Output
  • • Plotly (visualization)
  • • CSV/Excel export
  • • HTML dashboard

🤖 AI-Powered SFCC Search Enhancements

Beyond catalog analysis, we've implemented AI-driven improvements directly in Salesforce Commerce Cloud to optimize search performance

🛍️

Armani: Stopwords Optimization

Multi-language stopword additions to improve search precision

🇩🇪
German (de)
41 words added
nur, sehr, oft, immer, nie, gern, vielleicht, weil, denn, obwohl, trotzdem, ohne, gegen, um, mir, dir, ihm, ihn, uns, euch, ihnen, war, waren, warst, wart, wurde, wurden, sei, seid, gewesen, habe, hast, hat, haben, habt, hätte, hättest, hätten, hättet
🇬🇧
English (en)
23 words added
just, get, got, gets, getting, many, much, never, always, often, see, saw, seen, say, says, said, go, goes, gone, went, also, ever, really
🇪🇸
Spanish (es)
44 words added
y, o, u, mas, ni, que, a, de, contra, durante, hacia, hasta, mediante, según, más, menos, no, sí, tampoco, ya, hoy, ayer, mañana, siempre, nunca, jamás, ahora, luego, después, me, te, se, le, les, os, mi, tu, mis, tus, esto, eso, aquello, del, al
🇫🇷
French (fr)
11 words added
jamais, or, voire, allez, veux, veut, voulez, voudrais, voudrait, autres, mal
🇮🇹
Italian (it)
70 words added
sui, sugli, sulle, dal, dai, dagli, dalle, nel, negli, nelle, col, coi, mi, ti, si, ci, vi, ne, gli, li, non, più, meno, come, dove, quando, perché, sempre, mai, poi, prima, dopo, troppo, così, già, è, era, eri, eravamo, eravate, erano, ero, se, né, pure, neppure, oppure, i, gli, stata, state, mio, tuo, suo, miei, tuoi, suoi, mia, tua, sua, mie, tue, sue
🇳🇱
Dutch (nl)
56 words added
de, u, jij, jou, jouw, uw, haar, hen, mijn, onze, jullie, ben, bent, zijn, waren, hebt, heeft, hebben, hadden, doe, doet, doen, deed, deden, mag, moet, wil, kun, kunt, kunnen, op, over, onder, naar, voor, achter, tegen, door, zonder, maar, want, dus, hoewel, tenzij, omdat, hier, daar, waar, niet, ja, nee, misschien, nooit, altijd, vaak, soms, erg, zeer, veel, weinig
🔗

Armani: Synonym Groups

AI-generated synonym mappings to capture search intent across languages

🛍️
Brand & Misspellings
Catching common brand variations and typos
giorgio armani, giorgio, armani, mr armani, armaniy, armanie, gorgio armani, georgio armani, ga
Product Attributes & Benefits
Multi-language benefit keywords
hydrating: hydrate, moisturising, moisturizing, idratante, hidratante, feuchtigkeitsspendend
anti-aging: anti aging, antiage, anti-rides, anti-età, antiedad, anti arrugas
glowing: glow, luminous, illuminateur, luminoso, illuminante, ilumina
long-lasting: long wear, longue tenue, lunga tenuta, larga duración, langer halt
🆕
Product Categories
Makeup and skincare sub-categories
primer: base, pre-base, base trucco
blush: blusher, fard à joues, colorete
highlighter: enlumineur, illuminante, iluminador
nail polish: vernis, smalto, esmalte de uñas
🧪
Key Ingredients
Popular skincare and fragrance ingredients
hyaluronic acid: acide hyaluronique, acido ialuronico
retinol: vitamin a, vitamine a, vitamina a
rose: rose extract, extrait de rose, acqua di rosa
jasmine: jasmin, gelsomino, jazmín
🎁
Gifting & Occasions
Seasonal and holiday search terms
valentine's day: saint valentin, san valentino, valentinstag
mother's day: fête des mères, festa della mamma, muttertag
birthday: anniversaire, compleanno, cumpleaños
father's day: fête des pères, festa del papà, vatertag
💄

YSL: Strategic Stopword & Synonym Management

Brand-aware optimization protecting iconic product names

⚠️ High-Risk Words Identified & Protected
English: "the"
Conflict: YSL product "The Bold" Lipstick
Impact: Search for "The Bold" would become "Bold", potentially returning unwanted results
Action: Removed "the" from stopword list
French: "le", "la", "l'"
Conflict: Iconic products "L'Homme", "La Nuit de L'Homme", "Le Teint"
Impact: "L'Homme" → "Homme", "Le Teint" → "Teint"
Action: Removed from stopword list, ensured product data indexed correctly
French: "or" (gold)
Conflict: Key YSL theme word (gold packaging, fragrances)
Action: Intentionally excluded from French stopword additions
✍️ Safe Stopword Additions
After risk analysis, added the same multi-language stopwords as Armani, with brand-specific exclusions noted above.
🇩🇪 German
41 words
🇬🇧 English
23 words
🇪🇸 Spanish
44 words
🇫🇷 French
10 words *
🇮🇹 Italian
70 words
🇳🇱 Dutch
56 words
* Excluded "or" (gold) due to YSL brand importance
🔗 Comprehensive x-default Synonym Rebuild
Complete consolidation: All language-specific lists merged into single x-default with extensive new groups added.
💄
YSL Franchises
black opium, libre, touche eclat, all hours, pure shots, loveshine, rouge volupte, tatouage couture, nu, y, la nuit, mon paris...
📦
Product Categories
lipstick (pintalabios, rouge à lèvres, rossetto, lippenstift), foundation, mascara, serum, moisturizer...
Attributes & Benefits
hydrating (feuchtigkeitsspendend, hidratante, hydratant), matte, radiant, brightening, plumping...
🌿
Key Ingredients
saffron (safran, azafrán, zafferano), hyaluronic acid, glycolic acid, vitamin c, ceramides...
🎁
Gifting & Occasions
mother's day (fête des mères, dia de la madre), valentine's day, christmas, birthday...
🎯 Strategic Advantage
Single x-default list enables cross-language search: German users can find products using French names, Spanish users can search Italian terms, creating a unified multilingual search experience.

📈 Expected Impact

🎯
Better Intent Matching
Multi-language synonyms capture customer search intent across all locales
🔍
Reduced Noise
Smart stopword filtering improves result precision while protecting brand terms
💰
Revenue Recovery
Fewer "no results" searches means more conversions and customer satisfaction