Decoration Circle
Advanced SEO Textbook

Creating An eCommerce Strategy

In this chapter we take a detailed look at the various components that make up a solid eCommerce strategy.

Topic Details
Clock icon Time: 80
Difficulty Hard

In this chapter we’ll be diving into the most important areas of SEO that should be taken into account when putting together your eCommerce SEO strategy. We won’t only be looking at why they’re important and how you should implement them, but also delve deeper into understanding how Google views and approaches the different facets.

We’ll be taking a look at the most significant aspects of on page and content SEO for eCommerce sites: from utilising your blog to maximise visibility to considerations of user intent and internal links. We’ll also be reviewing how technical SEO plays into your eCommerce site’s strategy, assessing the importance of site architecture, schema markup, pagination and faceted navigation.

On Page & Content

Blog Content

An often overlooked and undervalued part of eCommerce SEO is blog optimisation and expansion.

Organic traffic from blog content is often given a lower priority as this kind of content doesn’t directly translate into sales.

However, there are many SEO benefits of having an extensive blog that will have a positive impact on your bottom line.

There are three main SEO reasons for building out a high quality and relevant blog.

Content Silos and Content Authority

Your blog can be used as a means of providing supporting content to help develop content silos around your eCommerce category pages. eCommerce categories are limited in the scope of what they can cover, in that they only target keywords that are directly relevant to the category and the products.

By answering related search queries based around your category pages via blog articles, you get more opportunities to create relevant, keyword rich internal links on your website.

These content silos with keyword rich anchor text, helps send additional, relevant signals to Google, boosting your eCommerce page rankings for their target keywords.

Opportunities for Backlinks

It’s often the case that sites won’t want to – or won’t have a need to – link to an eCommerce page of a website. Blog articles are easier to get organic links for and can be created to act as an opportunity to secure more links.

If your blog content is set up as silos (which we covered in the Information Architecture module), they can funnel the link juice towards your other important pages.

Mid Funnel Traffic and Brand Awareness

The marketing strategy for your online business should reflect that of an offline business. It’s important to understand that not everyone you get through your door is going to buy the first time, but it’s an important step of actually getting them to walk through your door in the first place.

Users will be searching for different terms at varied stages in their shopping journey; whether they’re simply browsing during the research stage or at the decision stage of making a purchase. Ignoring blog content is ignoring users who might further down the line become your customer. Blog content gives users an opportunity to get to know your brand, see you as an authority within the niche and consequently are more likely to purchase from you.

Sports supplements giants is a great example of how to make the most of a blog, it’s called The Zone and serves as an extension of their SEO strategy.

We can see from the screenshot below that the blog section alone is ranking for almost 85k keywords.

Now let’s compare this to the overall number of keywords that the site is ranking for, which is 320k.

More than a quarter of all keywords MYPROTEIN is ranking for, is coming from the blog giving them a huge amount of organic traffic that will be driving brand awareness and mid funnel traffic.

In addition to this, their blog has acquired over one thousand referring domains, making up a fifth of their backlinks and contributing greatly to the site’s authority.

The Zone Referring Domains Growth

MYPROTEIN Referring Domains Growth

User Intent

User intent refers to the type of content that a search engine ranks for each keyword – the term could be seen as a bit of a misnomer as the type of page that an individual user is looking for might not reflect what Google thinks they’re looking for.

As we’ve seen, there are three broad categories when it comes to user intent: informational, navigational and transactional.

A typical eCommerce page will rank for keywords that have “transactional” intent, whereas a blog article will have “informational” intent. The third type of user intent is “navigational”, where a user is looking for a page that they have previously visited, or assumes would exist on the internet.

This plays a big role into an eCommerce SEO strategy – a key question that must be asked is are your eCommerce pages targeting keywords that have transactional user intent?

If a category page is targeting a keyword that returns informational articles, then it’s very unlikely that you are going to rank for this keyword. Therefore, it’s absolutely vital that eCommerce categories and products are optimised for keywords that carry a transactional intent.

Using the sports supplement niche again as an example – the keyword “protein cookies” may sound like it’s something users might be shopping for, however a look at the SERPs shows us that Google is rewarding recipe pages in the top positions.

If you were planning on targeting this keyword with a category page it wouldn’t rank, and would be worth finding similar keywords, such as “protein bars” that actually return eCommerce pages.

Finding the intent of a keyword is relatively simple – review the top ranking pages and see what type of page is dominant. Some keywords, however, do have “mixed intent” – where different types of pages rank.

For example, the SERPs for the keyword “creatine monohydrate” return both informational and ecommerce pages.

Another consideration for transactional intent keywords is that there is often a distinction between keywords that return product pages or category pages.

This often reflects where a searcher is within the sales cycle or funnel: some transactional keywords are much higher up the sales funnel where they are not ready to buy yet and are still discovering and evaluating what options are available. These searches tend to be more general and return category pages, whereas more specific searches right at the bottom of funnel return product pages as the user is ready to buy.

The below examples show how for the more generic “whey protein” keyword, category pages are returned.

Whereas for the more specific “whey protein isolate” return product pages.

Dealing With Product Unavailability

Almost all eCommerce sites will have a number of Out of Stock (OoS) products on their site – this is a natural part of a successful eCommerce business. However, how you handle products that are no longer in stock have an impact on your SEO.

There are a number of factors at play that should be considered in addressing OoS products on your site. These include:

  • The size of your website
  • The number of out of stock products
  • Whether the product pages have any SEO value
  • Whether the products are temporarily or permanently out of stock

Temporarily Out of Stock Products

If your product is temporarily out of stock then these pages should be left as they are – you will restock these products and will want Google to continue crawling and indexing these pages so there is no need to do anything from an SEO standpoint.

From a user perspective, it might be useful to include a note that indicates that the product may return for example, you may say “We’re sorry for the unavailability. Enter your email below and we’ll let you know when this item is back in stock”.

However, this may not always be best because users will most likely simply look elsewhere and leave the page.

An alternative would be to allow users to preorder the item so that it will be delivered to them as soon as the product is available once again. This way, you aren’t missing out on a potential sale.

Permanently Out of Stock Products

If your product is permanently out of stock then the situation is more complex and there are a number of things that need to be considered.

There has to be a consideration of site size – whether your store has thousands of out of stock products or whether your site only has a handful of products in the first place.

Sites with a large number of out of stock products can contribute to issues of index bloat – in that large quantities of pages that have little to no value will prevent crawlers from crawling the site efficiently and can make it more difficult for search engines to understand.

The options that come to mind here would be to either 404, 301 redirect or noindex these pages – the recommended course of action would be to 404 permanently out of stock pages and remove all internal links to these pages.

A 301 redirect can cause more issues than it solves – it can result in a bad user experience – in that a user expecting one product will end up looking at a different product or a different page type. In addition to this, it can result in redirect chains if product pages are redirected multiple times, wasting crawl budget, diffused link equity and longer page load times.

An option for very large sites is to use the meta tag unavailable_after to automate the process of removing these pages from the index, setting products to be removed from the index after a specified period of time.

Another consideration is whether the out of stock product page is still providing any value to the site – from an SEO perspective it could have valuable backlinks or be driving organic traffic. Deleting valuable pages, even if the product is out of stock, would only have a negative impact on SEO. In these cases, it would be best to follow recommendations for products that are temporarily out of stock.

Dealing With New Product Lines

As has been the case throughout the textbook, there are a handful of interesting patents that help us better understand how Google may work when it comes to eCommerce websites. The first of these, is a patent titled “Detecting product lines within product search queries” which was invented by Ritendra Datta and granted in August 2019.

The patent shines some light on how Google approaches the addition of new products that appear in pre-existing product lines to determine whether the user’s query is specifically looking for a particular line of products. For example, you may introduce a new wireless alternative to an existing pair of wired headphones.

The aim of this patent is for Google to improve its understanding of shopping-related search queries and by extension, to understand the “various aspects of product categories”. This includes associating products with a brand or a particular product line; the latter of which can sometimes change rapidly over time with new products being added and old ones retired frequently.

A great example of this is with smartphones where companies like Apple and Samsung release variations of the same smartphone with improved camera capabilities or increased storage.

The patent emphasises the need for Google’s ability to automatically detect which search terms correspond to designated product lines from product-related search terms as provided by users. On top of this, Google needs to be able to relate this product lines and categories to brands, so that it is able to recognise whether new products have been added or old products have been discontinued more quickly.

The patent outlines a process to determine the product lines from product search queries as follows:

  • The product query may be classified to identify a product category
  • A brand may be identified from the product query.
  • The brand may be selected from a list of known brands for the product category

According to the patent, unknown product lines are identified within the product query with a metric reflecting “how well the unknown product lines correspond to an actual product line within the brand”. In other words, this metric tries to identify whether the new product is part of an existing known line for a particular brand.

The metric is compared to a specified threshold which then determines whether or not to treat the unknown product lines as a “new product line of the brand” if the metric compares to this threshold.

High Precision Query Classifiers

The patent introduces the concept of a High Precision Query Classifier which is used to associate product lines and brands by analysing product search queries.

This works by automatically mapping the user’s query to a product category where a list of known brands from Google’s own repository or index is then used to identify the terms within this query to see whether they specify the product brand. This helps Google quickly identify the product’s brand which leaves having to decide whether the unknown product line falls within a pre-existing category or not.

The patent goes onto describe how a list of known category attributes may also be used to pick out terms from the query which help identify particular attributes of the product that the user is searching for. For example, when searching for a smartphone model, users may include the number of megapixels of the camera or the amount of RAM memory for laptops.

This is depicted in the figure taken from the patent below.

[B] [PL] [A] Product Query Form

Google outlines possible forms that a product query may take, one of which is the [B] [PL] [A] form.

Let’s break this down.

[B] = one or more terms which may present a brand that is already known from Google’s own list of known brands

[A] = one or more terms may indicate the attributes that are already known from existing attributes from the product’s category

[PL] = one or more unknown terms may then be identified as a potential new product line.

This method of identification can be improved when [PL] is in a form that is already associated with product lines or where [PL] is found with a brand [B] more frequently with numerous product queries. It becomes even easier when the terms of the potential new product line are never “found with brands other than the brand [B] throughout many product queries over time”.

In order to determine how well the new product line fits within the context of existing lines, Google calculates and compares a metric. This metric looks at the number of unique queries that are contained within the terms of the new product line, as well as compare the structure of the product category for that particular brand [B] by looking at shared attributes [A].

For example, Samsung [B] has its “Galaxy” [PL] smartphone range which identifies each model with “S” succeeded by the model number i.e. “Samsung Galaxy S20+”. One difference between a “Samsung Galaxy S20” and “Samsung Galaxy S20+” is the screen size i,e. A product attribute [A].

This perceived “pattern” is what Google is looking for to determine whether a new product can be “added” to an existing product line or not based on the user’s search query.

In the patent, Google outlines some of these observations which have been made when trying to identify when unknown terms may reflect a form that is already associated with existing product lines: “An example of a rule or pattern may be that product line terms generally start with a letter and contain few or no numbers.” – this reflects our example of Samsung’s Galaxy S20 range.

Category Attribute Dictionary

Another concept that the patent mentions is the category attribute dictionary.

This provides a dictionary of known attributes that are already associated with various product categories and brands which allows for Google to quickly check whether new product lines terms within the user’s query can be resolved.

For instance, when unknown search terms are found with the user’s query, they could be viewed as product line terms that are associated with a specific brand. The category attribute dictionary can be used as a reference checker to see whether the unknown terms are consistent with the brand [B] or the category associated with the product query.

How Google Processes Product Query

Knowing how Google processes product queries helps us to understand how we should go about handling new product lines ourselves as website owners. It also helps provide an insight into how Google preserves the search intent behind these kinds of terms.

Below is a breakdown of how Google processes product queries as detailed in the patent:

1. Product query is received

2. Product query is classified to identify possible product categories

3. Product line terms are identified within the product queries of known brands

4. Unknown product line terms are evaluated against the typical product line templates

5. Unknown product line terms are identified as candidate product lines for the known brand

6. A metric is calculated to pair the product line with the brand

7. This metric is then compared against a pairing threshold

8. A product search is performed on the product query, the paired brand and the product line to determine the search results

9. Search results are returned to the user

The main takeaway from this patent is that Google is collecting information about products and product attributes from users’ search queries. These are then mapped to existing information about the brand and product categories to determine whether the product the user is looking for is part of a new or existing line of products.

Looking at users’ search queries as opposed to the explicit descriptions that may be detailed on the product page itself or within the page’s schema markup allows Google to get an idea of the interest levels for specific products.

As business owners, you should therefore adopt a consistent template for a series of products to help Google better evaluate and understand the new products that you add to your online store.

Product Vs Accessory Queries

Another patent titled “Distinguishing accessories from products for ranking search results” illustrates how Google adapts its search results by looking at the distinction between products and their accessories. This is important because most eCommerce websites offer accessories on top of their core line of products on their online store. For example, you may sell charging cables, bluetooth handsets and external storage cards as accessories for smartphones.

The patent aims to solve the issue of product and accessory results being intermingled in the search results which Google states can be “frustrating” for the user. This is because when researching a product, you do not want to see results that are related to an accessory when you haven’t even purchased the product in the first place.

Therefore, in a scenario where the search engine results depict both product and accessory listings, the user has to visually scan and potentially even click each result to decide whether the result is of interest to them or not.

The proposed solution here is for Google to classify queries into product or accessory queries, and then generate search results which focus on the identified classification. In other words, if Google determines that the user’s query is for an accessory, then the results for accessories would be prioritised over those for the products.

This allows Google to be able to find results that are more relevant to what the user is looking for and in turn, providing a better experience.

The Offer Data Store

Google explains how a query processor is used to identify results depending on whether the query is related to a product or an accessory.

“Individual results in a set of results can be ordered according to a ranking score computed by the query processor. The result set can be delivered to the client computer as a complete set of results or delivered in segments. For example, the 10 highest ranked results can be delivered first in one web page that includes a link that can be selected by a user to cause the front end server to deliver the next 10 highest ranked results”.

The patent goes on to mention an offer data store. This is a mechanism that includes records that correspond to items that are offered by merchants.

You can register with Google using its Merchant Center to upload and outline the products you offer so that it can retrieve and display these records to the user.

This allows to specify details about the products and offers such as:

  • The title of the offer
  • A description of the product or offer
  • The price of the offer
  • The category of the offer which can include the name of the category that can be selected from a list provided by the search engine.

This effectively allows Google to have a detailed and accurate representation of the products that you offer and allows the search engine to search this corpus of information to retrieve the right products based on the user’s search query.

Google is able to compare query keywords with the category information outlined by the merchant.

Likewise, the query classifier and result classifier (both of which are part of the query processor) are used to compare the “query keywords and keywords from identified results to offer record keywords that have a calculated probability of being associated with an accessory”. In other words, Google uses these two classifiers to determine how “close” the user’s search query is to the search results of other known keywords and in turn, decide whether to associate the user’s search with a product or accessory.

For example, Google explains how the offer processor may be able to heuristically classify a given offer by looking at the title field of the offer provided by the merchant. The system checks to see if the title includes words or phrases that are often associated with accessories.

For example, for a smartphone retailer, the word “cable” or “charger” may be an identifier of an accessory.

Key Takeaways

This patent is another example of how Google is looking at the search intent of the user’s query to determine the results. In this case, it highlights the importance of properly differentiating between products and accessories to enable the right kinds of offers to be displayed to the user based on their search query.

Product Reviews

The third and final patent that we’ll be looking at covers product reviews which provides insight into how the search engine finds and aggregates product reviews. Filed in 2004 and granted in 2011, the patent is titled “Method and system for finding and aggregating reviews for a product” and includes information how Google may approach reviews of consumer products, business products, movies, books, restaurants and more.

If you have an eCommerce website, it’s very likely that you allow your customers to leave aggregate reviews on the products and or services that you offer. Therefore, knowing what kind of information to include on your pages can make it easier for your products to appear in a service like the one described by Google in this patent.

Likewise, regardless of whether or not this method is employed in Google Search, it still helps us understand how Google (a company that prides itself in “organising the world’s information”) may interpret the information that is provided on your website.

In other words, this is an example of how you can be proactive as opposed to reactive when it comes to SEO.

The basis of the patent stems from the idea that “people like to do research on purchases on the web before buying something” and that “search engines tend to be a preferred place to do those searches”. For example, before booking a hotel, you’re most likely going to want to know what kind of experience and stay previous customers had.

The most common type of search query for this intent is as follows:

[product or service] + “review”

For example: samsung galaxy s20 review

However, Google states that although some of these results do contain reviews of the product that the user is researching, many may not. Some may sell the product whilst others may simply provide an aggregation of the reviews from several other websites which Google describes as “cumbersome, time consuming and inefficient”.

As a result, the patent aims to provide a more efficient way to:

  • collect product reviews from the Internet
  • aggregate reviews for the same product
  • provide an aggregated review to end users in a searchable format

Parts of the Proposed System

The system is broken down into three main parts.

1. Aggregated Reviews Backend Server

This component is responsible for:

  • Collecting product reviews from multiple web sites
  • Identifying particular products that are associated with particular product reviews
  • Generating aggregated review information for particular products
  • Storing the product reviews and the aggregated review information

2. Aggregated Reviews Frontend Server

This component is responsible for:

  • Receiving and responding to requests from the client to provide an aggregated review for a product and/or
  • Searching within reviews for a particular product

3. Graphic User Interface

This component is responsible for displaying:

  • Portions of a number of reviews for a product, and;
  • A search input area to enable searching for reviews of products containing the search terms.

The system is also made up of the following components:

  • Review Extraction Module – this is responsible for extracting product reviews from the information collected by the crawling module. The crawling module includes a parser for reviews which extracts related information like the review text, author and date as well as product information such as the name and model number.
  • Review Aggregating Module – this is responsible for identifying which products should be associated with which reviews and generates aggregated review information for these products.
  • Aggregated Review Buffer – stores aggregated review information for a product such as the total number of reviews, an average product rating as well as notes frequently appearing phrases that appear in the extracted product reviews that are associated with the product.
  • Review Database and Indexer – the database is used to store individual reviews and aggregate whereas the indexer indexes the reviews stored in the database so that it can be used by the reviews index.
  • Reviews Index – maps words and phrases to reviews such as ClusterIDs (product identifiers), the name of the author who wrote the review etc.
  • Front End Server – contains an operating system, communication module, a product database for storing product-related information (i.e. product and vendor details) and a reviews index.
  • Review Search Module – this answers search requests that includes a “search all reviews” application and a “search within reviews for a product” application which is used to search for a particular product within all of the reviews in the database.
  • Presentation Module – formats the aggregated reviews and search results for display

Collecting Product Reviews

The patent outlines how the system collects product reviews as follows:

1. By selectively crawling review-related sites for product reviews and importantly, only following “selected links on Web pages, rather than all links” which is what the traditional crawler does.

2. Use a seed list of URLs to begin crawling and collecting product reviews.

3. The parser “may identify such links based on the presence of terms or patterns in the URLs of the links” or use the anchor text of the links to decide which URLs to follow.

4. Other sources of reviews may be used to collect this information.

5. Reviews for products may be received via data streams from product reviewers.

6. The backend server may extract information about reviews from pages crawled and store the data in a buffer which separates the review into fields like author, review text, data , product name etc.

Clustering Reviews

The patent also explains how reviews for the same product may be clustered together.

This is achieved by looking for similarities in the product brand name, the model number or the product category.

In some cases, this may not be as simple as different web sites and reviewers may use different names to describe the same product. Likewise, the model number of product codes may not be explicitly mentioned within the review, i.e. different web sites refer to the same Canon scanner as:

  • Canon CanoScan 7890a002 Flatbed 7890a002
  • Canon CanoScan LiDE 30
  • Canon CanoScan LiDE 30 Scanner
  • Canon CanoScan LiDE 30 Color Scanner
  • Canon Lide 30 (7890A002)

This results in the search engine having to predict this information based on the information that it does have, such as the title of the page.

Generating and Displaying Aggregate Information

Google generates an aggregate of the information that it has collected for products or services so that it can be presented to the searcher.

This aggregation may taking into account things like:

  • The total number of reviews for the product
  • The average rating for the product (this may be the average of weighted ratings from different websites)
  • A distribution of the ratings for the product
  • Commonly occurring phrases within the reviews of the product

Reviews may be displayed to users based on:

  • The position and frequency of words used within the review – i.e. reviews with the product name in the title may be prioritised over those that don’t.
  • Their preferences i.e. users may only wish to see reviews from a particular website.

Review Searching and Display

The review search module supports two different types of search queries:

1. Searching within the reviews that belong to a single product for example a search for “customer service” in the reviews for a given product.

2. Searching all reviews to find a particular product, for example “best wireless headphone”.

The GUI (Graphical User Interface) of the system would then display the reviews so that searchers can see important information like:

  • The aggregate rating of the product.
  • A list of suggested search terms.
  • A list of sources for the reviews that link back to the source websites or to the corresponding reviews.
  • And more.

The interface also may allow users to click on certain regions which may generate more specific reviews that are related to the user’s search query. For example, the aggregated ratings may allow users to only see reviews that mention “customer service” so that the searcher is able to find relevant comments from previous customers/users based on attributes that are of importance to them.

Further customisation is enabled as users may then be able to view the results based on review length, relevance, date etc.

Key Takeaways

Google acknowledges that product reviews play a pivotal role in the search behaviour of users who are looking to make a purchase online. With this in mind, it’s important that you make it as easy as possible for Google to find reviews on your eCommerce website.

For example, this may influence how you optimise your:

  • URLs – so that Google is able to quickly find “Testimonials” or “Reviews” on your website
  • Anchor Text – so that Google is able to quickly identity patterns and follow links to the pages that you want to be crawled
  • Review Information – to see how you can guide customers to provide feedback on specific attributes of your product or service. For example, you may wish to allow users to rate various features of a pair of headphones i.e. sound quality, durability, design etc. This way, you can make the most of how Google can present users with reviews based on features that are important to them.

We mentioned earlier how this is an example of a proactive SEO strategy as opposed to reactive because whether or not Google actively implements this system within its algorithm is unknown. However, apart from helping provide more information to Google about the products and services you offer, implementing reviews serves as an additional trust signal which also helps build your E-A-T.

This in turn, increases the chances of you turning organic visitors into customers as people who see your products have received glowing reviews will be more likely to place an order or hire your services.

Internal Linking

Internal links play a particularly important role for an eCommerce site’s SEO. eCommerce sites are often large and carry a number of different components, from different types of pages such as categories to product pages to different types of content in blogs and customer pages – internal links can be used as a tool to overcome potential complications.

Keyword rich internal link anchor text can be used to send additional relevance signals to Google and help search engines better understand the site. They can help convey to Google not only the keyword the page is targeting but also whether it is a product or category page, or an informational post.

In addition to this, internal links help funnel link juice around the site and pass authority to pages such as product or sub-category pages that usually receive fewer direct backlinks compared to the likes of blog posts.

Internal links should also be used to help Googlebot navigate large sites and ensure that crawlers access the most important parts of the site. This is strongly tied to site architecture – which is discussed in more detail later in this chapter – in that a strong internal linking structure, especially within the site’s main navigation, can help bots crawl the site with ease, add relevancy signals and funnel link juice.


Schema Markup

Schema markup plays a vital role in enhancing your eCommerce pages in the SERPs.

For example, product pages without correctly implemented schema markup will not be able to show price, stock availability or the product’s star rating in the results pages. This is a missed opportunity for increasing your click-through-rates for the keywords that you are already ranking well for on the first page of Google.

Later on in the module we will show you how to implement structured data for your eCommerce website but here, we’ll explore the theory why implementing schema markup should be a crucial part of your eCommerce SEO strategy and how else it may be used by Google.

For this, we focus our attention once again on a Google patent that was filed in 2013 titled “Using structured data for search result deduplication”. Before we dive into the details, it’s worth noting that this patent was later abandoned by Google without explanation. As is the case with any Google patent that we have mentioned throughout The SUSO Method, there is no guarantee that Google actively implements or uses any of the systems described. Instead, these patents provide an insight into what the best practices may be and importantly, offer an insight into how Google may approach Search in the future.

Likewise, the fact that this patent in particular was abandoned by Google does not mean that it is not worth looking at. On the contrary, the patent tells us that Google is thinking about using structured data to combat issues of duplicate content, but that this system may be too difficult to implement or simply may not even be the “best” or most “innovative” solution.

The Problem

As mentioned before, one of the main challenges for eCommerce websites is duplication of content that is most often created as a result of filtering and sorting options that help users find the products that they are looking for. These filters create undesired URLs that ultimately end up getting crawled and indexed by Google. Without taking care to ensure that such pages aren’t discovered by the search engine, you will end up with a disparity between the number of products that you sell, and the number of URLs that Google indexes.

For example, if you have 200 product and category pages, your website may end up having over 1,000 pages being indexed by Google which may include all kinds of variations such as HTTP, HTTPS, www, non-www subdomains as well as URLs generated as a result of the sorting and filtering parameters.

Not only does this cause serious duplicated content issues, but also impacts your website’s crawl budget.

The Solution

Google wants to provide searchers with the best results possible, and as a result wants to avoid presenting search results that are duplicated.

The patent proposes a system that aims to reduce duplication so that fewer pages with identical or similar content rank for the same keywords using Schema markup vocabulary.

The idea here is for Google to be able to recognise and identify what products (or entities) that the pages on a website are about. By doing this, the system is able to compare pages that address the same entity and determine which page should be displayed to the user in the search results. This helps “provide a user with a greater diversity of search results that identify a larger number of sites”.

Here are the steps as outlined in the patent:

So, if for example we had the following structured data on a product page for a Camera model, the system would use the markup information to identify that this page is referring to this particular camera model.

<div itemscope itemtype=“”>
<div itemprop=“name”>CameraFX Q410 Digital Camera</div>
<div itemprop=“manufacturer”>CameraFX</div>
<a itemprop=“url” href=“”></a>
<div itemprop=“description”>The CameraFX Q410 Digital Camera is ideal for any photographer, combining both high quality imaging that makes taking pictures easy. </div>
<div>Product ID: <span itemprop=“productID”/>32720176</div>

This enables Google to distinguish between the most relevant information effectively to allow for the search engine to avoid showing multiple pages about the same entities.
Although the system is able to identify these entities, the challenge lies with how it can deal with variance. For example, if we had one page that talks about the Statue of Liberty and another that talks about Lady Liberty, the system would need to be able to identify that both of these pages are referring to the same entity.

In order to address this, the patent mentions assigning reference scores for entities that may be seen as related as well as entities which are the same, but use alias names.

The patent also details how eCommerce websites that have multiple entities on the same page can be grouped together as entity sets.

For example, let’s assume our website has four camera model entities (products) which are denoted by c1, c2, c3 and c4.

Structured data markup on web page A may be parsed and mapped to generate an entity set that has the following products: {c1, c2, c4}.

Web page B, may have the set: {c1, c2}.

Web page C may have the set: {c3, c4}.

The search engine would then be able to see which entities are featured on which pages as well as identify overlaps between the pages/entities. For instance, the system may decide that Web page C is better suited for searchers looking for c4 than Web page A.

The Takeaway

The key takeaway from this patent is that even though it was abandoned, it shows how Google may use structured data and schema markup to tackle duplication issues; especially with eCommerce websites.

Site Architecture

eCommerce sites can easily turn into large sprawling messes without a proper consideration for site structure.

Site Architecture is important as it has a large influence on how Google crawls and understands your website. If the site has no clear structure and is navigated, Google will struggle to find and understand the relevance of the pages that are most important to you. On top of this,

The site’s architecture should be simple, clear and the crawl depth should be kept shallow. By this, we mean that every page should be accessible from the homepage in no more than four clicks.

Products should be split up into relevant categories and subcategories that should be clear cut, avoiding duplication or overlap where possible. These categories should be reflected in the site’s main navigation menu.

The homepage should link to top level categories and each top category should link to sub-categories or products. Each page should link back up the level above it. This structure gives Google clarity in understanding the relevance of each page.

Argos’ main navigation menu is a great example of how a large number of products can be split into relevant categories and sub-categories, ensuring clarity and ease of navigation for both the user and the crawler.

As mentioned before, crawl depth refers to the number of clicks from the homepage that it takes to reach an internal page. The deeper the page, the harder and less likely it is for Google to discover. This is a problem that occurs in eCommerce SEO often due to the number of products and categories on a site. The category structure described above will keep products within 4 clicks from the homepage and allow products to be discoverable.

The below screenshot shows the crawl depth of a large renovation eCommerce site – in it we can see the potential pitfalls of a site with a poor site architecture. The large majority of URLs are 10 levels deep, giving crawlers an extremely hard time discovering all pages.

Later in the module, we will also show you how to structure the URLs for your eCommerce site.


Whilst Google no longer takes into account rel=”prev” or rel=”next”, which was previously a big talking point of eCommerce SEO, pagination, nonetheless, remains an important way for products to be displayed within an eCommerce category.

We recommend implementing pagination on category pages according to best practice.

A common theme in a misguided attempt to reduce page duplication is to noindex the second, third etc pages in a paginated set or to canonicalise them towards the first page of the set. This should be avoided as it will make it more difficult for Google to discover the products that aren’t on the first page of the category.

“We don’t treat pagination differently. We treat them as normal pages.” – John Mueller, Google Webmaster Office-hours Hangout 22 March 2019.

In addition to this, as seen from the quote above, Google now considers each of the paginated pages in a set as separate pages in their own right.

For example, if your website has the following pagination:

Then each of the paginated pages would now be treated as separate pages, just like any other page on your website that is indexed by Google.

This may pose a concern that duplicate content may be created as a result of all paginated pages treated alike and that the primary page may be weakened as a result of this. Removing the category description from all pages except the first page in the set is one option to reduce duplicate content between paginated pages.

You can find a comprehensive guide on how to deal with pagination here.

Faceted Navigation

Faceted navigation is an essential part of an eCommerce website, especially for categories with a large amount of products. It allows the user to refine their search and navigate through your products with ease.

For example, most fashion retailers like Next will allow users to narrow down their search for products by size, colour, fabric etc.

However, whilst it is great for usability, it can provide issues from an SEO perspective if not implemented correctly.

The biggest risk associated with faceted navigation is that it can create a large number of near duplicate pages: each option and variation combination selected creates a new URL.

If you select a colour and a design feature on the Next website, the URL may appear something like:

In addition to duplicate content, faceted navigation can waste a site’s crawl budget on hundreds or thousands of near duplicate pages that you don’t want indexed.

A canonical tag for each filter page back to the root category will help search engines understand the pages as simply variations of one another, it will consolidate link equity and will reduce wasted crawl budget.

For the above example, each variation within that category should canonicalise back to

An alternative way of implementing faceted navigation without generating different URLs for each variation would be to implement it using AJAX – this way the user will be able to navigate between product selections as before, but new URLs will not be created, eliminating the risk of duplicate content or wasted crawl budget.

You can find more information on the best (and worst) practices for implementing faceted navigation as recommended by Google here.

Link Building

Link building for an eCommerce website follows the same overarching principles of link building for any other site, however there are some specific factors that must be taken into consideration.

Link building is all about boosting your site’s authority and strengthening your pages’ relevance for target keywords. An eCommerce site has a number of page types – the homepage, category and subcategory pages, product pages and blog pages – so it can be difficult to determine where to point backlinks.

As ever, backlinks to your eCommerce website must reflect the natural balance of links within your niche.

Often within eCommerce sites, most links point towards:

1. Homepage

2. Bog pages

3. Category pages

4. Product pages

However, this is by no means consistent for all niches under the eCommerce umbrella. Similarly, anchor text ratios must also reflect that of your competitors.

The effectiveness of link building also depends on other factors highlighted throughout this section, especially with site architecture and internal linking.

There are some specific opportunities and strategies that can be taken advantage of when it comes to link building for your ecommerce website:

  • Creating a high quality blog: sites can often be reluctant to link to an ecommerce page as it can feel as if they’re promoting your products directly. High quality, unique content in a blog is much more attractive to link towards.
  • Use services for journalists such as HARO: often journalists are looking for products or quotes to feature in round-up articles, which will usually come with a link to your site – there are a number of services that allow journalists to post bulletins on a daily basis. You can filter through these and pitch directly to the journalist.
  • Giveaway products for reviews: publications are often looking for new products to review, so find a relevant website and offer to send them a product to review in exchange for a link back to your website.