Documentation

A comprehensive guide to the SEO Cluster Tool

File Format

To use the tool, prepare a CSV file containing search engine ranking data.

  • Keyword - The keyword you want to analyze
  • URL - The ranking URL for that keyword

Important notes on the format:

  • Each keyword should have exactly 5 URLs associated with it
  • This means each keyword will appear in 5 consecutive rows with different URLs
  • The file should be in CSV format with comma (,) as the delimiter

Keyword Clustering

The tool groups keywords based on common URLs as follows:

  1. For each keyword, the tool looks at its associated URLs
  2. It compares these URLs with other keywords' URLs
  3. Keywords that share a specified number of common URLs (threshold) are grouped together
  4. The keyword with the highest search volume in the group becomes the group name

Setting the Threshold

You can choose how many common URLs two keywords need to share to be grouped together:

  • 2 URLs - Larger, broader clusters
  • 3 URLs - Balanced clustering
  • 4 URLs - More specific clusters
  • 5 URLs - Very tight, specific clusters

Output Format

The tool generates a new CSV file with the following columns:

Cluster Information

cluster_id - Numeric identifier for each cluster (0 for unclustered keywords)

cluster_silo - Main keyword of the cluster (keyword with highest search volume)

keyword - The individual keyword being analyzed

Search Volume Data

Volume - Monthly search volume for this keyword

Example

Consider these keywords:

            
keyword,URL,Volume
seo tips,example.com/seo-guide,1000
seo tips,example.com/best-practices,1000
seo tips,example.com/optimization,1000
seo tips,example.com/ranking-factors,1000
seo tips,example.com/checklist,1000
seo guide,example.com/seo-guide,1500
seo guide,example.com/best-practices,1500
seo guide,example.com/optimization,1500
seo guide,example.com/tutorial,1500
seo guide,example.com/basics,1500
            
          

With a threshold of 3, these keywords would be clustered together because they share 3 common URLs. The cluster would be named "seo guide" because it has higher search volume (1500 > 1000).