Home Blog Topic Modeling and N ...

Topic Modeling and Natural Language Processing (NLP) for SEO: Why They are a Competitive Advantage Today

Emiliano Sammassimo


Lately, there has been a lot of talk about topics such as Artificial Intelligence, Deep Learning, and Natural Language Processing associated with SEO. Or rather, associated with Google, the search engine we most often turn to as SERP professionals.

In particular, since October 2019, Google has launched BERT, an algorithm update that introduces a bidirectional model for natural language understanding. Google’s stated goal is clear: Understand Searches Better Than Never Before, to understand search better than ever before.


Google is primarily concerned with language understanding, whether it’s a query typed on a keyboard (on a computer or smartphone) or spoken aloud. Users ask questions and Google must find the most appropriate answer. Natural Language Processing (NLP) is thus a fundamental part of Google’s search today. Knowing and mastering NLP is an additional tool for structuring medium to long-term SEO strategies.

Topic Modeling: what it is

Now that we have clarified the context and how artificial intelligence helps Google algorithms understand user queries, let’s delve a little deeper by introducing and clarifying some fundamental concepts and specific techniques that can help SEO professionals in their daily work.

The first concept to consider is Topic Modeling, a type of statistical model that aims to extract topics from a collection of texts or documents. To understand the context, it may be helpful to identify the difference between topic modeling and topic classification.

The first, topic modeling, is an unsupervised machine learning technique, while the second, topic classification, is a supervised model.

Another substantial difference concerns the modeling itself: topic classification involves mutually exclusive classes, while topic modeling can classify the same content into multiple classes based on probability distribution.

Topic Modeling and SEO

Una volta chiarito il significato del Topic Modeling e la sua differenza con il Topic Classification si può comprendere perché questo campo di apprendimento automatico ed elaborazione del linguaggio naturale possa essere utile per la SEO.

Example 1: Classifying your own content

The first example of using Topic Modeling is for an editorial website, whether it be a news site or a corporate blog, with a lot of content written over the years and following various trends interesting to their sector(s).

The content can be analyzed through a Topic Modeling model, achieving a macro topic classification for each published content over the years. Subsequently, various analyses can be conducted regarding the visibility of individual topics, from the most visible to those prioritized for the business, in order to plan accordingly the actions to be taken, specifically for each topic.

Example 2: Classifying the competition’s content

Another example, entirely similar to the first both in terms of techniques and conclusions. The classification of the contents of competitor websites can be used to compare them with our own contents and find, with the help of artificial intelligence, topics not covered on our digital properties and areas of improvement. Additionally, through the use of Topic Modeling models, monitoring of competitors can be systematic and updated with fixed timing, both to monitor the results of the activities put in place and to monitor the status compared to the competition.

Example 3: Classifying the most viewed content

A Topic Modeling model can also be implemented to classify the most viewed content online regarding a particular topic. The example could be to monitor a series of search queries based on average monthly search volume and extract the results from SERPs, including classic blue links, the News Box, and the Answer Box where available.

All results will be extracted and saved in a database, with indications of the starting query and check date. After a defined data collection period, an in-depth analysis can be conducted on which players are most present on various Google search features and through Topic Modeling, identify the topics of contents that receive the most visibility. This model can be extended to any keyword-set, as well as language and location.


We have seen how Topic Modeling, among the various natural language processing techniques, can be applied to SEO analysis, both to monitor our website and the competition, and to analyze a large amount of data to support strategies.

This is one of the fundamental points, today artificial intelligence offers us the opportunity to automatically analyze a large amount of data, even integrating them from different sources.

The time spent collecting and putting together data can be spent on non-repetitive and strategic activities that add real value to the projects entrusted to us.