AI powered semantic SEO webinar feat. Koray-Gubur, guest Robert Niechcial. host Olesia Korobka, Sherlock the Cat on Duda platform

AI-Powered Semantic SEO with Koray Gubur

AI-Powered Semantic SEO, also known as semantic search optimization, is a more advanced approach to Search Engine Optimization (SEO). It leverages the power of Artificial Intelligence (AI) and semantic technology to improve the accuracy and relevance of search results by understanding user intent and the contextual meaning of terms as they appear in the searchable dataspace.

The topic is so expansive that it wouldn’t be possible to go into detailed explanation within the scope of one webinar. Koray Gubur touched upon a few important points which were added by Robert Niechcial’s input. Sherlock the cat had also a few meows to specify. But not more than that. I suggest, you watch the whole webinar on YouTube: https://www.youtube.com/watch?v=81pe-YM9iRI

Koray Gubur presentation or download it here:

Robert Niechcial presentation or download it here:

Key Takeaways. Everything Below Is My Personal Understanding

Cost of Ranking a Website Can’t Be Higher Than Cost of Not Ranking a Website

First introduced by Koray during the Saigon SEO Mastery Summit in March 2023, this principle correlates cost to Topical Authority. While content can exhibit quality (through relevance and responsiveness), the expenses tied to indexing, crawling, and ranking—or retrieval—can be too steep. Google doesn’t rank you out of fondness. They will rank you only if it’s profitable for them. Remember, they’re an advertising company.

This explains why larger websites, such as Amazon, often outrank smaller ones—even if the latter are Google’s competitors. Larger sites answer millions of queries, making it more cost-efficient for Google to serve their content across numerous results, compared to using millions of smaller websites as a source of answer for just one query among the millions.

Emphasizing Query Responsiveness

To put it in the simplest of terms, your answer to a query should be not only relevant but also appropriately structured, and it should include the key terms from the query. There must not be a gap between a trigram extracted from a query and your response to it. You can delve deeper into this topic in the patent: Scoring candidate answer passages.

Image illustration from scoring candidate answer passages patent about the distance from Earth to MoonExample of one for the figures from the patent mentioned above.

Optimizing LLM for SEO

This process can be broken down into seven key steps:

  1. Fine-tune a LLM.
  2. Develop a Topical Map.
  3. Create a Semantic Content Network.
  4. Generate Content
  5. Incorporate Human Involvement
  6. Improve your Knowledge Base
  7. Transform your website into a Speaking AI.

To fully or partially execute these steps, you’ll first need to establish a robust knowledge base and then proceed to verbalize it. It’s crucial to safeguard this information, keeping it private. Since semantics are language-agnostic, feel free to utilize different languages for embeddings.

Optimizing small things sitewide creates a great difference.

Use of Language and Context to Improve SEO


Multi-chain reasoning is a concept in AI and ML, which involves linking together a series of facts, deductions, or inferences to arrive at a conclusion. With multi-chain reasoning, you help search engine to extrapolate meaning from the content provided. For example, when you mention Buckingham Palace gift shop, it is implied that the currency is in pounds.

The trick here is to keep your content as clear as possible. Ambiguity makes it difficult for search engines to understand your content and may lead them to use more expensive algorithms or even give up indexing your page. Google operates different tiers of servers for websites, with higher-quality websites being served from better servers to ensure their preservation.

Use language embeddings, which are numerical representations of language. Similar languages create closer embeddings even if they’re from different languages because search engines are capable of understanding semantics across languages. This idea is visualized through embedding projections, which can help improve your content by identifying relevant connections.

When creating a topical map, crawl a website and create embedding projection. By the distance between embeddings, you’ll understand how relevant they are to each other. If you have different contexts in your website, it’s necessary to cover intersecting areas between them. Google assumes that if you want to create a website that covers different domains, you should also touch upon the intersecting areas.

Koray introduced concepts of macro and micro contexts within documents, using the example of the words “ultrasonic”, “cleaners”, and “cars”. If you use quotation marks to search both “hexagon” and “ultrasonic”, the main context is the “ultrasonic wave type”.

NLP and Relevance Configuration for SEO

Fine-tune your selected pre-trained language model to perform specific tasks you need from it. You can use Google’s Learning Interpretability Tool to manipulate word relationships within the model. By changing the distance between different concepts, you can influence how the model understands their connections.

Always remember, it’s not only about what you say but also how you say it. Sentence structure and word order influence search engine classification and understanding. Look at it as a math excercise, because you are dealing with algorithms. Don’t take all those recommendations to focus on “great content” blindly. It doesn’t work this way.

What is relevance configuration? The idea here is that the same message can be conveyed using different sentence structures and word orders. Depending on the search query you’re targeting, you might prefer one structure over another. For example, if you’re targeting a search query involving “financial advisor”, you might want to use a sentence structure that places more emphasis on “financial advisor” rather than “families.”

Utilize AI Correctly

Robert suggests viewing AI, specifically large language models like GPT-4, as a powerful language processor rather than a source of knowledge. As these models are based on probability, they are apt to produce similar quality content, causing an “inflation” of content on the internet.

One way to make these models more effective is by using a feedback-based, iterative procedure. This involves providing a prompt, receiving output from the AI, providing feedback based on the output, and then fine-tuning the prompt based on the feedback.

However, given the outdated dataset of GPT-4, there is a need to feed it with the latest information and data structures. Instead of relying solely on the automated dataset in AI, it’s advised to supplement it with extra data for better output.

Generate Human-Like Content

  1. Extract important information from existing content, like a product description, to inform the AI about the topic.
  2. Use this data as input for the AI model, asking it to generate a review or similar content.
  3. The more detailed and specific your input data, the higher quality output you can expect from the AI.

This process, if done at scale, can generate a lot of content quickly. However, revise your goals before you do that. Moreover, it isn’t just to produce content, but to satisfy user intent – the real reason someone is searching for something.

Data Sources to Extract Information and Feed into AI for Content Generation

There are lots of them. Robert specifically mentioned the following:

  1. Apify: This service aims to make an API for each website in the world, providing numerous data sources.
  2. Rapid API: It offers a large amount of data to build your knowledge base.
  3. Google: By extracting “People Also Ask” and website heading information, you can understand how Google connects related questions and content.

Use embedding keywords based on Google results, not other models. You can cluster “People Also Ask” questions from Google, extending them to teach Google about new associations between questions.

The larger the input size, the better as more detailed inputs lead to better outputs.

Be SEO Engineer

Now it’s easier to understand and manipulate Google’s SERPs due to advancements in technology and data accessibility. SEO Engineers can even create their own search engines using databases like Common Crawl and specific relevance algorithms.

Engineering mindset for SEO practitioners is key as it provides a competitive edge and benefits agencies and communities in the SEO field.

Training and Fine-Tuning Your Model

Among your competitors, choose a website that Google uses as a benchmark for your niche. It should be not affected by updates or other algorithm changes. Scrape it and train your model on its content. You can use OpenAI or other tools to embeddings and fine-tuning. Adjust your style and vocabulary to imitate your most successful competitor. Create templates for each type of question your user might have. Use language models to fill in specific words within these templates. Also, you use NLP for better result. While lLLMs and knowledge graphs are language agnostic, word order and context can differ between languages.

To create topical maps, you need to consider which concepts need individual pages and which ones can be grouped together.

Use NLP libraries such as SiameseBERT with SentenceBERT framework to optimize answer formats and sentence structures for different types of questions. These libraries can score your answers, allowing you to understand which sentence structure yields the best results.

Backlinks

When semantic SEO and topical authority methods are combined with backlinks, the cost of link building can be significantly lowered and it can be easier to rank specific documents. Unnatural link building might also require continuous additions to seem natural, which can be a burden.

If good semantic SEO is used, fewer links can be employed less frequently, while still dominating the niche. Link building is essentially a budget game – the more resources you have, the more links you can build. However, the industry is moving toward a future where user signals may be more valuable than link signals. This means that the focus should be on quality content, accurate semantics, and strong user signals, rather than just increasing the quantity of backlinks.

Instead of Conclusion

  1. Focus on prompts: When dealing with language models, it’s essential to think about feeding on the prompt level, giving as much knowledge to the model on the input side, and then expecting good results on the output side. It’s better to play with the input and focus on quality over quantity, rather than expecting results from a couple of random sentences.
  2. Understand fundamental search engine needs: Instead of just imitating the best-ranking web pages, it’s essential to understand what search engines fundamentally need, especially for the future ecosystem of search. This understanding will be beneficial for a long time.
  3. Consider cost: It’s important to think about what it costs for Google, Bing, and you to see your website. Consider where you invest your time because your time also costs something.

Similar Posts

Leave a Reply

Your email address will not be published.