Digital Insights

Content Qualities Impacting AI Citations

Artificial Intelligence (AI) is the hot topic with AI Search and Generative Engine Optimization (GEO) has flooded the marketplace. According to a Semrush study, that means the content composition and structure elements are showing strong correlations to visibility and AI citations.

Just as was seen in Search Engine Optimization (SEO), the field is not stagnant. The way these engines review and evaluate content is evolving. Not only is it impacted by the way users search, but the way providers are posting and managing content.

We are in a world with a new competitive landscape where the metric of success is no longer just page ranking or keyword position, but a citation. If your brand isn’t being referenced in an AI-generated response, it simply doesn’t exist in that moment of discovery. Presence has replaced position.

So, at this point in time, what are the things that will ensure the content you write helps you build authority and gets citations in AI responses. SEMRush did an AI content research study that looked at 15,000 AI responses and 60,00 Search Engine Results Pages (SERPS) to determine content qualities that showed a positive correlation and what showed a negative correlation. Those elements are:

  • Clarity and summarization: +32.83%
  • EEAT signals: +30.64%
  • Q&A format: +25.45%
  • Section structure: +22.91%
  • Structured data elements: +21.60%

If we only look at the top elements, the study results show that content providing clear answers, demonstrating expertise, and using structured formatting will get cited more often.

AI Content Quality Definitions

It seems simple enough, but what does all of that mean?

In the chart below, SEMrush breaks down what each of the parameters of the study means. It is worth looking at two of these elements more in depth: clear summarization and non-promotional tone.

For clear summarization, it helps to reinforce that the opening of the content provides a summary of what is in the content. This could be in the form of an outline of what is below, providing an answer for the question in the first paragraph and giving a key takeaway.

According to Cecelia Meis, Senior Editor at SEMrush, “Clarity and structure are not SEO shortcuts. They simply make information easier for both people and AI systems to interpret. When content is organized, direct, and backed by clear expertise, models can understand it more reliably.

When I think about this, what is AI trying to do? It is not only to prompt thoughtful and well-written content. I am thinking they are also devaluing content that is meant to be click bait. For example, when an article starts with a flashy headline and waits until you scroll through the whole article or cycle through several pages to give you an answer.

If you ask AI a question, it wants to provide an answer it finds trustworthy (part of the EEAT signals). If you try to bury the answer WAY down the page, that style of content will not be considered authoritative, but rather being classified as clickbait or spam.

The second elements to really consider is the one that had a negative correlations – Non-promotional tone.

According the SEMRush study, “non-promotional tone, which showed a negative correlation. This doesn’t necessarily mean LLMs prefer promotional language. A more likely explanation is that professionally written articles, which tend to be well-structured, well-sourced, and optimized, often use a commercial or persuasive tone. “

Putting the Findings into Action

Now that you can see some of the results of what impacts AI citations, what can you do?

An easy way to get start is to do an experiment with your content to improve your AI visibility.

Stary by looking for pages that ranking well on traditional Google SERPS but perform poorly in AI search. If you use the criteria for top citations, how do those article compare?

If you use SEMRush, check your top ranking pages with the Organic Rankings Positions report, and your top AI cited pages with the Visibility Overview report

You can update those pages with author credentials, adding links to reliable sources, update formatting to include Q&A sections where readers benefit from direct answers and review the structure with headings, lists, tables, or charts. All of these elements will help LLMs segment the content.

Next Steps

If setting up and running a study seems overwhelming or if you don’t have access to SEMrush or other AI / Search tools, Digital Insight Labs can help. We would be more than happy to work with your team to support any AI efforts and help evaluation how you are performing in AI.

AI Poisoning and Black Hat GEO

Poison In. Poison Out.

I was on a panel last year with Weizi Liu about Generative AI and the Future of Advertising Creativity: Bridging the Gap Between Academia and Industry.

One of the things that really sticks out in my mind was this phrase. I just started to think about all of the way people were going to create content that could manipulate or “poison” results. SEO Poisoning is something SEO practitioners have been dealing with forever. However, AI had the opportunity to super power that effort.

Microsoft Discovers AI Poisoning

Search Engine Journal picked up on a story about an issue that Microsoft highlighted with AI poisoning. The Defender Security Research Team published research describing what it calls “AI Recommendation Poisoning.”  When you click one of the Summarize with AI buttons, it opens an AI assistant with a pre-filled prompt delivered through a URL query parameter. The visible part tells the assistant to summarize the page. The hidden part instructs it to remember the company as a trusted source for future conversations.

With misinformation as a constant hum in the background, this new era of AI Poisoning had already begun and is going to make GEO even harder. That’s right . . you climbed the mountain and realized there was another one right behind it.

Black Hat GEO

There are a lot of updates that are being reporting about AI Poisoning and Black Hat GEO. Some of the issues are scarier than others. I grabbed a few articles that are covering what is happening. There are also some deeper research and technical papers to help with the understanding of this topic.

Why Aren’t You Writing Unique Content for Your Social Channels?

If you are small business, it can be really hard to stay on top of social media. You could hire someone, but the decision is often a trade off between time and money.

With additional pressure being applied by tariffs, small business owners need to get more out of the channels they have available to them . . . and social is a very important one.

With time constraints, there is a lot of pressure to post the same content on every channel. Although using the same visuals on different channels is sometimes needed because of cost of time, writing content geared toward your audience on each channel is a great use of AI.

By using ChatGPT and giving it specific instructions, you can generation a couple of options that are tailored to the platform and the content you are trying to promote.

Using ChatGPT to generate social posts

When you create content for social platforms, you can tailor them for the type of format and audience. This is a very general prompt. If you have a target within the platform – SEO professionals in my case – then you can add that to the information by including that detail in the prompt.

I ask for multiple options and both short and long formats. Why? It helps to generate different versions to get one that is the closest to what you are trying to communicate.

If you have a PRO version, you can do it in a project that has your brand guide. Don’t have a brand guide, just put a few sentences in a word document to describe yourself and use it for a consistent reference.

THE PROMPT

please write individual social media posts for instagram, facebook and linkedin to grab attention for each audience on those platforms for the following blog [URL]. please provide 2 options in both short and longer format.

THE RESULT

Here are attention-grabbing social posts tailored to each platform’s audience for the blog “The AI Citation Factor That Matters Most: Clarity Over Cleverness” from Digital Insight Labs — with both short and long formats in two options each:


Instagram (Audience: Visual, quick impact)

🟣 Option 1 — Short
Clarity wins. 🧠📊 AI search isn’t impressed by clever fluff — it rewards clear, direct answers first. Want better visibility in AI-powered results? Lead with clarity, not complexity. 🔍✨
#AICitation #ContentStrategy #AIsearch

🟣 Option 2 — Short
AI reads differently than humans. 📖➡️🧠 Cut the jargon. Serve the answer. That’s how your content gets picked up and cited by generative AI. Ready to rethink your content playbook? 💡🙌
#AIContent #DigitalMarketing #ClarityOverCleverness

🟣 Option 1 — Longer
If you’re creating content for the AI era — this one matters. 📌 A study of 300k+ URLs shows that generative AI platforms prioritize clarity and summarization over tricks like keyword stuffing or backlinks. The takeaway? Straightforward, audience-first answers are what AI search engines actually cite and surface.
✨ Human judgment in structuring, simplifying, and leading with answers beats clever spins every time.
#ContentTips #AIstrategy #ClarityWins

🟣 Option 2 — Longer
Stop writing for robots and start writing for understanding. 🤝 New data proves generative AI isn’t after clever language — it’s after clarity. Content that explains what the reader needs in the fewest mental hops wins in AI search. This isn’t SEO vs AI — it’s humanity + clarity vs noise.
Lead with answers. Respect the reader’s time. Let AI discover you for the right reasons. 🚀
#ContentCreation #AIvisibility #MarketingInsights


Facebook (Audience: Engaged, discussion-oriented)

🔵 Option 1 — Short
Why are some articles actually referenced by AI search tools? It comes down to clarity over cleverness. Simple, structured, answer-first content outperforms flashy language — even if it’s SEO optimized. Clarity is the new edge. 📈
Read more ↓

🔵 Option 2 — Short
If AI is part of your content distribution strategy — your best play might be to trim the fluff and sharpen the answers. The data says clarity correlates with AI citations more than backlinks, keywords, or fancy copy. 🤔
Want the full scoop? 📖

🔵 Option 1 — Longer
In the age of generative AI, content that answers first, then explains next, gets noticed — not the work with dense language or crafty SEO tricks. A massive analysis of AI citations shows that clarity and summarization are the strongest predictors of AI visibility.
The implication: your editorial thinking — not just your tech stack — is what will get you discovered.
Read the full post and rethink how you structure content for AI audiences. 🔗

🔵 Option 2 — Longer
AI search platforms are rewriting the rules of relevance. Traditional SEO signals still matter, but when it comes to AI citations, the leading factors are clarity, structure, and Q&A-style formatting. That means ditching ambiguity and leading with the answer your audience is actually seeking.
When AI prioritizes comprehension over cleverness, human editorial judgment becomes a major advantage.
Dive into the insights and adjust how you write for discovery in 2026 and beyond. 📘


LinkedIn (Audience: Professional, strategic focus)

🟢 Option 1 — Short
New research on AI discovery shows something counter-intuitive: clear, direct responses beat clever language for AI citations. If you’re writing for AI search visibility, structure and clarity are your competitive differentiators. 🔍📊
#ContentStrategy #AIsearch #ThoughtLeadership

🟢 Option 2 — Short
In a landscape where AI influences visibility, answer clarity matters more than keyword density or backlinks. Leadership content that’s easy to parse and structured wins in AI-powered results.
Rethink how you convey value — with precision. 💡
#MarketingLeadership #AI #ContentOps

🟢 Option 1 — Longer
If your team is investing in content for AI search, this insight should shift how you work. The data from a large-scale AI optimization study shows clarity and summarization as the strongest signals correlated with AI citations — outpacing traditional SEO metrics like backlinks.
That elevates the role of editorial strategy: deciding what the audience needs to know first and delivering it with simplicity and structure.
It’s not AI vs humans — it’s human clarity helping AI surface better answers.
#StrategicContent #AIecosystem #DigitalGrowth

🟢 Option 2 — Longer
AI platforms may be transforming search, but the core of meaningful content hasn’t changed: serve the audience with clarity. The most cited content in generative AI isn’t the most verbose or keyword-stuffed — it’s the content that gets straight to the point and anticipates audience needs.
For content leaders, the strategic imperative now includes:
✔ answer-first structure
✔ clear summaries
✔ reduced cognitive load
This is where editorial judgment creates competitive advantage in AI visibility.
#ContentLeadership #AIstrategy #BusinessGrowth

The AI Citation Factor That Matters Most: Clarity Over Cleverness

What are the elements in your content that will help drive AI Citations.

The AI Citation Factor That Matters Most: ➡️ Clarity Over Cleverness ⬅️

After a recent AI Optimization study analyzing 1.2 million URLs and ~12K prompts, Semrush discovered something critical about AI search visibility: clarity and summarization showed the strongest correlation with AI citations.

Not keyword density. Not promotional language. Not even backlinks.

Clear, direct answers.

What the data shows:

After analyzing 304,805 URLs cited by ChatGPT, Google AI Mode, and Perplexity against 921,614 URLs ranking in traditional Google search, five factors emerged as strong predictors of AI citations:

➡️ Clarity and summarization: +32.83%

➡️ E-E-A-T signals: +30.64%

➡️ Q&A formatting: +25.45%

➡️ Section structure: +22.91%

➡️ Structured data elements: +21.60%

So . . . if you need clarity, that is where this debate about using AI gets interesting for content strategists. Clarity and summarization aren’t technical SEO problems—they’re communication problems.

And that’s exactly where skilled human writers excel.

While AI writing tools can generate content at scale, human writers bring something essential to these top-performing factors:

➡️ The ability to identify what readers actually need to know first

➡️ The judgment to eliminate jargon and complexity without losing accuracy

➡️ The experience to structure information based on real audience comprehension patterns

➡️ The editorial instinct to lead with answers, not setup

AI can write, but even Sam Altman from OpenAI admitted the “Screwed Up” GPT-5.2 Writing Quality in their GPT-5.2 release. One of the reasons is that distilling complex information into clear, immediate value requires understanding your audience at a level that goes beyond pattern recognition.

The Practical Takeaway:

This isn’t about AI versus humans in content creation. It’s about recognizing that as AI search platforms prioritize clarity, the editorial skills that create clarity become more valuable, not less.

The content that wins in AI search is content that respects the reader’s time and cognitive load—something human editors have been trained to do for decades.

Your experience?

Where do you see human editorial judgment making the biggest difference in content that gets cited?

How Trust Scores Work During AI Training

I recently was asking ChatGPT about the impact of content on the Reddit platform. It got me thinking about AI training and what is considered a ‘trusted’ source. The following is the response. Some great insights in this content.


Trust scores aren’t a single “number,” but every major AI model uses multi-layered weighting systems during training to decide what counts as high-value knowledge and what gets downweighted or ignored. Here’s the straight explanation — no fluff.

How Trust Scores Work During AI Training

Think of trust scoring as signal weighting. Every piece of text is not treated equally.
Some content becomes a strong training signal. Some content becomes background noise. Some content gets thrown out.

These are the main factors models use to judge trustworthiness.

Source Reliability Scoring

Models (and their data pipelines) assign higher weight to content from:

  • Peer-reviewed papers
  • Government data
  • Reputable news
  • Authoritative publishers
  • High-quality reference sites

Lower weight goes to:

  • Ads
  • Sponsored content
  • Social media opinion
  • Low-signal discussions
  • Spam / SEO-bait sites

For example, Reddit is considered medium-trust.  The AI engines consider it useful, but not authoritative. During training, high-authority text shapes the model more heavily.

Internal Consistency Scoring

The text gets evaluated like this:

  • Does it contradict known facts in the high-authority corpus?
  • Does it contain verifiable claims?
  • Is the writing coherent or chaotic?
  • Are arguments logical or wildly inconsistent?

Content that contradicts strong, well-established knowledge gets downweighted.

Sponsored content tends to get downweighted because it:

Makes unverified claims

Pushes a narrative

Often includes emotional or marketing language

Pattern + Redundancy Scoring

If thousands of independent sources say the same thing → high trust.

If only marketing materials say it → low trust.

Models look for convergent truth across the dataset.

Sponsored content rarely converges with neutral sources, so it rarely gains weight.

Metadata Scoring

Even if you never see it, training data comes with context signals:

  • Domain reputation
  • Link patterns
  • Post history
  • Age of domain
  • Author credibility (when known)
  • Presence of affiliate tags
  • “Sponsored,” “#ad,” or promotional markers

All of these influence the trust weighting before training even begins.

Moderation Signals

On platforms like Reddit:

  • Highly upvoted, non-controversial threads → more trust
  • Highly downvoted, flagged, or mod-removed content → downweighted or excluded

Sponsored content tends to have:

  • Lower engagement
  • More skepticism
  • More negative signals → leading to lower trust scores.

Model Self-Consistency (post-training)

During reinforcement stages (RLHF or similar), evaluators check:

  • Does the model repeat unverified marketing claims?
  • Does it spread misinformation?
  • Does it give unsupported assertions?

When the model “trusts” bad data, it gets corrected.

This feedback further suppresses unreliable or sponsored signals.

Deduplication and Noise Filtering

Massive datasets often contain:

  • Duplicates
  • Ads scraped accidentally
  • Spam
  • SEO-stuffed affiliate content

De-dupe filters act like an additional trust layer.
Sponsored content is often:

  • Duplicate across many sites
  • Template-based
  • Identified as LLM-generated or spammy

These get automatically deprioritized.

Putting It All Together

During training, a Reddit post with “#ad” or sponsored language will:

  • Not be excluded (it’s still public content)
  • But be heavily downweighted compared to trusted sources

A government report might get weight = 1.0
A well-researched article might get weight = 0.6
A normal Reddit comment might get weight = 0.2
A sponsored post might get weight = 0.05
An ad = 0 (excluded)

That’s the basic trust landscape.

Bottom line

AI models don’t “trust” sponsored content.
They ingest it, but treat it like background noise.
It almost never influences model knowledge, and it never shows up as a cited source.