The 7 dimensions that make AI visibility measurable

AI visibility isn't a feeling — it's a measurable state. I score a site from zero to one hundred across seven dimensions, each carrying its own weight: accessibility 20%, structured data 15%, answer-ready content 15%, entity 15%, off-site presence 25%, credibility signals 5%, content depth 5%. Under every dimension sit concrete, verifiable checks — not impressions, but facts a machine can confirm. Here's the full framework: what each dimension examines, and what it means for what the buyer actually sees in the AI's answer.

The value of a score is that it makes things comparable. Two competitors and your own site, measured with the same yardstick on the same day — that's how the real difference surfaces. Layer by layer, I'll show you what the seven dimensions examine, why each carries the weight it does, and what moves quickly versus what moves slowly.

What are the 7 dimensions, and why these?

The seven dimensions aren't an arbitrary list. Each answers a question the model quietly asks before it names a company: can I reach the site, do I understand what it offers, can I lift a quotable answer from it, am I certain which company this is, does anyone talk about it elsewhere, does it look trustworthy, and is there real substance behind it. The weights reflect how much each of these questions matters.

DimensionWeightWhat it means from the buyer's side
D1 · Accessibility20%Can the AI's crawler even reach the site, and is the content visible without JavaScript.
D2 · Structured data15%Does the site name itself in machine-readable form: company name, address, offering, related links.
D3 · Answer-ready content15%Can a standalone, quotable sentence be lifted in answer to the question — or does the answer get buried in long text.
D4 · Entity and NAP15%Do the name, address and phone number match everywhere, so the model doesn't confuse you with another company.
D5 · Off-site presence25%Does anyone talk about the company elsewhere: reviews, directories, independent mentions, press.
D6 · E-E-A-T signals5%Is the credibility visible: HTTPS, a named author, legal pages, cited sources.
D7 · Content depth5%Is there real, on-topic content, and do the important pages link to one another.

D1 — Accessibility (20%). This is the rawest layer, yet the fastest to fix. If the crawler can't even get onto the site, or the content only appears after JavaScript runs, the model sees a blank page — and the company disappears from the answer. The buyer notices nothing of this on the surface, yet everything else hinges on it. According to one international study, roughly 27% of B2B and e-commerce sites unknowingly block large language model crawlers, often at the hosting or CDN level. Where the crawler can't get in, the work done in every other dimension is wasted.

D2 — Structured data (15%). Structured markup tells the machine what the company's name is, where it is, what it offers — so it doesn't have to guess. International measurements show that pages with valid structured data appear in AI summaries 20–30% more often. What the buyer sees of this is that the company's details show up accurately in the AI's answer, rather than outdated or incomplete.

D3 — Answer-ready content (15%). The model most readily lifts the quotable part from the first few sentences after a heading — according to a frequently cited industry source, from the first 40–60 words. If the answer is buried under a long, winding paragraph, the AI looks for it elsewhere. From the buyer's side, this is the situation where the point arrives immediately in answer to their question, not after three paragraphs of preamble.

D4 — Entity and NAP consistency (15%). NAP is the trio of name, address and phone number. If these don't appear in exactly the same form on the website, in the Google Business Profile and in directories, the model grows uncertain and would rather skip the company than confuse it with another. The buyer notices this when the AI gives a wrong address or an old phone number — or simply recommends a different business with a similar name instead.

D5 — Off-site presence (25%). This is the heaviest-weighted dimension, and it deserves a chapter of its own — I explain why below. In short: this is what others say about the company. Reviews, independent sites, forums, press, directories. From the buyer's side this is decisive: the model doesn't believe the company's self-description — it believes what it sees about the company from the outside.

D6 — E-E-A-T signals (5%). The remaining credibility signals not measured elsewhere: does HTTPS run on every page, is there a named author, are the privacy notice and legal imprint available. Small weight, because most of it is already covered by structured data and off-site presence.

D7 — Content depth (5%). Does the company have real, on-topic content, and do the important pages link to one another. It's hard to build topical authority from a half-page brochure site — at most the AI knows the company's name, but no concrete, quotable answer.

Why is off-site presence the heaviest?

Because it's the only dimension the model doesn't read from the company's own website — and that's exactly what drives the actual recommendation. According to an analysis of 7,000 citations, most of the named sources aren't the companies' own pages: in ChatGPT's citations, Wikipedia alone accounts for 47.9 percent, and at Perplexity, forums make up close to half of all references (Digital Bloom, 2025). The model, in other words, judges from the outside. The other six dimensions can be flawless, but if nothing outside speaks about the company, the AI has nothing to draw on.

The densest part of this external footprint is the mass of reviews. There's no fixed review-count threshold — what operates is a trust threshold: according to SOCi's 2026 survey, places recommended by AI sit on 4.3 stars on average, with plenty of fresh, answered reviews; with few reviews, the typical outcome is omission or a mistaken guess. This signal can't be honestly rushed: it accumulates from the feedback of satisfied customers, week after week. What actually counts, and how it's built up honestly, I break down in the how many reviews you need for an AI recommendation piece.

One Hungarian detail adds extra weight to this. According to an analysis of 1.3 million citations, sites that translated their content into another language earned up to +327% more AI citations on searches in that language than they did without translation (Weglot). To win Hungarian customers, it's worth becoming visible in Hungarian — and the language of your external mentions matters too.

25%
20%
15%
Off-site presence (25%) is the heaviest-weighted dimension, accessibility (20%) is the fastest to fix, and the content layers (15-15%) take months to build up. Weight and speed are not the same thing.

Now comes an important caveat the gate sets in stone. The 25% doesn't measure whether the AI will recommend the company — that has to be checked live, with an actual query. It measures whether the external footprint the recommendation could draw on is even there. Your competitors are visible to AI by accident; the goal is for your business to become visible on purpose. Why the score isn't the same as the recommendation, I lay out point by point in the GEO score versus AI recommendation article.

And there's another asymmetry. Off-site presence isn't only the heaviest layer — it's the slowest too: 6–12 months before it moves meaningfully, because it's built from real people's real experiences, partly beyond the company's control. Accessibility, by contrast, is a matter of days. You can follow which layer takes hold at what pace in the how long it takes for GEO to work piece. The high weight and the slow build together explain why this is the weakest point for most Hungarian SMEs — and at the same time why it's the most valuable, since it's the hardest to copy.

What does a measurement look like in practice?

A single measurement is only worth anything if it's repeatable. That's why the method is strictly fixed: the same checks run under every dimension, and every result is dated. Not an estimate, but a fact a machine can confirm — does the crawler reach the site, is the structured data there, does the NAP match, and so on across all seven dimensions.

The measurement begins with a data-quality gate, before any score is produced. If the site returns an error message, is empty, or sits behind a bot wall, it doesn't get a bad score — it gets a "not measurable" mark. This is an important honesty rule: a site that simply wasn't reachable is not the same as a site that performed badly. I don't conflate the two, and neither does the report.

Repeatability is what makes the number trackable. I re-measure the same seven dimensions monthly, with the same checks, and produce a dated comparison against the previous month. So whether the work is making progress shows not from a promise, but from numbers: the technical score can jump as early as the first month, the content scores in the second or third, and off-site presence creeps upward slowly and steadily. You can follow the steps of the process on the how it works page.

My own scorecard is public too — I'll stand under the same yardstick. A freshly launched site necessarily starts low on the off-site presence dimension, and that's true of mine as well. I don't hide it: I describe the full weighting and every single check used for the measurement, openly and point by point, on the methodology page. Anyone who measures others has to stand under their own yardstick too.

Beyond a certain point, the logic of the seven dimensions is even simple. Some layers are about the AI being able to reach and understand the site — these are fixable quickly, technically. The heaviest-weighted layer, however, is about others talking about the company — that's slow, patient work, partly carried out beyond the company itself. The score is precisely what makes this duality visible. Why none of this replaces classic search optimization but instead builds on top of it, I explore on the SEO and GEO page.

The score is neither a reward nor a verdict. It's more of a map: it shows which layer is already in place, and which is still waiting for its time. The technical dimensions are the road already travelled, off-site presence the distance still ahead. Measure only one of them and you draw half a map — and half a map isn't worth buying even at half price.

The question today is no longer whether AI visibility can be measured, but who measures it honestly, and who sells a promise instead. The seven dimensions put this into numbers: they don't tell you the AI will recommend you, they tell you where the company stands today and what the next step is. A measurable, dated starting point is worth more than ten confident predictions.

The eighth axis: distinctiveness, that is, selection

The seven dimensions answer a single question: can the model read you. That's readiness — the entry ticket. But there's a second question, one that only becomes sharp once the answer to the first is already yes: if you make it into the set of candidate companies, will you be the one obvious answer, or one of fourteen interchangeable results? The two are not the same, and they don't take the same work. Readiness measures whether the model finds and understands the company. Distinctiveness measures whether the model singles it out from the rest when it has to choose. I explain this duality in more detail in the AI finds you, but does it choose you piece.

This difference now has strong, independent backing. According to Bain's March 2026 banking study, which analyzed more than a billion AI citations, large language models "smooth out the generic message and amplify the recurring patterns" — bland positioning that could be slapped on anyone is algorithmically disadvantaged (Bain, 2026). According to a separate Bain analysis from April 2026, 89 percent of unbranded queries are filled by third-party sources — meaning where a company leaves no clear, distinctive trace, the model brings in something else in its place. Distinctiveness, then, isn't a matter of style — it's a selection factor.

Importantly, this is a measured state, not a promise. Just as I say honestly about the readiness score that it isn't the same as the recommendation, the same must be said about distinctiveness: it measures where the company stands today on the interchangeability scale — it doesn't promise that improving distinctiveness brings more customers. The causal chain between the two isn't proven, and I don't claim it; as I also lay out point by point in the GEO score versus AI recommendation article, readiness and outcome are two separate measurements. A measurable, dated starting point on interchangeability is worth exactly as much as it is honest.

I examine distinctiveness across seven sub-metrics, each from a different angle of selection:

Sub-metricWhat it examines
U1 · Substitution resistanceWhen the model is asked to name a substitute, does it state a real difference, or is the answer "essentially the same."
U2 · Unprompted mentionDoes the company come up on its own in answer to a neutral buyer question, and in what position — or only when specifically asked.
U3 · Comparison survivalIn a head-to-head choice, does the model pick the company, and on what grounds — a unique reason or an interchangeable generality.
U4 · Descriptive distinctivenessCan the model, in its own words, distinguish the company from competitors — and does it do so on clinical/professional capability, or only on positioning.
U5 · Attribute ownershipIs there a concrete attribute the model unambiguously ties to this company — or does it award every important attribute to a competitor.
U6 · Cross-model agreementDo several models (ChatGPT, Gemini, Claude) identify the company in the same, consistent way — or with a different name and a different profile per model.
U7 · Specificity spreadDoes the company's own surface offer the model a concrete, quantified, quotable claim — or a generic claim of superiority with nothing to lift.

The logic is the same as with the seven dimensions: under every sub-metric a repeatable, dated measurement runs, and the measurement happens in two modes. One is the "no-search" mode, which examines the model's trained knowledge — this measures the mechanism. The other is the search-assisted mode, which shows what the buyer actually sees when they ask, live, today. The two often differ, and the difference is precisely the lesson: just because a company appears in the answer doesn't mean the model chooses it.

One honest limit. The distinctiveness measurement does not tell you one thing: that higher distinctiveness brings more sales. That would be a causal claim, and I have no proof of it — just as I don't claim that raising the readiness score raises the recommendation rate. Both are measured states, not promised outcomes. What I do commit to: I show you where the company stands today on the interchangeability scale, dated, with the same yardstick as its competitors — you draw the conclusion from it.

Why it isn't the number of checkmarks that matters, but their weight

The market throws up AI-visibility evaluators that compete on the number of dimensions — "twenty-six criteria," "dozens of points." This easily misleads, so it's worth clarifying the difference. A binary checkmark says only this: it's there, or it isn't. A weighted dimension says more: it examines not just whether something exists, but also how much it weighs in actual visibility. I work with seven domains, but behind each of them more than ten checks roll together into a single, weighted score — the goal isn't to collect checkmarks, it's to measure the outcome.

The difference isn't hair-splitting. A site can tick every point on a long checklist and still remain invisible to AI — because actual visibility isn't decided by the number of ticked technical items, but by the weight of external mentions, reviews and credibility. According to Ahrefs' large-sample analysis, the real predictors of AI visibility are brand mentions, ratings and authority, not the length of a checklist (Ahrefs, 2025). That's why I measure a weighted output, not a checkmark count: anyone who ticks thirty tiny technical points but stands empty in the heaviest-weighted dimension — off-site presence — finishes at the back in both the score and the AI's answer.

I don't measure the number of criteria, I measure the output. Thirty ticked trifles are worth less than a single heavy-weighted dimension if that one stands empty. The long list reassures — but the weight decides.

Frequently asked questions

What are the 7 dimensions that make AI visibility measurable?

Accessibility (20%), structured data (15%), answer-ready content (15%), entity and NAP consistency (15%), off-site presence (25%), E-E-A-T credibility signals (5%) and content depth (5%). Under each dimension sit concrete, machine-verifiable checks, and each counts toward the zero-to-one-hundred score with its own weight.

Why does off-site presence carry the greatest weight?

Because it's the only dimension the model doesn't read from the company's own website, and it's what drives the actual recommendation. International measurements show that the overwhelming majority of AI citations come from third-party sources: reviews, directories, press (Digital Bloom, 2025). That's why it gets 25% — and that's also why it's the slowest layer to build.

Does a high score mean the AI will recommend you?

No. The score measures readiness, not the outcome. It shows whether the external footprint and technical foundation the recommendation could draw on are there — but whether a given model actually names the company can only be checked live, with a query. I never conflate the two.

What happens if a site isn't reachable at the moment of measurement?

Then it doesn't get a bad score, but a "not measurable" mark. A site that just returned an error or was empty is not the same as a site that performed badly. That's why the measurement begins with a data-quality gate, before any score is produced.

What is the eighth axis, distinctiveness?

The eighth axis measures distinctiveness: whether, once a company makes it into the set of candidates, the model chooses it or treats it as one of the interchangeable results. The seven readiness dimensions answer whether the model finds and understands the company; distinctiveness answers whether it singles it out from the rest. It's a separate measurement because it answers a different question, and it's a measured state, not a promised outcome: I don't claim that improving it brings more customers, I only show where the company stands today on the interchangeability scale.

Sources