Key Takeaway
Tom's Link Authority is a free Windows desktop tool that queries the Common Crawl web graph (120 million domains, 4.4 billion edges) to show who links to any domain. It includes backlink lookup, link gap analysis against competitors, and batch domain processing. The desktop app is free. Optional paid edge shard databases unlock full backlink data by domain letter for $5 each.

How to Check Backlinks and Find Link Gaps Without a Subscription

Backlink data is one of the most expensive categories in SEO tooling. Ahrefs charges $99 per month at its cheapest tier. SEMrush starts at $139.95. Moz Pro is $99. Even the budget options run $30–50 monthly, and they all require an account, a credit card, and a commitment to keep paying whether you use the tool this month or not. For freelancers, small site owners, and indie developers who check backlinks a few times a month, that pricing model makes no sense.

The underlying data these tools query is not proprietary. The largest publicly available web graph comes from Common Crawl, a nonprofit that crawls the web and publishes the results as open datasets. The crawl that powers Tom's Link Authority contains over 120 million domains and 4.4 billion link edges — comparable in scale to what commercial tools index, though with different coverage characteristics. The difference is that Common Crawl data is free to access. The challenge has always been that it comes as enormous compressed text files measured in terabytes, and turning those files into something queryable requires significant processing infrastructure.

Tom's Link Authority solves that processing problem by pre-building the entire Common Crawl web graph into SQLite databases that run locally on your Windows machine. The desktop app is free. The rank databases (covering all 120 million domains with Harmonic Centrality scores) are free. The edge shard databases — which contain the actual backlink data showing which domains link to which — are available as 27 separate files organised alphabetically by target domain, priced at $5 each as a one-time purchase.

What Common Crawl Data Actually Shows You

Before getting into the tool itself, it helps to understand what Common Crawl data is and what it is not. Common Crawl runs web-wide crawls roughly every quarter. Each crawl discovers billions of pages and records the links between them. When this data is processed into a domain-level web graph, you get a dataset that shows which domains link to which other domains, with edge counts indicating how many page-level links exist between each pair.

This data is genuinely useful for backlink analysis. It tells you who links to your competitors, how many domains link to a target site, and where the overlap and gaps are between your link profile and someone else's. What it does not do is provide the granularity of a real-time crawler — you won't get anchor text, link placement, follow/nofollow status, or page-level URLs. Those details require crawling the actual pages, which is what the subscription tools do continuously. Common Crawl data trades granularity for scale and cost: you get a comprehensive map of the web's link structure without paying monthly for access.

For most practical link building decisions — identifying who links to competitors, finding domains worth reaching out to, comparing your link profile against others in your niche — domain-level data is sufficient. You rarely need to know the exact anchor text on a specific page to decide whether a domain is worth pursuing for a backlink.

How Tom's Link Authority Works

The tool is a portable Windows desktop application — single executable, no installer, no account required. It ships with a built-in database downloader that fetches the free rank databases on first run. These rank databases contain Harmonic Centrality scores for all 120 million domains in the Common Crawl graph, which gives you a domain authority metric derived from actual link structure rather than proprietary algorithms.

The Backlinks tab is where the tool earns its name. Enter any domain and TLA queries the relevant edge shard database to return every domain that links to it, sorted by link count. Each result shows the linking domain, its Harmonic Centrality score, and the number of edges (page-level links) between the two domains. For a domain like tomdahne.com, this might return dozens of linking domains. For a major site, it can return thousands.

Edge shards are organised by the first letter of the target domain. If you want backlink data for domains starting with "t" (tomdahne.com, twitter.com, techcrunch.com), you purchase the "t" shard. If you work primarily with clients whose domains start with different letters, you buy the shards you need. At $5 each, most practitioners need three to five shards to cover their regular client base and competitors.

Link Gap Analysis

The Link Gap tab is arguably the most actionable feature. Enter your domain and a competitor's domain, and TLA shows you every domain that links to your competitor but not to you. This is your outreach list — real domains with demonstrated willingness to link to sites in your space, where you currently have no relationship.

The results are sorted by the competitor's link count from each domain, which surfaces the strongest linking relationships first. Combined with the Harmonic Centrality score for each linking domain, you can quickly prioritise which outreach targets are worth pursuing. A domain with high centrality and multiple links to your competitor is a stronger target than a low-authority site with a single link.

The gap analysis works across different shard letters as long as you have the relevant shards installed. If your domain starts with "m" and your competitor starts with "s", you need both the "m" and "s" shards to run a complete gap analysis.

Batch Processing

The Batch tab accepts a list of domains (paste from a spreadsheet or text file) and processes them all at once. For each domain, TLA looks up its Harmonic Centrality rank and score, total inbound link count, and referring domain count. The results export to CSV, which makes it straightforward to import into a spreadsheet for further analysis or to sort and filter a prospect list by authority.

This is useful for evaluating large lists quickly — sorting a directory submission list by authority, checking which sites in a blogroll actually have link value, or comparing a set of competitors at a glance.

The Harmonic Centrality Score

Most commercial tools use proprietary "domain authority" metrics — Moz's DA, Ahrefs' DR, SEMrush's Authority Score. These are useful but opaque: you don't know exactly how they're calculated, and they can be manipulated by link schemes that the tool's own algorithms haven't caught yet.

Harmonic Centrality is a well-established graph theory metric. It measures how close a node (domain) is to all other nodes in the graph, weighted by distance. Domains that are linked to by many well-connected domains score higher. It is resistant to simple manipulation because boosting your score requires acquiring links from genuinely well-connected parts of the web, not just from any set of domains willing to link to you.

Tom's Link Authority computes Harmonic Centrality across the entire 120-million-domain Common Crawl graph and expresses it as both a raw score and a percentile rank. The free rank databases include this data for every domain in the graph — no edge shards required for basic authority lookups.

Tip: Tom's AI Rank Checker at tomdahne.com/ai-rank-checker provides a free online lookup of any domain's Harmonic Centrality score without downloading the desktop app. It queries the same Common Crawl data.

Data Freshness and Updates

Common Crawl releases new data roughly every quarter. Tom's Link Authority processes each new release into updated rank databases and edge shards. The current data is built from the April 2026 crawl. When a new crawl drops, updated shards are processed and made available to existing purchasers — the shard purchase covers the letter, and updates for that letter are included as they're processed.

The quarterly update cycle means the data is not real-time. A link acquired yesterday will not appear in TLA until the next Common Crawl release is processed. For link prospecting and competitive analysis this lag is acceptable — the web's link graph changes slowly at the domain level, and a quarterly snapshot captures the vast majority of stable linking relationships.

A Worked Example: Building an Outreach List

To make this concrete, here is how the pieces fit together in a typical link-prospecting session. Say you run a small SEO tools site and want to find link opportunities you are currently missing. You start in the Link Gap tab, entering your own domain and the domain of a competitor who ranks above you. With the relevant shards installed, TLA returns every domain that links to the competitor but not to you, sorted by how many links the competitor receives from each.

The top of that list is your priority outreach set — sites that have already demonstrated they will link to a tool like yours. Sort by Harmonic Centrality to push the most authoritative linking domains to the top, then export the list to CSV. From there you can drop a second competitor's link sources into the Batch tab, pull their authority scores, and merge the two lists to find domains that link to several rivals but not to you. Those multi-competitor linkers are the strongest targets of all: their willingness to link to sites in your space is established several times over.

None of this requires a subscription, a login, or sending your prospect list to a third-party server. The whole workflow runs against local SQLite databases on your own machine, and the only cost is the one-time shard purchases for the letters you actually work with.

When to Use TLA vs a Subscription Tool

Tom's Link Authority is not a replacement for Ahrefs or SEMrush in every scenario. If you need real-time link monitoring, anchor text analysis, or page-level backlink data, you need a subscription crawler. If you're an agency managing dozens of clients with daily reporting requirements, the commercial tools earn their monthly fee through automation and freshness that a quarterly-update desktop tool cannot match.

Where TLA wins is the use case most freelancers and small site owners actually have: periodic competitive analysis, link prospecting before an outreach campaign, and authority checks on potential partners or directories. For these tasks, a one-time $5-per-shard purchase with no expiry and no recurring cost is significantly more economical than any monthly subscription, and the data quality is comparable for domain-level analysis.

The two approaches also complement each other. Use TLA to build your prospecting list and identify gap opportunities, then use a subscription tool's free trial or day pass to drill into page-level details for your highest-priority targets before you begin outreach.

Getting Started

Download Tom's Link Authority from tomdahne.com/link-authority. The desktop app is free. On first run, it downloads the rank databases automatically — these cover all 120 million domains and give you Harmonic Centrality lookups and batch processing at no cost. Edge shards for full backlink data are available per-letter at $5 each from the same page.

If your SEO workflow includes technical site auditing alongside link analysis, Tom's Site Auditor provides a full offline crawl and audit for any domain — the two tools together cover the technical and link sides of SEO without a single subscription.