Building KeyScope

In late 2024 I started building a YouTube keyword research tool called KeyScope. The idea came from conversations with friends who work on YouTube regularly. None of them had a simple, focused tool for what they actually wanted: type a keyword, see a defensible volume number, move on. Every keyword tool in that space is a Chrome extension stuffed with a kitchen sink of other features, and most of them quietly admit (if you read carefully) that their YouTube search volume numbers are estimates.

Why I built it

I wanted to build the simple version. And I wanted the estimates underneath to be defensible, because people make content decisions based on those numbers.

Why Google was the baseline

YouTube does not expose a real keyword search volume API. You can pull trend lines, scrape autocomplete, read the results page, but no endpoint will tell you how many people searched for "how to tune a guitar" last month. Every tool in the category estimates around this.

Google, on the other hand, has real keyword volume data through Google Ads. So the question I started with was simple. If I can trust Google Ads volume for a keyword, how much of that volume also shows up on YouTube? The two search engines have very different patterns, but Google was the only one I could actually build a defensible baseline from.

The real insight came from looking at how different content categories behave on the two platforms.

The category problem

Gaming searches skew heavily toward YouTube. A surprisingly large share of their total search traffic ends up there instead of Google, because people go to YouTube to watch walkthroughs, reviews, and gameplay. Business and finance keywords are the opposite. People search "how to refinance a mortgage" on Google, not on YouTube, because they're looking for information, not a watch list.

A single blanket multiplier from Google to YouTube doesn't work. What works is a category-specific multiplier. So I went and built one.

I processed around 25,000 trend files from DataForSEO, analyzed over 8,000 keywords, and calculated a per-category coefficient: the ratio of YouTube volume to Google Ads volume for that category. The full system covers 15 categories, and the spread is wide.

25,000 trend files processed from DataForSEO.

8,000+ keywords analyzed for coefficient calculation.

15 content categories with per-category multipliers.

Category	Coefficient	Source
Pets & Animals	0.97	Calculated
Film & Animation	0.95	Calculated
Music	0.80	Corrected
Gaming	0.485	Calculated
Cooking & Food	0.26	Calculated
Health & Fitness	0.16	Calculated
Business & Finance	0.08	Estimated

The formula underneath is simple.

YouTube_Volume = Google_Ads_Volume × Category_Coefficient

The interesting part isn't the formula. It's the 25,000 files underneath it, because without real data those coefficients would just be guesses.

For the few categories where I didn't have enough direct data, I used proxy estimation (Travel derived from Entertainment, DIY derived as an average of Cooking and Pets) and marked them clearly in the code as lower-confidence. A creator using the tool deserved to know which numbers to trust more.

None of this was meant to be a one-time calculation. The plan was to revisit the coefficients periodically as new data came in, because search patterns shift and a coefficient from six months ago might not reflect where searches actually go now. That recalibration cadence was part of the design.

The database pivot

I started the whole project on Supabase because it's the default "cheap backend for an MVP" choice. It was fine for a couple of weeks. Then I started pulling in real DataForSEO data at scale and hit the free-tier limits.

This is where my Ubersuggest years paid off. Analytics tools are heavily database-dependent in a way most SaaS apps aren't. You're not just storing user records. You're storing every keyword anyone ever searched, every API response, every coefficient calculation, every cache lookup.

The database is critical to the product.

When Supabase started feeling constrained, I started looking for a backend I could run indefinitely without paying, because I wasn't going to pay for infra on a side project in pre-product stage. Cloudflare D1 had a much more generous free tier, and once I was thinking about D1 I was also thinking about Cloudflare Workers for the backend. The whole stack drifted that way in a couple of days.

I also built cost-aware habits into the database from day one. Time-to-live on every cached DataForSEO response. Multi-layer caching so the same keyword never cost me twice. A coefficient cache with a one-hour expiry. The rule in my head was: if I have to pay to fetch the same data twice, I built it wrong.

The DataForSEO cost story

DataForSEO is the paid API underneath everything. Their live endpoint is expensive per call, and a free-tier keyword tool that makes a live call for every search would go broke fast. They also have a standard queue endpoint that's cheaper but takes a few seconds, plus a tasks_ready endpoint that costs almost nothing and tells you when a queued result is ready to pick up.

I wired the service to submit to the standard queue, poll tasks_ready every couple of seconds, and only retrieve the result when it was actually ready. That combination, plus the multi-layer cache, cut my DataForSEO costs by about 23% per keyword compared to the live endpoint approach. For a paying product that's a rounding error. For a free-tier MVP, it's the difference between the project being sustainable and not.

Why it’s on hold

The short version: life priorities. I built the algorithm, the database, the API integration, the authentication, an admin dashboard, a beta invitation system, and most of the core UI components. What I didn't finish was the last-mile work that turns a working backend into a product people use: integrating the UI, polishing the dashboard, finishing exports, setting up a real production deployment, adding billing.

The honest version: I hit the point where the remaining work was less interesting than the rest of my life. KeyScope was still pre-alpha at that stage. My brother was my test user, which was exactly the right shape for the project at that point. One trusted person poking at it, catching the rough edges, before I opened it up to a wider circle during the polish-and-deploy phase. I never got to that phase. The domain is still registered. The repo is still there. The project is on hold.

To really scale KeyScope, the next version is probably a Chrome extension, because Google doesn't expose much YouTube data server-side and an extension can read what the user is looking at directly. That would be a meaningful re-architecture, not a polish pass. So KeyScope sits where it sits.

What I kept

The biggest takeaway has nothing to do with YouTube. It's that when you're building an analytics tool, the database isn't a component, it's the spine. Every design decision in KeyScope eventually touched the database: how often to call DataForSEO, how long to cache a coefficient, how to structure queries for D1's free tier, how to handle retention when I didn't know what people would search for. The Ubersuggest years taught me this in the abstract. KeyScope made it concrete.

The smaller takeaway, which still matters: when a whole category of tools is "estimating" something, go find out what they're estimating from. I almost trusted a few existing tools as a starting point, and I'm glad I didn't. They turned out to be drinking from roughly the same source I eventually built my own algorithm on top of.