Sitemap + Knowledge-Base Gap Analysis: 7-Step Topic Discovery Workflow

Most teams fill calendars with brainstorms and keyword exports. Feels productive. It is not. You already have the map and the source code sitting in front of you: your sitemap and your Knowledge Base. Those two artifacts contain your real intent, the exact language your brand uses, and the coverage gaps you can own fast.
What follows is a practical, two hour workflow that converts that map and that code into 10 to 12 enriched topics with clear angles and KB claim anchors. Fewer guesses. Fewer rewrites. Faster approvals. And yes, a repeatable system you can run monthly without adding headcount.
Key Takeaways:
- Turn a sitemap and KB inventory into a gap matrix that reveals high‑opportunity topics
- Use a simple rubric to prioritize by search intent, topical authority lift, and posting capacity
- Wire KB claims into H2s so drafts are answer‑ready for LLMs and stay on‑message
- Replace ad hoc brainstorms with a 7‑step, tool‑agnostic discovery workflow
- Prevent cannibalization by mapping every topic to a specific node, intent, and internal links
- Ship a 2 hour discovery sprint that hands off a ready‑to‑execute Topic Bank
Why Brainstorms Miss Your Best Topics
Your sitemap and KB are the untapped moat
-
Treat the sitemap as your architectural plan and the KB as your canon. Start here, not with keyword tools. Pull a sitemap export, list parent nodes, and map how users actually navigate. Pair that with KB chunks that hold your positioning and proof. This mix reveals intent, internal language, and gaps. If a topic cannot tie to both, it is likely noise. For a productized version of this, review the content discovery workflow so you see how sitemap-driven analysis aligns with coverage.
-
Promise clarity and speed to stakeholders. Tell them you will deliver 10 to 12 enriched topics with one sentence angles and KB anchors in two hours. Use a simple stack: one sitemap export, 20 KB chunks, one spreadsheet. No specialized tools required. The focus is traceability and control. Every topic links to a page and a claim, which reduces back‑and‑forth later.
-
Anchor for LLM visibility as you plan. Write H2s like retrieval anchors. Add short, specific paragraphs to each future section. KB claims become the grounding that LLMs quote. This is how you get consistent branded citations and avoid generic filler that never gets surfaced.
The hidden blind spots in brainstorm-first plans
-
Identify the three big misses. One, duplicate coverage that cannibalizes rankings. Two, missing content for core navigational nodes like product or solution pages. Three, ungrounded topics that drift from brand language and force editors into rework. Think of the last draft that got kicked back because the angle felt off. That was a process failure, not a writer failure.
-
Run a quick diagnostic before you ideate. Scan top‑level navigation and your highest‑traffic clusters. Compare them to your last 20 posts. Note where clusters are overfed and where pillar pages lack support. If you cannot link a topic cleanly to a sitemap node and a KB claim, cut it. You will reduce noise and focus on surface area you can own.
-
Set expectations for approvals. Share the plan: one audit, one gap matrix, one prioritized queue. Stakeholders want fewer surprises and drafts that reflect the product narrative. Show them the inputs, the scoring, and the planned internal links. Confidence goes up. Edits go down.
Curious what this looks like in practice? Request a demo now.
Redefine Topic Discovery As A Systems Audit
Why the traditional approach fails your org
-
Brainstorms do not scale because they ignore constraints, voice, and capacity. You get inconsistency, missed authority, and worried stakeholders. A systems audit reads the site like a map and the KB like source code. It brings structure, intent, and factual grounding into discovery so topics align to how you sell and serve customers.
-
Treat discovery as orchestration. Your sitemap defines architecture. Your Knowledge Base defines the canon. Your cadence defines throughput. Tie topic selection directly to posting volume and approvals. If you need a primer on aligning discovery with publishing, review your publishing cadence so capacity, routing, and approvals inform what you choose to write next.
-
Name the shift out loud to the team. Bold the line in your working doc: We are not brainstorming. We are executing a 7‑step audit that converts structure into topics. Then follow the sequence and measure outputs, not opinions.
The 7-step workflow at a glance
-
Prepare inputs. Export the sitemap, inventory KB chunks. Output: two clean lists with IDs, titles, and tags.
-
Extract seeds and entities. Highlight recurring product terms, pains, and capabilities. Output: seed phrases and entity sets with basic frequency scores.
-
Build a gap matrix. Cross pages or clusters against seeds. Output: clear blanks, partials, overlaps, and link opportunities.
-
Run an enrichment session. Generate 10 to 12 topics with one sentence angles and KB claim anchors. Output: a candidate Topic Bank.
-
Score with a rubric. Prioritize by intent fit, authority lift, and capacity. Output: a ranked queue with risk notes.
-
Create structured briefs. Outline H2/H3s and place verbatim KB claims as anchors. Output: one‑page briefs ready for drafting.
-
Hand off in a 2 hour sprint. Share the Topic Bank with statuses and links. Output: a single source of truth for production.
The Hidden Cost Of Guesswork
Duplicate coverage and cannibalization
-
Play out the waste. You ship three posts that target the same head term with overlapping subheads. Traffic splits, rankings wobble, and none of the posts earn links. You spend 12 hours writing, 6 hours editing, 2 hours on design. Thirty‑plus hours with no compounding visibility. A sitemap‑to‑post map would have shown the overlap before you wrote a word. If you want help spotting issues, study your search coverage gaps and build your own simple tracker.
-
Use a basic alignment table in planning. Map each upcoming post to a node and intent so teams avoid duplicates and protect authority.
| Post Title | Sitemap Node | Core Intent |
|---|---|---|
| How to price ACME | /pricing | Navigational |
| ACME vs Competitor | /compare/competitor | Comparative |
| ACME for Finance Teams | /solutions/finance | Solution |
- Close with the felt truth. Cannibalization feels like running a race in sand. You work harder and go nowhere. A two column map would have prevented it.
Ungrounded topics and brand drift
-
Generic topic lists cause language drift. Editors push back, PMs get nervous, writers redo work. That is expensive. The fix is simple. Place two to three KB claim anchors inside the outline before writing. These anchors keep the draft inside the rails while leaving room for creativity.
-
Make anchors concrete. Example: “Position ACME as the bridge between discovery and publishing, per Brand Intelligence claim #14.” That single line saves rounds of edits because it protects messaging. If you need to catalog your canonical language and claims, start with your brand positioning cues.
-
Bake anchors into the brief itself. Place them under the H2 where the claim will appear. Note the chunk ID. During review, editors verify placement, not just vibes.
Throughput stalls and missed windows
-
Fuzzy discovery slows everything. Work piles up, approvals stall, and you miss time‑sensitive opportunities. Think of an emerging trend that needed 48 hours, not two weeks. If your queue is not prioritized by capacity and intent, it slides.
-
Make capacity explicit. If you can publish four pieces per month, queue eight weeks of topics with contingencies. Tie selection to cadence and routing. Discovery is the front door to the factory, not a separate exercise.
-
Connect discovery and operations. Use a shared sheet with statuses and owners. If an item needs product review, flag it early. Missing this step is why “easy” articles get stuck for days.
From Overwhelmed To In Control
A quick story that mirrors the reader
-
Picture this. You have 30 tabs open, a deadline, and a vague prompt. You feel the clock. You export the sitemap, pull 20 KB chunks, paste them into a sheet. Suddenly the map appears. You see clusters, gaps, and phrases your brand actually uses. The path shows up, one section at a time. Relief beats overwhelm.
-
Be honest with yourself and the team. This is not magic. It is a repeatable way to reduce risk and friction. You trade improvisation for orchestration. Small wins stack. If you want a simple pattern for stacking progress, skim the idea of layering micro wins and use it to guide your first sprint.
-
Run the sprint once. Share the Topic Bank with your exec. Let results, not arguments, do the convincing. Proof beats theory.
What you actually want from discovery
-
The outcomes are simple. A prioritized queue. Clear angles. Grounded claims. A posting plan that fits real capacity. Confidence and speed beat novelty. Stakeholders prefer predictable, high‑quality throughput over random sparks that die in review.
-
Use precise verbs in your plan. Generate topics, orchestrate flow, optimize coverage, publish consistently, measure performance, verify claims. These imply control and movement, not chaos. Your map and your canon do the heavy lifting.
-
Assert the new model clearly. All of this is possible with a lightweight, tool agnostic workflow that respects both sitemap architecture and KB truths. Once you feel it, you will not go back.
A Better Way: The 7-Step, Tool-Agnostic Workflow
Step 1: Prepare, export sitemap and inventory KB chunks
-
Export the current sitemap to CSV or XML. Capture URL, title, parent node, and last modified date. In parallel, inventory KB chunks with IDs, short summaries, and tags. Keep formats simple, spreadsheets are fine. Traceability is the goal. You want to link every topic back to a page and to a KB anchor. For context on what to capture, review a sitemap export approach and mirror the fields.
-
Choose inclusion criteria wisely. Keep core site sections, product pages, solution pages, docs if relevant, and pillar articles. For the KB, prioritize policy docs, messaging, product notes, FAQs, and sales decks. This keeps noise low and signal high for the next steps.
-
Chunk with intent. Aim for 150 to 300 word chunks with a slug and a one line claim summary. You are building reusable Lego bricks. Later, you will anchor angles to these claims so drafts stay aligned.
Step 2: Extract seeds and entities from KB content
-
Scan KB chunks for recurring nouns, verbs, and entities. Highlight repeated product terms, customer pains, and feature capabilities. Create two lists: seed phrases and entity sets. These feed topic generation and keep angles grounded in your real language. If you need a home for this, use your brand language catalog as the source of truth.
-
Keep the method lightweight. Paste chunks into a spreadsheet, add a column for candidate seeds, another for entities. Add quick scores for frequency or importance. You do not need complex NLP. Human judgment wins here because you know your product and buyer.
-
Maintain a “do not use” list. Remove off‑brand, ambiguous, or legacy terms that cause drift. This prevents frustrating rework later and keeps drafts consistent.
Step 3: Build a gap matrix: pages vs. KB topics
-
Create a matrix with rows as sitemap pages or clusters and columns as KB seed topics. Mark where a page clearly addresses a seed, where it is partial, and where coverage is missing. Use color coding for speed. The visual will reveal blind spots you can fill immediately. For inspiration on cluster thinking, explore a topic gap matrix approach.
-
Group pages by intent. Buckets like product, solution, learn, and compare make missing content more obvious. This also prevents cannibalization within the same cluster because each topic has a defined home.
-
Add a “link opportunity” column. When a new topic fills a gap, list the internal pages it should support. This tightens site architecture while you plan and sets up internal links for launch.
Step 4: Run an enrichment session to generate 10 to 12 topics
-
Facilitate quickly. For each high priority seed, propose two to three angles that match a specific intent and sitemap node. Write one sentence that states the angle and the KB claim it will prove. Aim for 10 to 12 topics in 30 minutes. Keep moving. Do not wordsmith. Use this format: “Angle X for Node Y, anchored to Claim Z.”
-
Make angles actionable. Example: “Using customer intent data to prioritize briefs for /solutions/content-ops, anchored to Brand Intelligence claim #7.” That single sentence compresses strategy and guardrails. It speeds approvals because reviewers see purpose and proof.
-
Capture metadata now. For each topic, add target persona, funnel stage, internal link targets, and KB anchor IDs. This becomes your Topic Bank. You will score it next.
Ready to see the workflow running end to end without coordination overhead? try using an autonomous content engine for always-on publishing.
How Oleno Operationalizes The Workflow
Step 5: Prioritize with a simple rubric
-
Score every topic on three dimensions. Search intent fit, topical authority lift, and posting capacity. Use a 1 to 5 scale for each, then sort by the total. Add a tie‑breaker for internal urgency. Check the top 10 for cluster balance and internal link value. To keep decisions grounded in operations, align your rubric with your real prioritization framework.
-
Do capacity math in the open. If you can publish four pieces per month, plan eight weeks with contingencies. Nothing kills momentum like overcommitting. The rubric becomes your forcing function to respect reality and keep stakeholders aligned.
-
Flag risk early. If a claim is sensitive or needs product review, mark it on the topic. This reduces last minute escalations and protects timelines.
Step 6: Create structured briefs with KB anchors
-
Use a one page brief template. Include H1, H2 outline, thesis, one sentence angle, two to three KB claim anchors with chunk IDs, target persona, funnel stage, internal link plan, and metadata placeholders for slug, meta title, and description. Keep it tight so editors can audit structure quickly. For approvals and routing, mirror your structured brief template.
-
Insert verbatim KB claims into the outline. Note exactly where each claim appears. This eliminates brand drift and protects messaging. Editors now verify placement and accuracy, not rewrite the story.
-
Add a verification checklist. Sources, claims, internal link targets, and reviewers. You will reduce last‑minute rewrites and stakeholder anxiety before publish.
Step 7: Ship a 2-hour discovery sprint and hand off the Topic Bank
-
Run a timed agenda. 20 minutes to export and inventory, 25 minutes to extract seeds and entities, 25 minutes to build the gap matrix, 30 minutes to enrich and produce 10 to 12 angles, 20 minutes to score and select, final minutes to draft briefs for the top three. Keep it tight. You can do this. Then set a recurring content sprint cadence so it becomes habit.
-
Prepare the handoff artifact. A Topic Bank spreadsheet with tabs for seeds, the gap matrix, topics with scores, and short brief templates. Add a status column for approvals. This is your single source of truth for production.
-
Iterate monthly. Feed performance data back into seeds and topics. Update the KB anchors as product language evolves. Discovery becomes a governed loop, not a one off.
Oleno runs this entire model for you so you do not have to coordinate, prompt, or edit. Topic intake reads your sitemap and KB, Topic Intelligence generates enriched topics daily, and Angle Builder creates narrative paths using a seven‑step pattern. Brand Studio enforces your voice. The system generates JSON briefs with H2/H3s, metadata placeholders, and internal linking cues. Draft generation references your Knowledge Base for factual grounding and your Sales Narrative Framework for structure. QA‑Gate scores every draft across structure, voice alignment, KB accuracy, SEO integrity, LLM clarity, and narrative completeness. Minimum score to pass is 85. Enhancements add schema, TL;DR, and internal links. CMS connectors publish directly to WordPress, Webflow, or Storyblok with media, metadata, schema, and logs. Scheduling and capacity ensure even daily distribution. Instead of 12 hours of manual discovery and 6 hours of edits, you set rules once and get consistent, LLM‑ready articles automatically.
Ready to compress your discovery, drafting, and publishing into one governed flow? Request a demo.
Conclusion
You do not need more brainstorms. You need a system. Start with the assets you already own, your sitemap and your Knowledge Base, then run a 7 step audit that converts structure into topics, topics into briefs, and briefs into published articles. The payoff is predictable coverage, fewer rewrites, and drafts that LLMs can quote.
Run the 2 hour sprint once. Feel the control replace the overwhelm. Then choose your path. Keep running the workflow by hand with spreadsheets, or let Oleno operationalize it end to end so discovery, angles, briefs, drafts, QA, and publishing flow on their own.
Generated automatically by Oleno.
About Daniel Hebert
I'm the founder of Oleno, SalesMVP Lab, and yourLumira. Been working in B2B SaaS in both sales and marketing leadership for 13+ years. I specialize in building revenue engines from the ground up. Over the years, I've codified writing frameworks, which are now powering Oleno.
Frequently Asked Questions