Streamlining & Simplifying Your Data Model for Copilot

Welcome to the second article in the AI-Ready Semantic Model series. If you missed Article 1, we explored the fundamentals of good model structure, namely relationships, metadata, and naming conventions, and how they lay the foundation for Copilot to work effectively. Now it's time to tackle what might be the most underappreciated aspect of a semantic model: simplicity.

If the semantic model is the nervous system of Power BI, then this article is about decluttering that nervous system before it has a panic attack. I’ve worked with enough models now to know that it’s not just AI that gets confused by a bloated dataset, humans are just as prone to being overwhelmed by a swamp of half-named columns or duplicated fields.

Early in my Power BI journey, I was very much in the "include everything and sort it out later" camp. One model in particular springs to mind, where I genuinely believed I was doing everyone a favour by including all the raw columns, every code field, every staging artefact, every financial and sub-financial measure variation. It wasn’t so much a semantic model as it was a data hoarding experiment.

When it comes to Copilot, simplification is even more important. Copilot doesn’t have context from office banter or that one time someone explained what ‘Amount1’ means. It just sees what’s in the model and tries to make sense of it. So your job is to reduce the noise, clarify the signal, and make sure it has fewer wrong paths to wander down.

Image thoughtfully produced by Acendant Analytics using Canva

Hide What Doesn’t Matter

If a column isn’t used in visuals, calculations, or any business conversation since 2019, then hide it. Better yet, don’t include it in the model at all. Technical keys, ETL helper fields, flags that no one understands , all of these are landmines for both users and Copilot. I’ve seen analysts try to build visuals using raw keys that had summarisation set to ‘None’, only to wonder why nothing was aggregating. But it’s not their fault, the model let them do it.

By hiding unused columns and redundant tables, you reduce clutter and make it easier for Copilot to guess what users mean. It won’t accidentally pick up ‘Product_ID_Internal’ instead of ‘Product Name’, which can save a lot of apologetic Teams messages.

Use Explicit Measures, Hide the Raw Stuff

Fact tables are for storing data, not showing it. Columns like Sales Amount, Discounts, or Costs should be hidden in favour of well-defined DAX measures. This keeps your visuals clean with centralised logic so your Copilot won’t end up generating 17 variations of ‘Amount’ in a visual.

Explicit measures are the vocabulary Copilot learns from. When a user says "show me revenue by region", you want Copilot to confidently grab ‘Total Sales’ rather than make a best guess between ‘Amount’, ‘Amt’, or ‘Sales_Value_New’.

Another key task is disabling summarisation on fields like IDs or month numbers. This helps avoid those moments where someone proudly builds a chart and ends up summing the fiscal month. That’s how we end up with a Y-axis labelled ‘78’, and no one knows why.

Assign Proper Data Categories

This one’s a quiet hero. By tagging your fields with appropriate data categories, like ‘City’, ‘Country’, ‘Longitude’, or ‘Date’, you help both visuals and Copilot interpret the data properly. It’s a tiny change with a big impact.

I’ve seen reports where geography fields were stored as plain text, only to fail in map visuals or throw Copilot’s logic off entirely. Telling Power BI what a field represents gives it a fighting chance to be rendered correctly and makes it more likely that natural language queries resolve into something actually useful.

Avoid “Everything in One Table” Syndrome

We’ve all seen that model. One giant table with every metric, every category, every date, and a few spare user-defined flags that no one dares delete. It’s tempting to build like this because there’s fewer relationships to manage with everything in one place. Except what you gain in perceived simplicity, you lose in performance (and sanity).

These mega-tables lead to ambiguous joins, duplicated columns, and slow refresh times. Copilot, faced with 200 columns, may not pick the right one. You’ll also run into the dreaded duplicate results if users start combining fields without understanding how they connect.

Remove Redundant Calculated Columns

Calculated columns have their place, but many of them belong in Power Query or as DAX measures. If a column doesn’t need to exist on every row of your data, consider whether it needs to exist at all.

Redundant calculated columns inflate your model size, as well as chewing through memory, no wonder they slow everything down. I’ve seen models bogged down by high-cardinality calculated columns that were unnecessarily materialised row-by-row. These columns weren’t used in measures or filtered properly, making the model preform at a snail’s pace.

Stop Duplicating Fields Across Tables

This one gets messy fast. You’ve got ‘Product Name’ in the Product table, great. Then someone adds ‘Product Name’ again in the Sales table for convenience. Multiply that across Customers, Territories, Channels and so on, now suddenly your model has eight copies of every key field.

Users won’t know which one to pick, so Copilot definitely won’t. Worse still, it might guess inconsistently across questions, resulting in similar queries producing totally different answers.

Stick to one source of truth per attribute. Prefer dimension tables, and train users (and Copilot) to pull fields from those.

Real World Consequences (Trust Me, I’ve Seen Them)

When models get cluttered, weird things start to happen. I’ve seen business users swear the data was wrong, only to find they were dragging unrelated columns from two unconnected tables into the same visual. Or the time a data scientist built a DAX query using measures from three different fact tables with no relationships between them. The query ballooned into a monster, consumed far more memory than it needed to, and tanked the capacity for everyone. Performance alerts were firing, report visuals ground to a halt, and everyone else in the workspace got caught in the crossfire. It wasn’t pretty.

Then there are the marketing analysts who couldn’t figure out why their visual comparing Campaign and Region produced duplicates and blanks. They were combining columns from two dimensions with no shared relationship. Copilot would have struggled too, because it uses the same relationship logic underneath. A clean model protects everyone, human or AI, from this sort of chaos.

Performance Isn’t Just About Speed

Sure, stripping out unnecessary columns and optimising relationships will reduce memory load, refresh time, and make everything faster. That’s nice, but the real win is clarity.

A simplified model is easier to maintain, easier to query, and far easier for Copilot to interpret. It’s the difference between asking a well-organised analyst for a sales trend, versus rifling through a stranger’s filing cabinet hoping for the best.

Give your users, and your AI assistant, the best chance at getting things right. Start by taking things out, then take out a bit more. Then explain to your team why they didn’t need three columns for “Date Label”. They’ll thank you later.

Next up: Tabular Editor and the joy of letting robots do the boring bits for you.

Next
Next

Do You Really Need a Semantic Layer? It Depends, But Probably Yes