The Ontology of the Construction Industry

Every few days, a customer asks me how Archdesk uses AI, typically FOMO. I could give the easy answer. I could say a chatbot, AI-driven workflows, agents, MCP, all the things you'd expect to hear. We've done it all already. But every time I reach for that list, the software engineer in me refuses to leave it there, because I think it misses the whole point of understanding, and I feel like I need to educate instead.

The honest answer doesn't start with AI features at all. It starts with the fundamentals every non-technical person should understand before asking the question. I'm a software engineer by training, and I see businesses as data models. When I look at a company, I see its structure, its work and its challenges as a data model. I see objects, their attributes, and the relationships between them. I started programming when I was only ten years old, and my brain is naturally skewed towards thinking in data objects. It took me many years to realise this isn't how most other people see the world. I see a business's ontology. And how good any AI can ever be, on top of that company, is decided entirely by how good that underlying model is. Get the model right and the AI is almost an afterthought. Get it wrong and well... you just end up with fancy chatbots.

You can't reason about a world you never modeled.

That way of seeing is the whole answer, and it's the thinking that shaped the company (Archdesk) from the first line of code. But before I can explain it, I have to back up and explain what an ontology even is.

What an ontology actually is

An ontology is just a description of the things in a domain, the properties they have, and the ways they relate to each other. Objects, attributes, relationships. The easiest way to see it is with something simple, like a library: a Book, an Author, a Member, a Loan. An author writes books. A loan connects a member to a book on a date. That's the whole model.

A tiny ontology

Four nouns and the lines between them.

Once the objects and relationships exist, the system can answer questions no one ever stored, like "which members have overdue books by this author?", because the relationships carry the answer.

You already live inside ontologies other people built. Open Salesforce and a sales team thinks in Accounts, Opportunities and Leads. Nobody argues about whether an "Opportunity" should exist. It just does, because the software made it real. Palantir took the same idea to its limit: a blank canvas you can model any business on, wire up its objects and actions, then run AI over the top. That principle is exactly right. The only thing I'd change is that the canvas shouldn't start blank for an industry we already understand. For construction, it can be drawn in advance.

And no, this isn't just a database. A database describes how data is stored. An ontology describes what the data means and how it connects, in the language of the business itself. The idea isn't even new to software: decades ago, object-oriented programming taught us to model real things as objects that carry their own data and behavior, and domain-driven design taught us to build software around the language of the business rather than the shape of a database. An ontology is simply that discipline applied to a whole industry. Archdesk was built using many of the DDD principles.

A schema tells a computer where to put a number. An ontology tells it what the number is.

The industry built its model in a spreadsheet

Construction companies already model their world, and have done for decades. The tool they reached for was the spreadsheet, and that was a sensible choice. Excel assumes nothing. It doesn't insist that a project has a "type" or a manager or anything at all. It let every contractor write their own dialect of construction without asking a vendor for permission.

Excel was never the data model. It was the blank canvas every contractor drew one on.

The trouble is that a spreadsheet can't understand. It knows cell C3 equals A1 plus B2. It has no idea that A1 is a committed cost, that B2 is a change order, or why moving one should move the other. And as the foundation for a whole company it falls apart fast: every sheet a copy, integrity resting on whoever touched the file last, the moment you try to pull tens of projects into one system.

A pile of brilliant tools, and no model underneath

So companies bought software instead, and the market gave them plenty: a tool for scheduling, one for safety, one for documents, one for accounting, each excellent at its one job. I admire these tools. The problem is what you're left with when you own forty of them and none of them talk to each other. There's a reason for it, too. Construction software was built by construction people, who built sharp tools for the problems in front of them. Almost nobody stopped to ask the architectural question underneath it all: what is the shared data model that every one of these tools should be speaking to?

Our industry's software was built by builders. That's its genius, and its blind spot.

I'll be honest about my own part in this. For years I talked about my company the way every vendor does, in features, modules and solutions, because that's the language the market knows how to buy. But underneath, it has only ever been one thing: a data platform for construction. AI has, almost by accident, finally made the industry care about that layer, the one I've cared about all along.

What I want isn't a walled garden. Keep your best tools. Archdesk has scheduling built in, but if your planner lives in Primavera P6 or MS Project, fine. Estimate in Bluebeam, model in Revit, post your numbers to Sage or Xero. A specialist tool can live happily inside one stage of the work. Just don't let it become an island.

Take a simple example. A sensor sits in a fresh concrete pour and reports how strong it is as it cures. On its own, in its own app, that reading is just a number a site engineer glances at. But connect it to the schedule task it controls, striking the formwork, and the schedule starts to understand it. When the concrete is strong enough, the task is released. When it cures too slowly, the task holds, the delay is flagged, and the cost updates before a day is quietly lost. It's the same number either way. What changes is that it's now connected to the work it affects.

Stored, it's just a reading. Connected to the task it gates, it's a decision.

A project runs through one continuous lifecycle. The point tools each own a moment in it.

CRMOpportunity

BluebeamEstimation

RevitDesign

ExcelProcurement

H&S appMobilisation

Primavera P6Construction

Snagging appHandover

One connected data model, running underneath every stage

Each point solution is excellent at its one job, at its one moment in the work. None of that has to change. They just stop being islands, and feed what they know into one shared model that runs underneath the whole lifecycle.

AI belongs above the tools, not inside them

Most "construction AI" today is a search engine pointed at a pile of documents. It returns passages that look right, wrapped in fluent language. Ask it a real question and the cracks show: it can't total a portfolio budget or trace a dependency, and it has no idea what state your projects are in. It has read everything ever written about construction and knows nothing about your portfolio.

Here's a thought I keep coming back to: for this work, I've stopped seeing much difference between a human and an AI. Ask a project director whether a job will land on time and on budget, and they go and search P6, the mailboxes, a dozen spreadsheets, and then, perhaps after a few hours, they come back with an answer. Give them one cohesive model and it takes minutes. An AI is no different. We just keep bolting it onto the application, a little chatbot in the corner of every tool, when it belongs above them, the way a person sits at a desk who knows every system and how to move between them. Neither is smarter than the other. Both are only as good as the model they can reach.

The ontology is what moves AI from retrieval to reasoning.

The hardest part isn't the building. It's the language.

After tens of thousands of projects run through our platform, across companies on six continents, one thing is clear: the way work flows through a construction company is remarkably consistent. A Cat A fit-out and a stretch of motorway have nothing in common to build, but the process is the same: someone estimates, procures, schedules, builds and gets paid. The real obstacle is that every region and firm has invented its own words for it. One company's "stages" are another's "phases". A "variation" here is a "change order" there. A "BOQ" is an "estimate" is a "schedule of values". They build the same way and describe it in words that don't translate.

No two of our customers have ever had the same data model. Every one of them stands on the same backbone.

I don't say that loosely. Across the entire customer base, there are no two accounts configured exactly alike, not one matching pair. Different names, different classifications, different fields, different workflows. And yet, underneath all of it, the same handful of objects relating to each other in the same handful of ways. That contradiction, total variety on the surface and total consistency underneath, is the single most important thing I've learned about this industry. It's also the whole reason the model can be pre-built and still fit everyone.

What changes is the surface: the words, the classifications, the custom fields. What doesn't change is the deep structure, that scope gets priced, scheduled, procured, delivered and paid for, whatever you call each step. So the answer isn't to pre-build anyone's business. It's to build one stable home for each underlying concept, and let every contractor label it in their own language. This is where the giants struggle. Procore and Autodesk Construction Cloud are huge in construction software, but both lean on rigid, opinionated models, and a fixed model is a bet that every customer works the way you imagined.

The construction ontology, object by object

The construction ontology is big. Archdesk on its own defines more than 300 domain objects, and a single company will extend that with plenty of its own. So let's not try to cover all of it. Let's focus on the basics, the backbone, as a connected world. It starts with the Company, and under it the top-level objects: the Projects, the Clients you build for, the Suppliers you buy from, the Users who do the work. When you create a project, we ask for one thing, its name. We don't assume how you classify your work, because that assumption is what makes rigid software break. Everything else, the types, the phases, the custom fields, you add in your own language.

If there's one object to understand in all of construction, you find it by going deep. Inside a Project, within an Estimate (your BOQ, your quote, your schedule of values, whatever you call it), sits a Section, and inside that Section is a single line item. The BOQ Item. It has a quantity, a unit of measure, a price and a margin, and it is the single atom of the whole industry, the unit of work that everything else orbits. You pick the grain: a whole package at a high level, or a single bolt from a bill of materials. Almost every process touches it. The same item is estimated, approved through submittals and specifications, scheduled, subcontracted, delivered, valued and invoiced, and it carries its identity the whole way. The very same item is revenue to the contractor who installs it and cost to the owner who pays for it, two sides of one object. That is why a number on a board report can be traced down to the exact bolt, and why the bolt can be rolled back up to the board.

From the boardroom to the bolt, every number has a parent.

Around the scope item, the commercial world assembles itself. The BOQ groups items into Sections and totals them into an estimate. The CBS, the cost-breakdown structure, is the spine every cost reports against, so you can hang any KPI on it: commitments, change, earned value, CPI, SPI. Change Orders revise scope or price and auto-compare against the original BOQ to show the impact on both budget and margin. On the client side, Valuations certify progress period by period, break into Valuation Items, and flow into a Sale Invoice, which carries its own Invoice Items and is finally closed by a Payment. On the supply side, a Purchase Order commits spend through its PO Items, recognizes delivered value against the CBS before any invoice arrives, and reconciles against the supplier's bill when it lands. Those invoice items also reconcile against goods received, the GRN, so the accountants stay happy and nothing gets paid for twice.

Then comes one of the most valuable relationships in the whole model, where it starts to earn real money: the marriage of schedule and cost. A Program holds the schedule; its Tasks, the WBS, carry dates, dependencies and resources. Link the exact BOQ items and quantities to a task, and cost stops being a static number and becomes a function of time. That one connection forecasts your cash week by week, projects cost at completion, and tracks productivity live. Plan to pour a thousand cubic metres at two hundred a day, pour a hundred and fifty, and the model sees the gap the same day, not at month end.

A schedule tells you when. A budget tells you how much. Only their marriage tells you whether you'll make it.

Resources flow into the same web. Timesheets clock crews against a task with GPS, turning hours into labor cost on the right CBS line. Assets, the plant and machinery, are scheduled globally across every job. And the records that keep a portfolio honest hang off the project too: Documents that extend into a full CDE, RFIs, and a Risk Register that any task, supplier or scope item can raise against. Last, the part that lets the model outgrow anything we shipped: the meta-model. Forms let you build your own objects and attach them to almost anything, and the Workflow engine reacts to every event. Invent a Near-miss report we never planned for, make it first-class, route it to the safety lead, and reason about it the next morning. Every object, event and action is reachable through an API, and we've been API-first since day one.

At this point a fair challenge usually lands. Fine, it's a beautiful data model, but the reality of construction is brutal. You can't expect a crew on a muddy site to sit filling in forms all day to keep the thing alive. And you're right. If the only way in were our forms, this would already be dead.

So separate two things that get confused: the model, and the way data gets into it. The model is the definition of what your company is and how it connects. The interface is just a door, and there can be as many doors as you like. The same scope item can be updated from the web, from a phone in a site hut, by an agent reading an email, by voice, from a photo or a video of the work, or straight through the API from a tool you already run. The people on site are already capturing this data somewhere, in a message, a daily diary, another app. We are not asking them to drop that and learn our screens. We're asking that whatever they already do flows into one shared model underneath.

And this is where it gets interesting. Once the model is solid, the interface no longer has to be one fixed set of screens built for the average user. It can be generated, on the fly, for the person in front of it: a simple form for one role, a chat for another, a voice prompt for someone with their hands full on site, a dashboard for the office. Each shaped around what that person actually needs to see and capture. A model that is easier to feed is a model that stays truer, and a truer model makes every answer above it better. The interface stops being a barrier to keeping the data alive and starts pulling its weight.

The model is the thing that has to be right. How the data gets in is a detail, and every door is allowed.

Exhibit 2 · The Construction Ontology · Interactive

Click any object to trace how it connects. This is roughly 20 of Archdesk's 300+ entities.

This map is interactive, click any object to light up its relationships, and start at the BOQ Item to see how much of the business runs through it. What you see here is a deliberately small slice, around twenty objects. A full implementation runs to more than 300 connected entities like these, which together describe the ontology of the construction industry as a whole, and from this backbone every company shapes and extends its own.

"Is this project going to be late, and why?"

Ask the question every director really wants answered, and watch three kinds of AI handle it. A generic LLM knows construction in general but nothing about your job, so you get a horoscope: "delays are common on projects like this, you may want to review your schedule, resourcing and supply chain." A document AI reads your files and hands back a plausible passage: "clause 14.2 of your contract sets out liquidated damages in the event of late completion." Both sound reasonable. Neither has the faintest idea whether your project is actually late.

An ontology-grounded AI does something else entirely. Its only real job is to turn your plain question into the right walk across the model, and the answer back into a sentence. So it walks: "Phase 3 is eleven days behind. Timesheets show the concrete pour running at 150 m³/day against a planned 200. That slips the linked WBS task, drops your CBS productivity index, and moves forecast completion past the practical-completion milestone. It pushes your next client valuation, and the cash it releases, into the next quarter." The same kind of walk answers the question that keeps senior people awake, what happens if a key supplier fails, tracing from their open purchase orders out to every undelivered item, every dependent task, every project at risk. One question, and the model does the reasoning the documents never could.

Retrieval finds the clause. Reasoning tells you what it costs you.

Most of this never needed AI at all

Look back over those answers: the sensor releasing a task, the project found eleven days late, the failing supplier traced to the work that stops. Every one is exact. None needed probability, a language model, or a single guess. They're deterministic walks across a correct model, arithmetic and relationships, nothing more. Almost all the construction AI I've seen is straining to answer questions a deterministic system could answer perfectly, if only the data sat in the right model. We reach for probability to cover for a foundation we never poured.

This isn't an argument against AI. It's an argument about order. Once the model is right, a language model earns its place, as the interface that turns plain English into a precise query and back into a sentence, and for the rare problems that really are open-ended. But the reasoning that matters happens in the model beneath it.

The language model is the voice. The ontology is the mind.

Will this project be on time and on budget, before it even starts?

This is the question I really want to answer. Stand in front of a project that hasn't broken ground and ask it plainly. Nobody can answer that honestly today, and getting it wrong is not abstract: it's how contractors who looked fine on paper quietly go under. But imagine software that could, even at 75% accuracy, in an industry that runs on thin margins and forecasts that are mostly hope. It isn't a fantasy. Reverse-engineer the question and it breaks into smaller ones a data model can hold: how this scope priced last time, how that subcontractor performed, where this kind of job tends to slip. Run thousands of jobs inside one model, each fully understood rather than half-remembered in a folder, and you have a structured memory to reason from. A system that only saw outcomes spots a correlation. One that understands the why can tell you something you can act on.

A spreadsheet remembers what happened. An ontology understands why. Only one of them can tell you what happens next.

A word for the other founders reading this. This is a long-term vision we've been building toward for many years, quietly, one object at a time. AI didn't start it. AI accelerated it, and gave the rest of the world the vocabulary to finally understand what we were doing. Not all of our customers see the whole picture yet, but more of them do every year, and it's my job to articulate it clearly and to keep educating the market until they do. It's a long shot. I know how it sounds. But it is absolutely worth pursuing, and pursuing it is exactly what we are.

So, back to where we started. A customer asks how Archdesk uses AI. Now that you know how I see the world, you can probably guess my answer. It isn't a feature, or a chatbot bolted on the side. It's a question of my own: what do you want it to be able to do? Build the model honestly, and almost anything you can ask of it, it can eventually do.

I built it this way on day one because, seeing the world the way I do, I couldn't work out how to build it any other way.

Michał Mojżesz, Founder & CEO, Archdesk

The Ontology of the Construction Industry

What an ontology actually is

The industry built its model in a spreadsheet

A pile of brilliant tools, and no model underneath

AI belongs above the tools, not inside them

The hardest part isn't the building. It's the language.

The construction ontology, object by object

"Is this project going to be late, and why?"

Most of this never needed AI at all

Will this project be on time and on budget, before it even starts?

Construction management insights

Avoid Cash Flow Pitfalls in Construction Projects

Master Data Center Commissioning: L1–L5 Explained

Ultimate Guide to Managing Long-Lead Equipment in Data Center Builds

Trusted by construction companies of every size, worldwide

See your next project on Archdesk