When an executive sees an AI tool spin up a functional landing page in thirty seconds, a perfectly reasonable question follows: “Why should we pay an agency for a custom website when we can build one ourselves with AI?”
It's a fair question. But it confuses production with strategy. Here's what the actual research, not the marketing, says about where AI helps and where it doesn't.
AI Has Commoditised Basic Code. That Part Is True.
There's no point pretending otherwise. Writing boilerplate HTML, CSS, and standard JavaScript is no longer a premium skill. AI tools do it quickly and well enough for simple use cases.
If you need a static digital brochure for a side project with no integrations, no SEO history to protect, and no scaling ambitions, an AI site builder may be exactly what you need.
But if you're running a growing business, managing a merger or acquisition, or trying to build something that actually converts, the picture changes substantially once you look at the data.
Where the Research Shows AI Actually Falls Apart
McKinsey's research lab tested generative AI coding tools with more than 40 of its own developers across the United States and Asia. The results were genuinely impressive for simple, well-scoped tasks: documentation could be completed in roughly half the time, and new code written in nearly half the time.
But the same study found a sharp drop-off as task complexity increased.
“Time savings shrank to less than 10 percent on tasks that developers deemed high in complexity due to, for example, their lack of familiarity with a necessary programming framework.”
That finding comes directly from McKinsey & Company's own published research on developer productivity, available in full on their site. Read the full McKinsey report here.
It isn't an isolated finding. Stack Overflow's 2025 Developer Survey, the largest annual survey of working developers in the world, collected responses from more than 49,000 developers across 177 countries. The results show a widening gap between AI adoption and AI trust.
84% of developers now use or plan to use AI tools, up from 76% the year before. Yet only 29% say they trust the accuracy of what those tools produce, down sharply from previous years. The single most common complaint, cited by 66% of respondents, was AI solutions that are “almost right, but not quite.” The second most common, cited by 45%, was that debugging AI-generated code is more time-consuming than writing it properly the first time.
Full results are published directly by Stack Overflow. View the 2025 Developer Survey AI findings here.
Two Peer-Reviewed Studies Confirm the Pattern
Industry surveys are useful, but academic research carries more weight, because it's independently reviewed before publication. Two studies stand out.
Study 1: Asleep at the Keyboard?
Researchers from New York University and the University of Calgary tested GitHub Copilot across 1,689 generated programs covering high-risk security scenarios. They found that approximately 40% of the code Copilot generated contained exploitable security vulnerabilities. The study was peer-reviewed and presented at the 2022 IEEE Symposium on Security and Privacy, one of the top venues in computer security research, and later republished in Communications of the ACM.
Read the full paper, “Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions” (Pearce, Ahmad, Tan, Dolan-Gavitt & Karri, IEEE S&P 2022).
Study 2: Do Users Write More Insecure Code with AI Assistants?
Researchers at Stanford University ran the first large-scale controlled user study on this question. Participants given access to an AI coding assistant wrote significantly less secure code than a control group working without one, and they were more likely to believe their insecure code was secure. The study was peer-reviewed and presented at ACM CCS 2023, the top academic conference in computer and communications security.
Read the full paper, “Do Users Write More Insecure Code with AI Assistants?” (Perry, Srivastava, Kumar & Boneh, ACM CCS 2023).
The Tasks That Actually Drive Business Value Can't Be Automated
Even where AI generates functionally correct code, the work that determines whether a website performs as a business asset has nothing to do with syntax. It has to do with judgment.
Stakeholder alignment. When two companies merge, two brand identities and two technical environments have to become one coherent experience. AI cannot interview your leadership team or mediate a disagreement between what marketing wants and what your systems can support.
SEO migration. Mishandling URL structures and redirects during a merger or platform migration can erase years of organic search ranking in days. This requires a planned, tested migration strategy, not a generated page.
Custom integrations. Connecting a CRM to inventory systems or building checkout flows on a headless CMS requires understanding your specific business logic, not just plausible-looking code.
Conversion optimisation. Getting visitors to convert is a user psychology and brand communication problem, refined through testing and iteration, not a one-shot generation task.
The Bottom Line
AI can build a web page. The research is consistent on where it currently struggles: complex, high-stakes, judgment-heavy work, which is exactly the work that determines whether a website functions as a business asset rather than a digital brochure.
We use AI in our own workflow, the same way every credible agency does in 2026. The difference is that every output goes through human review before it reaches your business.
Sources
Karaci Deniz B, Harrysson M, Hussin A and Srivastava S (2023), Unleashing developer productivity with generative AI, McKinsey & Company, 27th June 2023, Available at: https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/unleashing-developer-productivity-with-generative-ai (accessed: 30th June 2026).
Stack Overflow (2025), 2025 Developer Survey: AI, Stack Overflow, 2025, Available at: https://survey.stackoverflow.co/2025/ai (accessed: 30th June 2026).
Pearce H, Ahmad B, Tan B, Dolan-Gavitt B and Karri R (2022), Asleep at the keyboard? Assessing the security of GitHub Copilot's code contributions, IEEE Symposium on Security and Privacy, 2022, Available at: https://arxiv.org/abs/2108.09293 (accessed: 30th June 2026).
Perry N, Srivastava M, Kumar D and Boneh D (2023), Do users write more insecure code with AI assistants?, ACM CCS 2023, 26th November 2023, Available at: https://doi.org/10.1145/3576915.3623157 (accessed: 30th June 2026).