• Uncategorized
  • Your AI Pair Programmer Is Lying to You (And That’s Okay)

    The best engineers I know aren’t afraid of AI tools. They’re afraid of engineers who trust them blindly.

    In the past two years, AI coding assistants have gone from novelty to standard equipment. GitHub Copilot, Cursor, Claude, ChatGPT — if you’re building software in 2026 and not using at least one of these, you’re probably moving slower than your peers. But here’s the uncomfortable truth that nobody talks about in the breathless productivity posts: AI pair programmers hallucinate, confidently, about the exact things that will hurt you most.

    I’ve shipped production systems with AI assistance and I’ve watched teams burn entire sprints chasing phantom bugs that AI-generated code introduced. The difference between these outcomes isn’t whether you use AI — it’s how you use it. Let me show you what I’ve learned.

    The Confidence Problem

    Here’s what makes AI coding tools uniquely dangerous compared to, say, copying from Stack Overflow: they’re always confident. Stack Overflow answers have votes, comments, and dissenting opinions. An AI assistant just… tells you something. Authoritatively. In your preferred programming language. Formatted perfectly. And sometimes it’s completely wrong.

    I’ve seen Copilot suggest deprecated API calls with no indication they’re deprecated. I’ve seen Claude generate database queries that are semantically correct but will trigger a full table scan on millions of rows. I’ve seen GPT-4 write authentication middleware that looks right, compiles, passes basic tests — and has a subtle JWT validation bug that only matters when tokens are near expiration.

    None of these tools know they’re wrong. They don’t have a confidence meter you can check. The output looks identical whether they’re absolutely right or subtly broken.

    This is different from human pair programming. When a human colleague suggests something they’re unsure about, there are signals — hesitation, a qualifier like “I think this works,” or they’ll explicitly say “let me check the docs.” AI gives you the same tone whether it’s reciting something it learned from a million examples or confabulating something it half-remembers.

    Where AI Actually Shines (And Where It Doesn’t)

    Once you internalize the confidence problem, you can start routing tasks appropriately. Here’s my rough mental model after two years of heavy AI-assisted development:

    AI is excellent for:

    • Boilerplate and scaffolding. Generate a FastAPI router, a React component structure, a Terraform module skeleton. The risk is low because you’ll customize it anyway, and the time savings are real.
    • Syntax you don’t write every day. Regex patterns, date formatting, shell one-liners, SQL window functions. Not because AI is infallible here, but because you’ll verify these anyway and the starting point saves time.
    • Explaining unfamiliar code. “What does this Python decorator do?” is a killer use case. AI is surprisingly good at reading and explaining code, often better than writing it.
    • Test case generation. Give it your function signature and ask for edge cases. It’ll think of things you won’t. Then you verify whether those cases are actually interesting.
    • First drafts of documentation. Painful to write, easy to review. AI does the painful part.

    AI is dangerous for:

    • Security-sensitive code. Authentication, authorization, encryption, input validation. This is where hallucinated confidence kills you. The code looks right. The tests pass. The CVE gets filed six months later.
    • Anything involving your specific infrastructure. AI doesn’t know your database schema, your service topology, your rate limits, your SLAs. It’ll write something plausible for a generic version of your problem, not your actual problem.
    • Performance-critical paths. AI optimizes for correctness and readability as trained on average code. It won’t know that your hot path processes 50k events/second and that particular list comprehension is going to kill you.
    • Library versions and compatibility. AI training data has a cutoff. It doesn’t know that the API you’re calling changed in v3.2 or that the package you’re using has a known bug in the version your team pinned to.

    The Senior Engineer Advantage

    Here’s something counterintuitive: AI tools have widened the gap between senior and junior engineers, not closed it.

    The common narrative is that AI democratizes expertise — a junior developer with Copilot can now write code that used to require years of experience. There’s some truth to this for the writing part. But software engineering isn’t mostly about writing code. It’s about knowing what to build, what to avoid, how to test it, how it’ll fail, and how to debug it when it does.

    A senior engineer using AI tools can:

    • Instantly recognize when generated code has a design smell, even if it compiles fine
    • Ask better, more specific prompts that constrain the solution space toward correct answers
    • Know exactly which parts of the output to scrutinize and which are safe to trust
    • Catch the subtle performance, security, or correctness issues in AI output
    • Use AI to explore the solution space 10x faster and then apply judgment to pick the right one

    A junior engineer using AI tools will produce more code faster. But without the judgment layer, they’ll also introduce more subtle bugs faster, make more architecturally wrong decisions faster, and take longer to debug the resulting mess.

    AI is a force multiplier. It multiplies whatever judgment you already have. That’s good news if you have good judgment, and concerning news if you’re still building it.

    Practical Techniques That Actually Work

    After years of iteration, here are the specific practices I use to get the most out of AI coding tools without getting burned:

    Prompt for constraints, not just solutions. Instead of “write a function to authenticate users,” try “write a function to authenticate users. It should: use bcrypt for password verification, return a typed result object (not throw exceptions), handle the case where the user doesn’t exist without leaking that information, and be testable without a database connection.” Specificity dramatically improves output quality.

    Ask AI to review AI’s output. This sounds silly but it works. Generate code with one prompt, then paste it back and ask: “What are the potential failure modes, edge cases, or security issues in this code?” AI is often better at critiquing than generating, and this surfaces issues the generation pass missed.

    Test-first, generate-second. Write your tests manually — they encode your actual requirements, your real edge cases, your actual constraints. Then use AI to generate the implementation. The tests act as a specification that AI can’t ignore, and they’ll immediately catch hallucinated behavior.

    Keep a “hallucination log.” When AI confidently gives you something wrong, write it down. After a few weeks, you’ll see patterns in what your tools get wrong consistently. Use these to build a mental filter for where to apply extra scrutiny.

    Use AI for exploration, humans for decisions. AI is great for generating five different ways to approach a problem. Humans (you) should decide which one is right for your context. Never let AI make architectural decisions — let it generate options and you evaluate them.

    The Bigger Picture: What This Changes About Engineering

    We’re in an awkward transitional period where AI tools are genuinely powerful but not yet trustworthy enough to deploy unsupervised. The engineers who’ll thrive are those who treat AI like a brilliant but inexperienced colleague: someone whose output you value and want to review, not rubber-stamp.

    The skills that are becoming more valuable: code review ability, systems thinking, debugging intuition, security awareness, performance instinct. These are the things AI is worst at and humans are hardest to replace for.

    The skills that are becoming less valuable in isolation: knowing syntax, writing boilerplate, translating known algorithms into code. These are the things AI is best at.

    If you’re a senior engineer worried about AI: don’t be. Double down on judgment, systems thinking, and the ability to evaluate AI output critically. These become more valuable, not less.

    If you’re a junior engineer: embrace AI tools, but don’t let them replace the learning you need to build judgment. Use AI to learn — ask it to explain what it generated, read the output critically, understand the trade-offs. Don’t let it be a shortcut past building the mental models you’ll need to supervise it.

    Conclusion: Trust but Verify (Actually, Just Verify)

    The old advice was “trust but verify.” With AI coding assistants, I’d drop the first part: just verify. Not because the tools are bad — they’re remarkable — but because they’re not trustworthy in the specific sense of knowing when they’re wrong. They’ll never say “I’m not sure about this one.” That means you have to be sure.

    The engineers getting the most out of AI in 2026 aren’t the ones who trust it most. They’re the ones who’ve built a clear mental model of where AI helps and where it hurts, who verify accordingly, and who are using the time savings to do the deeper, harder work that AI still can’t touch.

    Your AI pair programmer is lying to you sometimes. The good news: it’s also saving you hours every week. Learning to tell which is which — that’s the job now.

    8 mins