AI Coding Tools vs. Reality: Lessons from Weeks of Testing Kiro, Claude Code, and More

AI Coding Tools vs. Reality | Mission

5:18

Dr. Ryan Ries here. I just spent weeks testing every AI coding platform I could get my hands on.

I’m not surprised to say that I’m a bit disappointed. Let's just say the marketing promises and the actual reality exist in completely different universes.

The Great Experiment

When it launched in beta, I got early access to Kiro, AWS's new coding platform, and decided to build something ambitious: an automated SOW generation system that would pull Zoom transcripts, create a RAG database of proposals, generate custom prompts for each section, and include a chat function for querying past work.

I would be the hero of our GTM team if I could pull this off!

The platform understood my requirements perfectly. The architectural design looked solid. The task breakdown seemed logical. I hit the button and watched it start writing code.

Unit tests appeared. A React frontend took shape. Google OAuth got integrated. Everything looked beautiful.

Then I tried to actually run it, and it all fell apart.

Nothing worked.

When The Wheels Fall Off

Here's what nobody tells you about these AI coding platforms: they can write code that passes unit tests but completely fails in practice.

I spent days fighting with Kiro to fix basic issues. It would create endless test scripts instead of fixing the actual problems. It would suggest the same broken solutions repeatedly. When I asked it to make OAuth work across all tabs, it built an absurdly complex polling system that crashed the entire application.

The context window kept filling up, forcing the system to summarize and lose track of what we were doing. I'd spend hours making progress, hit the context limit, and have to start over explaining everything to a fresh instance.

Where Coding Tools & Reality Intersect

Microsoft just laid off developers after announcing that over 30% of their code is now AI-generated. Fiverr's CEO sent a memo warning employees that AI is coming for everyone's job, not just developers.

But as you can see from my example, these tools are not capable enough to replace experienced developers. Do they make us more productive? Absolutely. But the gap between marketing claims and actual capability is enormous.

I tested multiple platforms. Full app creation tools like Lovable and Base44. Code IDEs like Cursor, Kiro, and Claude Code. Function creation through standard LLMs. Each has distinct strengths and serious limitations.

What Actually Works

The spec-driven approach shows real promise. Instead of just telling the AI to start coding, you work through requirements, let it design the architecture (which you review and modify), create a task list (which you refine), and only then let it write code, which Kiro is set up nicely to follow this approach.

This works when you treat the AI as a junior developer who needs detailed direction, not a senior engineer who can figure things out independently.

I will say though, Claude Code performs better than Kiro right now. But even the best tools require constant supervision. They over-engineer solutions. They get confused and spiral. They need you to maintain control.

The Real Value Proposition

These tools excel at rapid prototyping. If you're an experienced developer who understands available services and can give detailed instructions, you can build things much faster than before.

They're terrible at "set it and forget it" development. They're not ready to build entire systems unsupervised. They work best when creating specific functions under tight direction rather than constructing complete applications.

The three-phase model I'm seeing emerge makes sense:

Phase 1: Human with AI assistant (where we are now)

Phase 2: Human-agent teams (coming soon)

Phase 3: Human-led, agent-operated (we’ve still got a while on this)

My Recommendation

If you're exploring these tools, start with function-level work. Give extremely detailed instructions. Review everything the AI produces. Maintain strict version control because these systems don't handle Git well. Never let them have access to production credentials.

The tools are evolving fast. Six months from now, many of these problems might be solved. But today, treat AI coding assistants as productivity multipliers for skilled developers, not replacements for engineering knowledge.

The people who succeed with these tools will be those who combine a deep technical understanding with the ability to effectively direct AI. That's a very different skill set than either traditional development or what the marketing materials promise.

What's your experience been with AI coding tools? I'm curious if you're seeing similar gaps between promise and reality.

Until next time,
Ryan

Now, time for this week’s AI-generated image and the prompt I used to create it:

Create an image of me thinking that I am the hero of Mission's GTM team. You should see Mission's GTM team in the background and they're disappointed. I am wearing a cape and look like a superhero. However, I find out that my solution to our problem doesn't work, and I am actually not the hero I thought I was after all. Use the attached image of me as a reference photo.

ChatGPT Image Nov 11, 2025, 12_55_27 PM