One Thing: Cursor, Claude, and the Mocking Struggle

One Thing: Cursor, Claude, and the Mocking Struggle

As someone who spends a lot of time coding, I rely heavily on Cursor as my daily IDE, and I most often use Claude 3.5 Sonnet as my preferred model for coding assistance. Its large context window and relatively high coding scores make it incredibly useful for code generation, especially when dealing with complex or large-scale tasks.

That said, I’ve noticed a surprising gap in its abilities—it tends to struggle with tasks like writing or refactoring tests. Mocking, in particular, seems to trip it up when I’m trying to refactor existing tests into cleaner, more modular versions. While it’s great at generating full functions or files, breaking down or improving test cases often leads to awkward or incomplete solutions.

This has been a good reminder that, while tools like Claude can supercharge certain workflows, they’re not perfect substitutes for human judgment—especially when it comes to ensuring high-quality, well-structured test code. For now, I’ve found it best to pair its strengths (code generation) with my own focus on refining tests and improving mocks.

It’s an ongoing learning process, but it’s also a fascinating window into the strengths and limitations of today’s AI models.