Pair-programming with an AI-agent

I moved away from programming because I usually get fed up with it after three days. That’s when frustration sets in, as I get stuck more-and-more. But, I may have found my way out!

The last three months I’ve had fun programming with AI in Cursor. It’s improved significantly over this period, with multiple iterations in its agent-models and interactions. Programming is less “getting stuck” and more “building”.

I should add that I’m not a professional programmer in any shape or form. I program purely for pleasure.

What is Cursor? #

Cursor is a fork of Visual Studio Code (VS Code) with AI integrations. It has some local running capabilities for tab-completions, while larger AI-integrations rely on cloud-integration.

Cursor analyzes you entire codebase, not just the file you are in. It can interact with the terminal, executing commands on its own. It also knows about coding-related tasks such as databases, Docker containers, Git, etc.

The good #

It’s amazing to see your thoughts come to life by programming with Cursor. Just watch your browser do live-reloading as agents set up entire functionalities. Interactivity, database, APIs, unit-tests, documentation, test data… The agent takes a holistic approach and creates functionality end-to-end.

It’s not just programming:

  • Debugging is much more efficient with an agent, as the chat interface forces you to clearly describe what’s wrong. The distance to code is good in this case, it allows for reflection instead of ‘diving in directly’. Together with the agent one can reason more broadly about the root cause
  • The agent can also execute terminal commands, so I can do tricky tasks quickly in human language (Git rebases, database re-initialization, renaming functions/variables)
  • Understanding. I understand the codebase better because of the agent’s explanations. It listens to my suggestions and gives counter-arguments: “why don’t we do this instead?”.

In the best cases, it feels like I’m programming with an overly friendly and articulate developer. Everything is “great!” and explained in long passages.

The bad #

LLM agents love generating. MORE CODE! MORE CHAT! Every edit shows the lines removed and added, and it’s always a net positive. 230 lines removed; 1090 added. This becomes unmangeable in the long run.

The agent randomly generates absurdly complex solutions. Creating custom code for problems simply solved with a standard HTML element (button, dialog), or generating new code instead of reusing existing code. I’ve had plenty of examples where I could reduce code size by a factor of 10 by rewriting it myself.

Agents aren’t neutral. They are eager to overdeliver. You ask for A, and they deliver A+B+Z. You have to pay close attention to ensure they don’t suddenly do things you didn’t ask for. These unexpected additions are buried in long chat responses!

There seems to be a “ghost in the machine.” One agent instance is “better” than another. Sometimes, you have to roll back large chunks of code when an agent gets stuck in its thought process and fails to find a solution. Reset and repeat. A new agent might solve the same problem in a blink. Be on your toes, and commit your code at the right moments.

Setting up a solid project is not something agents excel at. You need a good template to start with. Ultimately, LLM-agents remain predictive algorithms based on input. If you don’t provide the right input, you won’t get a good prediction.

Conclusion: exit developers? #

The generated code is good, but not great. Mindlessly clicking “generate more” doesn’t work; I have to put in the effort. The AI provides suggestions, but I have to stay in control.

But most importantly: software is only partially code. A large part consists of object models, business rules, and interactions. You need to have a clear understanding of these concepts before asking AI to take action.

It seems the experts are safe for now.

Beware the dangers of easy-life #

Working with Cursor is fun and easy. But “fun and easy” comes with a trade-off. At first, I overestimated Cursor and spent too much time “chatting and generating.” Suddenly, hours had passed when I could have solved the problem myself in 30 minutes. Easy choices now, hard times later!