Sam Stromberg (.me)

Vibe coding for half-technical roles

I recently joked to a friend that I was doing my best to avoid the AI-Take-Industrial Complex, but it was only partially true -- I've had the germ of this post in the hopper for a while, and, like all good blogging, putting it here saves you from having to converse with me about it. Conducting a job search in tech in 2025 seemingly requires at minimum salting AI-related keywords into everything you submit/produce, but I feel the need to demonstrate my bona fides in the domain as well.

I feel fortunate that essentially everyone in my immediate circle is more "hater" than "booster", so, if anything, I'm worried this will come across as not harsh enough. Rest assured that, regarding AI on the macro level (a future post, promise), I'm concerned about an investment bubble, impact on the electrical grid (driving up bills), ruining the educational system, and the like; for the purposes of today, I'm going to try to set that aside and look at the qualitative experience of using LLM-assisted coding agents, that is to say, evaluating these offerings on their own terms and tabling consideration of externalities.

Similarly, although the "talking to a computer" UI is an extremely useful paradigm for AI companies -- building on a lifetime of science fiction exposures but also the relatively good coverage of dialogue/Q&A interactions in the training data -- that's not the use case I find most interesting. Because LLMs don't carry an underlying world model, it's not useful to try to talk about things that can be true/false ("will your car dealership honor my 2-for-1 coupon?" or "can you give me some relevant legal citations?") Fortunately, computer programming, while relying on true and false in a symbolic-logic way, isn't as dependent on retrieving reliable information about the world,[1] so I'm going to discuss agentic coding, trying to give it a chance to lead with its strengths.

Can't be a real hater until you've tried it

We can begin with the maybe-obvious(?): I didn't put together this website, from scratch, myself -- I used free-trial periods with both Cursor and Github Copilot to kick-start this along with a couple other partially-complete projects.

But... by any reasonable definition, I didn't write any previous iterations of this site all by myself, either. There's always been a library of examples in Your First Website! Learn HTML for Anyone, No Seriously Anyone, A Baby Could Do This, or there was a layout I inspected and "borrowed" from someone else's site, or there was a lot of tabbing back-and-forth between Stack Overflow, public repos, and terrible AWS documentation throughout.

So, as far as the use-case goes, I would still put us on a continuum with Clippy[2]:

You: starts typing Computer: I see you're trying to do [thing] perhaps I can help? You: Yeah I guess so. Computer: breaks all the formatting

I didn't hate Clippy then, so the general idea of Claude-Code-in-a-Cursor-sidebar isn't DOA, and I'm going to continue to call my agent Clippy throughout the post.

Pretty sure I'm the target audience for this

While some of the most enthusiastic endorsements of agentic coding have come from working software engineers (SWEs, a non-acronym but sure), it's my impression that the intended use case is for half-technical people, who want to do things that have already been done a million times (and therefore are well-represented in the model's training data).

Getting a bit off-topic, but it seems like the long-term play is to destroy the market power of senior SWEs, who are generally well-compensated and less subject to the boom-and-bust cycle of other tech-company roles, so the strategy surely isn't to allow benefits of the new tool to accrue to them exclusively.[3] Instead, I would expect the goal to be de-skilling, lowering the barrier to entry so that programming becomes a small share of a bunch of jobs. And that's exactly where I sit, a Product Manager who has completed some lEaRn2c0dE crash courses, and has some feel for Computer, although I am not and do not desire to be a SWE.

In particular, I have some comprehension of a couple languages and use cases but am plagued with writing stuff that doesn't compile, usually for dumb reasons like syntax. This is, of course, a pain for me to debug, because I'm not conversant enough to know the rules, and I also generally jump to "there must be something really wrong here, I need to understand this better" when actually it's trivial to fix. Except that there are more problems further down, so after applying the fix the program still doesn't compile, and it's not immediately clear that it worked. A tool a little bit smarter than an ordinary IDE extension -- which offers suggestions that seem more for people who know what they're doing -- could be right up my alley.

Clippy's trying really hard to be helpful

Indeed, the good news for me is that "getting the thing to compile" is one of agentic-AI's strengths -- if you give it access to your terminal (which... probably not good for security), it'll try to run the script before and after making changes, and if it crashes will read the error log and try again (e.g., by using a larger context window to read the codebase, which is more expensive). In building this site, as an example, just about all the bugs that prevented a successful build were inflicted by me. For example, Clippy and I had collaborated to write a bunch of code to allow draft blog posts (so that I can draft them in the project folder without automatically publishing them)... a capability that was already built in to the current version of Eleventy. Several years ago, when it wasn't, a couple users wrote the functionality themselves and blogged about it, so when prompted "how can I made drafts," Clippy surfaced the outdated info and attempted to graft it on, causing duplication, naming collisions, and so on.

I was also surprised to discover that the agent will occasionally get stuck in a loop, making an ineffective fix, reverting it to try an alternative that's also ineffective, reverting that... only to try the first thing again, etc. You can manually terminate the loop and revise what you're asking for, which generally worked when I encountered it, but an equivalent of the "reaching an identical position 3x is a draw" rule in chess would seem easy enough to implement (possibly because many users are on a metered subscription, so the AI provider doesn't worry about burning down your balance).

There's a bit of a "if all you have is a hammer" issue here, too -- LLMs are good at generating text, so agentic AI is good at writing code, so its tendency is to add and add and add, a Sorceror's Apprentice situation. In a separate project (building a game engine using Svelte, details to come), the agent suggested a lot of helper functions over the course of development, but several of them ended up written in multiple locations across the codebase -- it seems like the agent runs keyword searches on the directory then reviews 100-line chunks of code to try to get "context"... so when it would assume a different keyword for the same function, the existing implementation would be missed and it'd just write it again. This part is intrinsic to the LLM architecture, which produces results probabilistically and so doesn't allow for a "canonical" way to reproduce the same keyword every time.

Editing and refactoring are less-hammerable problems -- and I found that I was more comfortable taking the lead than telling the machine to just start deleting things. Because Clippy doesn't have an of the project (just whatever 100-line context snippets it retrieves for a given response), it'll leave junk floating around that looks plausibly-useful, remove duplicative bits from where they belong while leaving them elsewhere, and so on. Some of the recommendations I've read about agentic coding use a lot of ancillary documentation (e.g., "use [agent 1] to write a PRD, then use [agent 2] to architect, then [agent 3] to write"); I did ask Clippy to write documentation and a feature roadmap, which was very... verbose, but like any documentation quickly became out-of-date and required human work to update (asking the agent to make the updates just led to a bunch of new text tacked on to the end, no actual editing).

And, of course, the notorious synchophancy of commercial LLMs was present (even after I put in a custom instruction to try to prevent it). Every "hey, quit trying to do that, it clearly isn't working" was met with a "Wow! You're so right! I'm just a dumb robot compared to you, thank you for helping me with that excellent suggestion!" which is grating at best and should perhaps call into question any trust I was placing in the agent to do things correctly.

It's not necessarily less work, but it does feel lazier

So, overall, Clippy wrote me a lot of code, most of which compiled successfully the first time around. A proportion of it was unnecessary or duplicative, and so rather than hunting for misplaced puntuation or skimming search results for the pattern I'm trying to implement, I spent more time watching the "thinking" spinner and editing -- harmonizing variable names, consolidating some of the many new files generated, and deleting stuff.

Having edited my high school newspaper way-back-when, I enjoy editing; in the context of software, I definitely prefer it to the blundering-forward-until-I-hit-an-obstacle feeling of writing it myself. Is it faster / less work? That's hard to say: it still took me several days and iterations to get through the functionality I wanted, and I would get fatigued after a few hours (from cognitive load) and need to walk away for a bit. On like a macro timeline, I feel like this site came together as quickly as my S3 flat-file site or Ghost instance, and with some added functionality -- despite knowing essentially nothing about how to implement responsive design, the site seems like it handles mobile screen widths (and lower-res alternates for images) reasonably well (some of this is to the credit of Eleventy rather than an agent, but I don't know that I would have figured it out). And the site is supposedly screen-reader friendly, uses large-enough targets for navigation on mobile, etc., if we can take Clippy's word for it.

So, my estimate is that it's roughly the same amount of work, but replaces some of the most frustrating parts (for me) with things I enjoy more, and helped surface some of my unknown unknowns. Editing hundreds of lines of text that unspool faster than a person can type, well, feels lazier, in a way that appeals to me. So my productivity is not 10x'd, or even 2x'd, but my odds of following through on "I should do this side project" were significantly better, which isn't nothing.

Destruction of economic value

I can't give vibe coding a full endorsement, however: I wasn't paying for the compute I used. After the Cursor trial ended, I looked up my token utilization, and it ran into the hundreds of dollars ://// To an extent, this is because the pricing model charges more for output than input, and as a less-technical user, I was putting in less input and demanding more output than a power user -- no doubt, an actual SWE could better control those costs by being more judicious in when to use the agent, avoiding deadends and infinite loops, but as suggested above (and by Cory Doctorow if you aren't taking my word for it) the intent is to reduce the need for skill(ed labor), so this does constitute a problem. In that light, I can't make the case that these are "worthwhile" products, or that I've been converted to a paying customer. Frankly, for the nontrivial cost we're talking about here, I could have reached the same end result from asking a friend to pair-program for a couple hours, or using some (deterministic / non-LLM) specialized tools for my specific goals, or even, like, paying a gig worker with domain experience to help.

Considering that Cursor et al are widely believed to be burning cash at egregious rates (part of why they've kept raising new rounds of money and subscription prices), and than actual model-builders like Anthropic aren't doing any better, I would be worried that the sticker price for what I did doesn't even reflect what it actually cost everyone up the chain to provide. Agentic AI can't be these orgs' loss-leader, if anything it's the one use case that's actually creating value compared to "being a chattier way to ask Internet a question" or "interjecting bullet points of dubious veracity on top of your regular search results." So if this isn't earning its keep,

Notes


  1. There is an problem in programming analogous to the fake legal citations issue: effectively-zero files in a modern computer are self-contained, everything uses libraries and functions written and stored elsewhere. But these libraries (or packages, I don't honestly know the difference if any) have names / referrents, so AI coding agents can easily end up pointing to something that doesn't exist or is actively malicious. ↩︎

  2. I was just old enough to have used the version of Microsoft Office that debuted their smart assistants, each complete with a bunch of animations related to various triggered behaviors. I remember Clippy, a small Einstein (decades before Salesforce went there), and a red rubber ball(?). Wikipedia tells me, rudely, that the paperclip was actually officially named Clippit... which is either a great example of the Mandela Effect or a brilliant prank being played on us all 28 years later. He(?)'ll always be Clippy to me.

    Anyway, teenage me preferred Clippy to the other assistants, even though I was aware (probably from reading Newsweek) that he was considered extremely corny, because his range of animations, like unfolding and forming shapes as well as making surprised-eyebrows, was more expressive and appealing (Einstein had a little lightbulb or something, but didn't display much emotional range). ↩︎

  3. I don't actually want to dig into this at the level of like pulling up citations... but if we think about how much 'management' (i.e., the largest VC firms, representing big institutional money) is allocating to enterprise AI companies, and about the "we want to replace people's jobs with this," there's one really obvious target. ↩︎