From e8db015769c6b1e69abd11b5e9bd46ceb53804a2 Mon Sep 17 00:00:00 2001
From: Youwen Wu <youwenw@gmail.com>
Date: Tue, 25 Jun 2024 02:44:44 -0700
Subject: [PATCH] content: first draft of AI post

---
 src/posts/2024-06-25-ai-code.md | 120 ++++++++++++++++++++++++++++++++
 1 file changed, 120 insertions(+)
 create mode 100644 src/posts/2024-06-25-ai-code.md

diff --git a/src/posts/2024-06-25-ai-code.md b/src/posts/2024-06-25-ai-code.md
new file mode 100644
index 0000000..b142105
--- /dev/null
+++ b/src/posts/2024-06-25-ai-code.md
@@ -0,0 +1,120 @@
+---
+author: "Youwen Wu"
+authorTwitter: "@youwen"
+desc: "tldr: ai is terrible"
+keywords: "ai, llm, chatgpt, copilot"
+lang: "en"
+title: "the case against AI generated code"
+updated: "2024-05-25T12:00:00Z"
+---
+
+Many software developers believe that the coming "AI revolution" will end the
+software industry and ChatGPT will replace all software developers. While I
+cannot predict the future, I am convinced that every developer who genuinely
+believes the current state of generative AI can meaningfully supplant the work
+of the average software developer has only worked on toy projects. See: projects
+like [Devin](https://www.codemotion.com/magazine/ai-ml/is-devin-fake/) (recently
+shown to be faked).
+
+No, I am not worried about AI taking my job and trying to discredit it in hopes
+of saving the industry. I'm not even surprised by generative AI. I was following
+GPT-2 since 2020 and tested the first "Instruct" models by OpenAI in their
+playground many months before ChatGPT was released. It's true that these tools
+have made a considerable jump in progress in the past 2-3 years, but such is the
+nature of pretty much everything in tech. They did not just suddenly become
+insanely good. This is the culmination of decades of research and rudimentary
+models and the result of us finally having the sheer compute power necessary to
+train these models. It's just that the technology has finally become usable, and
+ChatGPT made it available to "normies" by making it accessible without calling
+an API. There is no evidence to suggest that generative models will continue to
+make jumps like they did from GPT-2 to GPT-3.
+
+Of course, generative AI could prove to be a useful tool to developers. Many
+developers already say it is. But I recently removed Copilot autocomplete from
+my editor because I found it caused more trouble than it was worth and I'd like
+reflect on why.
+
+## the "LLM software engineer" grift
+
+The issue with people's perception of AI is that LLMs are fairly capable of
+doing a simple task acceptably. People will see ChatGPT solve a Leetcode problem
+in Python with the optimal solution and claim programmers have become obsolete.
+True, ChatGPT is most likely better than the majority of programmers at Leetcode
+as of now. The problem is conflating Leetcode with building real software (also
+a mistake made by many software recruiters). Being able to complete multiple
+simple well-defined tasks does not translate to being able to construct real
+production software. Real code requires knowledge of how to design things to not
+suck, and put everything together, which can only be done by something which
+actually logically understands what they're doing - which currently does not
+include LLMs. If this changes in the future, then perhaps developers really are
+doomed. As of now, however, anyone who's tried to make ChatGPT build upon its
+own code or iteratively develop larger projects with simple instructions knows
+how utterly stupid and useless it is. The key issue lies with how their powerful
+abilities simply cannot scale. It really is quite impressive the amount of
+"logical understanding" an LLM can simulate in a short conversation. But while a
+human engineer exhibits _actual_ logical understanding, the LLM pretends.
+
+You may say that there is no difference between "true" logic and "pretend" logic
+if they achieve the same result. Sure. While I think there's still a
+philosophical distinction to be made, practically speaking, "pretend logic" is
+probably just as useful as "real logic". The problem is LLMs are not good enough
+at pretending. A real engineer can build part of their codebase, test it out,
+debug, and iterate on the codebase. They (hopefully) retain understanding of the
+previous code they wrote and can improve, extend, and build upon it. Meanwhile,
+tell an LLM to build a simple API, and they might do it correctly, but then tell
+them to build a frontend that interfaces with it properly, and chances are it
+will completely fail. You might be able to coax results out with some prompt
+engineering, but then tell it to begin extending the API and frontend into a
+non-trivial app and it will completely break down, spitting out fake syntax and
+subtly yet catastrophically wrong results.
+
+There's a reason why Devin had to fake its results.
+
+## but what about copilot?
+
+Plugins like Copilot or Codeium simply provide advanced autocomplete suggestions
+and help complete localized tasks. Didn't I say that AI is good at completing
+simple and well-defined tasks?
+
+True, I definitely think tools like Copilot are much more useful than the
+"ChatGPT full stack software engineer" pipe dream. But these tools present a
+different set of issues. First of all, their usefulness is inversely
+proportional to your actual programming skill and mastery of a language. They're
+really good as suggesting obvious solutions and idiomatic syntax which you might
+not know when first getting to know a programming language. Type the beginning
+of a common code snippet or function call and it'll fill in and infer the rest.
+But this isn't really that useful if you're actually familiar with your language
+and your tools. At best, it provides a marginal speed increase.
+
+## the code just sucks
+
+The biggest downside of these tools is _they simply write terrible code._ You've
+seen them write optimal Leetcode solutions, but as I said, Leetcode solutions
+are both readily available and nothing like real software. Think about it. The
+average software developer is TypeScript Tommy, who dropped out of Udemy to
+pursue his dreams of becoming a React boilerplate developer. This is the code
+that Copilot and every other LLM trains on. At the end of the day, LLMs are
+simply extremely advanced probability machines and the majority of the code
+available for them to train with on the internet is just _awful_. When you're
+writing code, do you want a mid-tier developer to constantly be placing
+distracting suggestions in front of you? Not only does this terminate your flow
+of thought as you come up with your own (likely better) code, but you also have
+to review what basically amounts to amateur-level code that is sprayed out
+across your editor, and often contains subtle errors which leads you to spend
+just as much time reviewing documentation or Googling as just implementing
+yourself would have taken. This might work for other amateur devs, but shouldn't
+the goal be to master your craft? If you're at a skill level where you need to
+rely on LLM suggestions to be productive and they're often better than your own
+code, you should probably avoid using these tools in the first place and focus
+on improving your own skills first.
+
+## it's not all bad?
+
+One concession I will make is that LLM code is great for rapid prototyping where
+you don't care at all about code quality and just need something which holds up
+for one use and doesn't immediately error out. If you're good at prompting, you
+should be able to get an LLM to create shitty prototype code much faster than a
+human can. I keep Copilot chat around in my editor for this exact reason. This
+is pretty much only useful for toy projects, rudimentary demos, or simple bash
+scripting though. Again, bring LLMs to anything that wouldn't fit in a 20 minute
+YouTube tutorial, and they completely fail.