mirror of
https://github.com/youwen5/blog.git
synced 2024-11-24 18:03:50 -08:00
content: first draft of AI post
This commit is contained in:
parent
88d69e0002
commit
e8db015769
1 changed files with 120 additions and 0 deletions
120
src/posts/2024-06-25-ai-code.md
Normal file
120
src/posts/2024-06-25-ai-code.md
Normal file
|
@ -0,0 +1,120 @@
|
|||
---
|
||||
author: "Youwen Wu"
|
||||
authorTwitter: "@youwen"
|
||||
desc: "tldr: ai is terrible"
|
||||
keywords: "ai, llm, chatgpt, copilot"
|
||||
lang: "en"
|
||||
title: "the case against AI generated code"
|
||||
updated: "2024-05-25T12:00:00Z"
|
||||
---
|
||||
|
||||
Many software developers believe that the coming "AI revolution" will end the
|
||||
software industry and ChatGPT will replace all software developers. While I
|
||||
cannot predict the future, I am convinced that every developer who genuinely
|
||||
believes the current state of generative AI can meaningfully supplant the work
|
||||
of the average software developer has only worked on toy projects. See: projects
|
||||
like [Devin](https://www.codemotion.com/magazine/ai-ml/is-devin-fake/) (recently
|
||||
shown to be faked).
|
||||
|
||||
No, I am not worried about AI taking my job and trying to discredit it in hopes
|
||||
of saving the industry. I'm not even surprised by generative AI. I was following
|
||||
GPT-2 since 2020 and tested the first "Instruct" models by OpenAI in their
|
||||
playground many months before ChatGPT was released. It's true that these tools
|
||||
have made a considerable jump in progress in the past 2-3 years, but such is the
|
||||
nature of pretty much everything in tech. They did not just suddenly become
|
||||
insanely good. This is the culmination of decades of research and rudimentary
|
||||
models and the result of us finally having the sheer compute power necessary to
|
||||
train these models. It's just that the technology has finally become usable, and
|
||||
ChatGPT made it available to "normies" by making it accessible without calling
|
||||
an API. There is no evidence to suggest that generative models will continue to
|
||||
make jumps like they did from GPT-2 to GPT-3.
|
||||
|
||||
Of course, generative AI could prove to be a useful tool to developers. Many
|
||||
developers already say it is. But I recently removed Copilot autocomplete from
|
||||
my editor because I found it caused more trouble than it was worth and I'd like
|
||||
reflect on why.
|
||||
|
||||
## the "LLM software engineer" grift
|
||||
|
||||
The issue with people's perception of AI is that LLMs are fairly capable of
|
||||
doing a simple task acceptably. People will see ChatGPT solve a Leetcode problem
|
||||
in Python with the optimal solution and claim programmers have become obsolete.
|
||||
True, ChatGPT is most likely better than the majority of programmers at Leetcode
|
||||
as of now. The problem is conflating Leetcode with building real software (also
|
||||
a mistake made by many software recruiters). Being able to complete multiple
|
||||
simple well-defined tasks does not translate to being able to construct real
|
||||
production software. Real code requires knowledge of how to design things to not
|
||||
suck, and put everything together, which can only be done by something which
|
||||
actually logically understands what they're doing - which currently does not
|
||||
include LLMs. If this changes in the future, then perhaps developers really are
|
||||
doomed. As of now, however, anyone who's tried to make ChatGPT build upon its
|
||||
own code or iteratively develop larger projects with simple instructions knows
|
||||
how utterly stupid and useless it is. The key issue lies with how their powerful
|
||||
abilities simply cannot scale. It really is quite impressive the amount of
|
||||
"logical understanding" an LLM can simulate in a short conversation. But while a
|
||||
human engineer exhibits _actual_ logical understanding, the LLM pretends.
|
||||
|
||||
You may say that there is no difference between "true" logic and "pretend" logic
|
||||
if they achieve the same result. Sure. While I think there's still a
|
||||
philosophical distinction to be made, practically speaking, "pretend logic" is
|
||||
probably just as useful as "real logic". The problem is LLMs are not good enough
|
||||
at pretending. A real engineer can build part of their codebase, test it out,
|
||||
debug, and iterate on the codebase. They (hopefully) retain understanding of the
|
||||
previous code they wrote and can improve, extend, and build upon it. Meanwhile,
|
||||
tell an LLM to build a simple API, and they might do it correctly, but then tell
|
||||
them to build a frontend that interfaces with it properly, and chances are it
|
||||
will completely fail. You might be able to coax results out with some prompt
|
||||
engineering, but then tell it to begin extending the API and frontend into a
|
||||
non-trivial app and it will completely break down, spitting out fake syntax and
|
||||
subtly yet catastrophically wrong results.
|
||||
|
||||
There's a reason why Devin had to fake its results.
|
||||
|
||||
## but what about copilot?
|
||||
|
||||
Plugins like Copilot or Codeium simply provide advanced autocomplete suggestions
|
||||
and help complete localized tasks. Didn't I say that AI is good at completing
|
||||
simple and well-defined tasks?
|
||||
|
||||
True, I definitely think tools like Copilot are much more useful than the
|
||||
"ChatGPT full stack software engineer" pipe dream. But these tools present a
|
||||
different set of issues. First of all, their usefulness is inversely
|
||||
proportional to your actual programming skill and mastery of a language. They're
|
||||
really good as suggesting obvious solutions and idiomatic syntax which you might
|
||||
not know when first getting to know a programming language. Type the beginning
|
||||
of a common code snippet or function call and it'll fill in and infer the rest.
|
||||
But this isn't really that useful if you're actually familiar with your language
|
||||
and your tools. At best, it provides a marginal speed increase.
|
||||
|
||||
## the code just sucks
|
||||
|
||||
The biggest downside of these tools is _they simply write terrible code._ You've
|
||||
seen them write optimal Leetcode solutions, but as I said, Leetcode solutions
|
||||
are both readily available and nothing like real software. Think about it. The
|
||||
average software developer is TypeScript Tommy, who dropped out of Udemy to
|
||||
pursue his dreams of becoming a React boilerplate developer. This is the code
|
||||
that Copilot and every other LLM trains on. At the end of the day, LLMs are
|
||||
simply extremely advanced probability machines and the majority of the code
|
||||
available for them to train with on the internet is just _awful_. When you're
|
||||
writing code, do you want a mid-tier developer to constantly be placing
|
||||
distracting suggestions in front of you? Not only does this terminate your flow
|
||||
of thought as you come up with your own (likely better) code, but you also have
|
||||
to review what basically amounts to amateur-level code that is sprayed out
|
||||
across your editor, and often contains subtle errors which leads you to spend
|
||||
just as much time reviewing documentation or Googling as just implementing
|
||||
yourself would have taken. This might work for other amateur devs, but shouldn't
|
||||
the goal be to master your craft? If you're at a skill level where you need to
|
||||
rely on LLM suggestions to be productive and they're often better than your own
|
||||
code, you should probably avoid using these tools in the first place and focus
|
||||
on improving your own skills first.
|
||||
|
||||
## it's not all bad?
|
||||
|
||||
One concession I will make is that LLM code is great for rapid prototyping where
|
||||
you don't care at all about code quality and just need something which holds up
|
||||
for one use and doesn't immediately error out. If you're good at prompting, you
|
||||
should be able to get an LLM to create shitty prototype code much faster than a
|
||||
human can. I keep Copilot chat around in my editor for this exact reason. This
|
||||
is pretty much only useful for toy projects, rudimentary demos, or simple bash
|
||||
scripting though. Again, bring LLMs to anything that wouldn't fit in a 20 minute
|
||||
YouTube tutorial, and they completely fail.
|
Loading…
Reference in a new issue