03 10 2024
After finishing my MS at the end of 2021 and starting at Stripe in 2022 I had a number of events in my personal life that kept preoccupied from staying in abreast of all the work happening in the ML world, specifically with Large Language Models (LLMs). While I had read about some recent developments and, of course, used ChatGPT, LLMs still felt like a novelty to me. I was very heads down with work at Stripe and working in a large enterprise codebase with custom frameworks/architecture. ChatGPT never felt like a useful part of that workflow and so I only used it for occasional trivial tasks.
Recently I've had time to sit down and really dig deeper into LLM technology and the deeper I go the more I am blown away. It felt appropriate to organize my initial thoughts on this subject better.
At a high level, my current thinking about LLMs can be summarized as follows:
I'm guilty of largely brushing off developments in LLM technology over the past couple of years. Personally, I see more negative externalities from this technology than positives, so like many developers, I thought that it was mostly hype because I wanted it to be mostly hype. However, the reality is that these models have advanced at an astonishing pace, and ignoring them is no longer an option.
My personal philosophy on life leans towards Voltaire's command that "we must cultivate our own garden". I firmly believe that the path to the good life is to find value in the ordinary, immediate world around you through useful effort and self-reliance. But I also acknowledge that the fabric of Voltaire's society wasn't imminently threatened by super-human advanced stochastic generative software.
So, alas, the time to ignore AI has passed, whether I agree with the direction and pace of development or not. Regarding the implications for the future, there is absolutely no consensus on what exactly the impact will look like in terms of both specific jobs as well as on the broader economy. However it does look safe to say that at a minimum there will be non-trivial effects in many knowledge-work professions. For better or worse, it has already become an arms race, and those who seek to stay in their line of work will (at a minimum) have to learn to increase their productivity using these new tools to be able to compete with competitors. Whether that will be enough to stay employed, and what the net effect on job numbers will be remains uncertain.
Right now, if you push LLMs to the limits of their capabilities it becomes clear that they are basically ultra-efficient information-retrieval systems that are mysteriously capable of producing coherent and flowing output. Arguably the "killer app" of our generation is the last great information retrieval system invented, Google search. Prior to high quality web search the internet was mostly a weird research tool and not particularly helpful to the everyday person. I think LLMs clearly have the potential to cause another paradigm shift in our interactions with computers.
The potential of LLMs to become the next "killer app" is inevitable in my eyes, but there are still some speedbumps on the path to widespread adoption. Hallucinations in output, where the model generates false or nonsensical information, are the most obvious roadblock. Additionally, the user experience and integration of these models into hardware products is the next logical step. When an iPhone has integrated Whisper-quality speech transcription, GPT4-level generative capability, and fluild prompting, LLM usage will be ubiquitous. I'll know we are there when my mother is routinely using this tech, which is probably pretty soon I'd guess.
A few years ago, the majority of people seemed to agree that Artificial General Intelligence (AGI) was far enough into the future that there wasn't much debate about what the definition of AGI truly is. The very fact that this is now a point of contention is shocking and indicative of the progress made in this short window of time. So whether state-of-the-art LLMs like GPT-4 or Claude 3 independently qualify as AGI is beyond the point to me. These systems demonstrate a version of "intelligence" that is already superior to human experts on our own evaluations of intelligence (i.e., standardized testing). They are already on the path towards being able to replace at least some clerical and white-collar work. I think this warrants more serious conversations around how and why the technology should be used. As well as pushing for government policy changes that will safeguard against not just misuse but also the possible resulting economic consequences of legitimate use.
I have been on the internet long enough to remember when the material on LessWrong seemed like a bunch of overly-rational techno-futurism cultists and AI alignment sounded like something more relevant to sci-fi writers than researchers. However, the rapid progress in LLM technology has legitimized and accelerated AI safety concerns, making them a pressing issue that requires action.
The fact that AI progress has gone much faster than anyone expected in the last decade, coupled with our lack of a true understanding of why our current models work so well, is what makes some of the existential doomsday scenarios feel suddenly more plausible. The idea that AI models could gain some notion of sentience doesn't appear to be likely, but we don't have evidence that it isn't possible. Plug a sentient multimodal model into a robot (as OpenAI is already trying to do) and AI-induced extinction sounds like a plausible outcome.
Even ignoring the future capabilities of AI, our current models already present too many opportunities for abuse. The most concerning to me being through creation of disinformation, video generation models like Sora are most scary to me here. Imagine an improved version of Sora that has fewer artifacts and more character permanence. What will the justice system look like when we can't trust video evidence anymore? When you could be on trial for a crime you never committed but there is generated video evidence showing you did?
As these models have become more prevalent and powerful, ensuring their safety and alignment with human values is table stakes for ensuring a stable society.
I believe the best path forward is to understand as much as we can about LLM technology, its capabilities, limitations, and potential implications. In that vein, I'm particularly excited about two directions of work:
Trying to understand the inner workings of neural networks, called "mechanistic interpretability"
For this I'm working through material from Neel Nanda on mechanistic interpretability, initially reviewing some material and upskilling on the weak areas, then doing some experiments using Neel's TransformerLens library.
Aligning models to (good faith) human preferences, the most common method right now being reinforcement learning from human feedback (RLHF)
On this topic I'm starting with reading some papers. I think the most relevant papers here are:
Then I plan to reproduce the results from this last paper, but on a much smaller model. I'm not sure yet how I'll scale this down to be reasonable enough to do meaningful evaluation.
While the rapid progress in this field is both exciting and concerning, it is clear that the AI progress ship has sailed. Those who wish to stay relevant and competitive in their respective fields must embrace and adapt to this new reality.
Rather than resisting or ignoring these advances, we need to strive to join the efforts to shape and guide the development of LLMs in a responsible and beneficial manner. By actively engaging with this technology, learning about its inner workings, contributing to its safe and ethical development, and even engaging in discussions about their usage, we can try to steer adoption and development towards being a positive force that enhances and augments human capabilities rather than replacing or diminishing them. It is up to us to go forward from here with wisdom, caution, and a commitment to not causing more problems than we solve.
Thanks to Claude 3 for providing edits on drafts of this post. AI has already replaced the job of copy editing my writing, previously reserved for my friends.