floating online,
touching grass,

Every important thought, insight, and question I have about AI


A collection of random thoughts, reflections, and insights about the current state and future of AI and LLMs.

  • AI has already personally massively augmented my personal flow, it has made me much more productive and much more efficient. Most interestingly though is that I am now much more impatient about getting answers to my questions.

    • I’ve been conditioned to expect answers quickly now, what consequences does this new induced behavior of LLMs/ChatGPT have on other aspects of life?
    • Interestingly, when the ChatGPT UI now lags or crashes, especially when I’m in the middle of cranking some work out and explicitly have gone to it to search for answers, I’m extremely frustrated. Already I am becoming reliant on it like a drug and my productivity can take a massive hit when I’m not able to use it
  • Not to state the obvious here, but I think it’s pretty clear to me at this point that this technology will massively shape the world in next decade. This thread of discussion is already brought up often, “AI is going to change the world etc. etc.”, but what is not talked about as much is how it will tangibly do that, and that is making everyone in the world significantly more productive

    • We’ve seen this pattern happen before with defining periods and technologies, think the Industrial Revolution, the printing press, the personal computer, and the direct consequences of people becoming more productive have generally been an increase in quality of life
    • What happens when the entire world becomes one or two orders of magnitude more productive? Think about the shift in quality of life from pre to post Industrial revolution, we’re talking about a potential impact on the world of that scale if not more! Massively accelerating scientific research, democratizing access to knowledge, improved services, and literally so much more.
    • There are serious risks posed by AI and it’s important to be mindful and continue to research/dedicate resources towards this (especially in areas like mechanistic interpretability and alignment research) but I don’t think we should use those as reasons to limit progress
  • People talk about the danger of X-Risk (existential risk in AI) much more often than they talk about the short-term impacts AI will have on the workforce and disruption to labor. Why?

    • Given how fast this technology is developing and progressing, it’s clear that millions (if not more) of cetain kinds jobs (like knowledge work) will be automated (partially or completely) and this means a lot of people are going to lose their jobs. New jobs will be created, but this does not change the fact that the labor force will be massively impacted as soon as now and as late as the near future.
    • I think the AI industry spends too much time aimlessly arguing about X-Risk (without concretely doing things about it) and not enough time thinking about the very short-term current impacts this will have on people’s lives and we need to do more of the latter
    • Part of the reason is naturally the magnitude of the danger of X-Risk but I think that another the reason this narrative is pushed by AI industry leaders is that this is a somewhat “sexy” problem (h/t Kasey on the discussion) which spreads mindshare and incentivizes funding in the short-term to support the industry. Obviously, this is not the only (and probably not the main) reason, especially in certain circles (like academic ones), but my most important takeway is that we need to start thinking and doing something about the short-term impacts of AI immediately because millions of lives are going to be disrupted and there’s not enough tangible research and discussion to mitigate the negative consequences of this.
  • In the short-term, there is massive leverage unlocked by AI for personal usage

  • LLMs are very good at things like translation, pattern-matching, and busy work. In their current state, they may not be good at directly solving complex problems, but taking advantage of them in creative ways can unlock a whole new set of potential problems they can tackle.

    • What this means is that, besides the obvious problems that LLMs are good for (search, qa answering, pattern matching), there is a whole suite of new problems that can be solved by reduction
      • That is maybe it’s difficult for LLMs to directly do X. But maybe it’s easy for them to do Y and you can build some symbolic machine on top of Y’s output that can let you do X.
      • One common pattern I’ve seen of this, is you take some unstructured input, you feed that into an LLM to get some structured output, and you use that output to then solve your problem. For example, it’s may not be directly feasible to have an LLM generate a custom UI from a long instruction manual, but it’s probably feasible for it to generate some JSON object that you can then use to build the UI
      • This is obviously a simple (and maybe silly example), but I suspect there is a whole riche suite of applications that involve building symbolic systems or interpreters on top of outputs of LLMs. In other words, as opposed to LLMs solving your entire problem, you can use them in different steps of your pipeline.
    • Side note, but as a concrete example from recent events that highlight LLMs’ strengths, we were working on building some graphs for a project a couple of weeks ago. The graphs were initially written in matplotlib but we wanted to add interactivity to them, so we needed to port them over to a new library, plotly which we had never used before. We were able to port over and change the graphs to what we wanted to look like within 30 minutes. Trying to do so manually would have probably taken at least 3 days.
  • Many industries will be and are currently being eaten or disrupted by AI. In particular, notice why lots of pure software companies are trying to majorly integrate AI into their products (Replit treating AI as first-class citizen, no-code apps, many B2B Saas software integrating chatbots etc.).

    • The common failure mode I’m seeing is that a lot of people are trying to build similar-style light wrappers for solving one of {search, qa answering, generation…} for some pure software use-case as opposed to studying some unique problem in a (potentially outdated or unsexy) domain and figuring out how to properly integrate AI for that particular use case. The answer is not always throw everything into a VectorDB or slapping an agent over some data
    • The same thing happened with crypto in the last bull market cycle where people were working backwards from the technologies to solve problems as opposed to working backwards from the problems to seeing how and if a technology could be used to help solve the problem
      • it’s easy to get excited by the new technology then try to force it into some existing problem for the sake of doing so instead of asking is this actually necessary
  • LLMs are new types of “natural-language” computers

    • We can take many of the existing concepts we have in the CPU and see similar analogies to LLMs. Instead of bits in registers, LLMs operate on tokens in context. Instead of short-term memory like RAM, we have prompts and contexts in LLMs. Instead of long-term disk storage, we have vector databases which LLMs can plug into to search, retrieve, etc.. Effectively, LLMs are computers that allow us to (non-deterministically) execute language!
      • Note I mean non-deterministic with respect to the intent of the prompt (i.e. not same prompt different output but same intent but different prompts can produce massively different results)
    • I can see a future where every computer’s (in every form factor i.e. even your thermostat) software/firmware ships with built in LLMs for its particular use case
    • In particular, this new perspective on LLMs is exciting because for the past 50-60 years, our model and core architecture of computation has mostly remained unchanged, but Transformers are allowing us to create these differentiable computers that we can use for a new form of computation. And this means that we now have another primitive at our fingertips that (in addition to being directly used to solve problems) can build fundamentally new systems of computation.
      • For example, one can imagine some new type of computer that allows LLM to operate at the level of the instruction cycle, every instruction can either be executed in terms of opcodes at the level of bits (i.e. how instructions are executed now in computers) or we can now suddenly have these new LLM opcodes that execute more complex instructions.
  • In the short-term, LLMs are going and are already massively augmenting software development. Think of them as this additional toolkit or API we now have access to that offer a rich set of features and computation we can execute. Right now, using LLMs can feel somewhat foreign or novel to the rest of the application logic, like some service you’re separately calling, but in the future, every application is going to be a “LLM application”. The same way that we don’t say “I built a web server application now” because building a web server is so deeply intertwined in the software process, soon every application will draw on and use LLMs as another primitive of computation and this will be indiscernible from the rest of the software stack we rely on today

    • When we write code today for example, we don’t think “oh I am using this special fancy primitive called a for loop”, we think “I need to iterate over some elements, I’ll use a loop.” The same will be true for LLMs, for example “oh I can use this fancy primitive to create some JSON from this text” will become “I need to extract structured data from unstructured text, I’ll use an LLM.”
    • This is a very natural extension (not replacement) of software development. Many people think that LLMs will make code obsolete when in fact, LLMs will be used to augment and add more complex logic in our code that we could not easily add before.
    • One unanticipated benefit of these new systems that will emerge is that the complexity is encapsulated and contained to specific parts of the application.
  • It’s very easy to build toys or cool demos with LLMs, it’s very hard to build resilient, reliable systems.

  • We are currently in a massive echo chamber of AI application ideas. We need to be more ambitious, more imaginative, more daring.

    • Most people are using AI (and especially LLMs) to do the same thing; this is dangerous because it’s easy to lock and restrict yourself to a small, finite set of use cases for how you can take advantage of these new primitives (if I had a penny for every time I’ve seen a langchain-style chatbot, question and answering agent on some data that nobody uses…)
    • There is a whole rich surface area for problems that we can use LLMs for, why are we hung up on the same types of applications? Is building another chatbot for the sake of building a chatbot the best we can do?
  • Speaking of chatbots, the novelty of the chatbot form factor that is being forced into every application has quickly worn off. Minus the obvious use case of conversation agents where you can use to get your answer quickly, this does not feel like the right form factor for the next generation of computing

  • Most AI startups today will probably be obsolete in 1-5 years

    • Any startup that is some light wrapper on top of an LLM provider has no technical moat, and so can easily replaced by LLM providers
    • Distribution becomes very important if you plug into some LLM provider because you don’t own the core tech, your only (or main) moat might be your users
      • But this is still vulnerable to disruption because most LLM providers will probably have more distribution than you, so this is not a very defensible position to be as a startup. As one example, think of all the GPT4 apps that got disrupted by plugins, even if they did initially have distribution.
  • The biggest winners of AI/LLMs will be users and large infrastructure providers. Everyone else in the middle will still benefit, but less so.

  • LLMs and autoregressive models are probably not the path to AGII, but that doesn’t mean they will not massively augment human intelligence

  • AI increases the surface area of software’s impact on the world

  • Compute is the new oil

  • Open source models and on-device inference will continue to get better exponentially, but they may not catch up with the massive models. Scaling laws have mostly continued to hold, but it’s still unclear if the pattern will continue.

  • In order to sustain regular improvements in models, hardware will need to get exponentially better because the cost of training these models continues to get exponentially more expensive but proportional outputs may continue to diminish (making it uneconomical to do so)

  • I suspect that on-device inference will be huge

    • Open source models are continuing to get better very quickly, what happens when you no longer are bottlenecked on an API call but can quickly, privately, and securely run LLM calls natively on your own hardware?
    • These open source models will probably not be as good as the largest models for complex tasks, but for many constrained problems, it will be faster and cheaper to run local LLMs
  • AI makes cryptography more important. In fact they are deeply intertwined.

    • The Internet would not have been possible with cryptography (TSL, SSL, hashing, encryption for communication etc.)
    • The next"internet" scale technologies with AI will need cryptography to guarantee things like
      • computational integrity (i.e. I have a large model, I don’t open source the weights, how can you as a user guarantee I ran the model correctly as intended)
      • distinguishing generative from human content
      • protect and secure data
    • The way that many of the cryptographic primitives that power the Internet have slowly disappeared away from a product perspective (as an end-user, I don’t need to think about why or how SSL works, I just know https is secure etc.), these problems in AI point to cryptographic primitives that will, by necessity, integrate into the AI products we use but be mostly removed from the end-user experience
  • Cycles of product iterations in AI are much faster, which means you can gain users more quickly, but users will also churn more quickly. In other words, don’t be deceived by AI companies doing millions of revenue that just started, this may not be sustainable and many of these users will churn.

    • The rule of thumb is that how sticky your product is or lindy usage is proportional to the time it takes to acquire those users.
  • Similar to crypto in the last 2 years, we now have a lot of capital financing companies in a bubble-esque environment (exhibit a), which is interesting in particular because of the macro-state of things given inflation, rates, and the state of the markets in general.

  • When will AI disappear into our lives?

    • From The Computer of the 21st Century, “the most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it.”

If you’ve made it this far, I would love to hear your thoughts, please reach out on Twitter!