Read Hackers and Painters again for GenAI

HU, PiliHU, Pili
14 min read

Paul Graham's book Hackers and Painters is a classical read for technologists and never faded away after two decades. In this new GenAI era, let's review a few chapters and get some inspirations.

Book exerpts

This section exerpts key ideas from the below chapters:

  • The Other Road Ahead -- Web-based software offers the biggest opportunity since the arrival of the microcomputer.
  • Beating the Averages -- For web-based applications you can use whatever language you want. So can your competitors.

Chapter: "The Other Road Ahead"

When software moves off the desktop to the server, the key changes are: (from chapter The Other Road Ahead)

  • Encapsulation -- Normal users who just want to write an email do not have to worry about "Operating System", "Driver", "Patch".
  • Convergence -- In the web age, all you need is a browser, which can be made by various providers.
  • Abstraction -- The idea of "your computer" is substituted by "your data". You can get your data from any computer.
  • Tolerance -- In the old days, leaving your PDA on a taxi is as disastrous as having a disk crash. In the new web based era, you don't worry about backups.
  • Accessibility -- You do not need to install things. You can use software as soon as you buy them. Upgrades happen in the backend.
  • Concurrency -- Web based software allows multiple users to edit the document at the same time. Paul argues that is a natural software design choice, instead of a demand driven by users.
  • Security -- Since there is no complex desktop environment, there is much less room for visurs to play around constrained browser.
  • Control -- Everything is on the server so developers control more. They can tailor make the environment and leave more resources to build valuable functions for the users.
  • Agnostic -- Desktop software is usually developed in lower level languages. Developers can choose any languges for web based software.
  • Fluidity -- Web applications can be released immediately, instead following a coarsely grained release cycle.
  • Bug -- a large chunk of the texts are devoted to the advantages of web applications when dealing with bugs. One key observation is that (back in the old days), bugs of web Apps were often corner cases found by advanced users. They are more forgiving or even happy to be part of the building process.
  • Support -- In the old world, support is a way to comfort customer, instead of solving problems. In web based world, support is more instant, and developers are happy to hear what users say.
  • Ideation -- Pauls argues that idea leads to more ideas, similar to the writing world, where half the words were only unveiled after you finishing the first half. Being fast implementing ideas means more ideas would come.
  • Manpower -- As Fred Brooks pointed out in The Mythical Man-Month, adding people to a project tends to slow it down.
  • Trace -- Web developers can study the click trace of users to understand the experiences, and interate directly in the backend.
  • Marketing -- Users can get the actual experience via "test drive" and itself is the marketing.
  • Subscription -- Developers can openly charge subscription fees, instead of forcing people to install / upgrade.
  • Gatekeeping -- when the access of users become easier, so is it for the bad guys. For example, every employee may be able to access credit card information (at the time of Viaweb).
  • Democracy -- "personal computer" in the past sounded like "personal satellite" today. The key reason why desktop took over mainframe was that more people / small companies could afford to write software, proliferating the eco-system. After software moving off server (mainframe) to desktop for decades, now the software is moving back to server, enabling more developers to build software.

Chapter: "Beating the Averages"

Chapter Beating the Averages is mainly about arguing the advantages of Lisp. The advantages of being declarative, using macro to build repetitive patterns, and staying high level are true, but hardly comprehensible for script native generations (say people who started full stack web dev with JS). Also note that IDEs in Paul's era are behind, so programmers need to balance editing time v.s. thinking time. The population of Lisp authors are significantly smaller than other languages, make them statistically superior in brain capacity. When choosing the medium of communiaction with computers, they would prefer "more thinking and less speaking".

Today's IDE is so powerful, documentation is so rich, search engine is so efficient, and finally everything is uplevelled by AI. It is hard to assert that thinking is superior than speaking, because speaking is a way of thinking as demonstrated by LLM. A mediocre programmer in the past generation may efficiently deliver more and test out the ideas rapidly. In the end, the history would not bother to remember the failed ideas. Only successful ideas were long lasting in the textbook.

The whole concept of web and Lisp is: Ship faster, break faster, succeed faster.

That is still true in the AI world, with agent and prompt (English).

Mapping the concepts

Now let's revist those key trais of web articulated in Pual's essay: Encapsulation, Convergence, Abstraction, Tolerance, Accessibility, Concurrency, Security, Control, Agnostic, Fluidity, Bug, Support, Manpower, Ideation, Trace, Marketing, Subscription, Gatekeeping, Democracy.

Encapsulation/ Convergence/ Agnostic/ Fluidity/ Control

In the new GenAI world, user intefaces would converge to chat.

Actually, the models today are already very capable of building complex and nice UIs. That means, users do not have to learn the complex design from software companies. They can simply chat and spin off a nice UI that is tailor made for them. Software may be released as Agent, MPC, or Computer Use protocols. After a while, people would realize: why do we use models to build personalized UI for each single use case? Why don't we simply chat and drive the system to perform the action directly?

The client side converges to a chat interface. All the business logics are moved to server side. Going forward, the software developers are agnostic of which LLM they use. They can always choose the cheapest and the most capable offerings available in the market. This is simlar to how they choose cloud providers today. They also have higher control of the execution environment, making things more efficient, e.g. caching.

Software release is more fluid, because the UI always stay the same (chat). Releasing new functionality is simply by adding capabilities (say tools) in the backend. Only the users who enter that particular conversational path would notice a change.

Marketing a new feature is not by email any more. It is just a "hi, would you like to, ...".

Tolerance, Accessibility, Concurrency.

From the users point of view, they see more tolerance, accessibility, and new type of concurrency.

Users do not worry about losing data or migrating data between providers. Think of the notes taking software in the past generation. It is very hard to migrate from one to another and a lot of data collected in this way becomes liabilities rather than assets.

In the GenAI world, you only need to remember the prompts. In the ideal world, all the providers shall answer factual questions highly similarly in the substance. Think that you only need to use a few mega bytes to record thounds questions (Query) in a specific domain to "stay as experts". When you need to look up anything, you do not dig into a notes software or a search engine. All you need to do is to decompress the knowledge given the Query.

This means a very high tolerance of loss to what you possess. Do you still worry so much of losing things compared to 10 years ago? Disaster used to be losing a 3.5 inch soft disk, then became suddenly corrupted HDD, then became losing username / password to a webmail provider, then becaming notes taking software/ cloud storage being too jammed to make any meaningful use of it. Now things are different. You only need to remember a few prompts, and that is like the passcode of Alibaba to access the gold mine.

The accessibility side is more apparent. As long as you own a Chat interface, you can interact with any service providers. The output from various service providers can be nicely organized in one place in a coherent manner for future acces -- poe.com is one such prototype. This can furhter go beyond to become a browser, an operating system, or a hardware. We will brain storm more in the next note.

There will be a new type of concurrency. The cloud era enabled simultaneous editing, but conflict resolution is always a challenging. There has to be a lock mechanism. Now think that everyone is contributing to an AI maintained knowledge pool, instead of a document (as an artifact), we have a new way of conflict resolution. AI can simply summarizes the concurrent inputs and resolve the editorial conflicts automatically. If there are conflicting information, AI can put up a debate and let human to decide later. In another word, people can focus more on the ultimate objective of communication, instead of spending time to work around the medium of communication.

Ideation/ Trace/ Manpower

It is apparent that idea-to-market speed is improved by a large margin.

In this time, we not only observe the user behaviour (like click in the past), but also the user thoughts (the chats). The chat contains much richer information than just instructions. It is also a trace of what the users are thinking. Upon fixing a problem, we can simply "resume the conversation" and see how the user acts to it.

I like the idea that idea generates more idea a lot. Once you can get something done fast, you can get more things done faster. Users are not passively waiting for futuThe manpower will become more mythical.

Subscription to token

Subscription is a "wholesale" business. The future shall be token based pricing, and the charge can be very granular.

Imagine that an agent offers a solution to a user and estimates the token cost plus premium paid to the tools/ providers. Users can decide on the fly whether to proceed or not.

User do not need to buy a "wholesale" of features derived from the aggregated demand of all the other users, norm buy the package of product manager's self-esteem. On the other hand, developers do not need to go through the complex product design lifecycle to evaluate ROI, before pulling off an idea. Things become straightforward: The features requested by more users via chats are defintely the next one to develop. The developers can focus on the most popular branches on a chat tree to elaborate their software offerings. This is the true beauty: put your money (token) where your mouths are.

One talks, one pays and one uses.

When the prepaid token becomes a mainstream, the prospective of future token valuation will stimulate a financial market. This is the time when crypto comes into play. "Every individual can go IPO" would become an reality.

Security/ Gatekeeping

It is always a challenge to balance two types of preferences and failures:

  • Type 1: I prefer to keep my data extremely private, and I can endure long latency, complex process, and maintenance cost to keep them that way. In the worst case, no one can access the data including myself, say when the hard disk that includes a single copy of the file is damanged.
  • Type 2: I prefer my data to have multiple replica and easy to access from various places, at the time I need them. In the worst case, everyone can access my personal data, but accessing data is still more important than keeping them private.

Example of Type 1 is to encrypt all one's data in a private harddisk/ CD, and write the key on a paper. I used to do it, and I ended up losting my keys after a decade. Another situation is that I once "invented" some naive encryption algorithm myself and stored plenty of files on my family computer. After two decades, I could not find an appropriate environment to run the Visual Basic program.

Example of Type 2 is to "cloudify everything". Credentials are everywhere: GitHub, 1Password, Google Drive, Dropbox, Email, WeChat, ... Those are not only website passwords, but also identity documents. One you get access to one of those, you get access to more. There are dozens of people in the world that are able to competely ruin one's digital life.

Abstraction

The paradigm shift:

  • Your computer
  • Your data
  • (Your knowledge)
  • Your identity

In the ultimate world, we do not have to own anything, even data is not owned by us. LLM is an effective world compressor. There is no such boundary as "your data", because everyone's data is in a giant mixture. As long as we own and control and identity, we have our own way of interacting with the world, possesses resources that are unique to us and drives actions that no one else does.

AI being a giant compressor is one enabler technology.

Another key pillar is powered by cryptography and we will see more updates when the world attention circles back to blockchain industry.

The phrase of "Your knowledge" is put in bracket. It is hard to distinguish between the latter and the former. We can argue that data is equivalent to knowledge depending on the level of compression. We can also argue that one's identity is one knowledge (think of the pass codes your remember, the way of speech you master, the events you memorize).

One thing we are clear in the new world is that we become less reliant on possessing data, but more reliant on possess knowledge -- a compressed form of data. Putting it in the context of GenAI, we do not have to carefully take notes of the generated result, but we do pay attention to the prompt/ chat sequences that lead to a particular result that is 100 times or 1000 times larger than the prompt, which essentially is a form of "knowledge".

Support/ Bug/ Marketing

Support happens while using. The repetitive patterns in chats signal potential support needed, and humans can chime in to help "call the tools", or help "understand the intent". It is indistinguishable whether it is a bot or a human (regarless of latency) is responding. The bot / agent can also give interim shallow responses and "circle back" when they receive guide from human experts.

While the web era hides the operating system and programming language from normal users, the AI era blurs the backend being an LLM Agent or a human operator.

Bugs come in a differnt form. Instead of showing broken UI, or giving out error messages, the new bug becomes a single "Oh, I can not process it at the moment", or "sorry, I can not understand. Can you elaborate a little bit?". Only the software developers can see various types of bugs in the backend: be it quota exceed, tool calling error, database issues, malformed user input, ... You can name it.

After some "bugs" are resolved, it also takes minimal effor to push it to the market. Developers can get the list of impacted users and push a message to them / or use more curated message to steer the communication onto the expected path.

Democracy

Software logics moved between centre and edge multiple times in the past:

  • Mainframe - centre (server). Well educated professionals, hardware investment upfront.
  • Desktop - edge. Small companies, hackers, hobbists.
  • Web - centre (server). More hackers, web developers (without OS knowledge).
  • Cloud - centre (cloud provider). The infrastructure is further offloaded. Developers can focus on building value.
  • Mobile - edge. mobile developers, better interactions, rich UIs. Serverless architectures enables frontend (web/ mobile App) developers to rollout full stack applications easily.
  • AI - centre (future client side is only a multi-modal chat). anyone who can write prompt can come up with the core logics.

Despite the large swing of where the business logics are, one consistent trend is that software development becomes more democratic. From the early old days low level coding (assembly/ C++), to quick scripting (PHP/ Python/ JS), and today --- "Prompting"!

In the near future, everyone can build an agent with:

  • Knowledge base: a collection of documents
  • Prompt: natural language instructions to steer the behaviour of agents
  • (Optional) Tools: this is less demanded given the upcoming Agent communications, MCP, Computer Use. Many tools would be available as OpenAPI. The author only needs to use natural language to specify when to do what with what tools.

Macro

Another idea following the Macro concept in Lisp is to let AI build the tools they need. Paul argues that there were 25% codes in Viaweb in Macro that can not be easily done in other languages. If we translate this idea to today's AI world, that may probably mean a meta agent that build other agents, or LLM that can write tool/ MCP themselves. Of course, we can not open this backdoor and offshore the responsibilities all to users. There must be certain gate keeping mechanism, either by human review, or by constraining the stucture of how tools are built (how Macros are expanded).

Stay high level

Quote from the book:

At places like MIT they were writing programs in high-level languages in the early 1960s, but many companies continued to write code in machine language well into the 1980s. I bet a lot of people continued to write machine language until the processor, like a bartender eager to close up and go home, finally kicked them out by switching to a RISC instruction set.

Writing prompt and building multi-step agent is fast in development but costly in token consumption. If we look back at the history of how high level programming language took off, we might expect the infrascture cost to go extremely low rapidly.

That means, we can stay high level and focus on the value building at this stage. All we need is to wait, and the break-even point of ROI equation would eventually come.

Remark

Paul's book is a resounding piece worth reading every few years. Although new technology emerges very quickly, the core of society and business stay the same.

The book bibliography: Paul Graham. “Hackers & Painters - Big Ideas from the Computer Age,” 2004.

0
Subscribe to my newsletter

Read articles from HU, Pili directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

HU, Pili
HU, Pili

Just run the code, or yourself.