Typeland: The Power of Set Theory

Kiru SebatoKiru Sebato
6 min read

JavaScript is an imperative language: You write a series of statements, and JavaScript executes them unquestioningly. TypeScript, attributing to its name, introduces an abstract concept of types. Types are defined with a rather distinct syntax. Types live in a domain that I refer to as Typeland. Unlike regular JavaScript, Typeland is declarative. Generally, you declare the common shape of a type. TypeScript then determines whether another type matches the desired shape.

The Programmer's Bane: Mathematics

Programmers typically don't require extensive computation skills, as they work with machines specifically designed for computing. Many programmers lack knowledge of the mathematical principles underlying concepts like cryptography, computer graphics, and machine learning. Those who possess such expertise are often regarded as geniuses by their respective communities.

Typeland is based on the mathematical discipline of set theory, where new types can be derived from others through set unions, intersections, and differences. However, most TypeScript programmers are unfamiliar with set theory, and thus cannot fully harness the power of Typeland. In many cases, though, a basic understanding of Typeland is sufficient for most TypeScript users; they simply need to know how to declare the type of a variable or argument and how to use generics. Yet, Typeland offers so much more. By using well-crafted type definitions, you can precisely design how developers interface with your API, essentially shaping the developer's experience.

TypeScript Black Magic

I often refer to the ability to write advanced type definitions as "black magic." To the untrained eye, complex type definitions appear cryptic. Consider the following example.

type PairToken<T extends string> =
  string extends T
  ? string
  : T extends `${infer P}.${'open' | 'close'}`
    ? P
    : never;

In this TypeScript code, three main concepts are employed: Generics, with T being either a string or a concrete string value; a ternary if-else operator that handles generic strings versus concrete strings; and another ternary if-else operator that extracts the desired data type P from a string template pattern.

The extends keyword essentially corresponds to the "is a subset of" relation in set theory. In string extends T, it covers the case where T can be any generic string. This is necessary because the generic string can never be a subset of a concrete string. It is analogous to asking whether infinity is a member of the set of integers from 1 to 10. Without this, if the user attempts to use PairToken<string>, the resulting type would be never.

If, on the other hand, T is a set of concrete strings, such as 'foo' | 'bar' | 'baz' then the type inference proceeds with the second conditional. It utilizes another lesser-known feature of TypeScript: the infer keyword. In this case, it accomplishes two tasks: A) identifying the subset of strings within T that end with .open and .close, and B) extracting only the portions of the strings preceding these two suffixes. Consequently, the following two types are identical:

let foo: PairToken<'foo.open' | 'foo.close' | 'bar'>;
let bar: 'foo';

This type can now be used to improve the IntelliSense results a developer can get using your library. For example:

class MyClass<T extends Record<string, unknown>> {
  constructor(protected data: T) {}

  definePair(
    name: PairToken<keyof T>,
    openSequence: string,
    closeSequence: string,
  ) {
    // ... omitted for irrelevance
  }
}

const obj = new MyClass({
  'foo.open': 'some value',
  'foo.close': 'some other value',
  'bar': 'some final value',
}).definePair()

The advantage of this solution over a simple generic string is twofold: Firstly, your users will receive IntelliSense suggestions for only 'foo' - this is particularly beneficial when the definition of obj is located far away, possibly at the beginning of a several-hundred-line source file or in a completely different file altogether. Developers won't need to navigate to the definition to remind themselves of what to use. Secondly, they will even receive a TypeScript compilation error if they modify the original definition - say, replace 'foo' with 'bar' - without adjusting the method call.

Let's make this example even more complex:

type ScopeMode<T extends string, M extends string> =
  string extends T | M
  ? string
  : T extends `${infer P}.${'open' | 'close'}`
    ? (P & M)
    : never;

The objective of this extended version is to identify all modes in M that correspond to two tokens in T with a shared common P, differing only in the suffixes .open or .close.

The first difference is that string extends T has been expanded to string extends T | M. In set theory, if either T or M contains the generic string, then the union of T and M will also contain the generic string. As a result, the delicate logic concealed within the 'else' branch will only be computed if neither T nor M is the generic string.

The second difference is that the returned type is now P & M. In set theory, the union is equivalent to asking for all members that can be found in both sets P and M. If this were written as imperative JavaScript, it would look similar to:

function isScopeMode(mode: string, tokens: Set<string>): boolean {
  return tokens.has(`${mode}.open`) && tokens.has(`${mode}.close`);
}

function getScopeModes(modes: Set<string>, tokens: Set<string>): Set<string> {
  const result = new Set<string>();
  for (const mode of modes) {
    if (isScopeMode(mode, tokens))
      result.add(mode);
  }
  return result;
}

Effectively, Typeland is a separate programming language layered on top of JavaScript. It appears somewhat similar and even borrows some of its keywords, but primarily follows the paradigm of declarative programming languages. Your declarations essentially populate an internal database within the TypeScript engine, which is then used for a specific purpose: type inference and checking.

The Type Caveat

Type inference and checking are powerful features of TypeScript. However, don't let them deceive you. Typeland is only coherent within its strictly defined boundaries. Any occurrence of the "any" keyword is a potential point of failure. Any contact point with the outside world - outside of Typeland, that is - is a potential point of failure. Any form of communication between two processes is a potential point of failure.

Wherever "any" needs to be interpreted and imported into Typeland, malformed types can introduce bugs where the type guarantees promise false success. My stance on TypeScript is that it should be seen and treated as an alternative to both JSDoc and ESLint. Although "any" may serve as a convenient workaround for resolving troublesome TypeScript issues, it is crucial to ensure that your code functions as intended. Nevertheless, using "any" can hinder TypeScript from alerting you about impending changes that may impact and break other sections of your code.

Conclusion

In conclusion, a great developer knows how to use their tools, but a master developer knows how to create great tools. TypeScript's Typeland is a powerful and expressive domain that enables developers to craft precise type definitions and enhance the developer experience. By comprehending the underlying set theory and declarative nature of Typeland, you can harness its full potential to produce more robust and maintainable code. However, every tool has its caveats. Exercise caution with the "any" and "unknown" keywords, as no "interface" should rely on them.

0
Subscribe to my newsletter

Read articles from Kiru Sebato directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Kiru Sebato
Kiru Sebato