20(21 | 22 | 23 | 24)

The unreasonable effectiveness of old computers
Sat, 30 Dec 2023 20:08:13

My ThinkPad x260 does not offer nearly as much compute as my 2020 MacBook Pro can handle, despite the ThinkPad having more than twice the amount of RAM. The ThinkPad has an HDD instead of an SSD (like my Mac), and sometimes this shows in the noise and the latency. This might come across like the ThinkPad suffers from unequivocal shortcomings, but consider that I’m rarely if ever fully harnessing my MacBook’s computing power. I’m not training a large language model on a massive dataset, and I rarely find myself compiling huge programs or doing intensive graphics work. I’d say I spend 95 percent or more of my time writing code or essays for school, browsing the internet, listening to music, and reading PDF/EPUB/DJVU files.

However, the MacBook is much more permissive of terrible computer-usage habits. I am (and probably many other people out there are) guilty of things like keeping the computer running for weeks, if not months at a time (when the more prudent thing to do would be to, well, shut it down when not using it), leading to absolutely egregious uptimes. When using the web browser, I tend to hoard tabs like no one’s business, which in my experience only serves to diffuse my focus and consumes stupid amounts of processing power.

If anything, using the ThinkPad makes me much more aware of my habits. It forces me to ask myself: do I really need to have the browser open? Do I really need this particular tab in front of me, right now? In many cases, the answer is no.

Digital minimalism (no, I’m not necessarily talking about suckless software, though I guess it does fall under this umbrella term) and digital hygiene have been buzzwords for a long time in our modern age of behemoth programs, where compute (to an extent) is cheaper and more accessible than ever before. Especially in the wake of the COVID pandemic, it seems that the human-tech interface has transformed irrevocably, with the tacit and harmful expectation that you should be “plugged in” and reachable literally 24/7.

Computer scientist and professor Mark Weiser coined the term “ubiquitous computing” in 1998, where he remarked that:

Ubiquitous computing names the third wave in computing, just now beginning. First were mainframes, each shared by lots of people. Now we are in the personal computing era, person and machine staring uneasily at each other across the desktop. Next comes ubiquitous computing, or the age of calm technology, when technology recedes into the background of our lives.

(Emphasis is mine.) It’s a beautiful vision of what was once the future. But have we even come close to realizing Weiser’s prophecy? It’s difficult to say. Computing is ubiquitous nowadays, this much is true and has been true for a while. But nothing about this feels “calm”, and technology does not seem to have “receded into the background of our lives”. If anything, it has done the opposite.

Even though the bigger picture has gone astray, there are still actions you can take to create a healthier interface with your phone and/or your personal computer. Hence the title of this entry; operating within the constraints of older hardware can transform the user experience, counterintuitively, for the better. You may even find that it changes the way you see computers and similar devices into what they should be: moreso tools and portals, less so lifelines and coping mechanisms.

Holistic (language) learning
Wed, 20 Dec 2023 18:45:17

NB: All video links are to an Invidious instance.

The only person I’ve seen who talks about what is dubbed “holistic language learning” is Days and Words (Lamont) on YouTube. It is basically incorporating your language learning into your life such that your non-language-learning habits influence, and are in turn influenced by, your language-learning habits, in a symbiosis. That might be a vague description, but there are simple enough examples to point towards: see his video on language learning and exercise, or cold showers.

A big part of Lamont’s central thesis is that putting yourself in uncomfortable situations, whether it be on a physically strenuous run or an ice-cold shower, or waking up to an alarm you set purposefully at 4AM, ends up lowering the activation energy that is required for you to face the discomfort of listening to your target language. It’s analogous to the fable about the frog being put in a pot of boiling water.

In another video, he recommends taking notes while watching videos (instead of forcing yourself to pay attention to every word spoken) or playing mindless games while listening to audiobooks (somehow, the low-level stimuli leads to a drastic increase in your percieved endurance at listening to audio you can only partially comprehend).1 I’ve found that I can easily be listening to a book in Russian for up to 3 hours while playing 2048 or Tetris, something I definitely could not pull off if I were just listening to the book and doing nothing else.

I wonder if such actions and habits could be generalized to other skills outside of language learning, such as programming or drawing.

Some people might say it seems cynical to have to reach for ways of “engineering” yourself to do what should be innocent hobbies whose impetus is purely powered by “passion”.

The truth is that you can be truly passionate about something but still never quite seem to be able to pull together the intention to do it. A pithy example: I love to read literature, but I can rarely get myself into the state of solid focus that is required to go through a dense novel for meaningful lengths of time, not just in fits and starts.

My thoughts on this topic stem from observations about learning that I’ve been thinking about more and more lately:

  1. The foolproof way to get better at something is by doing that thing many times over.
    1. There are very few exceptions to this across most fields and skills.
    2. There are caveats (of course, there is a spectrum from inefficient ways of practicing to more efficient ways), but the core statement still holds.
  2. Learning something deeply requires near-constant skepticism and self-assessment.
  3. The more you know, the more you learn.
    1. This may seem unfair (surely, the people who need to learn the quickest and the most are beginners) but it’s just the way it is.

Though this might seem like a given, it’s crystallized for me recently in ways that I just can’t ignore. Of course, there’s always some resistance. From talking to my friend about autodidactism, I’ve noticed that we tend to put all of our attention onto the how of our plans, but never end up actually executing said plan.2 At the risk of repeating myself from the virtual memory post a few weeks ago, it’s far too easy to fool yourself into thinking that you’ve understood something, when you still have gaping holes in your mental model.

Is this just me regurgitating the basics of Dunning-Kruger? Probably. But I’ve been working on generalizing that mindset so that I can find better ways to combat it, regardless of what it is I’m trying to learn. Because of course you’re not motivated to learn when you have the (false) belief that you already have a handle on things.

I’ve been trying to combine Lamont’s notion of “holistic” learning and the three tenets that I mentioned above, in order to create a practice that will work with me, and hopefully this gives you some food for thought as well.


  1. If anyone knows of any emperically-studied mechanism behind this effect, namely the effect of mild distraction actually improving your long-term focus on certain tasks, I would love to see it.↩︎

  2. This procrastination-through-planning manifests in other ways too. A big phenomenon on the Russian language learning subreddit, r/russian, is Russian learners uploading photos of their Cyrillic handwriting—which is fine in and of itself, but sometimes it seems that people mistake practicing their handwriting with the very act of learning Russian. It’s a quick way to get compliments from native speakers without having to put in that much effort. You know what would be more helpful? Learning how to touch type in Russian!↩︎

Operating systems and an uncertain future
December 3, 2023

My school stopped making CS students take Operating Systems about 4 years ago because there wasn't enough faculty available to teach the course every year. Before that, OS was a core part of the major. To my knowledge, we are one of the only schools that don't require CS students to take OS.

This means that it is possible, perhaps even likely, that you could go through the four-year path to getting a Bachelor's in CS without knowing what a syscall is. This may mean nothing in the grand scheme of things, but I think it's more important than ever to at least have a little knowledge of what's going on under the hood. Using abstractions without really thinking about how they work is partly how we got to the state of the web today: a garbage fire, to put it lightly! Not that that's a hot take, especially around these parts.

I don't want to make it seem like I have any experience running an entire college department with hundreds of moving parts, under pressure to seek greater and greater funding, shouldering the burden of catering to hundreds of students, etc etc. I don't even have much pedagogical experience, either. So when I see questionable decisions I usually move on with my day and think well, they must've had a good reason to do X, so whatever. I also don't want to come off like I'm being gatekeepy; there's almost nothing that annoys me more than STEM-associated gatekeeping culture.

But hey, to end on a positive note, maybe the fact that OS is being managed by two of the best professors in the entire department will change things :) This is not an exaggeration; this is backed up by (somewhat dubious) statistics and online reviews. Despite my earlier snarkiness, I'm nothing but optimistic about the class's future.

Virtual Memory
November 30, 2023

Virtual memory is hard. It is the first real hurdle you might come across in a systems course (or perhaps in the college Computer Science track, generally speaking). It started out as the simplest possible abstraction—more-or-less basic arithmetic—but today it is a complex beast with many moving parts, and with just as many essential systems scaffolded on top of it. Computer architecture classes derive at least part of their notoriety from virtual memory. But what exactly makes it so intimidating?

From taking an Intro to Systems course and now being almost done with an Operating Systems course, I’ve noticed that understanding, really understanding virtual memory is easier said than done. My professor once said:

...virtual memory is one of those things where you can easily fool yourself that you've understood it, when you actually haven't.

Add that to the laundry list of jarringly profound truths that I have recieved in this class. This includes ruminations on the economic state of the world brought on by the topic of file systems (most of the storage on disk is taken up by a few large files... much like how billionaries own most of our collective wealth).

What can I say, he's an interesting teacher. I digress, though.

This post got a bit longer than I thought it would, so you can just read The small program in question section. But feel free to peruse Why it might be difficult to understand VM for some of my thoughts on teaching and understanding VM as a systems concept.

Don’t get me wrong, I don’t yet have a perfect mental model of it either – because it really is more difficult than it seems! I think the creed almost everything in CS can be solved with indirection [1] is generally true, and VM is one of the quintessential triumphs of systems abstraction.

The small program in question

Link to GH repo. I of course welcome forks, pull requests, and feedback in general.

Why might it be difficult to understand VM

TL;DR:

I will say this: like many other systems concepts (see scheduling, the process abstraction, etc) VM is relatively easy to “get” at a high level. Like okay, the computer abstracts a huge address space for each process, and it maps the few parts that the process actually uses to physical memory. Great.

But dig into the implementation even just a little bit, and you’ll find a heap (no pun intended) of implications and details to consider. For example, how would you maintain these mappings of virtual addresses to their physical counterparts?

If you decided to use a page table (as real-world systems do), which map virtual pages (which are a fixed-size unit of memory, usually 4KB) to physical frames in what are called page table entries, you should be wary of the fact that these page tables would grow linearly with respect to the size of the VAS. And VASes tend to be very big! Your computer may have a 64-bit address space, so there are 264 possible addresses. This is an incomprehensibly huge number: more than the grains of sand on Earth, more than the number of stars in the universe.

Then consider the fact that you have to maintain one of these tables for each running process. It’s common for computers nowadays to have more than one thousand running processes at any given time, so you can see why the size would pose a problem. To quantify such a size:

Size of page table = Number of pages in page table * Size of page table entry

For a theoretical system with a 32-bit VAS, 4 byte (32-bit) page table entries, and 4KB pages, this would be four megabytes (220 * 25 / 23 = 222) per running process! What's the point of virtualizing memory if so much of it is just used on maintaining these page tables?

Well, page tables in modern systems circumvent this issue by using some more clever indirection, in what are called multi-level page tables. First, we chop up the previous large page table into page-sized chunks. We then organize these chunks into a level-based hierarchy. The top-level page table (there can be at most one top-level page table) has page table entries that don’t hold mappings of virtual-to-physical addresses, but rather hold mappings to other page tables. That means one entry in the top-level page table can access PGSIZEn addresses, where n is the number of levels. We are paging the page table itself! Very cool.

Another significant pro of multi-level page tables is that if there are pages that aren’t currently being used in RAM (and there are usually many of these unmapped pages, since most processes don’t use that much memory), we don’t have to traverse the n levels for those pages’ mappings, nor do we even have to store the entries for those pages in the lower levels. In fact, the only page table we really need in RAM at all times is the top-level page table. This saves us a lot of space because the memory used by our page tables is no longer a function of the size of our VAS (which is huge), but a function of how much memory is being used by the program (which tends to be small).

Anyways, that’s only one of the optimizations that are commonly used in real-world VM systems. This sort of paradigm with multiple layers of indirection shows up all the time in systems because of its ability to bestow a single pointer with access to magnitudes more data than it might’ve originally had. There are many more details that go into a complete VM implementation... but this post is long as it is. I'll catch you after finals?

[1] Attributed to David Wheeler, the first in the world to recieve a PhD in Computer Science.

[2] When my professor gave the midterm for his graduate-level OS class, he included a question related to VM that less than half the class got full credit on. In the end, he had to take it out of the grading. Even grad students struggle with it!

Tiny update from college
January 30, 2023

Been very busy with college stuff. It feels very nice to be taking classes in things I'm actually interested in, a welcome and dramatic contrast from high school. The workload is... a lot, but it's somewhat offset by my interest in the content. Hopefully I will be able to get a post up soon-ish, I have had some ideas rolling around in my head for a while.

The fundamental four panels of storytelling
January 15, 2023

In the Western world, the common way to describe narrative structure (and likely the way you learned it in school) is exposition, rising action, climax, falling action, and denouement (or a slight variation on that).

China, Japan, and Korea have their own version of this, called qǐ chéng zhuǎn hé in the original Chinese. In hanzi, it is written like this: 起承轉合. However in Korean and Japanese, the words arise from these Chinese characters, which as you can see have a different fourth character: 起承轉結. Unlike Western narrative structure, conflict is not even implied. Kishōtenketsu or ki-seung-jeon-gyeol can be applied in formal essays, four-panel comic strips, TV show arcs, and basically any mode of storytelling.

There was a meme on the Korean internet where if someone was obsessed with a certain thing, you would describe it as ki-seung-jeon-, but replace the last word, gyeol, with whatever they were obsessed with. This is to say that, no matter what they're talking (telling a story) about, the conclusion will always be that certain thing they're fixating on. For example, if someone is obsessed with Star Wars, you'd describe it as ki-seung-jeon-Star-Wars.