Today I Learned

Everything from 2025

Spanner: Google's Globally-Distributed Database

https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf

Recently had reason to briefly explain how Google Spanner works when I was discussing Amazon Aurora Limitless. I have no idea if Limitless implements the same tech as Spanner, but my guess is it's not entirely unreasonable. Again, I'm not talking about Limitless, I really don't know how it works. But someone was explaining it to me and it just sounded like Spanner.

The Ultimate Insider Threat: North Korean IT Workers

https://cloud.google.com/transform/ultimate-insider-threat-north-korean-it-workers

Worth knowing if you're in the market for a job or hiring. The flip side of these scammers trying to get hired to infect your company with malware is running interviews for devs and having a technical interview involving running malware on your machine. They'll say it's something like a web API server you'll build a frontend against or something like that. They're hoping you use your employer's laptop to do it and either have a bunch of credentials on your machine, or if they're really fancy, they'll wait in the background and see what they can do over the next couple days. Even if it's just your personal laptop, expect to get extorted.

Stay vigilant out there.

Wikipedia: Henry Dreyfuss

https://en.wikipedia.org/wiki/Henry_Dreyfuss

Henry Dreyfuss is responsible for so many of the designs of everyday objects you take for granted. I recently brought him up in a discussion I was having where we lamented the frankly appalling industrial design of many modern versions of things with pointless touchscreens, wireless radios, and impossible to use ergonomics. I humorously joked that Dreyfuss must be rolling in his grave.

Cheating the Reaper in Go

https://mcyoung.xyz/2025/04/21/go-arenas/

I've always tried to explain to devs that you can do manual memory management in a garbage collected language. The simplest method is to just start using resource pools. The reason to manually manage memory is to prevent the garbage collector interrupting you randomly. If you have a lot of identical objects you're routinely allocating ror a limited amount of time, I suggest you look into them.

Either way, I came across this piece discussing how you can dig deeper in Go if you want to get even more manual control of your memory in a garbage collected language. They show how you can effectively implement an arena allocator.

When would you want an arena allocator? Whenever you have a bunch of work you want to allocate during and then a single point in time where you want to wipe it all away and start again. Instead of randomly pausing and running mark-sweep, you can just set the memory index to zero when the work is done. It's perfect for work loops like web request handlers, scratch space during game update/render loops, and between file processing.

While I like manually controlling memory, I also like being lazy. I like the idea of structuring languages where you can nest manually controlled memory regions within an otherwise GC environment. Allowing you to reach for power tools when appropriate.

Part 3/3: Processing 23 Billion Rows of ALIEN TXTBASE Stealer Logs

https://www.troyhunt.com/processing-23-billion-rows-of-alien-txtbase-stealer-logs/

Part 2/3: Experimenting with Stealer Logs in Have I Been Pwned

https://www.troyhunt.com/experimenting-with-stealer-logs-in-have-i-been-pwned/

Part two in which stealer logs

Part 1/3: Begging for Bounties and More Info Stealer Logs

https://www.troyhunt.com/begging-for-bounties-and-more-info-stealer-logs/

Three posts about beg bounties. In our ongoing deterioration into a society predominantly defined by scams and cons, bug bounties are becoming an increasing target of malicious actors. In this case, one group of attackers have created huge pools of people infected with malware. They use that malware to steal login credentials to websites and build massive doses, colloquially referred to as stealer logs. These are passed around for pay in private channels for money until it gets shared in a channel with a leak. At that point these collections hit the web.

This is where our new bottom feeders enter the scene. Being essentially talentless and morally bankrupt, they search out these lists and then contact all the sites who have users with computers infected with malware (so basically any site with a big enough user base). When someone shows up with a list of legitimate user credentials for your site, you take notice. How'd they get them? Your worst fear, there's a problem in your site leaking credentials.

The reality is much more benign. They're available through a few well known internet forums. They then try to spin a tale about how you need to pay them thousands of monies for forwarding them to you. Even if one percent of companies react before they think, contact a few thousand companies and you can make bank before it becomes too obvious a scam.

Lecture Friday: Elements of Programming Style - Brian Kernighan

Loud sound at the 56 minute mark. The fire alarm goes off and it's quite loud because the recording is fairly soft.

I quite like the elements of style discussed here. The style rules presented tend to port fairly well as general concepts to any other language. For example, I now try and order the blocks of my conditionals to put the shorter block at the top, even when it requires negating the conditional as I first think of it.

I'm sure I've picked up a couple other techniques from here and I hope you will too.

I'm Not Feeling The Async Pressure

https://lucumr.pocoo.org/2020/1/1/async-pressure/

Another piece to add to the long running debate on whether or not python's asyncio is worth it. My guess is in 3-5 years it'll become a regrettable complexity in the ecosystem when full lock free threading becomes generally available. Just in time for everyone to have significant investment in red versus blue functions that we can't easily backtrack on.

If language committees could stop adding everything just because it exists in another language C++ style, that would be great. In JavaScript's case, async was a worthwhile problem to add in order to solve the existing problem of having to live inside an event loop outside your control and simultaneously having access to anonymous functions. More specifically that given such a language, developers will instinctively nest lambda callbacks until they're in so deep they're forced to start using single spaces as indentation.

CSP like in Go takes a step forward by not color coding functions, but takes two steps back by having the standard library magically hard code which syscalls are blocking. The more I keep looking at the problem, the more I think I have to agree with Jeff Bonwick and Brian Cantrill who note how userspace multitasking will forever over promise and under deliver. How fundamentally, non-preemptive multitasking leads to an endless series of design problems you don't have if you just let your OS schedule you.

Lecture Friday: Scott Aaronson | How Much Math Is Knowable?

Great recap of many of the proofs we have about provability.

Wikipedia: Great Stork Derby

https://en.wikipedia.org/wiki/Great_Stork_Derby

Not the type of thing I usually share, but it kind of sticks with you.

Rough Idling

https://flak.tedunangst.com/post/rough-idling

Frustration at the state of software being interminably busy doing nothing of value.

What's in OpenBSD 7.7?

https://flak.tedunangst.com/post/Whats-in-OpenBSD-77

What's in OpenBSD 7.7? Essentially just an AMD GPU driver with a small Unix OS wrapped around it at this point.

I've kind of wondered for a while when the graphics card will just subsume the motherboard like a reverse integrated graphics stack.

Lecture Friday: Practical Data Oriented Design (DoD)

The thing that takes a bit to understand about Data Oriented Design is how the name implies it should be similar to some sort of code aesthetic like Object Oriented Programming. You go looking for the abstractions and intellectual virtues to structure code around but you never find any. It seems hard to understand because you keep trying to put it alongside philosophies like functional programming, procedural programming, and object oriented programming. The problem is it doesn't fit. You're comparing dogma to empiricism. It's like getting the intellectual tradition of Francis Bacon suddenly injected back into software development after a few decades concerned with moral purity and virtue. Turns out, you can measure things that matter to users and your code comes up wanting because you're not as good a developer as you think you are.

Unstructured Thoughts on the Problems of OSS/FOSS

https://www.gingerbill.org/article/2025/04/22/unstructured-thoughts-on-oss/

Great set of thoughts that bounce the problems around open source back and forth. I'm still lost in the quagmire. Can't say this really points the road beyond out. More thoughts on the table may be able to help us discern the way forward.

Some Surprising Code Execution Sources In Bash

https://yossarian.net/til/post/some-surprising-code-execution-sources-in-bash/

Works on bash and ksh. Zsh isn't impacted. So, it turns out [[ "${num}" -eq 42 ]] allows remote code execution. Yeah, I didn't know that one. For any non-trivial program you really should be using something other than a shell script. My dividing line continues to be arrays. If you need an array to solve the problem in a shell script, you really should be using a different language to solve the problem.

A New Oral Culture

https://www.oblomovka.com/wp/2025/02/12/a-new-oral-culture/

Interesting thoughts when you think about how it relates to project documentation.

The Best Programmers I Know

https://endler.dev/2025/best-programmers/

Not much I learned from this one, but always worth sharing things for those getting started.

🦕 RansomLook 🦖

https://www.ransomlook.io/

Added this one to the links page, but you really need to be aware just how bad the problem with ransomware has become. We're talking hundreds of organizations a month.

How to Report Bugs Effectively

https://www.chiark.greenend.org.uk/~sgtatham/bugs.html

The key to a good bug report is helping them recreate the problem. If you know how to reliably recreate it, a developer can usually fix the problem. So take the time to figure out the simplest way to cause the problem to happen before reporting it.

13 Things I Would Have Told Myself Before Building An Autorouter

https://blog.autorouting.com/p/13-things-i-would-have-told-myself

I always love someone rubbing real performance engineering work in the face of developers who think Rust or C++ just magically make your garbage code fast or those who think interpreted languages can't possibly be fast enough. Developer skill continues to be the dominant factor in real world application performance.

Win32 Is The Only Stable ABI on Linux

https://blog.hiler.eu/win32-the-only-stable-abi/

This is just sad, but seemingly true. So many people in the POSIX ecosystem just don't take backward compatibly as seriously as you have to, in order to be treated as a platform. I can still run games compiled for Windows in the 90s just fine. Want to back test your website on old versions of Webkit known to exist in many IoT devices? Basically impossible given the state of API churn and system "innovation".

From False Alarms to Real Threats: Protecting Cryptography Against Quantum

https://threatresearch.ext.hp.com/protecting-cryptography-quantum-computers/

Yeah, there's a lot of hyperventilating about quantum computing. This at least tries to give a mostly realistic picture of where we are, what needs doing before it becomes a reality, and why getting started this early matters. The two principals are that the first group to get Shor's algorithm running at a realistic scale will most likely be sworn to national secrecy. Second, agencies with fat wallets are definitely hovering data streams likely to have juicy secrets worth the cost of storing for a decade or more.

Teaching Smart People How to Learn

https://hbr.org/1991/05/teaching-smart-people-how-to-learn

Colleague at work sent this my way. The 60's to the 90's truly had some great management philosophy and science going on before the dot-com-boom and big tech would come to dominate management discourse by fiat.

Landrun

https://github.com/Zouuup/landrun

I love code sandboxes for services. Any tool that makes setting them up simpler is worth a look in my opinion. Haven't tested this out, but I'm keeping an eye on it.

The End of US Democracy and the Implications for International Relations

https://www.e-ir.info/2025/03/17/the-end-of-us-democracy-and-the-implications-for-international-relations/

Trying to situate the next few decades in the context of other country's backsliding helps understand what's likely in store. You won't be able to just vote for a reversal of what's likely to come to pass.

Kings Over the Necessaries of Life

https://farmaction.us/wp-content/uploads/2024/09/Kings-Over-the-Necessaries-of-Life-Monopolization-and-the-Elimination-of-Competition-in-Americas-Agriculture-System_Farm-Action.pdf

Great report picking apart many of the reasons for the price "shocks" we're seeing.

Microsoft CEO Admits That AI Is Generating Basically No Value

https://ca.finance.yahoo.com/news/microsoft-ceo-admits-ai-generating-123059075.html

lol

How Complex Systems Fail

https://how.complexsystems.fail/

Classic insights on complex systems.

CPU Benchmarks - Year on Year Performance

https://www.cpubenchmark.net/year-on-year.html

If you still think Moore's Law is going to save you, you're a decade out of date at this point. This marks the first time we've seen performance start to regress. It seems like we're starting to crest the S-curve on performance available on computers. The future is in squeezing the software, not the hardware.

Dazed & Confused: A Large-Scale Real-World User Study of reCAPTCHAv2

https://arxiv.org/pdf/2311.10911

ReCAPTCHA continues to get worse than it already is.

SoK: Understanding Designs Choices and Pitfalls of Trusted Execution Environments

https://dl.acm.org/doi/pdf/10.1145/3634737.3644993

Trusted Execution Environments continue to be my computing enemy number one. Get your grubby little hands off my system. To that end, know thy enemy.

Unexpected Benefits of Building Your Own Tools

https://tiniuc.com/make-more-tools/

Another take on why only using off the shelf tools will always struggle to do a fraction of what a set of tools built for purpose can achieve.

Smooth Paths Using Catmull-Rom Splines

https://qroph.github.io/2018/07/30/smooth-paths-using-catmull-rom-splines.html

Another great curve worth understanding. While everyone reaches for bezier's first, these are interesting in the context of compressing freehand drawings and adding visually pleasing smoothing.

Lecture Friday: The ultimate guide to JTBD

Lecture Friday: Colleague sent me Bob Moesta as a reference when we were talking about a few things. I love the 2x2 framework discussed when thinking about why people change products.

Difference between Spline, B-Spline and Bezier Curves

https://www.geeksforgeeks.org/difference-between-spline-b-spline-and-bezier-curves/

A great primer on a few useful curves. You can never understand curve functions too deeply if you're working in computer graphics.

Beware of Metacognitive Laziness: Effects of Generative Artificial Intelligence on Learning Motivation, Processes, and Performance

https://arxiv.org/pdf/2412.09315

Worth keeping in mind. I'd love to see a replication and further extensions here to see what sort of effects were likely to see.

Let's Talk About AI And End-to-End Encryption

https://blog.cryptographyengineering.com/2025/01/17/lets-talk-about-ai-and-end-to-end-encryption/

Another step towards treating users like cattle instead of pets.

Mistakes Engineers Make in Large Established Codebases

https://www.seangoedecke.com/large-established-codebases/

Another acolyte of Socrates. I like the idea that the ability to work on large legacy codebases is what separates senior developers from junior developers. I mean, it's self serving since I'm someone who works on such code bases. Either way, good food for thought on development.

That's Not an Abstraction, That's Just a Layer of Indirection

https://fhur.me/posts/2024/thats-not-an-abstraction

Go look at abstract art. Now look at your abstraction. Well those don't go together.

Yeah, your abstraction is just giving things names. Real abstraction moves away from the nature of of the thing. Relational algebra is an abstraction away from searching tuples of information. JSX is Javascript wearing an HTML skin suit. Knowing the difference is important.

Why Events Are A Bad Idea (for high-concurrency servers)

https://web.stanford.edu/class/cs240/readings/events-bad.pdf

Paper showing threads and events are just isomorphisms. They advocate for greater compiler support of threads, which is something I haven't really seen outside of language extensions like Cilk. I'd love to see what more we could do with the compiler to better signal context switching to the system rather than trying to outsmart it from our local point of view as we keep trying with userland threading.

Python for Lisp Programmers

https://www.norvig.com/python-lisp.html

I've always thought Python was fairly lisp like if lisp was based on dictionaries instead of lists. Sure Python's syntax isn't expressed in terms of dictionaries, but the runtime generally is.

Wikipedia: Amdahl's Law

https://en.wikipedia.org/wiki/Amdahl's_law

It's important to understand that horizontal scalability is also limited. You can't just throw more cores at a problem. There are those tasks that are embarrassingly parallel, but they represent only a subset of interesting computations.